Intelligent phishing url detection using association rule miningA seminar report submittedtoMANIPAL UNIVERSITYFor Partial Ful llment of the Requirement for theAward of the DegreeofBachelor of TechnologyinInformation TechnologybyHriddhi DeyReg. No. 130911388September 2016
AbstractPhishing is an online criminal act that occurs when a malicious webpage im-personates as legitimate webpage so as to acquire sensitive information from theuser. Phishing attack continues to pose a serious risk for web users and annoyingthreat within the eld of electronic commerce. This paper focuses on discerningthe signi cant features that discriminate between legitimate and phishing URLs.These features are then subjected to associative rule mining apriori and predic-tive apriori. The rules obtained are interpreted to emphasize the features that aremore prevalent in phishing URLs. Analyzing the knowledge accessible on phishingURL and considering con dence as an indicator, the features like transport layersecurity, unavailability of the top level domain in the URL and keyword within thepath portion of the URL were found to be sensible indicators for phishing URL. Inaddition to this number of slashes in the URL, dot in the host portion of the URLand length of the URL are also the key factors for phishing URL.1
1 Introduction1.1 PhishingPhishing is the attempt to obtain sensitive information such as usernames, passwords,and credit card details (and sometimes, indirectly, money), often for malicious reasons,by masquerading as a trustworthy entity in an electronic communication. The word is aneologism created as a homophone of shing due to the similarity of using a bait in anattempt to catch a victim. Communications purporting to be from popular...

... middle of paper ...

...e system prediction is notexclusively based on querying search engine result.Huang et al. [8] proposed SVM based technique to detect phishing URL. The fea-tures used are structural, lexical and brand names that exist in the URL. However, morefeatures related to URL are considered in the proposed work. Neda et al. [23] proposedrule based classi cation algorithm to detect phishing URL. However the rule used in thisis based on human experience rather than intelligent data mining technique. In the ap-proach proposed by Han et al. [24] the system warns the user, when the user submits theusername and password for the rst time, although the current website is a legitimatewebsite. This is because the information about the legitimate website is not maintained.This login problem is eliminated in our system as a repository of white list is e ectivelymaintained.4

