Submit Paper

Article Processing Fee

Pay Online


Crossref logo

  DOI Prefix   10.20431


International Journal of Research Studies in Computer Science and Engineering
Volume 1, Issue 8, 2014, Page No: 35-40

Email Spam Detection Using Customized SimHash Function

G.Venkata Reddy1, K.Ravichandra2

1.CSE Dept., Nova College of Engineering & Technology, Vegavaram, Jangareddy Gudem.
2.CSE Dept., M-Tech, Nova College of Engineering & Technology, Vegavaram. Jangareddy Gudem.

Citation : G.Venkata Reddy, K.Ravichandra, Email Spam Detection Using Customized SimHash Function International Journal of Research Studies in Computer Science and Engineering 2014, 1(8) : 35-40


E-mail communication is a narrative challenging in present days, because a problem can be done in that communication from one to other emails process generation. The problem is spam mail combination in original mail interaction. This is the major task for sending information from one to other persons, if it important to that particular person. So to solve these problems effectively traditionally a novel e-mail abstraction scheme, which considers e-mail layout structure to represent e-mails. In this technique a procedure to generate the e-mail abstraction using HTML content in e-mail, and this newly devised abstraction can more effectively capture the near-duplicate phenomenon of spams. In that instead of mapping each subsequence in a node of spam tree. In this paper we propose to replace with a special hash function namely SimHash, the advantage of this over other hash functions is that it sets a minimum on the number of members that the two sets must share in order to match. This mitigates the effect of extremely common set members on data clusters. SimHash based approach is Fast, Flexible, Customizable (HtmlSimhash), Scalable and is Google patented.

Download Full paper: Click Here