Go Back   PCMech Forums > Help & Discussion > Internet, Web Applications, & The Cloud

Need Some Help? Type Your Keywords Here:

Reply
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
Old 11-27-2006, 10:02 AM   #1
Member (9 bit)
 
Join Date: Sep 2006
Posts: 393
How do SPAM filters actually work?

I have a couple of online email packages, Yahoo and GMail, and find myself getting very little SPAM at this point. Within Yahoo I consistently click on SPAM whenever a junk message comes in, and today I wondered:

Lacking my having clicked on a message, how does a filter know an e mail is SPAM? Is it the number of recipients, a word or two in the body of the message, or what?

Of course most of the filters have their idiosynchosies, but are there general rules for how SPAM filters actually work?
TallTravel is offline   Reply With Quote
Old 11-27-2006, 11:31 AM   #2
Come in Ray...
 
faulkner132's Avatar
 
Join Date: Sep 2004
Posts: 1,668
Since pretty much all spam follows the same pattern, the most effective filters implement some sort of a "learning" Bayes Algorithm.
http://en.wikipedia.org/wiki/Bayes_Theorem
faulkner132 is offline   Reply With Quote
Old 11-27-2006, 11:51 AM   #3
Member (9 bit)
 
Join Date: Sep 2006
Posts: 393
Thanks much, but I was looking for a more understandable answer than the Bayesian theorems. It is rather tough to interpret the sort of thing on the site referenced above.

I sincerely appreciate the effort, however...

marginal probabilities of stochastic events, standardized likeliehood, formulas up the wazoo, etc, etc.

If there were an English version I could understand that, since I speak a version of same....

I think this is saying that most SPAMMERS use the same tools, so anti spam software builds those rules into their software. What this does not answer is whether the text is important to filtering SPAM, or the number of recipients, or some combination of words....

My goal is simply to reduce SPAM to a minimum prior to a long trip during which I will be 3-5 days without checking my e mail.
TallTravel is offline   Reply With Quote
Old 11-27-2006, 12:14 PM   #4
Come in Ray...
 
faulkner132's Avatar
 
Join Date: Sep 2004
Posts: 1,668
In plain english, a Bayes Algorthm basically builds a dictionary of "good" and "bad" words which is used to calculate a rating. Good and bad word dictionaries are built by classifying messages as spam or not spam (about 50 messages is a good starting size) and the text is stored in the respective dictionary. Then when an incoming message is received the text is scanned against the good and bad dictionaries and, depending on the rating, it is classified as spam or not spam.
faulkner132 is offline   Reply With Quote
Old 11-27-2006, 08:13 PM   #5
Member (10 bit)
 
Kareeser's Avatar
 
Join Date: Mar 2006
Location: Toronto, Canada
Posts: 810
These may or may not work in conjunction with filters that block inline images, as spammers have tried to use images to circumvent filtering based on words.
Kareeser is offline   Reply With Quote
Old 11-27-2006, 09:24 PM   #6
Staff
Premium Member
 
mairving's Avatar
 
Join Date: Jul 1999
Location: Arlington, TN
Posts: 5,538
There is not just one spam filter. Several products that I have used in the past had 12-14 different methods of filtering. The 2 worst were Bayesian and Keyword since they yielded the most false positives. It is pretty much a process. First you generally looks up the IP address against various blacklists. If the domain is on that blacklist it gets deleted. Then you might check to see if the sending mail server is listed on a relay list. Then you might use a SURBL list that will detect spam via the links in the email. If a link goes to a known spammer it is deleted. Then you have other filters like SPF that detects forged headers. Of course if you are large enough then any spam that gets sent to a user account can be identified as spam and the rules changed or the sender blacklisted.

Despite all of the methods when you are on the defense all the time you have a harder job since you have to keep tweaking your filter.
__________________

Want to Make $$$$ with your Computer? No Risk! Simply press shift-4 four times in a row
mairving is offline   Reply With Quote
Old 11-27-2006, 10:42 PM   #7
Member (9 bit)
 
Join Date: Sep 2006
Posts: 393
Quote:
Originally Posted by mairving
There is not just one spam filter. Several products that I have used in the past had 12-14 different methods of filtering. The 2 worst were Bayesian and Keyword since they yielded the most false positives. It is pretty much a process. First you generally looks up the IP address against various blacklists. If the domain is on that blacklist it gets deleted. Then you might check to see if the sending mail server is listed on a relay list. Then you might use a SURBL list that will detect spam via the links in the email. If a link goes to a known spammer it is deleted. Then you have other filters like SPF that detects forged headers. Of course if you are large enough then any spam that gets sent to a user account can be identified as spam and the rules changed or the sender blacklisted.

Despite all of the methods when you are on the defense all the time you have a harder job since you have to keep tweaking your filter.

Now THAT's a pretty darn comprehensive and simple explanation for the process. Thanks!
TallTravel is offline   Reply With Quote
Old 11-28-2006, 07:31 AM   #8
Come in Ray...
 
faulkner132's Avatar
 
Join Date: Sep 2004
Posts: 1,668
If you use Outlook (not Express), this is hands down the best spam prevention program.
http://spambayes.sourceforge.net/

It is a Bayean algorithm, (sorry mairv, I have to respectfully disagree with regards to them being the worst) and I've been using it here at work for over 6 months and I can count the number of false positives on 1 hand.
faulkner132 is offline   Reply With Quote
Old 11-28-2006, 08:34 AM   #9
Staff
Premium Member
 
mairving's Avatar
 
Join Date: Jul 1999
Location: Arlington, TN
Posts: 5,538
Quote:
Originally Posted by faulkner132
If you use Outlook (not Express), this is hands down the best spam prevention program.
http://spambayes.sourceforge.net/

It is a Bayean algorithm, (sorry mairv, I have to respectfully disagree with regards to them being the worst) and I've been using it here at work for over 6 months and I can count the number of false positives on 1 hand.
Obviously it depends on how the logic is constructed. Actually the Keyword filters are the worst but effective if you want to nuke emails with certain keywords. Overall idea is that you can't just use one type of filter nowadays.
mairving is offline   Reply With Quote
Reply

Bookmarks

Still Need Help? Type Your Keywords Here:


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Show off your Custom Case Work. Byte 2.0 Computer Hardware 292 03-14-2012 08:48 PM
Confused Siberian Bear Distributed Computing 15 06-16-2005 06:42 AM
Connecting to work server... KINGOFOOTBALL33 Networking & Online Security 1 05-28-2005 04:10 PM
Work out routine / Keeping fit james8547 General Discussion 4 05-22-2005 08:17 AM
wireless is now setup in my work shop Byte 2.0 Networking & Online Security 1 06-26-2003 06:04 PM


All times are GMT -5. The time now is 03:43 AM.
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2012, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 PL2