Saturday, March 24, 2007

Anti Spam techniques by Sakuraba

Anti-spam techniques The US Department of Energy Computer Incident Advisory Committee (CIAC) has provided specific countermeasures against electronic mail spamming.


Some popular methods for filtering and refusing spam include e-mail filtering based on the content of the e-mail, DNS-based blackhole lists (DNSBL), greylisting, spamtraps, enforcing technical requirements, checksumming systems to detect bulk email, and by putting some sort of cost on the sender via a Proof-of-work system or a micropayment. Each method has strengths and weaknesses and each is controversial due to its weaknesses.


Detecting spam based on the content of the e-mail, either by detecting keywords such as "viagra" or by statistical means are very popular. They can be very accurate when they are correctly tuned to the types of legitimate email that an individual gets, but they can also make mistakes such as detecting the keyword "cialis" in the word "specialist". The content also doesn't determine whether the email was either unsolicited or bulk, the two key features of spam. So, if a friend sends you a joke that mentions "viagra", content filters can easily mark it as being spam even though it is both solicited and not bulk.


The most popular DNSBLs are lists of IP addresses of known spammers, open relays, zombie spammers etc.


Spamtraps are often email addresses that were never valid or have been invalid for a long time that are used to collect spam. An effective spamtrap is not announced and is only found by dictionary attacks or by pulling addresses off hidden webpages. For a spamtrap to remain effective the address must never be given to anyone. Some black lists, such as spamcop, use spamtraps to catch spammers and blacklist them.


Enforcing technical requirements of the Simple Mail Transfer Protocol (SMTP) can be used to block mail coming from systems that are not compliant with the RFC standards. A lot of spammers use poorly written software or are unable to comply with the standards because they do not have legitimate control of the computer sending spam (zombie computer). So by setting restrictions on the mail transfer agent (MTA) a mail administrator can reduce spam significantly. In many situations, simply requiring a valid fully qualified domain name (FQDN) in the SMTP's EHLO (extended hello) statement is enough to block 25% of incoming spam.


Obfuscating message content Many spam-filtering techniques work by searching for patterns in the headers or bodies of messages. For instance, a user may decide that all e-mail he or she receives with the word "Viagra" in the subject line is spam, and instruct her mail program to automatically delete all such messages. To defeat such filters, the spammer may intentionally misspell commonly-filtered words or insert other characters, as in the following examples:
V1agra Via'gra V I A G R A Vaigra \ /iagra Vi@graa The principle of this method is to leave the word readable to humans (whose pattern-recognition skills make them adept at picking out the true meaning of misspelled words), but not recognizable to a literally-minded computer program. This is effective up to a point. Eventually, filter patterns become generic enough to recognize the word "Viagra" no matter how misspelled -- or else they target the obfuscation methods themselves, such as insertion of punctuation into unusual places in a word.
(Note: Using most common variations, it is possible to spell "Viagra" in over 1,300,000,000,000,000,000,000 ways.)


HTML-based e-mail gives the spammer more tools to obfuscate text. Inserting HTML comments between letters can foil some filters, as can including text made invisible by setting the font color to white on a white background, or shrinking the font size to the smallest fine print.


Another common ploy involves presenting the text as an image, which is either sent along or loaded from a remote server. This can be foiled by not permitting an e-mail-program to load images.


As Bayesian filtering has become popular as a spam-filtering technique, spammers have started using methods to weaken it. To a rough approximation, Bayesian filters rely on word probabilities. If a message contains many words which are only used in spam, and few which are never used in spam, it is likely to be spam. To weaken Bayesian filters, some spammers, alongside the sales pitch, now include lines of irrelevant, random words, in a technique known as Bayesian poisoning. A variant on this tactic may be borrowed from the Usenet abuser known as "Hipcrime" -- to include passages from books taken from Project Gutenberg, or nonsense sentences generated with "dissociated press" algorithms. Randomly generated phrases can create spamoetry (spam poetry) or spam art.


After these nonsense subject lines were recognized as spam, the next trend in spam subjects started: Biblical passages. A program much like Mark V Shaney is fed Bible passages and chops them up into segments. The reasoning is that this text, very different from the writing style of today, will confuse both humans and spam filters.


Another method used to masquerade spam as legitimate messages is the use of autogenerated sender names in the From: field, ranging from realistic ones such as "Jackie F. Bird" to (either by mistake or intentionally) bizarre attention-grabbing names such as "Sloppiest U. Epiglottis" or "Attentively E. Behavioral". Return addresses are also routinely auto-generated


About the Author
More infomation please vitsit http://antispam.awardspace.com or http://antispam.php0h.com/ or http://en.wikipedia.org

No comments: