Spam Classification

written by: Ted Highway; article published: year 2007, month 09;



In: Categories » Internet » Spam and Scam » Spam Classification

Through the use of classification techniques and forensic data gathering, we can identify specific spam groups. In some cases the identification can include a specific individual; in other cases, groups of e-mails can be positively linked to the same unspecified group. Forensic tools and techniques can allow the identification of group attributes, such as nationality, left- or right-handedness, operating system preferences, and operational habits.

Spam Organization

There are two key items for identifying individual spammers or specific spam groups: the bulk mailing tool and the spammer’s operational habits. People who send spam generally send millions of e-mails at a time.To maintain the high volume of e-mail generation, spammers use bulk-mailing tools.These tools generate unique e-mail headers and e-mail attributes that can be used to distinguish e-mail generated by different mailing tools. Although some bulk-mailing tools do permit randomized header values, field ordering, and the like, the set of items that can be randomized and the random value set are still limited to specific data subsets.

More important than the mailing tool is the fact that spammers are people, and people act consistently (until they need to change).They will use the same tools, the same systems, and the same feature subsets in the same order every time they do their work.

Simplifying the identification process, most spammers appear to be cheap. Although there are commercial bulk-mailing tools, most are very expensive. Spammers would rather create their own tools or pay someone to create a cheaper tool for them. Custom tools may have a limited distribution, but different users will use the tools differently. For example, Secure Science Corporation (SSC), a San Diego, California-based technology research company, has a unique forensic research tool that generates a unique header that is used in a unique way, which in many cases, makes it easy to sort and identify e-mails.

There are many different types of spam. Identification of an individual or group from this collection is very difficult. But there are things we can do to filter the spam. For example, a significant number of these spam messages have capital-letter hash busters located at the end of the subject line. So, we can sort the spam and look only at messages with capital-letter subject hash busters.

By sorting the spam based on specific features, we can detect some organization. We can further examine these e-mails and look for additional common attributes. For example, a significant number of spam messages have a Date with a time zone of -1700. On planet Earth, there is no time zone 1700, so this becomes a unique attribute that can be used to further organize the spam.

Based on the results of this minimal organization, we can identify specific attributes of the spammer:

■ The hash buster is nearly always connected to the subject.

■ The subject typically does not end with punctuation. However, if punctuation is included, it is usually an exclamation point.

■ The file sizes are roughly the same number of lines (between 50 and 140 lines—short compared to most spam messages).

■ Every one of the forged e-mail addresses claims to come from yahoo.com.

■ Every one of the fake account names appears to be repetitive letters followed by a number. In particular, the letters are predominantly from the left-hand side of the keyboard.This particular bulk-mailing tool requires the user to specify the fake account name.This can be done one of two ways: the user can either import a database of names or type them in by hand. In this case, the user is drumming his or her left hand on the keyboard (bcvbcv and cxzxca indicate finger drumming). With the right hand on the mouse, the user clicked the Enter key. Since the user’s right hand is on the mouse, the user is very likely right-handed.

Although this spammer sends spam daily, he does take an occasional day off— for example,Thanksgiving, New Year’s Eve, the Fourth of July, a few days after Christmas, and every Raiders home game. Even though this spammer always relays through open socks servers that could be located anywhere in the world, we know that the spammer is located in the United States. We can even identify the region as the Los Angeles basin, with annual travel in the spring to Chicago (for one to two months) and in the fall to Mexico City (for one to two weeks).
The main items that help in this identification are:

■ Bulk-mailing tool identification This does not necessarily mean identifying the specific tool; rather, this is the identification of unique mailing attributes found in the e-mail header.

■ Feature subsets Items such as hash busters (format and location), content attributes (spelling errors, grammar), and unique feature subsets from the bulk-mailing tool.

■ Sending methods Does the spammer use open relays or compromised hosts? Is there a specific time of day that the sender prefers?

The result from this classification is a profile of the spammer and/or his spamming group.

Classification Techniques

After we identify and profile individual spam groups, we can discern their intended purpose.To date, there are eight specific top-level spam classifications, including these four:

■ Unsolicited commercial e-mail (UCE) This type is generated by true company trying to contact existing or potential customers.True UCE is extremely rare, accounting for less than one-tenth of 1 percent of all spam. (If all UCE were to vanish today, nobody would notice.)

■ Nonresponsive commercial e-mail (NCE) NCE is sent by a true company that continues to contact a user after being told to stop.The key differences between UCE and NCE are (1) the user initiated contact and (2) the user later opted out from future communication. Even though the user opted out, the NCE mailer will continue to contact the user. NCE is only a problem to people who subscribe to many services, purchase items online, or initiate contact with the NCE company.

■ List makers These are spam groups that make money by harvesting email addresses and then use the list for profit, such as selling the list to other spammers or marketing agencies.

■ Scams Scams constitute the majority of spam.The goal of the scam is to acquire valuable assets through misrepresentation. Subsets under scams include 419 (“Nigerian-style” scams), malware, and phishing.

Phishing

Phishing is a subset of the scam category. Phishers represent themselves as respected companies (the target) to acquire customer accounts, information, or access privileges.Through the classification techniques just described, we can identify specific phishing groups.The key items for identification include:

■ Bulk-mailing tool identification and features

■ Mailing habits, including, but not limited to, their specific patterns and schedules

■ Types of systems used for sending the spam (e-mail origination host)

■ Types of systems used for hosting the phishing server

■ Layout of the hostile phishing server, including the use of HTML, JS, PHP, and other scripts

To date, according to SSC, there are an estimated four dozen phishing groups worldwide, with more than half the groups targeting customers in the United States.

legal disclaimer

1) Our website is not responsible for the information contained by this article as well for any and all copyright infringements by authors and writers. E-articles is a free information resource. If you suspect this article for any copyright infringements, please read the Terms of service and contact us to investigate the problem.
2) The E-articles directory team is not responsible for inaccuracies, falsehoods, or any other types of misinformation this tutorial may contain and will not be liable for any loss or damage suffered by a user through the user's reliance on the information gained here. Please read the Terms of service

Useful tools and features

Translate this article to...    Send this article to you or to a friend

Link to this article from your page   
If you like this article (tutorial), please link to it from your web page using the information above. Linking to this page, this is the only way to help us improve our service, the same time providing your visitors with a way to improve their online experience.

related articles

1. THE LONELY HEARTS SCAM
The lonely hearts scam involves fleecing a rich victim by dangling the promise of love and affection. In the old days, the con artist had to physically meet and talk with the potential victim, but nowadays, con artists can use the Internet to fleece victims from afar. The con artist simply contacts potential victims and claims to be a beautiful woman currently living in another country, such as Russia or the Philippines. After sending a potential victim a photograph (which is usually a picture of someone else), the ...

2. THE NIGERIAN SCAM
Many people in other countries hate Americans, which may seem natural because the only contact most overseas countries have with Americans is through the actions of American tourists (whom they don't like) and American politicians (whom we don't like). Other countries also get their information about Americans through American television shows. So after watching shows like Baywatch or Sex and the City, most countries believe that Americans are not only rich and beautiful, but lousy actors as well. ...

3. THE AREA CODE SCAM
Area code scams play off people's ignorance of the growing proliferation of different telephone area codes. The con artist starts by contacting you, either by leaving a message on your answering machine, by sending you email, or by paging you. The goal of the message is to get you to call a telephone number in another area code by claiming that you won a fabulous prize in a contest, that your credit card was wrongly charged so you need to call and correct the matter, or that one of your relatives has died, been arrested, or fal...

4. WORK AT HOME BUSINESSES SCAM
Besides pyramid schemes, many people receive messages offering them fabulous moneymaking opportunities that can be done at home. Here are some typical scams. Stuffing envelopes The most common work-at-home business scam claims that you can earn hundreds of dollars stuffing envelopes in your spare time. First of all, who in their right mind would want to spend their life stuffing envelopes for a living? If this logic still escapes you, and you actually send money for information on how you can e...

5. Operating Systems Used by Crackers
Everyone that uses computers will most likely develop a preference for a particular operating system. In my opinion, you should use what works best for you. There are arguments good and bad for any system you might be interested in using. Here, I will explain why crackers choose to use a particular operating system. Windows Operating Systems Windows is arguably the most popular operating system available these days. It is easy to use, and is installed on the majority of systems shipped in the world. Windows has...

6. HOW TO PROTECT YOURSELF AGAINST SCAM
To protect yourself, watch out for the following signs of a scam: Promises of receiving large quantities of money with little or no work. Requirements of large payments in advance, before you have a chance to examine a product or business. Guarantees that you can never lose your money. Assurances that "This is not a scam!" along with specific laws cited to prove the legality of an offer. When was the last time you walked into K-M...

7. PYRAMID SCHEMES
The idea behind a pyramid scheme is to get two or more people to give you money. In exchange, you give them nothing but the hope that they can get rich too—as long as they can convince two or more people to give them money. The most common incarnation of a pyramid scheme is a chain letter. A typical chain letter lists five addresses and urges you to send money ($1 or more) to each of the addresses. You then copy the chain letter, remove the top name from the list of addresses, and put your own name and ...

8. CREDIT CARD FRAUD
While many people worry about typing and sending credit card numbers over the Internet, the reality is that few credit card numbers are stolen off the Net. Not only would a potential thief need to tap into your Internet account at the exact moment you're sending your credit card number to a website, but he or she would have to break the encryption scheme that many websites use to protect your credit card numbers online. If someone's going to steal your credit card number, they're more likely to get it by breaking into t...

9. Phishing Statistics
During the last three months of 2004, phishing in general took on a more organized direction. Phishers have refined their attacks, both in e-mail and malware, and have begun to target specific secondary and tertiary targets. We highlight them here from the perspective of statistics and the evolutionary development of phishing: ■ Phishers are refining their e-mail techniques.Their e-mails are much more effective than regular spam. A single mass mailing of 100,000 emails may have a receive rate as high as 10 percent and c...

10. Cyber Crime Evolution
Chances are high that you have received a phish in your e-mail within the few months or even last week. The operations that involve phishing scams will have accelerated due to aggressive malware propagation (trojans, viruses), automated botnets, and the overall infrastructure that has been established by these cyber-scammers. So let’s step back for a moment. Our world has changed significantly since I was a kid. Just 10 years ago, the sophistication of hackers and the tools available to them were somewhat limited from ...