APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Analyzing some affiliate marketing email spam for Bayesian poisoning


2012/10/11

A large percentage of the spam email we get is from affiliate marketing. That is, someone who wants to sell something offers a commission to people who send them buyers. People hoping to make money post affiliate links at their websites (I do that here for a number of products, but mostly for tech related books I have reviewed or mentioned) and others send out blind mailings to you, me and everyone else whose email address can be found, bought or stolen.

If you'd like to know more about how those folks work, try these links: Behind the Curtain of an Affiliate Marketing Spam Email and How Spammers get paid. My purpose here is to dive into the actual mail they send.

21 Spam emails

I selected these 21 recent spam emails because they were all somewhat identical.

Credit%20score%20basic%20spam%20attempt

The picture above is the basic idea: check your credit score. It's likely to be an affiliate link because of the format of the links you'd click on:

<a href="http://attention.bluecreditamerica.com/
global-stateselection.action7719906924c752301904117172512_bankamerica.review"
><img src="http://attention.bluecreditamerica.com/7734837117a/297643760.jpg"
 border="0"></a>
 

All of these are basically the same - the links and the affiliate code (probably the "7719906924c752301904117172512" string in this case) differ, but plainly each of these was taken from a template. The spammer can and usually will add to that template, but that first example was quite plain.

It also got tagged as spam, gathering a score of 8.8 on my system:

X-Spam-Status: Yes, hits=8.8 required=5.0
        tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
        T_REMOTE_IMAGE: 0.01,T_SURBL_MULTI1: 0.01,URIBL_AB_SURBL: 4.499,
        URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,TOTAL_SCORE: 8.804,autolearn=no
X-Spam-Flag: YES
 

Note that it was url blacklists that were primarily responsible for that score.

In this next example, the spammer has added text:

Attempting%20Bayesian%20poisoning

Interestingly, when there was text added, it was all taken from the same source, a KnowledgeTree forum. Different amounts of text were used, apparently because it was extracted at different dates, suggesting that this part might also be part of the template. Here's one with much more text:

Large%20amount%20of%20poisoning

This table shows the variances. Note that some of the IP's are likely in the same subnet, so those could be one spammer running multiple outbound mail senders. None of the affiliate ID's match, though, so it may not be.

NameDateLine CountWord CountSpam ScoreURLIP
00000010.emlOct 3 20:112139258.8rewards.credit-clocking.com207.150.177.130
00000020.emlOct 4 13:53982625.7nest.habitchangingmodels.com216.99.147.174
00000030.emlOct 4 15:42952553.7trys.outerspacedocs.net216.246.20.192
0000004e.emlOct 5 10:512119174.2platinum.thecreditomni.com184.82.124.67
0000003a.emlOct 5 11:53972605.7brack.megabyteconsumers.org260 198.12.66.88
0000004c.emlOct 5 13:13952522.0meter.brainstormcheckins.org108.163.209.140
00000050.emlOct 6 11:53972655.4calc.postpapercouponmarket.com184.154.186.172
00000059.emlOct 6 14:512129232.5register.capital-omni.com199.38.243.74
0000005c.emlOct 6 12:51 942501.7phrase.secondarymediaplatforms.net216.99.147.170
00000066.emlOct 6 17:562149345.6urgent.omincapital.com184.82.124.67
00000067.emlOct 6 18:212149368.8urgent.credit-omni.com216.99.159.142
000000bc.emlOct 6 19:012119184.2notification.omnicreditscore.com184.82.169.230
000000b6.emlOct 7 10:12108239924.2quorum.credit-usa-financial.com75.102.10.195
000000b2.emlOct 7 14:53108239924.2application.bankusafinancial.com199.38.243.77
000000ac.emlOct 8 09:08108239880.6attention.creditusabusiness.com75.102.10.190
000000ab.emlOct 8 09:11108239880.6application.bluecreditnational.com184.82.169.227
000000aa.emlOct 8 09:34 108239880.6attention.bluecreditamerica.com199.38.243.70
000000a5.emlOct 8 13:47 1705851.8www.expresshomesections.com184.82.169.99
000000a8.emlOct 8 12:02109339932.4application.bluecreditnational.com184.82.169.227
0000007d.emlOct 10 11:10108239952.6quorum.bluecreditusa.com216.99.159.136
000000bd.emlOct 10 14:47108339974.3review.creditblueusa.com75.102.10.189

Why are they doing this? They are trying to "poison" the Bayes database.

Bayes poisoning

As explained at Wikipedia's Bayesian poisoning article


The spammer hopes that the addition of random (or even carefully selected) words that are unlikely to appear in a spam message will cause the spam filter to believe the message to be legitimate.


Look at the spam scores for the messages that added the most text (the "Word Count" column). It looks like that might work, right?

Maybe. I'm not really sure, because the ones that did get high spam score got it because of blacklists. Here are the actual scores:

==> 00000010.eml < (925 words)==
X-Spam-Status: Yes, hits=8.8 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	T_REMOTE_IMAGE: 0.01,T_SURBL_MULTI1: 0.01,URIBL_AB_SURBL: 4.499,
	URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,TOTAL_SCORE: 8.804,autolearn=no
X-Spam-Flag: YES

==> 00000020.eml < (262 words)==
X-Spam-Status: Yes, hits=5.7 required=5.0
	tests=HTML_IMAGE_ONLY_12: 1.629,HTML_MESSAGE: 0.001,HTML_SHORT_LINK_IMG_1: 0.139,
	T_URIBL_SEM_FRESH: 0.01,T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,
	URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,URIBL_RHS_DOB: 0.276,
	TOTAL_SCORE: 5.723,autolearn=no

==> 00000030.eml < (255 words)==
X-Spam-Status: No, hits=3.7 required=5.0
	tests=HTML_IMAGE_ONLY_12: 1.629,HTML_MESSAGE: 0.001,HTML_SHORT_LINK_IMG_1: 0.139,
	T_URIBL_SEM_FRESH: 0.01,T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,
	URIBL_DBL_SPAM: 1.7,URIBL_RHS_DOB: 0.276,TOTAL_SCORE: 3.775,autolearn=no

==> 0000003a.eml < (260 words)==
X-Spam-Status: Yes, hits=5.7 required=5.0
	tests=HTML_IMAGE_ONLY_12: 1.629,HTML_MESSAGE: 0.001,HTML_SHORT_LINK_IMG_1: 0.139,
	T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,URIBL_DBL_SPAM: 1.7,
	URIBL_JP_SURBL: 1.948,URIBL_RHS_DOB: 0.276,TOTAL_SCORE: 5.713,autolearn=no
X-Spam-Flag: YES

==> 0000004c.eml < (252 words) ==
X-Spam-Status: No, hits=2.0 required=5.0
	tests=HTML_IMAGE_ONLY_12: 1.629,HTML_MESSAGE: 0.001,HTML_SHORT_LINK_IMG_1: 0.139,
	T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,URIBL_RHS_DOB: 0.276,
	TOTAL_SCORE: 2.065,autolearn=no

==> 0000004e.eml < (252 words)==
X-Spam-Status: No, hits=4.2 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	T_REMOTE_IMAGE: 0.01,URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,
	TOTAL_SCORE: 4.295,autolearn=no

==> 00000050.eml < (265 words)==
X-Spam-Status: Yes, hits=5.4 required=5.0
	tests=HTML_IMAGE_ONLY_12: 1.629,HTML_MESSAGE: 0.001,HTML_SHORT_LINK_IMG_1: 0.139,
	T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,URIBL_DBL_SPAM: 1.7,
	URIBL_JP_SURBL: 1.948,TOTAL_SCORE: 5.437,autolearn=no
X-Spam-Flag: YES

==> 00000059.eml < (923 words)==
X-Spam-Status: No, hits=2.5 required=5.0
	tests=AWL: -1.799,HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,
	MIME_HTML_ONLY: 0.001,T_REMOTE_IMAGE: 0.01,T_URIBL_SEM_FRESH: 0.01,
	T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,URIBL_DBL_SPAM: 1.7,
	URIBL_JP_SURBL: 1.948,TOTAL_SCORE: 2.526,autolearn=no

==> 0000005c.eml < (250 words)==
X-Spam-Status: No, hits=1.7 required=5.0
	tests=HTML_IMAGE_ONLY_12: 1.629,HTML_MESSAGE: 0.001,HTML_SHORT_LINK_IMG_1: 0.139,
	T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,TOTAL_SCORE: 1.789,autolearn=no
Received: from phrase.secondarymediaplatforms.net ([216.99.147.170])

==> 00000066.eml < (934 words)==
X-Spam-Status: Yes, hits=5.6 required=5.0
	tests=AWL: -3.120,HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,
	MIME_HTML_ONLY: 0.001,T_REMOTE_IMAGE: 0.01,T_SURBL_MULTI1: 0.01,
	URIBL_AB_SURBL: 4.499,URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,
	TOTAL_SCORE: 5.684,autolearn=no

==> 00000067.eml < (936 words)==
X-Spam-Status: Yes, hits=8.8 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	T_REMOTE_IMAGE: 0.01,T_SURBL_MULTI1: 0.01,T_URIBL_SEM_FRESH: 0.01,
	T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,URIBL_AB_SURBL: 4.499,
	URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,TOTAL_SCORE: 8.834,autolearn=no

==> 0000007d.eml < (3995 words)==
X-Spam-Status: No, hits=2.6 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	URIBL_DBL_SPAM: 1.7,URIBL_RHS_DOB: 0.276,TOTAL_SCORE: 2.613,autolearn=no
Received: from quorum.bluecreditusa.com ([216.99.159.136])

==> 000000a5.eml < (585 words)==
X-Spam-Status: No, hits=1.8 required=5.0
	tests=HS_INDEX_PARAM: 0.023,HTML_MESSAGE: 0.001,MIME_HTML_ONLY: 0.001,
	MISSING_MID: 0.14,T_URIBL_SEM_FRESH_10: 0.01,T_URIBL_SEM_FRESH_15: 0.01,
	URIBL_DBL_SPAM: 1.7,TOTAL_SCORE: 1.885,autolearn=no

==> 000000a8.eml < (3993 words)==
X-Spam-Status: No, hits=2.4 required=5.0
	tests=AWL: -1.824,HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,
	MIME_HTML_ONLY: 0.001,URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,
	TOTAL_SCORE: 2.461,autolearn=no

==> 000000aa.eml < (3988 words)==
X-Spam-Status: No, hits=0.6 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	TOTAL_SCORE: 0.637,autolearn=no
Received: from attention.bluecreditamerica.com ([199.38.243.70])

==> 000000ab.eml < (3988 words)==
X-Spam-Status: No, hits=0.6 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	TOTAL_SCORE: 0.637,autolearn=no
Received: from application.bluecreditnational.com ([184.82.169.227])

==> 000000ac.eml < (3988 words)==
X-Spam-Status: No, hits=0.6 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	TOTAL_SCORE: 0.637,autolearn=no
Received: from attention.creditusabusiness.com ([75.102.10.190])

==> 000000b2.eml < (3992 words)==
X-Spam-Status: No, hits=4.2 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,TOTAL_SCORE: 4.285,autolearn=no
Received: from application.bankusafinancial.com ([199.38.243.77])

==> 000000b6.eml < (3992 words)==
X-Spam-Status: No, hits=4.2 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,TOTAL_SCORE: 4.285,autolearn=no
Received: from quorum.credit-usa-financial.com ([75.102.10.195])

==> 000000bc.eml < (918 words)==
X-Spam-Status: No, hits=4.2 required=5.0
	tests=HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,MIME_HTML_ONLY: 0.001,
	T_REMOTE_IMAGE: 0.01,URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,
	TOTAL_SCORE: 4.295,autolearn=no

==> 000000bd.eml < (3997 words)==
X-Spam-Status: No, hits=4.3 required=5.0
	tests=AWL: 0.046,HTML_MESSAGE: 0.001,HTML_MIME_NO_HTML_TAG: 0.635,
	MIME_HTML_ONLY: 0.001,URIBL_DBL_SPAM: 1.7,URIBL_JP_SURBL: 1.948,
	TOTAL_SCORE: 4.331,autolearn=no
 

Notice that AWL is also sometimes causing lower scores, but sometimes higher. It's hard to tell if these poisoning attempts work, but it's certainly obvious that the senders think it will!

Certainly I might want to add these senders to my own blacklists, but the length of time they'll use these IP's is short, so that might not be a smart idea. Fighting spam is a never ending task.



Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> Analyzing some affiliate marketing email spa for Bayesian poisoningm




Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Anthony Lawrence



Kerio Connect Mailserver

Kerio Samepage

Kerio Control Firewall

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





AOL is like the cockroach left after the nuclear bomb hits. They know how to survive. (Jan Horsfall, VP of Marketing for Lycos)

Don't blame me for the fact that competent programming, as I view it as an intellectual possibility, will be too difficult for "the average programmer" — you must not fall into the trap of rejecting a surgical technique because it is beyond the capabilities of the barber in his shop around the corner. (Edsger W. Dijkstra)












This post tagged: