APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

SpamCheetah


Some material is very old and may be incorrect today

© August 2009 Girish Venkatachalam

Girish Venkatachalam is a UNIX hacker with more than a decade of networking and crypto programming experience. His hobbies include yoga,cycling, cooking and he runs his own business. Details here:

http://gayatri-hitech.com
http://spam-cheetah.com

More posts by Girish Venkatachalam.

The genesis of the SpamCheetah spam buster

How did I end up doing this?

First off I should thank Tony for letting me write something close to my heart. This is also something that I deeply believe in and spam is a problem that all of us should get together and fight. Many of us feel strongly about dirty spammers and we never want spam in our inbox. Neither do we want to lose important mail at the altar of spam control.

E-mails have been around for quite some time. From the early days of the Internet electronic messaging has existed. But the world has come a long way since then and businesses realized the low cost and accessibility to a large audience that e-mail gives. E-mail authentication protocols and verification mechanisms grew in the same culture of the Internet and consequently there is neither security in the Internet nor in e-mail protocols. It has got to do with the culture at the time when the Internet evolved. Since everyone used to trust computers and other people and nobody knew that Internet would become as widespread as it is today, people became lax about security.

E-mails have become not only a communication tool but also a very effective business tool. Businesses rely on e-mails for everything and if there is anything that affects their bottomline directly then e-mail efficiency scores very high. When that is the case companies hate reduction in employee productivity due to unwanted mails entering into their inboxes. In fact it is hard to predict or measure the productivity loss involved in receiving p0rn mails or messages about Nigerian widows or bumper lotteries.

All said and done, we all know the kind of problem that spam is causing. And my motivation to develop a product for spam control does not have much to do with what I wrote above. I knew it had potential but I wanted a product to start my business. I wanted something that I thought would sell. That is all. It just happened to be this. My interests are wide and I would have got into anything where I could add value.

Anyway now I have spent a good part of the last two years developing this product and even before that I started learning the various spam control techniques. After evaluating a lot of the theory I felt that something was quite wrong about statistical filtering and content scanning. Though Zdziarski wrote eloquently about how well his dspam filter worked I was not quite impressed.

There were many different mechanisms like distributed checksum computation, the idea behind Vipul's razor, advanced Bayesian filtering, Paul Graham's articles and many other literature on the Internet that gave me a sound understanding of the ways in which people attacked this problem. I was confused for a long time about what greylisting meant and how it was implemented in OpenBSD.

I was way off the mark in the beginning and I sent a mail to the mailing list asking somewhat embarrassing basic questions. But I got very detailed answers by one of the developers Nick Holland. A user also testified how effective OpenBSD greylisting was in his network. Later he also gave a neat readme telling me and everyone else in the mailing list how to get it working.

I could instinctively feel that there was something novel in their approach. Something different and interesting in the way they approached the spam problem. I sat there thinking.

I could only think and do nothing more since I neither had the money nor the resources to run my own mail server. This was until my client, a company in Chennai that relied completely on mails for their back office operations for big banks asked me to develop a spam filter.

It happened in this way. They had purchased a Windows based software for hefty sum of 4 and a half lakhs Indian rupees. And like most Windows based products, it came with a lot of strings attached and all kinds of silly restrictions. They got fed up with this content scanning filter since their mails were getting delayed and some key mails were getting lost (false positives). If you want to know which product I am talking about, mail me and I will tell you! But it is not important since all products have defects and we cannot achieve anything by nitpicking.

So now I got a golden opportunity to try out my ideas in their network. I sat and developed the initial version of SpamCheetah sitting in their server room. I spent countless hours slogging in their heavily air conditioned server room. India being a hot country and Chennai being hot most of the time this was a nice situation.

I did not take long to develop a product. It was called 'anjal' in those days as 'anjal' means 'letter' in Tamil. Interestingly the development of this product was not hard. What hit me hard on my face was its network deployment.

The technical details

I could develop this product without much ado, but how to deploy and test? This technology had the restriction that it could not be tested in a test setup. It had to go live. So I had to wait for the opportunity to test my product in their production network.

Now they are a major customer of Citibank and we all know how major corporations work. They demanded certification for my product. They were scared of audit and security had to be proven by certification. All the silly firewalls out there in the market(once again I shall not name them) come with big bunch of various certifications. But where could I go? How to obtain certification for an open source software?

Everyone knows that OpenBSD is the most secure OS. Everyone knows that OpenBSD has the last word in security and I am also well aware of its innards and the various stack protection mechanisms, its audit process, its development culture and so on. I am a cryptographer myself, so I do know something about security.

Bob Beck, the principal author of OpenBSD spamd sent me a certificate from a Japanese OpenBSD developer. But it was not sufficient. I was stuck. But they told me that if I could deploy my product somehow inside the DMZ and behind the firewall then I did not need a certificate.

Thankfully I had advanced networking skills. I knew something about port forwarding and routing topologies. In spite of my knowledge and experience I had to validate my claims with netcat and several computers on my home network. A sniffer like tcpdump also helped. I learnt that using aliasing one can simulate several networks with just 3 computers. I wrote about this in the local Linux User's group and I started becoming good at networking techniques as well.

Finally I got it installed and very quickly they realized the power and value of this technology. It has been running successfully in their premises for close to 18 months now and they get ZERO spam. They would get close to 50 spams in a day per mailbox but now they get none. And they process all the banking transactions by e-mail and they get close to 10,000 mails in a day on average. They were happy and I got paid a good sum of money. By that I mean something with which I could get going.

Why product development is not hard with open source tools and UNIX?

But money was not the thing. The satisfaction of having a product that is used in production by a large enterprise itself is very satisfying. I have developed several products when working for companies but you never get the joy of end to end development of real deployment.

Whereas in the open source world you get ample opportunities for real life exposure for our programs. That is what attracted me to open source years ago. Moreover my charity mindset and technical excellence found in OpenBSD attracted me further.

I will not bore you with other details but suffice it to say that even after having a proven product I still needed to develop the web interface. Now I am a hacker and web programming is not something that I knew. But I had to learn. I knew that in the long run no matter what product I developed it had to have a web interface.

Once again my open source underpinnings and focus on simplicity attracted me to jQuery. At that time jQuery had not been adopted by Microsoft and it was somewhat nascent. But I was quick to realize it's potential.

Realizing the potential of a technology is key to success. You also have to be quick at realizing the potential of humans...then only you can employ good people and grow. The insight you get in learning various technologies will help you in a big way. I repeat:

Never pass up an opportunity to use the right tool for the job.

I used OpenBSD spamd for my product. Greylisting has been around for a long long time but OpenBSD did it right and combined it effectively with tarpitting(teergrubing), blacklisting and database updation. It does not use mysql for storing the 3 tuple involved in greylisting. OpenBSD developed their own database. OpenBSD integrated this with OpenBSD's firewall, pf and gave a complete solution integrated well with th OS.

All this makes a big difference. No software exists in isolation. No software is an island. It has to work with everything else to form a complete whole. Even in my product, SpamCheetah does only spam control. It does not do e-mail. This is the old UNIX philosophy of do one job, but do it well.

This gives you the ability to run any mail server behind SpamCheetah. You are free to run MS Exchange, sendmail, qmail, postfix or whatever. All you have to do is setup SpamCheetah properly in your network. That requires slightly advanced networking skills. But I have given clear instructions here. But you need some experience in networking to follow it.

You can see a short video of OpenBSD spamd in action here. The tarpit receives mail at 1 character per second and annoys spammers! If more and more people use OpenBSD greylisting with stock OpenBSD or with SpamCheetah or whatever then the spammers will go out of business!

Why? OpenBSD spamd eats up spammer resources. I am really happy to be able to serve society by taking OpenBSD's great technology to a wider audience. How successful SpamCheetah will prove to be, only time will tell. Hopefully you enjoyed reading this experience of mine. I wish you success in product development. Please mail me for any clarifications. Thanks!

running cheetah
SpamCheetah
Stop spam dead in its tracks!

References and further reading

  1. SpamCheetah technology backgrounder
  2. Understanding the network level behavior of spammers
  3. SpamCheetah's OpenBSD spamd video
  4. OpenBSD papers
  5. OpenBSD spamd - greylisting and beyond


If you found something useful today, please consider a small donation.



Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

->
-> The genesis of the SpamCheetah spam buster


Inexpensive and informative Apple related e-books:

Are Your Bits Flipped?

Take Control of the Mac Command Line with Terminal, Second Edition

Take Control of OS X Server

Take Control of Preview

Take Control of Pages





More Articles by © Girish Venkatachalam





Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us


Printer Friendly Version





He who hasn't hacked assembly language as a youth has no heart. He who does as an adult has no brain. (John Moore)




Linux posts

Troubleshooting posts


This post tagged:

Girish

Mail

Malware

Security



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode