Here's something worth sharing: I get a lot of comments on this blog. Well, not a lot really, but probably on the order of a hundred every week or so. All of them are spam. And they all have one really obvious thing in common: they're all written in HTML. So they all start with <p>, which makes them really easy to identify. Now there's something to put in my spam filter, whenever I get around to making a decent one.
EDIT: now silly me... it's my comment posting/formatting code that adds in the <p>, not the spambots. So that tactic goes out the window. But I'm still pretty sure I've never actually gotten a legitimate comment. (If you know otherwise... leave a comment? ;-)
In the current implementation of my blog software, when someone sends a trackback request, there's no check to make sure that the remote page actually links to this site. So as you might imagine, I get quite a bit of trackback spam — bogus trackback requests that specify URLs for drug sites, porn sites, etc. It used to be something on the order of 20-30 per day, whereas by contrast, I'd only get one comment (spam of course) every few days. But when I just went to clear out the spam from my database, I saw about 40 spam comments compared to only 2 spam trackbacks. So I have to wonder, are spammers finding comments more profitable than trackbacks? Are they wising up to the fact that comments appear on this site automatically whereas trackbacks don't? (Doubtful, since my site isn't really worth that kind of attention)
There are a lot of programs and protocols out there devoted to stopping, or at least reducing, the flow of spam email around the internet. But one of the most effective is also one of the simplest: greylisting.
In order to understand greylisting, you first need to know that a typical email message on its way through the internet travels through four computers ("nodes"):
user@example.com, this is the email server for example.com)
Greylisting is actually a simple process: the first time node 2 tries to send the email to node 3, node 3 responds with an SMTP 450 error code, which basically means "try again later". And a standards-compliant mail server will indeed try again later. But a spammer's server usually won't. Spammers typically operate their own mail servers which are specially designed to send out as many emails as possible to as many people as possible, and it's not worth their time to try sending the same mail twice.
It might be hard to believe that this really works — you'd think that spammers would be smarter than that. But I just enabled greylisting on my mail server yesterday and my daily spam count dropped from around 70 to only 1! The great thing about greylisting is that unlike, say, Bayesian filtering (which tries to identify spam based on its content), there's no risk of flagging a legitimate email as bad. Of course, if greylisting catches on, spammers will probably start to work around it, but we're still a long way from that landmark...
Okay, not the worst — there's plenty of room on the internet for any amount of incompetence you care to look for — but anyway: I just got this link in what looked like a rare well-crafted piece of spam. Now, for a while I've wanted to actually follow some of these spam links, see what the latest phishing tech looks like, and maybe if I'm feeling nice feed them some fake login information (like empty calories for websites, I suppose).
Turns out, when these particular spammers created their mockup of the Enom web page, they completely neglected to change the links! So anyone who clicks anything on the page — even the "forgot password" link — gets redirected to the legitimate site. Way to completely defeat the point...
Not to mention, the email they're sending talks about domain information in the WHOIS database, which is something that only a very small percentage of internet users would know/care about. Probably the same small percentage who know how to identify spam. And who know to actually look at the addresses of the websites they go to. All in all, I have to wonder whether this particular scheme is actually going to work on anyone.
But then again, as they say, people are stupid ;-)
The email:
Dear user,
On Wed, 29 Oct 2008 11:25:09 +0700 we received a third party complaint of invalid domain contact information in the Whois database for this domain Whenever we receive a complaint, we are required by ICANN regulations to initiate an investigation as to whether the contact data displaying in the Whois database is valid data or not. If we find that there is invalid or missing data, we contact both the registrant and the account holder and inform them to update the information.
The contact information for the domain which displayed in the Whois database was indeed invalid. On Wed, 29 Oct 2008 11:25:09 +0700 we sent a notice to you at the admin/tech contact email address and the account email address informing you of invalid data in breach of the domain registration agreement and advising you to update the information or risk cancellation of the domain. The contact information was not updated within the specified period of time and we canceled the domain. The domain has subsequently been purchased by another party. You will need to contact them for any further inquiries regarding the domain.
PLEASE VERIFY YOUR CONTACT INFORMATION - http://www.enom.com
If you find any invalid contact information for this domain, please respond to this email with evidence of the specific contact information you have found to be invalid on the Whois record for the domain name. Examples would be a bounced email or returned postal mail. If you have a bounced email, please attach or forward with your reply or in the case of returned postal mail, scan the returned letter and attach to your email reply or please send it to:
Attn: Domain Services 14455 N Hayden Rd Suite 219 Scottsdale, AZ 85260
LINK TO CHANGE INFORMATION - http://www.enom.com
Thank you,
Domain Services[IncidentID:34399]