Thanks for the Spam … I think

An odd thing has happened in the last three days.  It appears that my site traffic has more than doubled, and I'm starting to get way more comments.

Funny thing about all these comments, though ... they seem to think I need Viagra or some help with the size of my body parts.

With the increased traffic, this site is now sucking up 20% of my monthly bandwidth allowance (not a problem, yet), and I'm now in the top 800,000 blogs according to Technorati (note the site rank widget at the side).  There are an average of 90 comments waiting for my approval every day, and FireStats shows 1200 new hits from countries such as China, the Republic of Korea and the Russian Federation every hour.  This wouldn't bother me so much if it was actually people from these countries looking at my site ... but with thousands of hits per day from a handful of distinct IP addresses, it's clear that these are bots that look like real users.  They don't seem to look for new content, really ... they just view every page repeatedly in every language I have available.

Luckily, there are tools to help combat these nuisances.

To handle spam comments, I would recommend Akismet.  This handy little utility comes with WordPress and can be integrated into almost any site (from what I've read over the last few months).  This program is right about 99.9% of the time when marking something as spam, and several very popular bloggers swear by it.

To combat the bots, I'd suggest blocking the worst offenders from your site.  In the case of the excessive hits from China, Korea and Russia, these looked like actual users but were all coming from seven IP addresses.  So instead of adding whatever user-agent they employed to my robots.txt file (which would then block anyone using IE 5 or 6), I was forced to ban their IPs from accessing my site.  This can be done pretty easily in the .htaccess file as seen below:

order allow,deny deny from 123.45.6.7 deny from 012.34.5. allow from all
Of course, this isn't for everyone.  And you wouldn't want to block all bots from your site unless you were trying to stay completely off the search engines.  But with due diligence and a little log reading, spam and excessive bot traffic can be turned into something a little more manageable.

Page generated in roughly: 0.235963 Seconds, 0 API Calls, 5 SQL Queries, 4 Cache Objects