Spam Bots and UBB (Usage Based Billing)
Be forewarned when you set up a website running WordPress, Spam Bots are going to flood your web site.
If you are running your AppleTV server at home with WordPress Spam bots are going create a lot of unnecessary web traffic which will cost you money in extra bandwidth and slow down your Internet substantially unless you take precautions. Click here to see a subjective price for UBB in Canada.
Here is a site that contains a long list of fake blog comment posts that bots and content spammers are using. Check these against any comments on your site and just delete them.
How to block spam bots or at least minimize them
1) Don’t use WordPress at all. Only use static web pages
Other then the WordPress pages the rest of Technoids.com is plain HTML with a little bit of java script and cached PHP code
2) If you still want to use WordPress here are some remedies
Do not allow any comments or Allow comments but they must register first
At the least moderate all posts
Use G.A.S.P plug-in (it will stop about 90% of the bots from auto posting)
Using Akismet increases Internet traffic but it works – not recommended
Try not to use the words WordPress Blog or Forum on any page
Deny All to the WordPress folder in robots.txt
Block search engines in the lighttpd.conf file
Under Settings > Discussion Settings
Recommend unclicking all boxes except
Comment author must fill out name and e-mail
Break comments into pages ……
E-mail me whenever (both boxes)
A comment is held for moderation
An administrator must always approve the comment
Comment author must have a previously approved comment
Under Settings > Privacy Settings
Select “I would like to block search engines, but allow normal visitors”
Under Settings > Writing Settings > Update Services
Make sure it says this
“WordPress is not notifying any Update Services because of your site’s privacy settings”
Other Tricks to Keep Your Bandwidth Down
Use a light weight WP theme
Use as few images as possible and optimize those
Don’t allow downloads or uploads of large files
Set the Quick Cache Expiry Time to at least 1 week (604800)
Addendum
List of User Agent Strings is Here
lighttp.conf Blocking Search Bots Example
$HTTP["useragent"] =~ “YandexBot” { url.access-deny = ( “” ) }
$HTTP["useragent"] =~ “MLBot” { url.access-deny = ( “” ) }
This will give them 403 error messages and block access to the whole site
Sample robots.txt (goes in the web root folder)
Note Bad Bots ignore robots.txt completely
User-agent: *
Disallow: /wordpress/wp-content/
Disallow: /wordpress/wp-icludes/
Disallow: /wordpress/trackback/
Disallow: /wordpress/wp-admin/
Disallow: /wordpress/archives/
Disallow: /wordpress/category/
Disallow: /wordpress/tag/*
Disallow: /wordpress/tag/
Disallow: /wordpress/wp-*
Disallow: /wordpress/login/
Disallow: /wordpress/*.js$
Disallow: /wordpress/*.inc$
Disallow: /wordpress/*.css$
Disallow: /wordpress/*.php$
User-agent: All
Allow: /
User-agent: Googlebot-Image
Disallow: /
User-agent: ia_archiver
Disallow: /
User-agent: duggmirror
Disallow: /
User-agent: YandexBot
Disallow: /
Sample allowing Google Yahoo and Bing only
User-agent: Googlebot/2.1
Disallow:
User-agent: Yahoo! Slurp
Disallow:
User-agent: Bingbot
Disallow:
User-agent: *
Disallow: /
If you don’t want any “friendly” bots scanning your site
User-agent: *
Disallow: /
