PDA

View Full Version : Blocking Voilabot



Chris
02-19-2010, 04:03 PM
One of my servers had apache going down every 2 hours today, just getting hammered, because every search engine was crawling it. Cuil, MSN, Yahoo, Google... and VoilaBot, showing up in stats as Voilabot, in netstat as natcrawlbloc.

It was hitting my server like crazy, more than the others, and it apparently is notorious for not obeying robots.txt.

I found this page, banned it, and the problem ceased. Whatdyaknow?

http://www.electricmonk.nl/log/2008/08/19/blocking-voilabot/

mobilebadboy
02-19-2010, 05:10 PM
Ran into that one a long time ago. I prefer blocking by name (if they identify themselves). IPs tend to change from time to time on some annoying crawlers. However some user-agents can change too (like DuckDuckGo, which used to identify itself, but no longer does).

A new annoyance (of many new ones) has become [X]AnalysisAgents (# LegalAnalysisAgent. FinancialAnalysisAgent. ParchedAnalysisAgent, EagleContentAnalysis, etc).

I'm probably pushing at least 150 bots blocked.