Misc. House Madness Links to Sites Indy Lib. Info
(Legal) |
|
|||
| |
||||
|
Please see bottom
of page for more info on what the "Indy Library" is. I'm a compulsive log-file reader - if I get bored I like to settle back, download Flarp's web server log, and have a casual read through the activity of the last day or two. Recently I saw the return of an old acquaintance - very dodgy activity from a web crawler bot identifying itself as "Mozilla/3.0 (compatible; Indy Library)". It is my assertion that this agent is acting in very bad faith. The evidence (a couple of pages of Apache log file with annotations) is available here, this is my reasoning for disliking this agent:
Here are some other interesting things to note about this "thing", which I'm now personally convinced is just harvesting email addresses for spamming:
Now that I've grepped February's log too I notice that there's been several "minor" hits from this "thing" from fairly radically different IP addresses (all included in the evidence file). A quick check of the IP addresses that these scans came from indicates that 100% of the incidents I've quoted in the evidence file came from ISPs and networks in China. I've now excluded the Indy Library user-agent from Flarp with either a little bit of PHP scripting or via an Apache .htaccess file - check this page in the Server Magic section for details. Update: Turns out, after a little investigation, the Indy Library user-agent actually belongs to a Delphi/C++ Builder suite of tools for doing internet stuff - therefore, someone has written the spambot that I've noticed here in Delphi or C++ Builder and used this library. If you see little individual hits from something describing itself as Indy Library it could be some innocent application. Developers who use the Indy Library for legitimate reasons should really change the User Agent that their software sends when making HTTP requests, to avoid being tarred with the same brush. Another Update (18/04/03): The particular spammer operating out of China that first alerted me to the abuse of the Indy Libraries have either rewritten their software or just changed the User-Agent header it's sending to now report itself as "Zeus 2.6". (btw: the "real Zeus" is a web server, not a spambot). I've just been spammed by them again (on behalf of trafficmagnet.com), here's the log. When I have the time to spare I will drastically overhaul the PHP method of excluding potential spambots to include a basic CAPTCHA just in case a legitimate visitor gets wrongly picked up as a spambot. Oh No, More Updates (10/08/03): Ed Kohler has written in more length about the particular spammers who triggered this piece in the first place, you can find his article here. If you have anything to add to this, you can email me at:
(If for by some miracle a big news site like Slashdot or something wants to link to this page - please contact me FIRST because I can't afford the bandwidth impact of getting /.ed!) |