One of the rules/tools used by the modbot is to count Google hits for the numeric IP of an untrusted poster. Turns out that HTTP proxies have a real proclivity for getting indexed. A lot. Legitimate IPs, not so much. I wrote a little online tool to call Google to get these counts; the tool is here and the write-up of the code is here. It's currently blocking about 40% of spam (I don't have good statistics analysis in place yet, so that's very approximate.)
Finally, as a spinoff of this project, I've started a spam archive. There's nothing to present yet, but I hope to start doing some interesting analysis, and most specifically a searchable database -- along with a searchable database of spamvertised sites. That ought to overlap with the sites spamvertised by email spam as well, and that's going to be an interesting thing to look at. We'll see.
I've stumbled onto a spam link network of staggering extent in the course of examining forum spam. A spammer has a site somewhere, and then spamvertises it. But then some of the spam starts to link to other forum spam, which in turn links to the site. Some sites auto-forward to other sites using obscured Javascript (I haven't figured out just why, yet; if you have a rationale, I'd be happy to hear it.) Anyway, after that goes on for a while, there's a huge resulting network of vulnerable fora linking to other vulnerable fora. There is a true treasure trove of information available to the interested party. Which would, of course, be me. I will definitely be following up on that and posting on it.
Anyway, it's been nice talking to you. Back to work!