Useful sites about machine-learning algorithms, for developers of spam filters: Machine Learning Network, the Bow toolkit, Latent Semantic Analysis (used by Apple's mail client), Bayesian Latent Semantic Analysis, text clustering, more text clustering, Using Clustering to Boost Text Classification (PDF) and TFIDF notes.

I wouldn't be entirely surpised if neural networks worked well here, either--the problem has that "figure out where to draw the boundaries between clusters" aspect that maps nicely onto the math of neural networks.