Posted by Eric
Tue, 24 Sep 2002 04:00:00 GMT
I'm in maze of twisty little library interfaces, all different. I'm
dealing with three C libraries (MPW StdCLib, CarbonStdCLib.o and MSL),
two MacOS platforms (PPC and Carbon), two build systems (MPW and
CodeWarrior) and a growing sense of desperation. Of course, no piece of
this cruft wants to talk to any other piece.
Tags Mac | no comments
Posted by Eric
Mon, 23 Sep 2002 04:00:00 GMT
Sincere Choice: A lobbying
effort arguing that free software and proprietary software should compete
on an equal footing.
no comments
Posted by Eric
Mon, 23 Sep 2002 04:00:00 GMT
The FTC appears to have a huge spam
database.
Tags Spam | 1 comment
Posted by Eric
Mon, 23 Sep 2002 04:00:00 GMT
Useful sites about machine-learning algorithms, for developers of
spam filters: Machine Learning
Network, the Bow
toolkit, Latent Semantic
Analysis (used by Apple's mail client), Bayesian
Latent Semantic Analysis, text
clustering, more
text clustering, Using
Clustering to Boost Text Classification (PDF) and TFIDF
notes.
I wouldn't be
entirely surpised if neural networks worked well here, either--the
problem has that "figure out where to draw the boundaries between
clusters" aspect that maps nicely onto the math of neural
networks.
Tags AI, Probability, Spam | no comments
Posted by Eric
Mon, 23 Sep 2002 04:00:00 GMT
For deadly-accurate spam filtering, combine a well-trained bogofilter with
SpamAssassin. Here's how.
Add the following lines to your procmailrc file, before you run
SpamAssassin:
:0HB
* ? bogofilter
{
:0fw
| formail -I "X-Spam-Bogofilter: yes"
}
Add the following lines to your /etc/spamassassin/local.cf
file:
header BOGOFILTER X-Spam-Bogofilter =~ /yes/
describe BOGOFILTER Message has too many bogons.
score BOGOFILTER 5.0
Presto! This plugs almost all the holes in SpamAssassin's defense,
and uses SpamAssassin's auto-whitelist (you've got it turned on,
right?) to protect against false positives.
Tags Spam | no comments