How To Test a Trainable Spam Filter

Posted by Eric Sun, 22 Sep 2002 04:00:00 GMT

Ever since Paul Graham published A Plan for Spam, "trainable" spam filters have become the latest fashion. These filters train themselves to know the characteristics of your personal e-mail. Supposedly, this extra knowledge allows them to make fewer mistakes, and makes them harder to fool. But do these filters actually work? In this article, I try out Eric Raymond's bogofilter, a trainable Bayesian spam filter, and describe the steps required to evaluate such a filter accurately.

Read more...

Tags , ,

Things I Hate About CodeWarrior, Part I

Posted by Eric Fri, 20 Sep 2002 04:00:00 GMT

Metrowerks CodeWarrior is a fairly nice compiler suite and IDE for the Macintosh. Unfortunately, it suffers from several severe flaws. Most of these flaws involve CodeWarrior's binary project files.

A short list of problems with this design:

  1. The project files are completely opaque. As Unix users like to complain, binary files are just an opaque blob of bytes. This breaks such vital utilities as diff and merge.
  2. The project files change every time you compile your program. For some unknown reason, CodeWarrior stores object code in the project files. This means the files get changed every time you compile. This makes CVS grumpy.
  3. The project file format is always changing. I've never upgraded CodeWarrior without having to re-import all my project files.
  4. CodeWarrior can't read very old project files at all. Just today, CodeWarrior told me it couldn't open an old project file at all. I wonder what was in there.

Now, don't get me wrong, CodeWarrior was a really sweet product back in early 1995. But by modern standards, it's pretty painful.

Tags  | 1 comment