Plush Cthulhu

Posted by Eric Thu, 19 Sep 2002 00:00:00 GMT

For H.P. Lovecraft fans only: Plush Cthulhu.

Weekend Spam Update

Posted by Eric Mon, 16 Sep 2002 00:00:00 GMT

Between midnight Friday and 11:00am Monday, I received over 160 spams. SpamAssassin stopped all but five of them. SpamAssassin misidentified 2 legitimate messages as spam; both were unimportant mailing list messages from a user whose site is frequently used to send spam. (If I needed to correspond with this user on a regular basis, I'd add his name to my whitelist--or help educate him.)

If you don't need a public e-mail address, let me suggest a new rule: Never give your real e-mail address to anybody you don't know. This includes online vendors. If necessary, use a throwaway webmail address instead.

I also devoted quite a bit of work to repacking libJudy, HP's ultra-optimized associative array library. This library is used by bogofilter, Eric Raymond's promising new spam filter.

Content-based spam filtering is extremely good, and is improving rapidly. Just don't send me any e-mail about hot stock picks involving real estate companies in Nigeria that specialize in toner cartridge factories, and there shouldn't be any problem.


Why Hygienic Macros Rock

Posted by Eric Fri, 13 Sep 2002 00:00:00 GMT

I've recently been reading a lot of excellent essays on programming language design by Paul Graham. Paul and I agree about a number of things: (1) LISP is beautiful and powerful family of languages, even by modern standards, (2) all existing dialects of LISP are lacking a certain something, and (3) programmatic macros are a Good Idea.


Tags , , ,

Back from France

Posted by Eric Fri, 13 Sep 2002 00:00:00 GMT

I've just returned from a lovely vacation to France, where I saw lots of museums, ate lots of crepes, and generally had a much-needed chance to unwind. Along the way, I learned quite a bit about the Fench middle ages and the French Revolution, which has left me thinking about politics and ideology.

Interestingly enough, the groups most influenced by supposedly noble ideologies--the church, the chevaliers, the revolutionaries--seem to be the groups most likely to spill the blood of innocents by the thousands. This holds true for the history of North and South America, too--witness the conquistadors and the settlement of the American west.

I'm pretty sure there's an important lesson here, but I'm not yet certain what it is.

Bogofilter: A New Spam Filter

Posted by Eric Fri, 13 Sep 2002 00:00:00 GMT

According to Linux Weekly News, Eric Raymond is writing a new spam filter called bogofilter based on Bayesian analysis, as suggested by Paul Graham. Unlike the excellent SpamAssasin, which merely requires whitelisting a small number of addresses, bogofilter requires training with around 1,000 e-mail messages. But bogofilter may ultimately offer more hope for defeating spam.

Once trained, bogofilter recognizes most incoming spam (allegedly as much as SpamAssassin, but we'll have to wait and see). More importantly, however, bogofilter is very good at not recognizing legitimate e-mail as spam (in other words, it has a very low false positive rate).

The secret strength of bogofilter, however, is the training process. Because bogofilter is trained by the user, each user gets a personalized spam filter. This means that (1) information of professional interest to the reader will generally be recognized as non-spam (however incriminating it might otherwise look), and (2) there won't be a centralized list of rules for the spammer to read.

I suspect that the new MacOS X 10.2 mail client may be using a similar technique.


Busy for a While

Posted by Eric Tue, 27 Aug 2002 00:00:00 GMT

I've been working lately on some heavy code refactorings lately, and things are going well--the newly cleaned-up program structure is allowing me to add new features at a breakneck pace.

It's really nice to see a tangled old hairball of code get better quickly after a long, long slog. And we're just getting warmed up.

But for a number of reasons, I won't be updating this site for a few weeks. When I resume updates, I'll see if I can post some screenshots.

RedHat Bill Update

Posted by Eric Tue, 13 Aug 2002 00:00:00 GMT

Steve Sheldon sent me a URL to the RedHat bill which I mentioned yesterday. I can't find anything on Mandrake's, IBM's or Linux International's websites, so I'll assume that they're innocent until proven guilty (I don't trust the CNET fact checkers, and I seriously doubt that Linux International's corporate members would approve of this silliness).

So what's going on? It looks like somebody in RedHat's Open Source Now division is trying to pull a publicity stunt at the Linux World Expo. I'm not convinced that RedHat's management seriously supports this bill; they have a lot of partnerships with companies such as Oracle.

garym has has written a scathing satire about this bill. And I have reason to believe that Linus would hate it: And I personally refuse to use inferior tools because of ideology. In fact, I will go as far as saying that making excuses for bad tools due to ideology is stupid, and people who do that think with their gonads, not their brains.

Coffee Mug Question

Posted by Eric Tue, 13 Aug 2002 00:00:00 GMT

I notice that many Radio UserLand weblogs include a coffee mug icon. This icon allows Radio users to subscribe to a weblog with a click or two, instead of fooling around with RSS feeds manually. It's a good idea, and I'd like to add this feature to my site. One problem: I've dug around quite a bit, and I can't find any documentation.

Can non-Radio weblogs participate in this system? And if so, how? I'm perfectly willing to spend $40/year on Radio, if that's what's required, but I'd like to keep using my own tools.

California Open Source Bill: A Really Bad Idea

Posted by Eric Mon, 12 Aug 2002 00:00:00 GMT

According to CNET (via LWN), IBM, RedHat and several others are backing a bill which would prohibit the California government from purchasing proprietary software. If this is true, these companies have taken leave of their senses.

I write free software for a living, and I would be adamantly opposed to any such legislation. This is bad strategy (it would only alienate potential users), bad policy (there aren't open source products in many important markets), bad politics (it makes the sponsors look like self-serving fools without even a chance of victory), and bad business (running to the government when you can't compete in the market is tacky).

I have a hard time believing that RedHat is this dumb--or that IBM is this united behind a single proposal--so I'll wait a few days and see if CNET is misquoting the sponsors. But if the bill really says what CNET claims, I'm ready to oppose it.

SpamAssassin: An Decent Spam Filter

Posted by Eric Tue, 06 Aug 2002 00:00:00 GMT

SpamAssassin is a highly accurate open source spam filter.

There are two major components to the SpamAssassin filtering system: a set of rules which match various properties of an e-mail (e.g., whether it mentiones stock alerts or Nigerian banks), and a set of weights for each rule. The weights are assigned automatically, by analyzing various real-world mail spools. So SpamAssassin is essentially an adaptive system--the rules are periodically recalibrated, and whether a given property is good or bad may change over time.

SpamAssassin also includes an "auto whitelist", which supposedly learns to recognize your most frequent correspondents.

There're probably some chewy ideas in here for an evolutionary biologist--spam filtering involves an arms race between the spammers and the mail administrators of the world, and the most advanced spam filters are beginning to resemble immune systems.

(If you're a Debian user, type apt-get install spamassassin spamc libnet-dns-perl razor and take a look at the setup instructions. If you want to use spamd, try using the --max-children 10 argument; it will save you a lot of grief.)


Older posts: 1 ... 8 9 10 11 12