I’ve been looking at various libraries for natural language processing, and I’m pleasantly surprised by the tools created by the Python community. Some examples:

  • The Python NLTK library provides parsers for many popular copora, visualization tools, and a wide variety of simple natural language algorithms (though few of these are probabilistic). Highlights include:
  • WordNet support.
  • NumPy integration (see below).
  • An accessible introductory book on natural language processing.
  • ConceptNet provides a simple semantic model of the world.
  • NumPy (and SciPy) provide extensive support for linear algebra and data visualization.
  • PyCUDA provides access to Nvidia GPUs for high-performance scientific computation, and it integrates with NumPy.

If you need to build a web crawler, there’s Twisted, which makes it easy to write fast, asynchronous networking code.

All in all, I usually prefer Ruby to Python, because I love Ruby’s metaprogramming support. But the Python community has built an impressive variety of scientific and linguistic tools. Many thanks to everybody who contributed to these projects!