Interesting Python libraries for natural language processing
I’ve been looking at various libraries for natural language processing, and I’m pleasantly surprised by the tools created by the Python community. Some examples:
- The Python NLTK library provides parsers for many popular copora, visualization tools, and a wide variety of simple natural language algorithms (though few of these are probabilistic). Highlights include:
- WordNet support.
- NumPy integration (see below).
- An accessible introductory book on natural language processing.
- ConceptNet provides a simple semantic model of the world.
- NumPy (and SciPy) provide extensive support for linear algebra and data visualization.
- PyCUDA provides access to Nvidia GPUs for high-performance scientific computation, and it integrates with NumPy.
If you need to build a web crawler, there’s Twisted, which makes it easy to write fast, asynchronous networking code.
All in all, I usually prefer Ruby to Python, because I love Ruby’s metaprogramming support. But the Python community has built an impressive variety of scientific and linguistic tools. Many thanks to everybody who contributed to these projects!
Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.