Visualizing WordNet relationships as graphs

The WordNet database contains all sorts of interesting relationships between words: it can categorize words into hierarchies, find the parts of an object, and answer many other interesting questions.

The code below relies on the NLTK and NetworkX libraries for Python.

Categorizing words

What, exactly, is a dog? It’s a domestic animal and a carnivore, not to mention a physical entity (as opposed to an abstract entity, such as an idea). WordNet knows all these facts:

How do we generate this image? First, we look up the first entry for “dog” in WordNet. This returns a “synset”, or a set of words with equivalent meanings.

dog = wn.synset('dog.n.01')

Next, we compute the transitive closure of the hypernym relationship, or (in English) we look for all the categories to which “dog” belongs, and all the categories to which those categories belong, recursively:

graph = closure_graph(dog,
                      lambda s: s.hypernyms())

After that, we just pass the resulting graph to NetworkX for display:

nx.draw_graphviz(graph)

The implementation

The closure_graph function repeatedly calls fn on the supplied symset, and uses the result to build a NetworkX graph. This code goes at the top of the file, so you can use wn and nx in your own code.

from nltk.corpus import wordnet as wn
import networkx as nx

def closure_graph(synset, fn):
    seen = set()
    graph = nx.DiGraph()
    
    def recurse(s):
        if not s in seen:
            seen.add(s)
            graph.add_node(s.name)
            for s1 in fn(s):
                graph.add_node(s1.name)
                graph.add_edge(s.name, s1.name)
                recurse(s1)
                
    recurse(synset)
    return graph

By using a high-quality graph library, we make it much easier to merge, analyze and display our graphs.

More graphs

Parts of the finger, generated with synset('finger.n.01') and part_meronyms:

Types of running, generated with synset('run.v.01') and hyponyms:

Matthew wrote on Jan 01, 2010:

Do you know of any similar modules in Haskell that will let me play around with this sort of thing?

Eric wrote on Jan 01, 2010:

Google finds HWordNet, which looks pretty reasonable. For maximum enjoyment, you'll also want some kind of DAG or graph library, and visualization tools. I have to say, I'm delighted about WordNet, because it is both terrifyingly comprehensive and remarkably robust--you can actually get away with writing software that reasons robustly over WordNet relationships. I find this remarkable, and may post some examples soon.

Categorizing words

The implementation

More graphs

More posts