Recently, I was investigating the state of RDF in the Ruby world. Here are some notes, in case anybody is curious. I have used only a few of these Ruby RDF libraries, so please feel free to add your own comments with corrections and other alternatives.

There's also some stuff about ActiveModel and ActiveRelation down at the end, for people who are interested in Rails 3.

For a list of available Ruby RDF libraries, run:

gem search -dr rdf

RDF.rb: A high-level, pure-Ruby RDF library

RDF.rb appears to be the most complete of the Ruby RDF libraries. It represents RDF triples using a hierarchy of Ruby classes, and it supports many formats and data stores via plugins. RDF.rb is actively maintained, with the latest commit occurring less than 8 hours before this post was written.

The short descriptions below are taken directly from gem.

Formats:

  • N-Triples (included)
  • rdf-json: RDF/JSON support for RDF.rb.
  • rdf-n3: Notation-3 (n3-rdf) and Turtle reader/writer for RDF.rb.
  • rdf-rdfa: RDFa reader for RDF.rb.
  • rdf-rdfxml: RDF/XML reader/writer for RDF.rb.
  • rdf-trix: TriX support for RDF.rb.
  • rdf-xml: An RDF.rb plugin for XML files.

Storage adapters:

  • In-memory RDF store (included).
  • rdf-4store: 4store adapter for RDF.rb.
  • rdf-bert: BERT-RPC repository proxy for RDF.rb.
  • rdf-cassandra: Apache Cassandra adapter for RDF.rb.
  • rdf-do: RDF.rb plugin providing a DataObjects storage adapter.
  • rdf-mongo: A storage adapter for integrating MongoDB and RDF.rb.
  • rdf-redstore: RDF.rb plugin providing a RedStore storage adapter.
  • rdf-sesame: Sesame 2.0 adapter for RDF.rb.
  • rdf-talis: RDF.rb plugin providing a Talis platform storage adapter.

Related libraries:

  • rdf-isomorphic: RDF.rb plugin for graph bijections and isomorphic equivalence.
  • rdf-raptor: Raptor RDF Parser wrapper for RDF.rb.
  • rdf-rasqal: Rasqal RDF Query Library plugin for RDF.rb.
  • rdf-sparql: RDF.rb plugin for parsing / writing SPARQL queries.
  • rdf-spec: RSpec extensions for RDF.rb.
  • rdfgrid: Map/Reduce pipelines for RDF.rb.

ActiveRDF: Rails object mapper (from the pre-ActiveModel days)

Once upon a time, ActiveRDF was the premiere high-level library for working with RDF and Rails. It re-implemented much of the ActiveRecord API, allowing Rails developers to treat RDF datastores in much the same way they treated SQL databases.

Unfortunately, ActiveRDF no longer appears to be actively maintained. Furthermore, because ActiveRDF is a re-implementation of ActiveRecord, it doesn't take advantage of the new ActiveModel and ActiveRelation libraries that have become available with Rails 3. (See below for why this would be nice.)

Storage adapters and libraries

  • activerdf_jena: ActiveRDF adapter to the Jena RDF store
  • activerdf_rdflite: An RDF database for usage in ActiveRDF (based on sqlite3).
  • activerdf_redland: ActiveRDF adapter to Redland RDF store.
  • activerdf_rules: A rulebase and forward chaining production system for activerdf databases.
  • activerdf_sesame: Jruby adapter to sesame2 datastore (for usage in ActiveRDF).
  • activerdf_sparql: ActiveRDF adapter to SPARQL endpoint.
  • activerdf_agraph: AllegroGraph storage adapter. Not currently available from gemcutter. (Full disclosure: Several years ago, Franz paid me to work on this project part time.)

Redland Ruby bindings

The Redland library is widely used by C developers, and it includes Ruby bindings. This code appears to be actively maintained.

agraph: Low-level AllegroGraph bindings

phifty has recently been working on low-level Ruby bindings for AllegroGraph. I've tried these out, and they work quite well—it took about 5 minutes to get started after installing AllegroGraph's free Java edition on an EC2 server. (More details on that in a future post, if anybody is interested.)

One word of warning: The agraph gem requires that string literals be passed as raw RDF syntax, and not as high-level objects (as far as I can tell). This is slightly awkward. For example, note how we specify the English-language string "John Doe" in the following example:

stmts = repo.statements
stmts.create('<http://example.com/people#jdoe>',
             '<http://xmlns.com/foaf/0.1/name>',
             '"John Doe"@en')

Other RDF resources

The rubyrdf library appears to have been abandoned. The public-rdf-ruby mailing list seems pretty quiet these days, too. The rdf_context library is still going strong, but it appears to fill roughly the same niche as RDF.rb without the depth of add-on libraries.

Besides ActiveRDF, there are two other RDF object mappers available for Ruby. These have some interesting ideas, but they don't seem to have a lot of traction.

  • rdf-mapper: An RDF object mapper sitting directly on top of RDF.rb. According to the website, this is under heavy development and not yet ready for production use. It handles RDF vocabularies and properties in a rather nice way, relying on RDF.rb's infrastructure whenever possible. However, because this is a standalone library, it doesn't have deep integration into the Rails 3 ecosystem.
  • dm-rdf: DataMapper storage adapter for a variety of different RDF data stores, including Sesame and RDFcache. Has a reasonable syntax for declaring object properties and default namespaces.

If anybody is looking for ideas on how to make good dynamic-language APIs for working with RDF, here are some Python RDF libraries for comparison purposes:

What's missing: ActiveModel and ActiveRelation

As far as I can tell, two major players are missing from the Ruby RDF scene: ActiveModel and ActiveRelation. If you haven't heard of these, you might enjoy the slick introductory video (6m 30s).

Up until Rails 3, Rails developers used the built-in ActiveRecord API to access SQL databases. If they wanted to work with non-SQL databases, they had to make some sacrifices. Libraries like ActiveRDF and MongoMapper re-implemented much of the ActiveRecord API, but they had a number of shortcomings: features like validations always worked in a slightly incompatible fashion, and other Rails add-ons would frequently be confused by the API differences.

The Rails 3 team decided to stop this pointless reinvention of (square) wheels, and extracted much of ActiveRecord's functionality into ActiveModel. While they were at it, they replaced ActiveRecord's ad hoc query APIs with ActiveRelation, which offers a fully-composable interface to the relational algebra in Ruby:

class Post < ActiveRecord::Base
  scope :published, where(:published => true)
  scope :visible,
        published.where(:hidden => false)
  scope :reverse_publication_order,
        order('posts.created_at DESC')
end

# Return the 10 most recent visible posts.
Post.visible.reverse_publication_order.limit(10)

Using has_scope, you can even map these scopes directly onto URL query parameters!

For some more examples of ActiveModel goodness, see Mongoid and the Rails 3 libraries for neo4j. If you want to implement your own object mapper, I strongly recommend the excellent Crafting Rails Applications book and this talk on ActiveModel and ActiveRelation.

Clearly, there are some real possibilities for working with RDF here.