The state of Ruby, RDF and Rails 3

Posted by Eric Kidd Mon, 20 Dec 2010 19:56:00 GMT

Recently, I was investigating the state of RDF in the Ruby world. Here are some notes, in case anybody is curious. I have used only a few of these Ruby RDF libraries, so please feel free to add your own comments with corrections and other alternatives.

There’s also some stuff about ActiveModel and ActiveRelation down at the end, for people who are interested in Rails 3.

For a list of available Ruby RDF libraries, run:

gem search -dr rdf

RDF.rb: A high-level, pure-Ruby RDF library

RDF.rb appears to be the most complete of the Ruby RDF libraries. It represents RDF triples using a hierarchy of Ruby classes, and it supports many formats and data stores via plugins. RDF.rb is actively maintained, with the latest commit occurring less than 8 hours before this post was written.

The short descriptions below are taken directly from gem.

Formats:

  • N-Triples (included)
  • rdf-json: RDF/JSON support for RDF.rb.
  • rdf-n3: Notation-3 (n3-rdf) and Turtle reader/writer for RDF.rb.
  • rdf-rdfa: RDFa reader for RDF.rb.
  • rdf-rdfxml: RDF/XML reader/writer for RDF.rb.
  • rdf-trix: TriX support for RDF.rb.
  • rdf-xml: An RDF.rb plugin for XML files.

Storage adapters:

  • In-memory RDF store (included).
  • rdf-4store: 4store adapter for RDF.rb.
  • rdf-bert: BERT-RPC repository proxy for RDF.rb.
  • rdf-cassandra: Apache Cassandra adapter for RDF.rb.
  • rdf-do: RDF.rb plugin providing a DataObjects storage adapter.
  • rdf-mongo: A storage adapter for integrating MongoDB and RDF.rb.
  • rdf-redstore: RDF.rb plugin providing a RedStore storage adapter.
  • rdf-sesame: Sesame 2.0 adapter for RDF.rb.
  • rdf-talis: RDF.rb plugin providing a Talis platform storage adapter.

Related libraries:

  • rdf-isomorphic: RDF.rb plugin for graph bijections and isomorphic equivalence.
  • rdf-raptor: Raptor RDF Parser wrapper for RDF.rb.
  • rdf-rasqal: Rasqal RDF Query Library plugin for RDF.rb.
  • rdf-sparql: RDF.rb plugin for parsing / writing SPARQL queries.
  • rdf-spec: RSpec extensions for RDF.rb.
  • rdfgrid: Map/Reduce pipelines for RDF.rb.

ActiveRDF: Rails object mapper (from the pre-ActiveModel days)

Once upon a time, ActiveRDF was the premiere high-level library for working with RDF and Rails. It re-implemented much of the ActiveRecord API, allowing Rails developers to treat RDF datastores in much the same way they treated SQL databases.

Unfortunately, ActiveRDF no longer appears to be actively maintained. Furthermore, because ActiveRDF is a re-implementation of ActiveRecord, it doesn’t take advantage of the new ActiveModel and ActiveRelation libraries that have become available with Rails 3. (See below for why this would be nice.)

Storage adapters and libraries

  • activerdf_jena: ActiveRDF adapter to the Jena RDF store
  • activerdf_rdflite: An RDF database for usage in ActiveRDF (based on sqlite3).
  • activerdf_redland: ActiveRDF adapter to Redland RDF store.
  • activerdf_rules: A rulebase and forward chaining production system for activerdf databases.
  • activerdf_sesame: Jruby adapter to sesame2 datastore (for usage in ActiveRDF).
  • activerdf_sparql: ActiveRDF adapter to SPARQL endpoint.
  • activerdf_agraph: AllegroGraph storage adapter. Not currently available from gemcutter. (Full disclosure: Several years ago, Franz paid me to work on this project part time.)

Redland Ruby bindings

The Redland library is widely used by C developers, and it includes Ruby bindings. This code appears to be actively maintained.

agraph: Low-level AllegroGraph bindings

phifty has recently been working on low-level Ruby bindings for AllegroGraph. I’ve tried these out, and they work quite well—it took about 5 minutes to get started after installing AllegroGraph’s free Java edition on an EC2 server. (More details on that in a future post, if anybody is interested.)

One word of warning: The agraph gem requires that string literals be passed as raw RDF syntax, and not as high-level objects (as far as I can tell). This is slightly awkward. For example, note how we specify the English-language string “John Doe” in the following example:

stmts = repo.statements
stmts.create('<http://example.com/people#jdoe>',
             '<http://xmlns.com/foaf/0.1/name>',
             '"John Doe"@en')

Other RDF resources

The rubyrdf library appears to have been abandoned. The public-rdf-ruby mailing list seems pretty quiet these days, too. The rdf_context library is still going strong, but it appears to fill roughly the same niche as RDF.rb without the depth of add-on libraries.

Besides ActiveRDF, there are two other RDF object mappers available for Ruby. These have some interesting ideas, but they don’t seem to have a lot of traction.

  • rdf-mapper: An RDF object mapper sitting directly on top of RDF.rb. According to the website, this is under heavy development and not yet ready for production use. It handles RDF vocabularies and properties in a rather nice way, relying on RDF.rb’s infrastructure whenever possible. However, because this is a standalone library, it doesn’t have deep integration into the Rails 3 ecosystem.
  • dm-rdf: DataMapper storage adapter for a variety of different RDF data stores, including Sesame and RDFcache. Has a reasonable syntax for declaring object properties and default namespaces.

If anybody is looking for ideas on how to make good dynamic-language APIs for working with RDF, here are some Python RDF libraries for comparison purposes:

What’s missing: ActiveModel and ActiveRelation

As far as I can tell, two major players are missing from the Ruby RDF scene: ActiveModel and ActiveRelation. If you haven’t heard of these, you might enjoy the slick introductory video (6m 30s).

Up until Rails 3, Rails developers used the built-in ActiveRecord API to access SQL databases. If they wanted to work with non-SQL databases, they had to make some sacrifices. Libraries like ActiveRDF and MongoMapper re-implemented much of the ActiveRecord API, but they had a number of shortcomings: features like validations always worked in a slightly incompatible fashion, and other Rails add-ons would frequently be confused by the API differences.

The Rails 3 team decided to stop this pointless reinvention of (square) wheels, and extracted much of ActiveRecord’s functionality into ActiveModel. While they were at it, they replaced ActiveRecord’s ad hoc query APIs with ActiveRelation, which offers a fully-composable interface to the relational algebra in Ruby:

class Post < ActiveRecord::Base
  scope :published, where(:published => true)
  scope :visible,
        published.where(:hidden => false)
  scope :reverse_publication_order,
        order('posts.created_at DESC')
end

# Return the 10 most recent visible posts.
Post.visible.reverse_publication_order.limit(10)

Using has_scope, you can even map these scopes directly onto URL query parameters!

For some more examples of ActiveModel goodness, see Mongoid and the Rails 3 libraries for neo4j. If you want to implement your own object mapper, I strongly recommend the excellent Crafting Rails Applications book and this talk on ActiveModel and ActiveRelation.

Clearly, there are some real possibilities for working with RDF here.

Tags , ,  | 6 comments

Comments

  1. Ben Lavender said about 6 hours later:

    Good overview! Let us know if you’re needing some aspect of RDF that RDF.rb doesn’t support yet.

    On the ORM front, I have to pipe up and say you missed Spira, a work-in-progress ORM for RDF.

    http://github.com/datagraph/spira

    It needs plenty of work, but does fine for simple applications.

    One thing I’ve learned with Spira is that adapting RDF to a relational model is not trivial. The semantics are different, and ActiveRelation and DataMapper both make a lot of assumptions about a relational data model.

    You can find an overview of some of the design decisions here: http://blog.datagraph.org/2010/05/spira

    That blog also has some RDF.rb tutorials.

  2. Ben Lavender said about 13 hours later:

    For the record, after reading this, I went down a rabbit hole of recorded talks and googling for information on ActiveModel more recent than when I started Spira some 6 months ago. Suffice to say ‘thanks’, Spira needs more than ‘some work’ now :)

  3. Arto Bendiken said about 14 hours later:

    That’s a good summary of the state of RDF in Ruby, Eric.

    Note that the dm-rdf gem is discontinued in favor of Spira, linked to by Ben above.

    Also, we have some initial SPARQL support for RDF.rb in the form of the sparql-client gem.

    The easiest way to discover currently-available RDF.rb plugins is to browse RubyGems.org using the “rdf-” and “sparql-” prefixes:

    http://rubygems.org/search?query=rdf-
    http://rubygems.org/search?query=sparql-

  4. Eric said about 17 hours later:

    Arto: Thank you for all your work on RDF.rb! I’m really impressed at the size of the ecosystem you’ve built around RDF.rb. And I’ll take a good look through the SPAQRL plugins for RDF.rb later; I didn’t see those when I wrote the article.

    Ben: Spira looks really great! I had just found your email about Spira’s design decisions this morning. You’ve clearly thought about the implications of the “open model” nature of RDF, and how that affects ORMs.

    To a certain extent, I’m approaching the question of an RDF ORM from the other direction: I want something that works, seamlessly, with Rails plugins like inherited_resources and has_scope. This would allow existing Rails users to drop RDF models into their applications, and they wouldn’t have to know very much about RDF at first.

    For an example of deep Rails 3 integration, see Mongoid, which gets MongoDB users about 80% of the way to Rails nirvana. And MongoDB really isn’t a relational database at all. I’d love it if the RDF community could have ORMs at least as “Rails-like” as Mongoid.

    Clearly, I need to spend some time thinking about how ActiveRelation and RDF interact. Because ActiveRelation is really about SQL queries (and not about ORM mapping at all), I suspect that we’re looking at an essentially mathematical question: Is there a clean mapping between the n-tuple relations of SQL’s relational algebra, and the open-ended triples used by RDF? There’s clearly been a lot of work on this question, which I need to read.

    If there is a reasonable mathematical mapping between the relational alegbra and RDF, the next goal would be to take the query “parse tree” generated by ActiveRelation and to try to compile it to SPARQL instead of SQL. I have no idea whether this would actually work, but it would certainly be instructive.

    If I keep digging into this question, I’ll probably reach much the same conclusions that you have, Ben. I just like doing things the hard way. :-)

    Many thanks to both of you for your excellent contributions to the Ruby RDF world!

  5. Eric said about 19 hours later:

    Here are some more discussions on Ruby and RDF, courtesy of the various “overflow” sites:

    StackOverflow: The State of RDF in Ruby
    SemanticOverflow: What Ruby library do you use for working with RDF?

    There’s also some further discussions of RdfContext and RDFObjects, which I don’t really cover above.

  6. Arto Bendiken said about 21 hours later:

    I should have also mentioned that Gregg Kellogg (@gkellogg), the author of RdfContext, has joined the RDF.rb core development team (to be announced when we release 0.3.0 later this week), which now consists of him, Ben (@bhuga), and myself.

    Many of the most important parser/serializer plugins for RDF.rb are Gregg’s creations and based on code he originally wrote for RdfContext.

    Similarly, Ross Singer (@rsinger), the author of RDFObjects, has also mentioned plans to have RDFObjects be based on and compatible with RDF.rb. I don’t know of an ETA, however.

    So we’re all basically pulling in the same direction here.

(leave url/email »)

   Preview comment