Screencast: Use Rails and RDF.rb to parse Best Buy product reviews

Posted by Eric Kidd Sun, 05 Jun 2011 19:07:00 GMT

In the past few years, many companies have been embedding machine-readable metadata in their web pages. Among these is Best Buy, which provides extensive RDFa data describing their products, prices and user reviews.

The following 20-minute screencast shows how to use Ruby 1.9.2, Rails 3.1rc1, RDF.rb and my rdf-agraph gem to compare user ratings of the iPad and various Android Honeycomb tablets.


Tags , ,  | 1 comment

Heroku "Celadon Cedar" review

Posted by Eric Kidd Fri, 03 Jun 2011 19:20:00 GMT

Heroku just released a new version of their hosting service for Ruby on Rails. It’s called Celadon Cedar, and it adds support for arbitrary background processes, Node.js servers and long-polling over HTTP.

I just finished porting a large Rails 3.0 application to Heroku’s Ceder stack from Chef+EC2, and I’m deeply impressed. But there are still some rough edges, especially with regard to asset caching.

Procfiles are really cool

Previous versions of Heroku could only run two types of processes: Web servers and delayed_job workers. If you needed to monitor a ZeroMQ queue or run a cron job every minute, you were out of luck. So even though I loved Heroku, about 2/3rds of my clients couldn’t even consider using it.

Celadon Cedar, however, allows you to create a Procfile specifying a list of process types to run:

web:    bundle exec rails server -p $PORT
worker: bundle exec rake jobs:work
clock:  bundle exec clockwork config/clock.rb

Once you’ve deployed your project, you can specify how many of each process you want:

heroku scale web=3 worker=2 clock=1

Even better, if you’re running on a development machine, or if you want to deploy to a regular Linux server, you can use the Foreman gem to launch the processes manually, or to generate init scripts:

foreman start
foreman export upstart /etc/init -u username

If you’re feeling more ambitious, you can also run Unicorn and Node.js servers on Heroku.

Asset caching is even worse than before

Previous versions of Heroku had a built-in Varnish cache, which would cache CSS, JavaScripts and images for 12 hours. The Varnish cache was automatically flushed on redeploy, so it gave you a nice performance boost for zero work.

However, if you were running a high-performance site, you would generally want to run all your JavaScript and CSS through YUI Compressor, which vastly improves your download times. Under the previous version of Heroku, this was annoying to set up: You had to either commit your compiled assets into git, or deploy them to a CDN manually.

The Celadon Cedar stack, unfortunately, doesn’t make it any easier to set up YUI Compressor, and it removes the existing Varnish cache. In place of Varnish, Heroku encourages you to set up Rack::Cache with memcached as a storage backend.

You may want to consider adding the following line to your file, right before the run statement:

use Rack::Deflater

Combined with Rack::Cache, this will give you back some of the functionality of Varnish. But it’s a lot more work than you needed to do before, and the results aren’t as good. Heroku made this decision deliberately, because Varnish prevented them for doing cool things with Node.js servers and long-polled HTTP connections. But it still represents a retreat from Heroku’s famous ease of use.

What Heroku’s Cedar stack really needs is first-class support for Rack::Cache, Rack::Deflator, and the new Sprockets asset caching in Rails 3.1. Please, just allow me to add a couple of lines to my Gemfile and have everything work automagically. Yeah, you’ve spoiled me and made me lazy.

You’ll have to upgrade to Ruby 1.9.2

According to the official documentation, only Ruby 1.9.2 is supported under Celadon Cedar. This isn’t entirely surprising—Rails 3.1 recommends Ruby 1.9.2 as well—but it may be a problem for some users.

Fortunately, my client’s application worked flawlessly under Ruby 1.9.2 with only a single change to the Gemfile.

Running a cron job once per minute is really easy, but it costs $71/month

One of Heroku’s engineers explains how to run high-frequency cron jobs using Clockwork and delayed_job.

Basically, you add a couple of lines to your Procfile:

worker: bundle exec rake jobs:work
clock:  bundle exec clockwork config/clock.rb

…and you put something like the following in config/clock.rb:

require File.expand_path('../environment',  __FILE__)

# Run our heartbeat once per minute.
every(1.minutes, 'myapp.heartbeat') { MyApp.delay.heartbeat }

This creates a DelayedJob and hands it off to our worker process. According to the tutorial, you’re supposed to do the actual work in a separate process, so as not to interfere with other events. This approach is elegant, but it’s going to cost you $71/month for two “dynos”. Ouch.

Cedar is a great new stack, but it needs polishing

I’m really impressed with Celadon Cedar. Heroku has vastly improved their support for complex applications with a lot of moving parts. But along the way, they’ve made it slightly harder to deploy simple applications, and they still don’t have a painless way to do asset caching. Of course, these minor drawbacks should improve dramatically once the Ruby community plays with Cedar for a few weeks.

Many thanks to Heroku for a great new release! I’ll be moving more applications over soon.

Does anybody have any suggestions on how make better use of Cedar and Rails?

Tags ,  | no comments

The state of Ruby, RDF and Rails 3

Posted by Eric Kidd Mon, 20 Dec 2010 19:56:00 GMT

Recently, I was investigating the state of RDF in the Ruby world. Here are some notes, in case anybody is curious. I have used only a few of these Ruby RDF libraries, so please feel free to add your own comments with corrections and other alternatives.

There’s also some stuff about ActiveModel and ActiveRelation down at the end, for people who are interested in Rails 3.


Tags , ,  | 6 comments

Wave Hackathon

Posted by Eric Kidd Sat, 21 Nov 2009 23:35:00 GMT

I’m currently attending the Wave hackathon at the Massachusetts GTUG. Here’s some code from a protocol-level Wave agent that I just demoed:

# Capitalize random words.
replace /\b(random|words)\b/i do |word|

# Shorten URLs.
replace /\bhttp:\/\/([^ ]+)/  do |url| 

In keeping with the traditions of hackathons, this agent is horribly fragile. It only works with FedOne’s console-based wave client, and it doesn’t handle annotations correctly.

Some earlier—and more robust—wave-related projects:

  • Pick Several: A gadget which implements approval voting. Written using GWT. Includes a reusable library for writing GWT-based wave gadgets.
  • BugLinky: A robot which links bug numbers to a bug tracker. Includes a reusable library for simple pattern-matching, text replacement and annotation.

Many thanks to GTUG and to Google for organizing this hackathon!

Tags ,

Write a 32-line chat client using Ruby, AMQP & EventMachine (and a GUI using Shoes)

Posted by Eric Kidd Fri, 08 May 2009 18:06:00 GMT

Have you ever considered using instant messages to communicate between programs? You can do this using Jabber’s XMPP protocol, of course. But it’s also worth taking a look at AMQP, a distributed messaging protocol first used at JPMorgan Chase. AMQP is fast, easy to use, and implemented by at least 4 open source servers.

To try it out, install the excellent Ruby AMQP bindings, and set up the RabbitMQ server (which is written in Erlang using Mnesia). On a Mac, you might do something like this:

sudo gem install amqp
sudo port install python25 rabbitmq-server
sudo rabbitmq-server

Once your server is running, save the following code as chat.rb:

require 'rubygems'
gem 'amqp'
require 'mq'

unless ARGV.length == 2
  STDERR.puts "Usage: #{$0} <channel> <nick>"
  exit 1
$channel, $nick = ARGV

AMQP.start(:host => 'localhost') do
  $chat = MQ.topic('chat')

  # Print any messages on our channel.
  queue = MQ.queue($nick)
  queue.bind('chat', :key => $channel)
  queue.subscribe do |msg|
    if msg.index("#{$nick}:") != 0
      puts msg

  # Forward console input to our channel.
  module KeyboardInput
    include EM::Protocols::LineText2
    def receive_line data
      $chat.publish("#{$nick}: #{data}",
                    :routing_key => $channel)

Now, run copies in two different terminals:

ruby chat.rb channel_1 sarah
ruby chat.rb channel_1 joe

Everything you type into one terminal will be relayed to the other.

How it works

The following line creates a topic exchange named “chat”:

$chat = MQ.topic('chat')

A topic exchange allows many-to-many communication. Here, we bind a listener to our exchange, and ask to receive all messages tagged with our channel name:

queue.bind('chat', :key => $channel)

Note that :key may be hierarchical, and it may contain wildcards. To write data to our topic exchange, we use publish:

$chat.publish("#{$nick}: #{data}",
              :routing_key => $channel)

Our keyboard input is processed using EventMachine, a Ruby library for writing high-performance, multi-protocol servers. It’s very similar to Python’s Twisted library, though it has less documentation and support for fewer protocols.

We use EventMachine’s EM.open_keyboard to create a asynchronous keyboard input channel, and we use EM::Protocols::LineText2 to treat the keyboard input as a line-oriented protocol.

Adding a Shoes GUI

Shoes is an eccentric, entertaining, and highly-portable GUI library by _why the lucky stiff. With a certain amount of grotesque kludging (and some pointers from “s1kx” on the #shoes IRC channel), I managed to get the Mac version of Shoes to talk to EventMachine. You may find that this code fails strangely on your computer. Honestly, I don’t know anything about Shoes. And I’m doing some pretty bad things with threads.

First, the pretty pictures:

Next, the code:

Shoes.setup { gem 'amqp' }
require 'mq'

$app = => 256) do
  background(gradient('#CFF', '#FFF'))
  @output = stack(:margin => 10)

  def nick str
    span(str, :stroke => red)

  def display text
    @output.append do
      if text =~ /^([^:]+): (.*)$/
        para nick("#{$1}: "), $2
        para text
end do
    AMQP.start(:host => 'localhost') do
      queue = MQ.queue('shoes')
      queue.subscribe do |msg|
  rescue => e
    # Try to report at least _some_ errors
    # where we'll be able to see them.

Note that the GUI client listens to all channels simultaneously, because it doesn’t pass a :key to bind. And when writing code to run in a Shoes background thread, don’t expect to see any error messages.

Learning more about AMQP

The Ruby AMQP documentation page has a good list of papers, magazine articles, and other background material on AMQP.

Tags , , , ,

Designing programs with RSpec and Cucumber (plus a book recomendation)

Posted by Eric Kidd Thu, 30 Apr 2009 15:07:00 GMT

Over the last couple of years, I’ve occasionally written Ruby programs using RSpec and (more recently) Cucumber. These two tools are inspired by Test Driven Development (TDD), a school of thought which says you should write unit tests before implementing a feature.

When doing TDD, you work inwards from the interface to the implementation. You start by writing a test case against the interface you wish you had, and then you make that test case work. This is a subtle shift in how you approach a design problem, but it frequently results in beautiful APIs. (And you also get a fully automated test suite for your software, liberating you to make much larger changes without fear of breaking things.)

The problem with the word “test”

Unfortunately, the name “Test Driven Development” is misleading. Most folks think of “testing” as something you do after development is complete. But TDD is really more of a design activity—you’re specifying how your APIs should work before you actually start coding.

Dan North spent some time struggling to teach developers about TDD. After a while, he decided that the main barrier to understanding was the word “test.” He proposed replacing TDD with Behavior Driven Development (BDD), and he started referring to unit tests as “specifications.”

In the Ruby community, the most popular BDD tool is RSpec. Using RSpec, you might specify an API something like this:

describe "simplify_name" do
  it "should convert all letters to lowercase" do
    simplify_name("AbC").should == "abc"

  it "should remove everything but letters and spaces" do
    simplify_name(" Joe Smith 3 -+\n").should == "joe smith"

After writing this specification, you would then go ahead and implement simplify_name. And from then on, whenever you changed your program, you could automatically check it against this specification.

Using specifications to communicate with clients and users

By itself, RSpec is mostly useful for programmers. Sure, a specification looks a lot like English. But would you really want to show it to an end user?

Cucumber goes one step further. Instead of using code to specify how an API should work, it uses plain text to describe how a user interface should work. For example:

Feature: Log in and out
  As an administrator
  I want to restrict access to certain portions of my site
  In order to prevent users from changing the content

  Scenario: Logging in
    Given I am not logged in as an administrator
    When I go to the administrative page
    And I fill in the fields
      | Username | admin  |
      | Password | secret |
    And I press "Log in"
    Then I should be on the administrative page
    And I should see "Log out"

  Scenario: Logging out

Here’s the neat part: This specification is actually an executable program. Each line of text corresponds to a “step”, which is defined in another file. Here’s an example from the standard webrat_steps.rb file:

Then /^I should see "([^\"]*)"$/ do |text|
  response.should contain(text)

Cucumber encourages you to think at a very high level, and to specify how different users will actually use your software. It’s particularly helpful if you need to communicate between programmers and end-users.

My experiences with RSpec and Cucumber

I’ve been using RSpec on and off for a couple of years now, and Cucumber since late last year. Initially, I found both tools fascinating, but also a bit frustrating. Both RSpec and Cucumber have very strong opinions about how you should write software. Now, I found those opinions very interesting, and I was quite happy to be influenced by the assumptions built into the tools. But every now and then, I would need to do something that the authors of RSpec and Cucumber hadn’t anticipated, and I would inevitably wind up struggling to make things work.

But recent versions of RSpec and Cucumber are richer and more flexible. They cover more important cases straight out of the box, and they’re easier to customize. So I can finally recommend both tools for real-world projects: They’ll still guide your thinking, but they should give you enough flexibility to handle the corner-cases.

The RSpec (and Cucumber) book

Unfortunately, the documentation for RSpec and Cucumber is scattered around the web, and there aren’t enough online guides showing the best way to solve common problems.

But the Pragmatic Press is working on The RSpec Book, which contains a large section on Cucumber, and a walkthrough of a typical development session using Cucumber and RSpec.

Currently, the RSpec book is available as a “beta book”. This is a downloadable, DRM-free PDF, with periodic updates throughout the publishing process. Right now, between one-third and one-half of the chapters have been roughed in, and the book is already very useful.

So if you’re curious about RSpec and Cucumber, have a look around the two web sites, and maybe watch some of the screencasts. If you decide to investigate further, pick up the beta book and dive in.

Tags , ,

Ruby-style metaprogramming in JavaScript (plus a port of RSpec)

Posted by Eric Kidd Sun, 01 Jul 2007 19:00:00 GMT

Programming in Ruby makes me happy. It’s a lovable language, with a pleasantly quirky syntax and lots of expressive power.

Programming in JavaScript, on the other hand, frustrates me to no end. JavaScript could be a reasonable language, but it has all sorts of ugly corner cases, and it forces me to roll everything from scratch.

I’ve been trying to make JavaScript a bit more like Ruby. In particular, I want to support Ruby-style metaprogramming in JavaScript. This would make it possible to port over many advanced Ruby libraries.

You can check out the interactive specification, or look at some examples below. If the specification gives you any errors, please post them in the comment thread, and let me know what browser you’re running!


Tags , , ,

Some useful closures, in Ruby

Posted by Eric Kidd Thu, 01 Feb 2007 18:36:00 GMT

Reginald Braithwaite has just posted a short introduction to closures in Ruby. Closures allow you to pass functions around your program, and build new functions from old ones.

Programming languages that support closures include Perl, Ruby, Python (sorta), Lisp, Haskell, Dylan, Javascript and many others.

The Dylan programming language included four very useful functions built using closures: complement, conjoin, disjoin and compose. The names are a bit obscure, but they can each be written in a few lines of Ruby.

Let’s start with complement:

# Builds a function that returns true
# when 'f' returns false, and vice versa.
def complement f
  lambda {|*args| not*args) }

We can use this to build the “opposite” of a function:

is_even = lambda {|n| n % 2 == 0 }
is_odd  = complement(is_even) # true # false

compose is another useful function:

# Builds a function which calls 'f' with
# the return value of 'g'.
def compose f, g
  lambda {|*args|*args)) }

We can use this to pass the output of one function to the input of another:

mult2 = lambda {|n| n*2 }
add1  = lambda {|n| n+1 }
mult2_add1 = compose(add1, mult2) # 7

The conjoin function is a bit more complicated, but still very useful:

# Builds a function which returns true
# whenever _every_ function in 'predicates'
# returns true.
def conjoin *predicates
  base = lambda {|*args| true }
  predicates.inject(base) do |built, pred|
    lambda do |*args|*args) &&*args)

We can use it to construct the logical “and” of a list of functions:

is_number = lambda {|n| n.kind_of?(Numeric) }
is_even_number = conjoin(is_number, is_even)"a") # false   # false   # true

The opposite of conjoin is disjoin:

# Builds a function which returns true
# whenever _any_ function in 'predicates'
# returns true.
def disjoin *predicates
  base = lambda {|*args| false }
  predicates.inject(base) do |built, pred|
    lambda do |*args|*args) ||*args)

This allows us to construct the logical “or” of a list of functions:

is_string  = lambda {|n| n.kind_of?(String) }
is_string_or_number =
  disjoin(is_string, is_number)"a") # true   # true  # false

These were four of the first closure-related functions I ever used, and they’re still favorites today.

Feel free to post versions in other languages below!

Tags ,

13 Ways of Looking at a Ruby Symbol

Posted by Eric Kidd Sat, 20 Jan 2007 03:20:00 GMT

New Ruby programmers often ask, “What, exactly, is a symbol? And how does it differ from a string?” No one answer works for everybody, so–with apologies to Wallace Stevens–here are 13 ways of looking at a Ruby symbol.



Why Ruby is an acceptable LISP

Posted by Eric Kidd Sat, 03 Dec 2005 11:30:00 GMT

Years ago, I looked at Ruby and decided to ignore it. Ruby wasn’t as popular as Python, and it wasn’t as powerful as LISP. So why should I bother?

Of course, we could turn those criteria around. What if Ruby were more popular than LISP, and more powerful than Python? Would that be enough to make Ruby interesting?

Before answering this question, we should decide what makes LISP so powerful. Paul Graham has written eloquently about LISP’s virtues. But, for the sake of argument, I’d like to boil them down to two things:

  1. LISP is a dense functional language.
  2. LISP has programmatic macros.

As it turns out, Ruby compares well as a functional language, and it fakes macros better than I’d thought.


Tags , , ,

Older posts: 1 2