New Ruby programmers often ask, "What, exactly, is a symbol? And how does it differ from a string?" No one answer works for everybody, so—with apologies to Wallace Stevens—here are 13 ways of looking at a Ruby symbol.

A Ruby symbol is:

  1. ...the name of something, not just a blob of text
  2. ...a label in a free-form enumeration
  3. ...a constant, unique name
  4. ...an "interned" string
  5. ...an object with O(1) comparison
  6. ...a Lisp identifier
  7. ...a Ruby identifier
  8. ...the keyword for a keyword argument
  9. ...an excellent choice for a hash key
  10. ...like a Mac OSType
  11. ...a memory leak
  12. ...a clever way to store only a single copy of a string
  13. ...a C typedef named "ID"

1. A Ruby symbol is the name of something, not just a blob of text

In Ruby, we would generally use symbols when referring to things by name:

find_speech(:gettysburg_address)

But to represent large chunks of text, we would use strings:

"Four score and seven years ago..."

2. A Ruby symbol is a label in a free-form enumeration

In C++ (and many other languages), we can use "enumerations" to represent families of related constants:

enum BugStatus { OPEN, CLOSED };
BugStatus original_status = OPEN;
BugStatus current_status  = CLOSED;

But because Ruby is a dynamic language, we don't worry about declaring a BugStatus type, or keeping track of the legal values. Instead, we represent the enumeration values as symbols:

original_status = :open
current_status  = :closed

3. A Ruby symbol is a constant, unique name

In Ruby, we can change the contents of a string:

"foo"[0] = ?b # "boo"

But we can't change the contents of a symbol:

:foo[0]  = ?b # Raises an error

Similarly, we can have two different strings with the same contents:

# Same string contents, different strings.
"open".object_id != "open".object_id

But two symbols with the same name are always the same underlying object:

# Same symbol name, same object.
:open.object_id == :open.object_id

4. A Ruby symbol is an "interned" string

In Ruby, we can convert a string to a symbol using intern:

"foo".intern # returns :foo

intern maintains a hash table mapping strings to the corresponding symbol. The first time intern sees a string, it creates a new symbol and stores it the hash table. The next time intern sees a string, it retrieves the original object.

We could implement our own version of Symbol and intern as follows:

class MySymbol
  TABLE={}
  def initialize(str) @str = str end
  def to_s() @str end
  def ==(other)
    self.object_id == other.object_id
  end
end

class String
  def my_intern
    table = MySymbol::TABLE
    unless table.has_key?(self)
      table[self] = MySymbol.new(self)
    end
    table[self]
  end
end

"foo".my_intern

5. A Ruby symbol is an object with O(1) comparison

To compare two strings, we potentially need to look at every character. For two strings of length N, this will require N+1 comparisons (which computer scientists refer to as "O(N) time").

def string_comp str1, str2
  return false if str1.length != str2.length
  for i in 0...str1.length
    return false if str1[i] != str2[i]
  end
  return true
end
string_comp "foo", "foo"

But since every appearance of :foo refers to the same object, we can compare symbols by looking at object IDs. We can do this with a single comparison (which computer scientists refer to as "O(1) time").

def symbol_comp sym1, sym2
  sym1.object_id == sym2.object_id
end
symbol_comp :foo, :foo

6. A Ruby symbol is a Lisp identifier

The earliest ancestors of Ruby symbols are Lisp symbols. In Lisp, symbols are used to represent "identifiers" (variable and function names) in a parsed program. Let's say we have a have a file named double.l containing a single function:

(defun double (x)
  (* x 2))

We can parse this file using read:

(read "double.l")
;; Returns '(defun double (x) (* x 2))

This returns a nested list containing the symbols defun, double, *, x (twice) and the number 2.

7. A Ruby symbol is a Ruby identifier

In Ruby, we can look up identifiers (variable, function and constant names) while the program is running. This is typically done using symbols.

class Demo
  # The stuff we'll look up.
  DEFAULT = "Hello"
  def initialize
    @message = DEFAULT
  end
  def say() @message end

  # Use symbols to look up identifiers.
  def look_up_with_symbols
    [Demo.const_get(:DEFAULT),
     method(:say),
     instance_variable_get(:@message)]
  end
end

Demo.new.look_up_with_symbols

8. A Ruby symbol is the keyword for a keyword argument

When passing keyword arguments to a Ruby function, we specify the keywords using symbols:

# Build a URL for 'bug' using Rails.
url_for :controller => 'bug',
        :action => 'show',
        :id => bug.id

9. A Ruby symbol is an excellent choice for a hash key

Typically, we'll use symbols to represent the keys of a hash table:

options = {}
options[:auto_save]     = true
options[:show_comments] = false

10. A Ruby symbol is like a Mac OSType

The MacOS uses four-character abbreviations to represent open-ended enumerations:

enum {
  kSystemFolderType  = 'macs',
  kDesktopFolderType = 'desk',
  // ...and so on...
  kTrashFolderType   = 'trsh'
};
OSType folder = kSystemFolderType;

In Ruby, we'd typically use symbols for the same purpose:

:system_folder
:desktop_folder
:trash_folder

11. A Ruby symbol is a memory leak

Because of the way Ruby symbols are stored, they can never be garbage collected. So if we create 10,000 one-off symbols that we'll never use again, we'll never get the memory back.

Some Scheme implementations use a clever version of intern that looks up symbols using a weak hash table. This allows symbols to be garbage collected without destroying their uniqueness properties.

12. A Ruby symbol is a clever way to store only a single copy of a string

(For a similar idea, see this article.)

Let's say we're working on natural language parser that tries to understand breakfast orders. We have a corpus of 30,000 sentences that represent real-world breakfast orders, and we're trying to find the patterns.

But even though we have a huge number of sentences, the actually vocubulary is fairly limited. We don't want to store 15,000 copies of the word "bacon" in memory! Instead, we can use symbols to represent the individual words:

corpus = [
  [:i, :want, :some, :bacon],
  [:i, :want, :some, :eggs],
  [:give, :me, :some, :bacon],
  [:chunky, :bacon],
  # ... 29,995 more phrases ...
  [:some, :toast, :please]
]

In the early days of AI, many Lisp programs used exactly this strategy for representing text.

13. A Ruby symbol is a C typedef named "ID"

Internally, Ruby 1.8 represents symbols using the type ID. This is a typedef for an unsigned integer. An ID represents an entry in Ruby's symbol table.

typedef unsigned long ID;

Some interesting symbol-related functions include:

// Enter a C string into symbol table.
ID rb_intern(const char *name);
// Convert an ID to a Symbol object.
#define ID2SYM(x)
// Convert a String to a Symbol object.
VALUE rb_str_intern(VALUE s);

Other explanations of Ruby symbols

If none of these explanations work for you, you might have luck with one of the following:

  1. Symbols Are Not Immutable Strings
  2. Using Symbols for the Wrong Reasons
  3. Yet Another Blog About Ruby Symbols
  4. Digging into Ruby Symbols
  5. Understanding Ruby Symbols