13 Ways of Looking at a Ruby Symbol
New Ruby programmers often ask, “What, exactly, is a symbol? And how does it differ from a string?” No one answer works for everybody, so—with apologies to Wallace Stevens—here are 13 ways of looking at a Ruby symbol.
A Ruby symbol is:
- …the name of something, not just a blob of text
- …a label in a free-form enumeration
- …a constant, unique name
- …an “interned” string
- …an object with O(1) comparison
- …a Lisp identifier
- …a Ruby identifier
- …the keyword for a keyword argument
- …an excellent choice for a hash key
- …like a Mac OSType
- …a memory leak
- …a clever way to store only a single copy of a string
- …a C typedef named “ID”
1. A Ruby symbol is the name of something, not just a blob of text
In Ruby, we would generally use symbols when referring to things by name:
find_speech(:gettysburg_address)
But to represent large chunks of text, we would use strings:
"Four score and seven years ago..."
2. A Ruby symbol is a label in a free-form enumeration
In C++ (and many other languages), we can use “enumerations” to represent families of related constants:
enum BugStatus { OPEN, CLOSED };
BugStatus original_status = OPEN;
BugStatus current_status = CLOSED;
But because Ruby is a dynamic language, we don’t worry about declaring a
BugStatus
type, or keeping track of the legal values.
Instead, we represent the enumeration values as symbols:
original_status = :open
current_status = :closed
3. A Ruby symbol is a constant, unique name
In Ruby, we can change the contents of a string:
"foo"[0] = ?b # "boo"
But we can’t change the contents of a symbol:
:foo[0] = ?b # Raises an error
Similarly, we can have two different strings with the same contents:
# Same string contents, different strings.
"open".object_id != "open".object_id
But two symbols with the same name are always the same underlying object:
# Same symbol name, same object.
:open.object_id == :open.object_id
4. A Ruby symbol is an “interned” string
In Ruby, we can convert a string to a symbol using intern
:
"foo".intern # returns :foo
intern
maintains a hash table mapping strings to the
corresponding symbol. The first time intern
sees a string, it
creates a new symbol and stores it the hash table. The next time
intern
sees a string, it retrieves the original object.
We could implement our own version of Symbol
and
intern
as follows:
class MySymbol
TABLE={}
def initialize(str) @str = str end
def to_s() @str end
def ==(other)
self.object_id == other.object_id
end
end
class String
def my_intern
table = MySymbol::TABLE
unless table.has_key?(self)
table[self] = MySymbol.new(self)
end
table[self]
end
end
"foo".my_intern
5. A Ruby symbol is an object with O(1) comparison
To compare two strings, we potentially need to look at every character. For two strings of length N, this will require N+1 comparisons (which computer scientists refer to as “O(N) time”).
def string_comp str1, str2
return false if str1.length != str2.length
for i in 0...str1.length
return false if str1[i] != str2[i]
end
return true
end
string_comp "foo", "foo"
But since every appearance of :foo
refers to the same object,
we can compare symbols by looking at object IDs. We can do this with a
single comparison (which computer scientists refer to as “O(1) time”).
def symbol_comp sym1, sym2
sym1.object_id == sym2.object_id
end
symbol_comp :foo, :foo
6. A Ruby symbol is a Lisp identifier
The earliest ancestors of Ruby symbols are Lisp symbols. In Lisp, symbols
are used to represent “identifiers” (variable and function names) in a
parsed program. Let’s say we have a have a file named
double.l
containing a single function:
(defun double (x)
(* x 2))
We can parse this file using read
:
(read "double.l")
;; Returns '(defun double (x) (* x 2))
This returns a nested list containing the symbols defun
,
double
, *
, x
(twice) and the number
2.
7. A Ruby symbol is a Ruby identifier
In Ruby, we can look up identifiers (variable, function and constant names) while the program is running. This is typically done using symbols.
class Demo
# The stuff we'll look up.
DEFAULT = "Hello"
def initialize
@message = DEFAULT
end
def say() @message end
# Use symbols to look up identifiers.
def look_up_with_symbols
[Demo.const_get(:DEFAULT),
method(:say),
instance_variable_get(:@message)]
end
end
Demo.new.look_up_with_symbols
8. A Ruby symbol is the keyword for a keyword argument
When passing keyword arguments to a Ruby function, we specify the keywords using symbols:
# Build a URL for 'bug' using Rails.
url_for :controller => 'bug',
:action => 'show',
:id => bug.id
9. A Ruby symbol is an excellent choice for a hash key
Typically, we’ll use symbols to represent the keys of a hash table:
options = {}
options[:auto_save] = true
options[:show_comments] = false
10. A Ruby symbol is like a Mac OSType
The MacOS uses four-character abbreviations to represent open-ended enumerations:
enum {
kSystemFolderType = 'macs',
kDesktopFolderType = 'desk',
// ...and so on...
kTrashFolderType = 'trsh'
};
OSType folder = kSystemFolderType;
In Ruby, we’d typically use symbols for the same purpose:
:system_folder
:desktop_folder
:trash_folder
11. A Ruby symbol is a memory leak
Because of the way Ruby symbols are stored, they can never be garbage collected. So if we create 10,000 one-off symbols that we’ll never use again, we’ll never get the memory back.
Some Scheme implementations use a clever version of
intern
that looks up symbols using a weak hash table. This
allows symbols to be garbage collected without destroying their uniqueness
properties.
12. A Ruby symbol is a clever way to store only a single copy of a string
(For a similar idea, see this article.)
Let’s say we’re working on natural language parser that tries to understand breakfast orders. We have a corpus of 30,000 sentences that represent real-world breakfast orders, and we’re trying to find the patterns.
But even though we have a huge number of sentences, the actually vocubulary is fairly limited. We don’t want to store 15,000 copies of the word “bacon” in memory! Instead, we can use symbols to represent the individual words:
corpus = [
[:i, :want, :some, :bacon],
[:i, :want, :some, :eggs],
[:give, :me, :some, :bacon],
[:chunky, :bacon],
# ... 29,995 more phrases ...
[:some, :toast, :please]
]
In the early days of AI, many Lisp programs used exactly this strategy for representing text.
13. A Ruby symbol is a C typedef named “ID”
Internally, Ruby 1.8 represents symbols using the type ID
.
This is a typedef for an unsigned integer. An ID
represents an
entry in Ruby’s symbol table.
typedef unsigned long ID;
Some interesting symbol-related functions include:
// Enter a C string into symbol table.
ID rb_intern(const char *name);
// Convert an ID to a Symbol object.
#define ID2SYM(x)
// Convert a String to a Symbol object.
VALUE rb_str_intern(VALUE s);
Other explanations of Ruby symbols
If none of these explanations work for you, you might have luck with one of the following:
Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.
#<Method: Demo#say>