<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Random Hacks</title>
    <link>http://www.randomhacks.net/</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Technology and Other Fun Stuff</description>
    <item>
      <title>Best article I've seen on SOPA</title>
      <description>&lt;p&gt;Wikipedia, Google and many other internet sites are protesting PIPA and SOPA today. But their &lt;a href="http://en.wikipedia.org/wiki/Wikipedia:SOPA_initiative/Learn_more"&gt;official&lt;/a&gt; &lt;a href="https://www.google.com/landing/takeaction/"&gt;explanations&lt;/a&gt; don&amp;#8217;t include very many details about the actual legislation.&lt;/p&gt;

&lt;p&gt;If you&amp;#8217;d like to learn more, check out &lt;a href="http://www.dailykos.com/story/2012/01/18/1055849/-Confessions-Of-A-Hollywood-Professional:-Why-I-Cant-Support-the-Stop-Online-Piracy-Act-%28UPDATED%29"&gt;this excellent background piece&lt;/a&gt; by a freelance film editor.&lt;/p&gt;</description>
      <pubDate>Wed, 18 Jan 2012 19:29:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:09960263-a39f-4216-97c0-14408c54d207</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2012/01/18/best-article-on-sopa</link>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/857</trackback:ping>
    </item>
    <item>
      <title>Screencast: Use Rails and RDF.rb to parse Best Buy product reviews</title>
      <description>&lt;p&gt;In the past few years, many companies have been embedding machine-readable metadata in their web pages. Among these is Best Buy, which &lt;a href="http://jay.beweep.com/category/rdfa/"&gt;provides extensive RDFa data&lt;/a&gt; describing their products, prices and user reviews.&lt;/p&gt;

&lt;p&gt;The following 20-minute screencast shows how to use Ruby 1.9.2, Rails 3.1rc1, &lt;a href="http://rdf.rubyforge.org/"&gt;RDF.rb&lt;/a&gt; and my &lt;a href="http://rdf-agraph.rubyforge.org/"&gt;rdf-agraph&lt;/a&gt; gem to compare user ratings of the iPad and various Android Honeycomb tablets.&lt;/p&gt;&lt;p&gt;&lt;a href="http://www.randomhacks.net/articles/2011/06/05/screencast-rails-rdf-agraph-product-reviews"&gt;Read More&lt;/a&gt;&lt;/p&gt;</description>
      <pubDate>Sun, 05 Jun 2011 19:07:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:f2fc22a5-16da-46a9-96a2-999264effcf1</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2011/06/05/screencast-rails-rdf-agraph-product-reviews</link>
      <category>Ruby</category>
      <category>Rails</category>
      <category>RDF</category>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/850</trackback:ping>
    </item>
    <item>
      <title>Heroku &amp;quot;Celadon Cedar&amp;quot; review</title>
      <description>&lt;p&gt;Heroku just released a new version of their hosting service for Ruby on Rails. It&amp;#8217;s called &lt;a href="http://blog.heroku.com/archives/2011/5/31/celadon_cedar/"&gt;Celadon Cedar&lt;/a&gt;, and it adds support for arbitrary background processes, Node.js servers and long-polling over HTTP. &lt;/p&gt;

&lt;p&gt;I just finished porting a large Rails 3.0 application to Heroku&amp;#8217;s Ceder stack from &lt;a href="http://www.opscode.com/chef/"&gt;Chef&lt;/a&gt;+&lt;a href="http://aws.amazon.com/ec2/"&gt;EC2&lt;/a&gt;, and I&amp;#8217;m deeply impressed. But there are still some rough edges, especially with regard to asset caching.&lt;/p&gt;

&lt;h3&gt;Procfiles are really cool&lt;/h3&gt;

&lt;p&gt;Previous versions of Heroku could only run two types of processes: Web servers and &lt;a href="http://rubydoc.info/gems/delayed_job/2.1.4/frames"&gt;delayed_job&lt;/a&gt; workers. If you needed to monitor a ZeroMQ queue or run a cron job every minute, you were out of luck. So even though I loved Heroku, about 2/3rds of my clients couldn&amp;#8217;t even consider using it.&lt;/p&gt;

&lt;p&gt;Celadon Cedar, however, allows you to create a &lt;code&gt;Procfile&lt;/code&gt; specifying a list of process types to run:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;web:    bundle exec rails server -p $PORT
worker: bundle exec rake jobs:work
clock:  bundle exec clockwork config/clock.rb&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Once you&amp;#8217;ve deployed your project, you can specify how many of each process you want:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_sh "&gt;heroku scale web=3 worker=2 clock=1&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Even better, if you&amp;#8217;re running on a development machine, or if you want to deploy to a regular Linux server, you can use the &lt;a href="http://adam.heroku.com/past/2011/5/9/applying_the_unix_process_model_to_web_apps/"&gt;Foreman&lt;/a&gt; gem to launch the processes manually, or to generate init scripts:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_sh "&gt;foreman start
foreman export upstart /etc/init -u username&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If you&amp;#8217;re feeling more ambitious, you can also run &lt;a href="http://michaelvanrooijen.com/articles/2011/06/01-more-concurrency-on-a-single-heroku-dyno-with-the-new-celadon-cedar-stack/"&gt;Unicorn&lt;/a&gt; and &lt;a href="http://devcenter.heroku.com/articles/node-js"&gt;Node.js&lt;/a&gt; servers on Heroku.&lt;/p&gt;

&lt;h3&gt;Asset caching is even worse than before&lt;/h3&gt;

&lt;p&gt;Previous versions of Heroku had a built-in &lt;a href="http://www.varnish-cache.org/"&gt;Varnish cache&lt;/a&gt;, which would cache CSS, JavaScripts and images for 12 hours. The Varnish cache was automatically flushed on redeploy, so it gave you a nice performance boost for zero work.&lt;/p&gt;

&lt;p&gt;However, if you were running a high-performance site, you would generally want to run all your JavaScript and CSS through &lt;a href="http://developer.yahoo.com/yui/compressor/"&gt;YUI Compressor&lt;/a&gt;, which  vastly improves your download times. Under the previous version of Heroku, this was annoying to set up: You had to either commit your compiled assets into &lt;code&gt;git&lt;/code&gt;, or deploy them to a CDN manually.&lt;/p&gt;

&lt;p&gt;The Celadon Cedar stack, unfortunately, doesn&amp;#8217;t make it any easier to set up YUI Compressor, and it removes the existing Varnish cache. In place of Varnish, Heroku &lt;a href="http://devcenter.heroku.com/articles/http-caching"&gt;encourages you to set up Rack::Cache with memcached as a storage backend&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You may want to consider adding the following line to your &lt;code&gt;config.ru&lt;/code&gt; file, right before the &lt;code&gt;run&lt;/code&gt; statement:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;use&lt;/span&gt; &lt;span class="constant"&gt;Rack&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Deflater&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Combined with Rack::Cache, this will give you back some of the functionality of Varnish. But it&amp;#8217;s a lot more work than you needed to do before, and the results aren&amp;#8217;t as good. Heroku made this decision deliberately, because Varnish prevented them for doing cool things with Node.js servers and long-polled HTTP connections. But it still represents a retreat from Heroku&amp;#8217;s famous ease of use.&lt;/p&gt;

&lt;p&gt;What Heroku&amp;#8217;s Cedar stack really needs is first-class support for Rack::Cache, Rack::Deflator, and the new &lt;a href="http://www.rubyinside.com/how-to-rails-3-1-coffeescript-howto-4695.html"&gt;Sprockets asset caching in Rails 3.1&lt;/a&gt;. Please, just allow me to add a couple of lines to my &lt;code&gt;Gemfile&lt;/code&gt; and have everything work automagically. Yeah, you&amp;#8217;ve spoiled me and made me lazy.&lt;/p&gt;

&lt;h3&gt;You&amp;#8217;ll have to upgrade to Ruby 1.9.2&lt;/h3&gt;

&lt;p&gt;According to the &lt;a href="http://devcenter.heroku.com/articles/cedar#stack_software_versions"&gt;official documentation&lt;/a&gt;, only Ruby 1.9.2 is supported under Celadon Cedar. This isn&amp;#8217;t entirely surprising—Rails 3.1 recommends Ruby 1.9.2 as well—but it may be a problem for some users.&lt;/p&gt;

&lt;p&gt;Fortunately, my client&amp;#8217;s application worked flawlessly under Ruby 1.9.2 with only a single change to the &lt;code&gt;Gemfile&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;Running a cron job once per minute is really easy, but it costs $71/month&lt;/h3&gt;

&lt;p&gt;One of Heroku&amp;#8217;s engineers explains how to run high-frequency cron jobs using &lt;a href="http://adam.heroku.com/past/2010/6/30/replace_cron_with_clockwork/"&gt;Clockwork&lt;/a&gt; and &lt;a href="http://rubydoc.info/gems/delayed_job/2.1.4/frames"&gt;delayed_job&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Basically, you add a couple of lines to your &lt;code&gt;Procfile&lt;/code&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;worker: bundle exec rake jobs:work
clock:  bundle exec clockwork config/clock.rb&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&amp;#8230;and you put something like the following in &lt;code&gt;config/clock.rb&lt;/code&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="constant"&gt;File&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;expand_path&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;../environment&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt;  &lt;span class="constant"&gt;__FILE__&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

&lt;span class="comment"&gt;# Run our heartbeat once per minute.&lt;/span&gt;
&lt;span class="ident"&gt;every&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;minutes&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;myapp.heartbeat&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="constant"&gt;MyApp&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;delay&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;heartbeat&lt;/span&gt; &lt;span class="punct"&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This creates a DelayedJob and hands it off to our worker process. According to the tutorial, you&amp;#8217;re supposed to do the actual work in a separate process, so as not to interfere with other events. This approach is elegant, but it&amp;#8217;s going to cost you $71/month for two &amp;#8220;dynos&amp;#8221;. Ouch.&lt;/p&gt;

&lt;h3&gt;Cedar is a great new stack, but it needs polishing&lt;/h3&gt;

&lt;p&gt;I&amp;#8217;m really impressed with Celadon Cedar. Heroku has vastly improved their support for complex applications with a lot of moving parts. But along the way, they&amp;#8217;ve made it slightly harder to deploy simple applications, and they still don&amp;#8217;t have a painless way to do asset caching. Of course, these minor drawbacks should improve dramatically once the Ruby community plays with Cedar for a few weeks.&lt;/p&gt;

&lt;p&gt;Many thanks to Heroku for a great new release! I&amp;#8217;ll be moving more applications over soon.&lt;/p&gt;

&lt;p&gt;Does anybody have any suggestions on how make better use of Cedar and Rails?&lt;/p&gt;</description>
      <pubDate>Fri, 03 Jun 2011 19:20:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:3df2e7b3-d7e1-4fe5-9468-ffa7596237a2</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2011/06/03/heroku-celadon-cedar-review</link>
      <category>Ruby</category>
      <category>Rails</category>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/849</trackback:ping>
    </item>
    <item>
      <title>Derivatives of algebraic data structures: An excellent tutorial</title>
      <description>&lt;p&gt;Last month, the folks at Lab49 explained &lt;a href="http://blog.lab49.com/archives/3011"&gt;how to compute the derivative of a data structure&lt;/a&gt;. This is a great example of how to write about mathematical subjects for a casual audience: They draw analogies to well-known programming languages, they follow a single, well-chosen thread of explanation, and there&amp;#8217;s a clever payoff at the end.&lt;/p&gt;

&lt;p&gt;The Lab49 blog post is, of course, based on two &lt;a href="http://strictlypositive.org/diff.pdf"&gt;classic&lt;/a&gt; &lt;a href="http://strictlypositive.org/Dissect.pdf"&gt;papers&lt;/a&gt; by Conor McBride, and Huet&amp;#8217;s original paper &lt;a href="http://www.st.cs.uni-saarland.de/edu/seminare/2005/advanced-fp/docs/huet-zipper.pdf"&gt;The Zipper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you&amp;#8217;re interested in real-world applications of this technique, there&amp;#8217;s a great explanation in the final chapter of &lt;a href="http://learnyouahaskell.com/zippers"&gt;Learn You a Haskell for Great Good&lt;/a&gt;. If you&amp;#8217;re interested in some deeper mathematical connections, see the &lt;a href="http://lambda-the-ultimate.org/node/1957"&gt;discussion at Lambda the Ultimate&lt;/a&gt;.&lt;/p&gt;</description>
      <pubDate>Fri, 20 May 2011 20:01:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:1f6b612b-f3cd-4f3a-8442-d23ce7769c8c</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2011/05/20/derivatives-of-algebraic-data-structures-an-excellent-tutorial</link>
      <category>Haskell</category>
      <category>Math</category>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/836</trackback:ping>
    </item>
    <item>
      <title>What do these fixed points have in common?</title>
      <description>&lt;p&gt;A question asked while standing in the shower: What do all of the following have in common?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Banach_fixed_point_theorem"&gt;Banach&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Brouwer_fixed_point_theorem"&gt;Brouwer fixed points&lt;/a&gt;. If you&amp;#8217;re in Manhattan, and you crumple up a map of Manhattan and place it on the ground, at least one point on your map will be exactly over the corresponding point on the ground. (This is true even if your map is &lt;em&gt;larger&lt;/em&gt; than life.)&lt;/li&gt;
&lt;li&gt;The fixed points computed by the &lt;a href="http://en.wikipedia.org/wiki/Fixed_point_combinator"&gt;Y combinator&lt;/a&gt;, which is used to construct anonymous recursive functions in the lambda calculus.&lt;/li&gt;
&lt;li&gt;The &lt;a href="http://en.wikipedia.org/wiki/Nash_equilibrium"&gt;Nash equilibrium&lt;/a&gt;, which is the stable equilibrium of a multi-player game (and one of the key ideas of economics). See also this lovely—if metaphorical—&lt;a href="http://www.scottaaronson.com/blog/?p=418"&gt;rant by Scott Aaronson&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;a href="http://en.wikipedia.org/wiki/Eigenvector"&gt;eigenvectors of a matrix&lt;/a&gt;, which will still point in the same direction after multiplication by the matrix.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;At what level of abstraction are all these important ideas really just the same idea? If we strip everything down to &lt;a href="http://en.wikipedia.org/wiki/Abstract_nonsense"&gt;generalized abstract nonsense&lt;/a&gt;, is there a nice simple formulation that covers all of the above?&lt;/p&gt;

&lt;p&gt;(I can&amp;#8217;t play with this shiny toy today; I have to work.)&lt;/p&gt;</description>
      <pubDate>Thu, 12 May 2011 12:09:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:59fdd543-cb2f-4186-8d02-7ddf121ccc7c</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2011/05/12/what-do-fixed-points-have-in-common</link>
      <category>Math</category>
      <category>Haskell</category>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/832</trackback:ping>
    </item>
    <item>
      <title>AWS outage timeline &amp;amp; downtimes by recovery strategy</title>
      <description>&lt;p&gt;Renting a server from Amazon is no substitute for a disaster recovery plan.&lt;/p&gt;

&lt;p&gt;If you run your own servers, you need backups.  If you can&amp;#8217;t afford to go
down, you also need offsite replication. But if you lease servers in the
cloud, how can you protect against problems like this week&amp;#8217;s Amazon outage?&lt;/p&gt;

&lt;p&gt;Keep reading for a timeline of the outage, plus a list of recovery
strategies and the minimum downtime that each would have incurred.&lt;/p&gt;

&lt;h3&gt;A timeline of the Amazon outage&lt;/h3&gt;

&lt;p&gt;Here&amp;#8217;s a timeline of what went wrong, and when it was fixed. Note, in
particular, the window from roughly 1:00 AM to 1:48 PM PST when several of
Amazon&amp;#8217;s availability zones were partially unavailable. (For a
glossary of Amazon Web Service terminology, see the bottom of this post.)&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve also included Heroku&amp;#8217;s status reports on this timeline.&lt;/p&gt;

&lt;div style="font-weight: bold; text-align: center"&gt;21 April 2011&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;1:15 AM PDT&lt;/strong&gt; Heroku begins investigating high error rates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1:41 AM PDT&lt;/strong&gt; Amazon admits they are seeing problems with EBS volumes and
EC2 instances in US East 1.  The outage affects multiple availability
zones.  Amazon later described the problem as follows:&lt;/p&gt;

&lt;blockquote&gt;
A networking event early this morning triggered a large amount of
re-mirroring of EBS volumes in US-EAST-1. This re-mirroring created a
shortage of capacity in one of the US-EAST-1 Availability Zones, which
impacted new EBS volume creation as well as the pace with which we could
re-mirror and recover affected EBS volumes. Additionally, one of our
internal control planes for EBS has become inundated such that it&amp;#8217;s
difficult to create new EBS volumes and EBS backed instances. We are
working as quickly as possible to add capacity to that one Availability
Zone to speed up the re-mirroring, and working to restore the control plane
issue. We&amp;#8217;re starting to see progress on these efforts, but are not there
yet. We will continue to provide updates when we have them.
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;1:52 AM PDT&lt;/strong&gt; Heroku reports that applications and tools are functioning
intermittently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3:05 AM PDT&lt;/strong&gt; Amazon reports that RDS databases replicated across
multiple Availability Zones are not failing over as expected.  This is a
big deal, because these multi-AZ RDS databases are intended to be an
expensive, highly-reliable option for storing data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1:48 PM PDT&lt;/strong&gt; EBS volumes and EC2 instances are now working correctly in
all but one availability zone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2:15 PM PDT&lt;/strong&gt; Heroku reports that they can now launch new EBS instances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2:35 PM PDT&lt;/strong&gt; Amazon restores access to &amp;#8220;majority&amp;#8221; of multi-AZ RDS
databases.  (There&amp;#8217;s nothing in the Amazon timeline to indicate when &lt;em&gt;all&lt;/em&gt;
of the multi-AZ RDS databases came back online.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3:07 PM PDT&lt;/strong&gt; Heroku brings core services back online, and restores
service to many applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4:15 PM PDT&lt;/strong&gt; Heroku reports: &amp;#8220;In some cases the process of bringing many
applications online simultaneously has created intermittent availability
and elevated error rates.&amp;#8221;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8:27 PM PDT&lt;/strong&gt; Heroku finishes restoring API services.&lt;/p&gt;

&lt;div style="font-weight: bold; text-align: center"&gt;22 April 2011&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;2:19 AM PDT&lt;/strong&gt; Heroku reports that all dedicated databases are back
online.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6:25 AM PDT&lt;/strong&gt; Heroku reports that new application creation is enabled.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1:30 PM PDT&lt;/strong&gt; Amazon reports &amp;#8220;majority&amp;#8221; of EBS volumes in affected zone
have been recovered.  Remaining volumes will require a more time-consuming
recovery process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;9:11 PM PDT&lt;/strong&gt; Amazon reports that &amp;#8220;control plane&amp;#8221; congestion is limiting
the speed at which they can recover the remaining volumes.&lt;/p&gt;

&lt;div style="font-weight: bold; text-align: center"&gt;23 April 2011&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;11:54 AM PDT&lt;/strong&gt; Amazon is still wrestling with control plane congestion.&lt;/p&gt;

&lt;blockquote&gt;
Quick update. We&amp;#8217;ve tried a couple of ideas to remove the bottleneck in
opening up the APIs, each time we&amp;#8217;ve learned more but haven&amp;#8217;t yet solved
the problem.  We are making progress, but much more slowly than we&amp;#8217;d
hoped. Right now we&amp;#8217;re setting up more control plane components that should
be capable of working through the backlog of attach/detach state changes
for EBS volumes. These are coming online, and we&amp;#8217;ve been seeing progress on
the backlog, but it&amp;#8217;s still too early to tell how much this will accelerate
the process for us.
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;8:39 PM PDT&lt;/strong&gt; Amazon finishes re-enabling their APIs for all recovered
volumes in the affected zone.  Not all EBS volumes have been recovered yet,
however.&lt;/p&gt;

&lt;blockquote&gt;
We continue to see stability in the service and are confident now that that
the service is operating normally for all API calls and all restored EBS
volumes.
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;8:39 PM PDT&lt;/strong&gt; Heroku reports that all applications are back online,
though a few still cannot deploy new code via git.&lt;/p&gt;

&lt;div style="font-weight: bold; text-align: center"&gt;24 April 2011&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;3:26 AM PDT&lt;/strong&gt; Amazon re-enables RDS APIs in the affected zone, but not
all databases have been recovered:&lt;/p&gt;

&lt;blockquote&gt;
The RDS APIs for the affected Availability Zone have now been restored. We
will continue monitoring the service very closely, but at this time RDS is
operating normally in all Availability Zones for all APIs and restored
Database Instances. Recovery is still underway for a small number of
Database Instances in the affected Availability Zone.
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;5:21 AM PDT&lt;/strong&gt; Heroku reports that all functionality is fully restored,
including deploying new applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7:35 PM PDT&lt;/strong&gt; Amazon reports that all EBS volumes are back online.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7:39 PM PDT&lt;/strong&gt; Amazon reports that all RDS databases are back online.&lt;/p&gt;

&lt;h3&gt;Strategies for surviving a major cloud outage, and associated downtime&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Rely on a single EBS volume with no snapshots.&lt;/strong&gt; If you relied on
  single EBS volume with no shapshots, there&amp;#8217;s a chance that your site
  would have been offline for &lt;strong&gt;over 3.5 days&lt;/strong&gt; after the initial outage.
  There&amp;#8217;s also at least a 0.1% to 0.5% annual chance of losing your EBS
  volume entirely.  This is &lt;em&gt;not&lt;/em&gt; a recommended approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Deploy into a single availability zone, with EBS snapshots.&lt;/strong&gt; In this
  scenario, if an availability zone goes down, you can theoretically
  restore from backup into another availability zone.  During this recent
  outage, your site might have remained offline for over &lt;strong&gt;12 hours&lt;/strong&gt;, and you
  might have lost any changes since your last backup (unless you
  reintegrated them manually).  Given Amazon&amp;#8217;s record during 2009
  and 2010, this could still give you 99.95% uptime if no other EBS volume
  failures occurred.  Despite the recent events, this may still be a viable
  strategy for many smaller, lower-revenue sites.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Rely on multi-AZ RDS databases to fail over to another availability zone.&lt;/strong&gt; This approach &lt;em&gt;should&lt;/em&gt; have lower downtime than
  relying on EBS snapshots, but in this case, the multi-AZ RDS failover
  mechanisms took &lt;strong&gt;longer than 14 hours&lt;/strong&gt; for some users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Run in 3 AZs, at no more than 60% capacity in each.&lt;/strong&gt; This is the
  approach taken by &lt;a href="https://twitter.com/#!/adrianco/status/61076362680745984"&gt;Netflix&lt;/a&gt;, which sailed through this
  outage without &lt;strong&gt;no known downtime&lt;/strong&gt;.  If a single AZ fails, then the
  remaining two zones will be at 90% capacity.  And because the extra
  capacity is running at all times, Netflix doesn&amp;#8217;t need to launch new
  instances in the middle of a &amp;#8220;bank run&amp;#8221; (see below).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Replicate data to another AWS region or cloud provider.&lt;/strong&gt; This is still
  the gold standard for sites which require high uptime guarantees.
  Unfortunately, it requires transmitting large amounts of data over the
  public internet, which is both expensive and slow.  In this case,
  downtime is function of external systems and how quickly they can fail
  over to the replicated database.&lt;/p&gt;

&lt;p&gt;There are some other approaches, such as writing backups and transaction
logs to S3, where they are likely to remain available even in the case of
severe outages.&lt;/p&gt;

&lt;h3&gt;Lessons learned&lt;/h3&gt;

&lt;p&gt;For some excellent post-mortems, see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://agilesysadmin.net/ec2-outage-lessons"&gt;Today’s EC2 / EBS Outage: Lessons learned&lt;/a&gt;. A good overall analysis, with recommendations.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://joyeur.com/2011/04/22/on-cascading-failures-and-amazons-elastic-block-store/"&gt;On Cascading Failures and Amazon’s Elastic Block Store&lt;/a&gt;. How emergency fail-over code can actually make an outage worse.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Update:&lt;/em&gt; &lt;a href="http://blog.rightscale.com/2011/04/25/amazon-ec2-outage-summary-and-lessons-learned/"&gt;Amazon EC2 outage: summary and lessons learned&lt;/a&gt;. RightScale has posted an excellent post-mortem. They note that the outage actually spread to more EBS volumes over time, and link to a long list of related posts. (They also claim that the other AZs were functioning again after 4 hours, which doesn&amp;#8217;t match either Amazon&amp;#8217;s public claims or the experiences of people I&amp;#8217;ve spoken to.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here are some of the most important points:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The biggest danger in a well-engineered cloud system is a &amp;ldquo;&lt;a href="http://joyeur.com/2011/04/22/on-cascading-failures-and-amazons-elastic-block-store/"&gt;run on the bank&lt;/a&gt;&amp;#8221;, where initial failures trigger error-recovery code, which in turn may drive the load far beyond normal limits.&lt;/strong&gt; According to Amazon, an initial network problem triggered an
  EBS re-mirroring, which in turn overloaded their management plane.  This,
  in turn, triggered emergency recovery scripts written by AWS customers,
  forcing the total load even higher.  To stabilize the situation, Amazon
  was forced to disable API access to multiple zones.  Just as in 1933, the
  easiest solution to a bank run is a &lt;a href="http://en.wikipedia.org/wiki/Emergency_Banking_Act"&gt;bank holiday&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Availability Zone failures are correlated.&lt;/strong&gt; Even though Amazon claims
  that multiple availability zones should not fail at the same time, it&amp;#8217;s
  clear that all the availability zones within a region share a management
  plane.  This means that a large enough failure can overload the shared
  management plane.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. EBS remains the weakest link.&lt;/strong&gt; Recent months have seen widespread
  &lt;a href="http://blog.reddit.com/2011/03/why-reddit-was-down-for-6-of-last-24.html"&gt;complaints about EBS&lt;/a&gt;, and Netflix has published an article
  on &lt;a href="http://perfcap.blogspot.com/2011/03/understanding-and-using-amazon-ebs.html"&gt;working around those limitations&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Few cloud providers publish their disaster recovery plans, making it hard to estimate downtime.&lt;/strong&gt;  If you were a Heroku customer last week,
  you had no way to evaluate how Heroku would respond to a major outage, or
  their plans for keeping your site on the air.  As it turns out, they had
  widespread dependencies on EBS, and no plan for getting Heroku-based
  sites back on the air if an availability zone failed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Test your disaster recovery plan.&lt;/strong&gt;  If you haven&amp;#8217;t tested your
  disaster recovery plan, then you have no idea how long it will take you
  to get back on the air.&lt;/p&gt;&lt;p&gt;&lt;a href="http://www.randomhacks.net/articles/2011/04/25/aws-outage-timeline-and-recovery-strategy-downtimes"&gt;Read More&lt;/a&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 25 Apr 2011 08:41:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:a86cd5b6-62f2-493f-adab-e9955d543b3a</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2011/04/25/aws-outage-timeline-and-recovery-strategy-downtimes</link>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/815</trackback:ping>
    </item>
    <item>
      <title>The state of Ruby, RDF and Rails 3</title>
      <description>&lt;p&gt;Recently, I was investigating the state of RDF in the Ruby world.  Here are
some notes, in case anybody is curious.  I have used only a few of
these Ruby RDF libraries, so please feel free to add your own comments with
corrections and other alternatives.&lt;/p&gt;

&lt;p&gt;There&amp;#8217;s also some stuff about &lt;a href="http://rubyonrails.org/screencasts/rails3/active-relation-active-model"&gt;ActiveModel and ActiveRelation&lt;/a&gt; down at the end, for people who are interested in Rails 3.&lt;/p&gt;&lt;p&gt;&lt;a href="http://www.randomhacks.net/articles/2010/12/20/the-state-of-ruby-rdf-and-rails-3"&gt;Read More&lt;/a&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 20 Dec 2010 19:56:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:174f6eb6-ee3b-4c43-a665-b623fba65a63</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2010/12/20/the-state-of-ruby-rdf-and-rails-3</link>
      <category>Ruby</category>
      <category>RDF</category>
      <category>Rails</category>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/807</trackback:ping>
    </item>
    <item>
      <title>Feedhose demo: Real-time RSS using Node.js and Socket.io</title>
      <description>&lt;p&gt;Yesterday evening, I released an experimental Node.js/Socket.io application:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://feedhose.randomhacks.net/"&gt;feedhose.randomhacks.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Just leave your web browser open, and watch the New York Times headlines scroll by. Dave Winer is &lt;a href="http://scripting.com/stories/2010/10/13/ericKiddsFeedhoseClientInJ.html"&gt;sending me some traffic&lt;/a&gt; this morning, so I&amp;#8217;m going to find out how well this stack scales.&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve tested it in IE 6, IE 7, Firefox 3.5 and a ridiculously new version of Chrome, and it runs without any major problems. Please &lt;a href="http://www.randomhacks.net/contact/"&gt;let me know&lt;/a&gt; if you encounter any problems in other browsers!&lt;/p&gt;

&lt;p&gt;During the day, I&amp;#8217;ll update this post with technical details: How it works, how much it costs to run, and some tricks I&amp;#8217;m using to keep the system alive.&lt;/p&gt;</description>
      <pubDate>Wed, 13 Oct 2010 12:09:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:d35d4fbc-1f9c-4780-bc9d-2b80d3e21127</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2010/10/13/feedhose-realtime-rss-using-nodejs-and-socketio</link>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/780</trackback:ping>
    </item>
    <item>
      <title>Visualizing WordNet relationships as graphs</title>
      <description>&lt;p&gt;The &lt;a href="http://wordnet.princeton.edu/"&gt;WordNet&lt;/a&gt; database contains all sorts of interesting relationships between words: it can categorize words into hierarchies, find the parts of an object, and answer many other interesting questions.&lt;/p&gt;

&lt;p&gt;The code below relies on the &lt;a href="http://www.nltk.org/"&gt;NLTK&lt;/a&gt; and &lt;a href="http://networkx.lanl.gov/"&gt;NetworkX&lt;/a&gt; libraries for Python.&lt;/p&gt;

&lt;h3&gt;Categorizing words&lt;/h3&gt;

&lt;p&gt;What, exactly, is a dog? It&amp;#8217;s a domestic animal and a carnivore, not to mention a physical entity (as opposed to an abstract entity, such as an idea). WordNet knows all these facts:&lt;/p&gt;

&lt;p&gt;&lt;a href="/files/dog.png"&gt;&lt;img src="/files/dog.png" width="406" height="306" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How do we generate this image? First, we look up the first entry for &amp;#8220;dog&amp;#8221; in WordNet. This returns a &amp;#8220;synset&amp;#8221;, or a set of words with equivalent meanings.&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_python "&gt;dog = wn.synset('dog.n.01')&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Next, we compute the &lt;a href="http://en.wikipedia.org/wiki/Transitive_closure"&gt;transitive closure&lt;/a&gt; of the &lt;a href="http://en.wikipedia.org/wiki/Hyponymy"&gt;hypernym&lt;/a&gt; relationship, or (in English) we look for all the categories to which &amp;#8220;dog&amp;#8221; belongs, and all the categories to which &lt;em&gt;those&lt;/em&gt; categories belong, recursively:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_python "&gt;graph = closure_graph(dog,
                      lambda s: s.hypernyms())&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;After that, we just pass the resulting graph to &lt;a href="http://networkx.lanl.gov/"&gt;NetworkX&lt;/a&gt; for display:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_python "&gt;nx.draw_graphviz(graph)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;The implementation&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;closure_graph&lt;/code&gt; function repeatedly calls &lt;code&gt;fn&lt;/code&gt; on the supplied symset, and uses the result to build a &lt;a href="http://networkx.lanl.gov/"&gt;NetworkX&lt;/a&gt; graph. This code goes at the top of the file, so you can use &lt;code&gt;wn&lt;/code&gt; and &lt;code&gt;nx&lt;/code&gt; in your own code.&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_python "&gt;from nltk.corpus import wordnet as wn
import networkx as nx

def closure_graph(synset, fn):
    seen = set()
    graph = nx.DiGraph()

    def recurse(s):
        if not s in seen:
            seen.add(s)
            graph.add_node(s.name)
            for s1 in fn(s):
                graph.add_node(s1.name)
                graph.add_edge(s.name, s1.name)
                recurse(s1)

    recurse(synset)
    return graph&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;By using a high-quality graph library, we make it much easier to merge, analyze and display our graphs.&lt;/p&gt;

&lt;h3&gt;More graphs&lt;/h3&gt;

&lt;p&gt;Parts of the finger, generated with &lt;code&gt;synset('finger.n.01')&lt;/code&gt; and &lt;code&gt;part_meronyms&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="/files/wn_finger.png"&gt;&lt;img src="/files/wn_finger.png" width="406" height="306" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Types of running, generated with &lt;code&gt;synset('run.v.01')&lt;/code&gt; and &lt;code&gt;hyponyms&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="/files/wn_run.png"&gt;&lt;img src="/files/wn_run.png" width="406" height="306" /&gt;&lt;/a&gt;&lt;/p&gt;</description>
      <pubDate>Tue, 29 Dec 2009 20:38:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:bf20469d-bdce-4636-a7f1-33579f49b54c</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2009/12/29/visualizing-wordnet-relationships-as-graphs</link>
      <category>Python</category>
      <category>NLP</category>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/739</trackback:ping>
    </item>
    <item>
      <title>Experimenting with NLTK</title>
      <description>&lt;p&gt;The &lt;a href="http://www.nltk.org/"&gt;Natural Language Toolkit&lt;/a&gt; for Python is a great framework for simple, non-probabilistic natural language processing. Here are some example snippets (and some trouble-shooting notes).&lt;/p&gt;

&lt;h3&gt;Concordances&lt;/h3&gt;

&lt;p&gt;We can search for &amp;#8220;dog&amp;#8221; in &lt;a href="http://www.gutenberg.org/etext/1695"&gt;Chesterton&amp;#8217;s &lt;em&gt;The Man Who Was Thursday&lt;/em&gt;&lt;/a&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;&amp;gt;&amp;gt;&amp;gt; from nltk.book import *
&amp;gt;&amp;gt;&amp;gt; text9.concordance(&amp;quot;dog&amp;quot;, width=40)
Displaying 4 of 4 matches:
ead of a cat or a dog , it could not ha
d you ever hear a dog bark like that ?&amp;quot;
aid , &amp;quot; is that a dog -- anybody ' s do
og -- anybody ' s dog ?&amp;quot; There broke up&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;Synonyms and categories&lt;/h3&gt;

&lt;p&gt;We can use WordNet to look up synonyms:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_python "&gt;from nltk.corpus import wordnet

dog = wordnet.synset('dog.n.01')
print dog.lemma_names&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This prints:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_python "&gt;['dog', 'domestic_dog', 'Canis_familiaris']&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can also look up the &amp;#8220;hypernyms&amp;#8221;, or larger categories that include the word &amp;#8220;dog&amp;#8221;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_python "&gt;paths = dog.hypernym_paths()

def simple_path(path):
    return [s.lemmas[0].name for s in path]

for path in paths:
    print simple_path(path)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This prints:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_python "&gt;['entity', 'physical_entity', 'object',
 'whole', 'living_thing', 'organism',
 'animal', 'domestic_animal', 'dog']
['entity', 'physical_entity', 'object',
 'whole', 'living_thing', 'organism',
 'animal', 'chordate', 'vertebrate',
 'mammal', 'placental', 'carnivore',
 'canine', 'dog']&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;For more neat examples, take a look at the &lt;a href="http://www.nltk.org/book"&gt;NLTK book&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Installation notes&lt;/h3&gt;

&lt;p&gt;While setting up NLTK, I bumped into a few problems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; The &lt;code&gt;dispersion_plot&lt;/code&gt; function returns immediately without displaying anything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; &lt;a href="http://matplotlib.sourceforge.net/users/shell.html#mpl-shell"&gt;Configure your matplotlib back-end correctly.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; The &lt;code&gt;nltk.app.concordance()&lt;/code&gt; GUI fails with the error:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;out of stack space (infinite loop?)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; &lt;a href="http://code.google.com/p/nltk/issues/detail?id=445"&gt;Recompile Tcl with threads.&lt;/a&gt; On the Mac:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_sh "&gt;sudo port install tcl +threads&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
      <pubDate>Mon, 28 Dec 2009 21:31:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:d23d8142-890b-4aed-ad8c-38b73ffc102a</guid>
      <author>Eric Kidd</author>
      <link>http://www.randomhacks.net/articles/2009/12/28/experimenting-with-nltk</link>
      <category>Python</category>
      <category>NLP</category>
      <trackback:ping>http://www.randomhacks.net/articles/trackback/738</trackback:ping>
    </item>
  </channel>
</rss>
