Objects as Ruby Hash Keys

If you've been following the Ruby 3x3 effort, you've probably heard of Optcarrot. It's an NES emulator written in pure Ruby.

I was recently looking over the Optcarrot source and one interesting detail stuck out to me. It makes extensive use of a feature of Ruby's hashes that is often overlooked, but quite useful. That is the ability to use any object as a hash key.

The context: NES Memory Mapping

As high-level programmers, we tend to think of memory as RAM. But at a lower-level, "memory" has many other uses.

Reading and writing to "memory" is how the NES's CPU communicates with the GPU, the control pads and any special electronics on the cartridge. Depending on the address used, a write_to_memory method call could reset the joystick, swap out VRAM or play a sound.

How would you implement this in Ruby?

Optcarrot does it by storing two Method objects for each of the 65536 addresses. One is a getter and one is a setter. It looks something like this:

@getter_methods[0x0001] = @ram.method(:[])
@setter_methods[0x0001] = @ram.method(:[]=)

The problem: Duplicate Objects

The problem with using Object#method in this way is that it creates a lot of individual Method objects that are identical.

We can see this by looking at object_id:

> a = []
> a.method(:[]=).object_id
=> 70142391223600
> a.method(:[]=).object_id
=> 70142391912420

The two Method objects have different object_id values, so they're different objects even though they do the same thing.

Normally, we might not care about a few extra Method objects, but in this case we're dealing with thousands of them.

The solution: Memoizing via a Hash

Optcarrot avoids the duplicate Method object problem with a trick that's so simple it's easy to overlook.

It uses a hash to memoize and deduplicate. The simplified code below demonstrates the technique:

def initialize
  @setter_methods = []
  @setter_cache = {}
  ...
end

def add_setter(address, setter)
  # Doesn't store duplicates
  @setter_cache[setter] ||= setter

  # Use the deduped version
  @setter_methods[address] = @setter_cache[setter]  
end

This works because Hash doesn't care what kind of objects you give it as keys.

If this is confusing, try it in IRB with strings:

> cache = {}
> cache["foo"] ||= "bar" 
=> "bar"
cache["foo"] ||= "baz"
=> "bar"

Now, consider that in Ruby, strings are instances of the class String. The mechanism Ruby used to use the string as a hash key is basically the same as the one used to store a Method object.

How Hash calculates equality

When using non-string objects as hash keys, the question arises: how does Hash know if two objects are equal?

The answer is that it uses the Object#hash method. This method goes through your object and recursively generates a hash. It looks like this:

> a.method(:[]=).hash
=> 929915641391564853

Because identical objects produce identical hash values it can be used as a test of equality.

a.hash == b.hash

Interesting enough, this is the same approach used by the eql? method:

a.eql?(b)

This works with the Method objects in our example:

> a.method(:[]=).hash == a.method(:[]=).hash
=> true

Conclusion

Having gotten used to Ruby web development patterns, it was really interesting for me to look at the optcarrot source and see how a real-time non-web app uses different patterns. In a web app I doubt I'll ever make an array with 65536 elements, but here, as part of the setup for a "desktop" app, it makes a lot of sense.

If you have any questions or comments, please be in touch at starr@honeybadger.io or @StarrHorne on Twitter.