Caching Dictionaries in Python vs. Ruby
A while ago I made a slightly-underinformed post (see the corrections in the comments) trying to draw a difference between Python and Ruby. I’ve finally got a decent example and can explain what I’m getting at.
I’m processing all the items in a big list, and part of that is performing some expensive calculation on an item attribute. There are only a few dozen values for the attribute, so I cache them:
results = {}
for item in really_big_list: if item.attribute in results: result = results[item.attribute] else: result = expensive_calculation(item.attribute) # ... do something useful with item and result
It’s simple and readable, but it’s also lengthy. If I’m doing several expensive calculations on different attributes (and I am), my actual work gets lost in the noise. So I defined a dictionary that can do the heavy operation and cache the result:
class LambdaDict(dict): def init(self, l): super(LambdaDict, self).init() self.l = l
def getitem(self, key): if key in self: return self.get(key) else: self.setitem(key, self.l(key)) return self.get(key)
# and now my code becomes
results = LambdaDict(lambda key:expensive_calculation(key))
for item in really_big_list: result = results[item.attribute] # ... do something useful with item and result
That’s really nice, clean code that I’m satisfied with. The difference in Python and Ruby here is that Ruby hashes (the equivalent of dicts) include this behavior by default, just pass a block (lambda) when constructing the hash and it’ll be called for every missing key.
I’ll have the same behavior in Python or Ruby, it’s just that the default Python object doesn’t give me a handy method for building a dictionary that might perform some arbitrary expensive method on a simple lookup. Ruby is in favor of implicit magic, so it holds out the hook. This is the difference I was trying to get at: Python’s builtin objects have fewer methods and convenient hooks for me to do weird and useful things than Ruby’s, and Ruby will even let me tinker with them. Ruby has open classes, so I can extend both the builtin and defined classes with my own methods:
` module Enumerable # included by Arrays and similar objects def sum inject(0) { | x, y | x + y } end end `{lang=”ruby”} |
This adds a sum() method every Array in my program, whether I construct it (like I created my own results
dict) or get it from some library code. In Python I’d have to define my own Array type and I’d be out of luck if I’m getting back arrays from a library. Monkey patching is Python’s (deliberately clunky) name for adding a method to a single object at runtime. I’d have to monkey patch every object as it’s returned from the library rather than just being able to declare that every object of that class should have my method.
If this is the first time you’ve seen open classes: yes, when I first saw it I felt exactly the same way you do. Dangerous unreliable hackery, a recipe for disaster. But I’ve seen a lot of useful things happen in Rails because of it, and my Ruby projects have benefited from being able to add a method to an existing library’s objects, to overwrite (or just chain my code before or after) another method.
It’s something of a last resort that lets me build the cleanest object system I can as I interface with builtin objects and library code. I don’t have any utility methods floating around or have to tweak each object as I get it back from an API call. As I cook, Ruby’s metaprogramming is the garnish that finishes the dish. I called Python academically inconvenient because it feels like there’s an academic designer at my shoulder saying, “No, you don’t want to do that, it’d be messy and might be be abused. Pedagogically unsound.” I love that Python pushes me to build an explicit and obvious code, but I find myself wanting to tuck in one little bit of magic to make the code perfect.