The Two Kinds of Integers

Published: ← 2015-04-11 →
Category: ← Code →
Tags: Haskell ← Ruby → abstractions ← assembly values

A small thing I see tripping up developers is that there are two kinds of integers: numbers and identifiers.

We don’t know what the integer 4 means unless we know what it’s for. Are we counting things, or identifying them? It’s really easy to slip between the two, like this real code snippet I found:

` - if event.event_type < 3 = event.label.humanize `{lang=”haml”}

Events have an integer to enumerate their possible types. It just so happens that the first two should have their label printed and the later ones shouldn’t, but that’s not anything to do with the numeric properties of the IDs. It doesn’t make sense to multiply an ID by 7 or take the absolute value of an ID, but it’s often tempting to add 1 for the next item or compare manitudes, even though the fact that you can do these things is an implementation detail.

One of the things I’ve really liked in the Haskell code I’ve read is how, everywhere, the basic types like integer are aliased to meaningful types like age, distance, or different identifiers. There’s a lot more ceremony involved in creating a value class in Ruby, so it doesn’t happen.

When we don’t have classes or types, we get primitive obsession, where the knowledge of how to work with a value (especially basic things, like what it means to be valid) spread across the system instead of clustering. (To say nothing of the problems avoided by static typing.) Sometimes we get misled by the primitives and misuse them in ways that result in weird bugs.

â€œYou never know where it’s going to put thingsâ€?, he explained, â€œso you’d have to use separate constantsâ€?.

It was a long time before I understood that remark. Since Mel knew the numerical value of every operation code, and assigned his own drum addresses, every instruction he wrote could also be considered a numerical constant. He could pick up an earlier â€œaddâ€? instruction, say, and multiply by it, if it had the right numeric value. His code was not easy for someone else to modify.

Leaping between abstraction and representation was useful decades ago when computers were far more limited, but now I can only code like the sample above as sloppy. I assume there’s still some places it’s useful in the design of embedded software or low-level protocols, but up in the business code I write all day it’s a great way to set up future bugs.

I’m writing because x86 assembly was my third programming language when I was still very green. Knowing assembly is wonderful for understanding what’s happening behind all the abstractions, but it left me with a really poor ability to form my own. I used to think I was clever for coding at multiple layers of abstraction at once, and now I think I’m clever for not.