Versioning: The Next Big Thing
In the web development world, anyways. So, in the grand scheme of things, maybe not a huge deal to anyone else. Versioning is going to be one of the biggest problems and opportunities there is in web development, and it’s going to take us at least five years to get it right.
Actually, let me admit up front that five years is a shot in the dark, and optimistic to boot. If people keep hanging out with bondage and discipline languages like Java and C# that are still catching up to language and framework developments from the 90’s it’ll take us more like ten years. (Attention Lisp Weenies: Yes, I know you solved every problem forty years ago for certain values of “solved” and “problem” while the rest of us were getting work done.) Not only is versioning a difficult technical problem, it will be difficult to educate programmers in what it is, how it works, and why you’ll wish you used about a year after you decided it was too much work.
I’m writing to get help get the ball rolling on the process of solving this problem and publicizing it. So maybe it’s about time I go into what the problem is instead of just yammering on, eh?
Because your website is served from your machines, you can update it every few days or hours. This is almost pedantically obvious, but it’s a big change from having to press CDs, ship them to a store, and wait for them to go home with customers -- it’s even a big change from posting a new version online for customers to download and install.
The web’s faster pace means that updating, say, your database model to add or remove fields is a common occurence rather than something your InstallShield wizard does every 1-2 years. It’s vital that changes be streamlined and safeguarded. What kind of defaults get set when you add fields? How do you save data when you lose fields? How does the programmer make sure the Right Thing happens? If you have the fields Name and Address and want to combine them into a simple MailingAddress field, you’re not just deleting two columns and adding one.
Not only is it complex to change your schema, getting those changes into production are not easy. The script that updates definitions and migrates data (which we tested somewhere other than production, right?) needs to placate the multiple web and database servers used for load-balancing. If you update the database server first, the application code on the web server shouldn’t break because it didn’t get the word about the new schema.
I see too many problems with trying to get multiple servers to update to a new version at the Exact Same Instant, I think schema changes are going to have to be a four-step process with a backwards-compatible first step applied to the database (add columns, loosen restrictions), a code update (with testing!), a backwards-incompatible database change (drop columns, tighten restrictions), and finally another code update to remove outdated usage.
Jakob Kaplan-Moss, one of the creators of Django, just opened the discussion of versioning in Django (with links to prior art). As a web framework with an object-relational mapper, versioning is an important feature.
In addition to databases, web APIs need versioning. Adam Kelsey has opened that discussion, but so far the solutions are pretty rough and I don’t have anything to add to the discussion.
If you’d like to read more about how web development differs from other development, Steve Yegge wrote an excellent article titled It’s Not Software coining the term “servware” to emphasize that web developers need to recognize the ways our code ends up so different. (It’s easy to turn this topic into rah-rah “We’re so special we don’t have to learn from history” bullshit, but nobody wants that.)