14 Years to Unicode

Last August I was chatting with some friends (every coder has an IRC channel with around six nerds they shoot the breeze with, right?) and said:

In other news, I will be really happy in like fifteen years when everyone has broken down and admitted that Unicode is the only sane way to go and all the tools use it by default. At the moment it’s built into most everything, but it 1. is generally broken and 2. is not the default.

(Clarification: I mean Unicode implementations are broken, not the standard itself.)

Google just posted a great graph of the uptake of Unicode on the web:

Graph showing strongly rising unicode adoption{.important}

This is really encouraging, it’s great to see Unicode taking a plurality (if not yet a majority or totality) share. I think I had my two concerns reversed, Unicode needs to be set as the default for everything before all the varying implementations work out their myriad bugs.

But that’s progressing along well. Python 3000 and Ruby 1.9 will use Unicode as a default, and more content-producing tools are picking it up. There’s still lots of random standards that don’t support it (think of metadata stuffed into mp3s and images) or only half-support it in the “well, it’s 8-bit clean...” way that can leave you guessing at encoding.

I don’t have anything profound to say, I just like Unicode and wanted to make my prediction public.