Case Sensitivity in In-Page Anchors
As I’ve been working on Chibrary, I ran into a small cross-browser compatability issue: only some browsers treat anchors as case-sensitive. The call numbers that uniquely identify messages would be perfect for linking to in the middle of a long discussion, but some of them only differ by case.
So I wondered: is that acceptable in linking to an anchor inside the page? A quick experiment with Firefox worked fine, but I wanted to be thorough.
When the HTML5 spec talks about scrolling to a “fragment identifier” (aka “fragid” or “named anchor”) there’s mention that a special value of “top” is case-insensitive and otherwise:
- Let decoded fragid be the result of applying the UTF-8 decoder algorithm to fragid bytes. If the UTF-8 decoder emits a decoder error, abort the decoder and instead jump to the step labeled no decoded fragid.
- If there is an element in the DOM that has an ID exactly equal to decoded fragid, then the first such element in tree order is the indicated part of the document; stop the algorithm here.
- No decoded fragid: If there is an a element in the DOM that has a name attribute whose value is exactly equal to fragid (not decoded fragid), then the first such element in tree order is the indicated part of the document; stop the algorithm here.
So it’s not case-insensitive according to HTML5 (though it used to be in HTML 4.01).
The HTML5 spec goes on to reference RFC 3986 for what a URL is, and section 3.5 begins its definition:
The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information. The identified secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. A fragment identifier component is indicated by the presence of a number sign (“#”) character and terminated by the end of the URI.
So the relevent specs have no mention of case-sensitivity. Still, being a paranoid developer, I created a test case. The page had enough lorem ipsum to fill a few screens, and two anchors named #ANCHOR
and #anchor
. They’re color-coded so I could asses at a glance, which I did by linking to the lower-case version and seeing if I ended-up at the earlier upper-case version.
Want to guess which one browser got it wrong?
Yep, Internet Explorer, versions 6 through 11. Here’s a screenshot of a current IE 11 on Windows 8.1 when linked to #anchor
:
{.aligncenter .size-medium .wp-image-2480 .content width=”300” height=”186”}
I briefly considered base64 encoding the call numbers, but then I remembered my longstanding commitment to myself: never fix an IE-only bug unless required by a job. Internet Explorer has been broken in many ways for a long time, and often fails to correct bugs for backwards compatibility. As long as it’s going to keep wasting so much of my time working around it (...and for a good while after, should it ever stop), I’m not going to donate more time to it.