Alphabetical Sorting SQL Without “The” «
»


Code: , , ,
1 comment

Chris McAvoy asked how to sort alphabetically so that entries starting with “A”, “An”, or “The” end up in the proper place instead of jumbled into the As and Ts.

Do I create a field “name_without_the” and select * from bands order by name_without_the? I’m assuming this is a common alpha-issue that’s been solved millions of times, but I can’t find a best-practice suggestion via two minutes of Googling.

Here’s the query I came up with to do it in-database:


SELECT name FROM bands
ORDER BY CASE
WHEN LOWER(LEFT(name, 2)) ="a " THEN SUBSTRING(name, 3)
WHEN LOWER(LEFT(name, 3)) ="an " THEN SUBSTRING(name, 4)
WHEN LOWER(LEFT(name, 4)) ="the " THEN SUBSTRING(name, 5)
ELSE name
END;

The problem with this is that it’s sloooow — your database engine will have to look at every single row of the table instead of scanning an index. So if speed is an issue, you need to add another column, and you have two choices:

First, you could add a name_article column and split your data. For the band “The Who”, name_article would be “The” and name would be “Who”. You can sort by name but you’ll need to recombine the fields to use them. If you don’t have an ORM to nicely abstract this it’ll be painful. So while it saves duplication of data, it’s just a bad idea to make database columns depend on each other.

The better way to do it is Chris’s original idea: create a name_without_article column to store it. For the band “The Who”, name_without_article would be “Who” and name would be “The Who”. This will just be straightforward to deal with, and the extra space you use will be offset by the simpler and faster code you write.


Comments

  1. Hello.

    With PostgreSQL there is also an other very nice way: creating an index on the expression which removes the given prefixes. This way (a) you don’t have to clutter your database schema with columns for the sake of optimization (b) you don’t have to make sure to update them from your code (Postgres does it automatically) and (c) you still get the performance benefit of indexes.

    Postgres rocks :)

Leave a Reply

Your email address will not be published.