IF...THEN in a SQL Server SELECT Clause

November 19, 2010 21:45 by Matt Whitfield

This is a commonly asked question: 'How do I do the equivalent of an IF...THEN in a SELECT Clause?'

The answer is fairly commonly given that the Transact-SQL CASE expression is the way to achieve this. So, for example, we might want to achieve:

SELECT (IF [name] LIKE '%ID' THEN 'Yes' ELSE 'No') as ColumnNameEndsWithID  FROM [sys].[columns];

This is done using the case expression:

SELECT (CASE WHEN [name] LIKE '%ID' THEN 'Yes' ELSE 'No' END) as ColumnNameEndsWithID  FROM [sys].[columns];

So the syntax here is fairly close to how we would express it in natural language - and will look fairly familiar to anyone who has used a ternary operation in C, C++ or C#.

However, the CASE expression can also handle switch style syntax. So, for example, we might want to return the letter 'A', 'B', 'C' or 'D' depending on whether a column's value is 0, 1, 2 or 3.

This would look like the following:

SELECT TOP 10 CASE [integerValue]              WHEN 0 THEN 'A'               WHEN 1 THEN 'B'               WHEN 2 THEN 'C'               WHEN 3 THEN 'D'               END AS [characterValue]FROM   [dbo].[myTable];

Excellent - we have the basics of a switch syntax - although we are limited to expressions and not statements here - i.e. we can return different values based on the input value, but we cannot perform different operations.

This works well when the input value to the CASE statement is deterministic. What do we mean by deterministic? A deterministic function always returns the same value for the same inputs. So, looking at the value of the column at a given point in time is deterministic. A non-deterministic function might be something like GETDATE() - which returns a different value each time it is called.

So how does CASE cause us problems with non-deterministic input? Consider the following SQL:

SELECT TOP 10 CASE ABS(CONVERT (INT, RAND() * object_id)) % 4               WHEN 0 THEN 'A'               WHEN 1 THEN 'B'               WHEN 2 THEN 'C'               WHEN 3 THEN 'D'               ENDFROM   [sys].[columns];

What do we expect this to do? We expect it to return us a random character, 'A', 'B', 'C' or 'D' for each of the 10 rows we're selecting. What happens when we run that? Well, you'll get a random result set, but here's the one I got:


Hang on - NULL? Why is that there? Ok, so we're asking for a random positive value, then taking modulo 4 - which will always be 0, 1, 2 or 3. So how come our CASE statement isn't matching on any of those branches? To answer that question, we have to look at how the query is executed - or how it is expanded. I know I was surprised when I saw this in a presentation given by Peter Larsson, but the actual statement you're running is more akin to:

SELECT TOP 10 CASE WHEN ABS(CONVERT (INT, RAND() * OBJECT_ID)) % 4 = 0 THEN 'A' ELSE                CASE WHEN ABS(CONVERT (INT, RAND() * OBJECT_ID)) % 4 = 1 THEN 'B' ELSE                  CASE WHEN ABS(CONVERT (INT, RAND() * OBJECT_ID)) % 4 = 2 THEN 'C' ELSE                    CASE WHEN ABS(CONVERT (INT, RAND() * OBJECT_ID)) % 4 = 3 THEN 'D' END                  END                END              ENDFROM   [sys].[columns];

Oh. So, because the expression on which we're matching is non-deterministic, we are evaluating the expression on each branch of the CASE statement, and coming up with a different value each time. That would explain how we could fail to match on any of the branches - seeing as the value tested against each branch is different from the last value that was tested.

You can see the same effect when using non-deterministic scalar functions too. The moral of the story? Be careful when using case with non-deterministic expressions...

Tags: ,


Add comment

(Will show your Gravatar icon)

  Country flag

  • Comment
  • Preview


Data Inspector

Tag cloud


<<  March 2015  >>

View posts in large calendar