Nested case statement in sql vs multiple criteria case statement - which should be used?
Had an interesting discussion with a colleague today over optimising case statements and whether it's better to leave a case statement which has overlapping criteria as individual when clauses, or make a nested case statement for each of the overlapping statements.
As an example, say we had a table with 2 integer fields, column a and column b. Which of the two queries would be more processor friendly? Since we're only evaluating if a=1 or a=0 once using the nested statements, would this be more efficient for the processor, or would creating a nested statement eat up that optimization?
Multiple criteria for the case statement:
Select
case
when a=1 and b=0 THEN 'True'
when a=1 and b=1 then 'Trueish'
when a=0 and b=0 then 'False'
when a=0 and b=1 then 'Falseish'
else null
end AS Result
FROM tableName
Nesting case statements:
Select
case
when a=1 then
case
when b=0 then 'True'
when b=1 then 'Trueish'
end
When a=0 then
case
when b=0 then 'False'
when b=1 then 'Falseish'
end
else null
end AS Result
FROM table name
nested case statement in sql vs multiple criteria case statement
Someone may come along and dig into the details of how these operations are actually processed on the CPUs, or what instructions get generated differently between those two queries. But from a practical standpoint, it seems that there's not a significant difference.
This loads up a table with 10 million rows of two bit columns, with a somewhat arbitrary distribution of 1 and 0 values.
CREATE DATABASE [239583];
GO
USE [239583];
GO
SELECT TOP 10000000
CASE WHEN v.number % 2 = 0 THEN CAST(1 AS bit) ELSE CAST(0 AS bit) END AS a,
CASE WHEN v.number % 2 = 1 THEN CAST(1 AS bit) ELSE CAST(0 AS bit) END AS b
INTO dbo.TableName
FROM master.dbo.spt_values v
CROSS JOIN master.dbo.spt_values v2
CROSS JOIN master.dbo.spt_values v3;
I then told SSMS to discard results after execution (Query -> Query Options -> Grid -> Discard results after execution), because I don't have time to wait on SSMS to crash loading 20mm rows of 1s and 0s.
Then ran this code:
SET STATISTICS IO, TIME ON;
Select
case
when a=1 and b=0 THEN 'True'
when a=1 and b=1 then 'Trueish'
when a=0 and b=0 then 'False'
when a=0 and b=1 then 'Falseish'
else null
end
FROM tableName;
Select
case
when a=1 then
case
when b=0 then 'True'
when b=1 then 'Trueish'
end
When a=0 then
case
when b=0 then 'False'
when b=1 then 'Falseish'
end
else null
end
FROM table name;
I typical run has CPU output like this on my machine:
Query 1: CPU time = 2141 ms, elapsed time = 2191 ms.
Query 2: CPU time = 2359 ms, elapsed time = 2461 ms.
Sometimes they are slightly closer to equal, or the second query is slightly faster, but I think those are just the usual variations of the CPUs on my laptop doing other stuff than just running this query.