PIVOT! – Manual Pivot and the PIVOT function

Pivot

(Whenever I hear PIVOT this is what I think of, PIVOTTTT!!!)

Right, to start this series of posts I wanted to write about a method of manipulating data I refer to as a “manual PIVOT”.

For those who have use the TSQL PIVOT you with know it enables you to transpose data like you can do in excel (Rows become columns etc). For those who don’t already use it have you ever looked at the syntax in TSQL? It is horrible and almost impossible to remember (see below).

SELECT,
    [first pivoted column] AS,
    [second pivoted column] AS,
    ...
    [last pivoted column] AS
FROM
    (<select>)
    AS 
PIVOT
(
    ()
FOR
[]
    IN ( [first pivoted column], [second pivoted column],
    ... [last pivoted column])
) AS 
;
</select>

You can however create something similar using conditional logic.

If you’re using a version of SQL Server before SQL2012 then you can use a CASE statement, for those on a version greater than or equal to SQL2012 you can use either a CASE or the new Inline If (IIF). If you are unfamiliar with the CASE statement the syntax is as follows:

CASE WHEN Expression THEN Result WHEN Expression THEN Result ELSE Result END as x

Each “Expression” is a boolean, supplement with conditional logic which evaluates to TRUE.

For all of these examples I will use the AdventureWorks2012 database. If you don’t have this you can grab it from CodePlex

use AdventureWorks2012;
GO
 
SELECT CustomerID, OrderDate, TotalDue FROM Sales.SalesOrderHeader ORDER BY CustomerID

Pivot_Q1
The query above produces a simple list of what each customer has spent ordered by the customerID. What if we want to know how much each customer had spent each year? Well we could write something like this:

SELECT  CustomerID
		, YEAR(OrderDate) as 'OrderYear'
        , SUM(TotalDue) AS 'OrderTotal'
FROM    Sales.SalesOrderHeader
GROUP BY CustomerID, YEAR(OrderDate)
ORDER BY CustomerID

Pivot_Q2

This is fine but what if the requirement was to have years as columns to see the spend trend. This is where we can use the conditional logic to create a manual pivot. Using CASE and the YEAR(OrderDate) we can check to see which year the sales relate to.

SELECT  CustomerID
        , CASE WHEN YEAR(ORDERDate) = 2005 THEN (TotalDue) END AS '2005_OrderTotal'        
		, CASE WHEN YEAR(ORDERDate) = 2006 THEN (TotalDue) END AS '2006_OrderTotal'
        , CASE WHEN YEAR(ORDERDate) = 2007 THEN (TotalDue) END AS '2007_OrderTotal'
        , CASE WHEN YEAR(ORDERDate) = 2008 THEN (TotalDue) END AS '2008_OrderTotal'
        , CASE WHEN YEAR(ORDERDate) = 2009 THEN (TotalDue) END AS '2009_OrderTotal'
FROM    Sales.SalesOrderHeader
ORDER BY CustomerID

Pivot_Q3

The problem with this is that we have multiple rows for our customers. Where a customer has had a spend in 2005, all other years will show NULL. As we are looking for the total amount spent in each year we can simply SUM these totals and GROUP BY the CustomerID.

SELECT  CustomerID
        , SUM(CASE WHEN YEAR(ORDERDate) = 2005 THEN (TotalDue) END) AS '2005_OrderTotal'        
		, SUM(CASE WHEN YEAR(ORDERDate) = 2006 THEN (TotalDue) END) AS '2006_OrderTotal'
        , SUM(CASE WHEN YEAR(ORDERDate) = 2007 THEN (TotalDue) END) AS '2007_OrderTotal'
        , SUM(CASE WHEN YEAR(ORDERDate) = 2008 THEN (TotalDue) END) AS '2008_OrderTotal'
        , SUM(CASE WHEN YEAR(ORDERDate) = 2009 THEN (TotalDue) END) AS '2009_OrderTotal'
FROM    Sales.SalesOrderHeader
GROUP BY CustomerID
ORDER BY CustomerID

Pivot_Q4

This is now displaying one row for each customer with the amounts spent between 2005 and 2009 as columns.

This method works with all sorts of use cases and isn’t restricted to mathematical calculations. You can use this method to to calculate the top 3 products sold for each customer as columns for example.

; WITH RankedProducts AS 
(
SELECT  CustomerID
		, p.Name
		, SUM(OrderQty) OrderQty
		, ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY SUM(OrderQty) DESC) OrderRank
FROM    Sales.SalesOrderDetail
		INNER JOIN Sales.SalesOrderHeader ON Sales.SalesOrderHeader.SalesOrderID = Sales.SalesOrderDetail.SalesOrderID
        INNER JOIN Production.Product p ON p.ProductID = Sales.SalesOrderDetail.ProductID
GROUP BY 
	CustomerID
		, p.Name
)
SELECT CustomerID
	, MAX(CASE WHEN OrderRank = 1 THEN Name ELSE NULL END) AS 'ProductSales1'
	, MAX(CASE WHEN OrderRank = 2 THEN Name ELSE NULL END) AS 'ProductSales2'
	, MAX(CASE WHEN OrderRank = 3 THEN Name ELSE NULL END) AS 'ProductSales3'
FROM RankedProducts
GROUP BY 
	CustomerID

Pivot_Q5

This works by apply in a MAX to the Name based on the OrderRank. OrderRank is calculated using a window function for more on window functions, see an upcoming post in this series. MAX works because it treats the Name as alphanumeric, which allows the MAX to take the word over the NULL as that is higher alphanumerically.

You can use this to work out all sorts scenarios.

Thanks for reading.