The PIVOT operation is one frequently discussed when talking about more advanced SQL Queries. Pivoting is the process of switching rows and columns, and is for example an advanced feature in Excel. Before Oracle 11g, there were several approaches to pivoting, that were not work-arounds for the lack of the PIVOT operator in the Oracle SQL Language.
With Oracle 11g, this has changed. Now we do have a PIVOT and an UNPIVOT operator at our disposal. Let’s first take a look at UNPIVOT. In Oracle, the UNPIVOT operation is the process of turning Columns to Rows. Put simply, by applying the UNPIVOT operator to a number of columns, every row is split into that same number of rows. Each of these rows has two new columns: one for the column that this row stems from – one of the columns the data set was UNPIVOTed by – and one with the value from the column. The original UNPIVOT-columns are no longer part of the newly created records.
Let’s take a look at a simple example, using the well known EMP table. We decide to consider the SAL and COMM columns as two different types of INCOME_COMPONENT. We would like to present our employees with an INCOME_COMPONENT column – and two rows for each employee with two income components.
This UNPIVOT query would look like this:
select * from ( select ename, job, sal, comm from emp ) unpivot ( income_component_value for income_component_type in (sal, comm) ) /
Here we have specified that the rows selected from emp, with columns ename, job, sal and comm, should be UNPIVOTed by the SAL and COMM column. This means that for each row with a value in the SAL column, we get a new row with columns ENAME and JOB, a new column INCOME_COMPONENT_TYPE with value ‘SAL’ and a new column INCOME_COMPONENT_VALUE with the value from the SAL column. If that row also has a value for COMM it spawns another new row with the same ENAME and JOB values, a value of ‘COMM’ for the INCOME_COMPONENT_TYPE column and the COMM value in the new column INCOME_COMPONENT_VALUE:
ENAME JOB INCO INCOME_COMPONENT_VALUE ---------- --------- ---- ---------------------- SMITH CLERK SAL 800 ALLEN SALESMAN SAL 1600 ALLEN SALESMAN COMM 300 WARD SALESMAN SAL 1250 WARD SALESMAN COMM 500 JONES MANAGER SAL 2975 MARTIN SALESMAN SAL 1250 MARTIN SALESMAN COMM 1400 BLAKE MANAGER SAL 2850 CLARK MANAGER SAL 2450 KING PRESIDENT SAL 5000 TURNER SALESMAN SAL 1500 TURNER SALESMAN COMM 0 ADAMS CLERK SAL 1100 JAMES CLERK SAL 950 MILLER CLERK SAL 1300 SCOTT ANALYST SAL 3000 FORD ANALYST SAL 3000 18 rows selected.
With this result, it becomes easy to calculate the total income for our employees, although with a simple “sal+nvl(comm,0) as Income” that could be done without UNPIVOT of course:
select ename , job , sum(income_component_value) income from ( select ename, job, sal, comm from emp ) unpivot ( income_component_value for income_component_type in (sal, comm) ) group by ename , job /
ENAME JOB INCOME ———- ——— ———- WARD SALESMAN 1750 SMITH CLERK 800 CLARK MANAGER 2450 TURNER SALESMAN 1500 JAMES CLERK 950 JONES MANAGER 2975 MARTIN SALESMAN 2650 ADAMS CLERK 1100 SCOTT ANALYST 3000 KING PRESIDENT 5000 FORD ANALYST 3000 ALLEN SALESMAN 1900 BLAKE MANAGER 2850 MILLER CLERK 1300 14 rows selected.
This example does not do much I fear to illustrate the usefulness of the UNPIVOT operation. Where it really shines I believe is when the data design has not been as denormalized as we would have liked and rather than the transactional format we would prefer for aggregations, we are stuck with multiple columns in each row. Typical examples are records with columns for each day of the week or even day of the month or for several fixed – or so we thought – times during each day. When the screen to enter values looks like a matrix, all too frequently the table has such a column layout. And data operations are harder than they should be. Enter the UNPIVOT.
Look at this example.
SQL> desc daily_measurements Name Null? Type ----------------------------------------- -------- --------------- DAY DATE MORNING_MEASUREMENT NUMBER(5,2) AFTERNOON_MEASUREMENT NUMBER(5,2) EVENING_MEASUREMENT NUMBER(5,2)
At one point, this table design seemed a good idea. By now, we know better. Getting an average measurement per day is much harder than one would think, knowing that our overworked staff could easily miss one of the daily measurements – and of course we do not know which one.
DAY MORNING_MEASUREMENT AFTERNOON_MEASUREMENT EVENING_MEASUREMENT --------- ------------------- --------------------- ------------------- 27-SEP-07 10.2 45.91 35.6 28-SEP-07 16.12 25.13 38.21 29-SEP-07 21.65 81.2
select (morning_measurement+afternoon_measurement+evening_measurement)/3 day_average is not going to do the trick. First of all we need NVL(measurement,0) to not have NULLs distort the summation. Then we should divide by the number of measurements that is actually there: case when morning_measurement is null then 0 else 1 end + case when afternoon_measurement is null then 0 else 1 end +… is one way to arrive at the factor to divide by. Not elegant!
Using UNPIVOT, life becomes so much easier in these circumstances:
select day , avg(measurement) from daily_measurements unpivot ( measurement for measurement_time in ( morning_measurement , afternoon_measurement , evening_measurement ) ) group by day / DAY AVG(MEASUREMENT) --------- ---------------- 27-SEP-07 30.57 28-SEP-07 26.4866667 29-SEP-07 51.425
That is all we need to do. We take the three measurement columns in the daily_measurements table and use them for UNPIVOTing. Each of these measurement columns spawns – when it contains a value – a new record for the day. Instead of one row for each day, with three measurements, we get a row per measurement.
In the old days, with Oracle 10g or earlier, we probably would have solved this with something like:
select day , morning_measurement measurement , 'morning_measurement' measurement_time from daily_measurements where morning_measurement is not null union all select day , afternoon_measurement measurement , 'afternoon_measurement' measurement_time from daily_measurements where afternoon_measurement is not null union all select day , evening_measurement measurement , 'evening_measurement' measurement_time from daily_measurements where evening_measurement is not null /
It works, but with a growing number of columns, it is not pretty. I have not investigated, but it would be my expectation that UNPIVOT is faster as well.