Aggregating in SQL across multiple levels and along several dimensions using ROLLUP and GROUPING SETS

Lucas Jellema 1
0 0
Read Time:3 Minute, 32 Second

In preparation for the Oracle 7Up Workshop as well as the Oracle Quiz – The Water is Still Burning (presentation and paper) with Alex for the ODTUG 2006 conference in June in Washington DC, I revisited some old pals of mine: the ROLLUP and the GROUPING SETS aggregation operators. Let’s take a brief look at what again they can be used for. Note that I intentionally shy away from the CUBE operator for now…

Select statements have been perfectly capable of retrieving aggregate values. Maximum and Average Salaries, Count of Job occurrences, Minimum Hiredate: all hunky dory. There are two serious limitations though: all aggregations in a query have to be along the same dimension and at the same level. You cannot query in a single the query the Maximum Salary for all Salesmen and the Average Salary per Department. Nor can you query the average salary per Employee as well as per Hire Year.

In this article we discuss ROLLUP (8i) and GROUPING SETS (9i) that make both operations possible.

....

Crossing Dimensions using GROUPING SETS

Using Grouping Sets, introduced in Oracle 9i, you can aggregate along different dimensions without resorting to not so elegant and less than perfect performing UNION (ALL). The following query for example gives us the maximum salary per JOB as well as per DEPARTMENT:

select deptno
, job
, max(sal) max_sal
from emp
group
by grouping sets (deptno, job)

The results suggest that PRESIDENT in Department 10 is the best paid position:
 
Grouping Sets are a quite flexible mechanism to determine exactly for which (combinations of) columns we want to have an aggregate value calculated. We can also add aggregates for the combination of DEPTNO and JOB:

select deptno
, job
, max(sal) max_sal
from emp
group
by grouping sets (deptno, job, (deptno, job))


Note: all these results can be thought of as to be constructed using multiple group by clauses in a single select-statement, like this:

select deptno
, job
, max(sal) max_sal
from emp
group
by deptno
group
by job
group
by job, deptno

This is invalid syntax, but it illustrates what grouping sets are like when executed.

Aggregation at various levels of a hierarchy

 

Now for aggregation at various levels in a single query. If we want to retrieve the Average Salary for all employees as well as the Average Salary per Year and per Quarter and per Month in which employees have been hired, we are out of luck with the normal aggregation; that is: we need to use UNION ALL to combine several queries. However, using the ROLLUP operator, we can aggregate from multiple levels in a single query.

SELECT extract(year from hiredate) year
, 'Q'||to_char(hiredate, 'Q') quarter
, to_char(hiredate, 'MONTH') month
, count(empno) num
, avg(sal) avg_sal
FROM emp
GROUP
BY ROLLUP
( extract(year from hiredate)
, 'Q'||to_char(hiredate, 'Q')
, to_char(hiredate, 'MONTH')
)

This query returns the number of employees hired per month, quarter, year and in total – as well as the average salary at each of those levels:
 
With ROLLUP we specify to the SQL engine that we desire aggregations at each of the levels of a ‘hire-time hierarchy’ that stretches from Month via Quarter and Year to the Grand Total level. Note that the order of the expressions in the ROLLUP clause determines this hierarchy: the most detailed level is last.

The last record in the resultset is somewhat enigmatic. It only says 14 and 2073 – there is no clue as to what the year, quarter and month are. Does this mean we have 14 employees with NULL for a hiredate? Probably not, but to be sure, we can use the GROUPING function to determine for a record whether it contains an aggregation at a certain level:

SELECT case grouping(extract(year from hiredate))
when 1
then 'All Years'
else to_char(extract(year from hiredate))
end year
, case grouping('Q'||to_char(hiredate, 'Q'))
when 1
then 'All Quarters'
else 'Q'||to_char(hiredate, 'Q')
end quarter
, case grouping(to_char(hiredate, 'MONTH'))
when 1
then 'All Months'
else to_char(hiredate, 'MONTH')
end month
, count(empno) num
, avg(sal) avg_sal
FROM emp
GROUP
BY ROLLUP
( extract(year from hiredate)
, 'Q'||to_char(hiredate, 'Q')
, to_char(hiredate, 'MONTH')

With this query, the results look like this:
 

About Post Author

Lucas Jellema

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director and Oracle Developer Champion. Solution architect and developer on diverse areas including SQL, JavaScript, Kubernetes & Docker, Machine Learning, Java, SOA and microservices, events in various shapes and forms and many other things. Author of the Oracle Press book Oracle SOA Suite 12c Handbook. Frequent presenter on user groups and community events and conferences such as JavaOne, Oracle Code, CodeOne, NLJUG JFall and Oracle OpenWorld.
Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

One thought on “Aggregating in SQL across multiple levels and along several dimensions using ROLLUP and GROUPING SETS

  1. A remark to a statement, that UNION (ALL) is “less than perfect performing”:
    * UNION is bad performing, it has to sort all queries and remove duplicates
    * UNION ALL is perfectly performing, it just executes first query, returns rows, executes the second and returns rows. It is equal to executing both queries.

    Viliam

Comments are closed.

Next Post

JDeveloper, not just for development...

When you startup JDeveloper, default it gives you a “Tip of the day”. I always enjoy reading these tips and I would like to share this one with you: Related posts: Migration from MySQL to Oracle – using Oracle Migration Workbench Cost Based Optimizing through time travel? – the value […]
%d bloggers like this: