Courtesy of Tom Kyte: Generating rows in SQL with the CUBE statement - no dummy table or table function required 20188367001

Courtesy of Tom Kyte: Generating rows in SQL with the CUBE statement – no dummy table or table function required

In a previous post Making up records in SQL Queries – Table Functions and 10g Model clause I discussed how you can make use of either a Table Function or the 10g SQL Model Clause to make up records in queries – as alternative to using a large table or dummy table to produce a certain number of records. During the three-day seminar by Tom Kyte that we – a seven man delegation from AMIS – attended this week, I saw a new way to do this ‘wizardry of new rows where none existed before’ using the CUBE statement.

From time to time there is a need in my SQL queries to have access to a table with a specific number of records. Sometimes I want to explode the results from a query – double, treble etc. the number of records. On other occasions I want to select from a specific range or outer join with a specified range to have an entry for every rank, date etc. More concrete examples (in terms of EMP and DEPT):

  • select the number of employees hired on every day of the week – include entries for days on which no one was hired
  • select the employees ranked on even positions when ordered by salary (note: there are many ways to approach this)
  • present for the numbers 1 to 5 the number of employees that manage that number of subordinates
  • select all employees as many times as there are Clerks

Of course the old way of doing this was create a utility or helper table that contained a very large number of records (something like dual but with many rows). Each row typically only contained a number. At other times you also have created tables with seven records (days of the week), twelve records (months of the year), 26 records (letters in the alphabet) etc. However, these could all have been derived in views from the one table with just many numeric values. You could also fake this table by using a view or table in your schema that you know to be always available and contain a substantial number of records. Tom Kyte’s favorite for example is ALL_OBJECTS. My previous post discussed ‘modern’ ways: TABLE FUNCTIONS (Oracle 9i) and the MODEL clause (10g). Now we will use the CUBE for an even more elegant – though slight less controllable way.

Using CUBE to generate rows

CUBE is used in the group by clause of select statements, to produce aggregations along every combination of dimensions described by the CUBE parameters. Typical use of CUBE is to produce subtotals at all dimension values. Some background on the CUBE for example in The SQL of OLAP Don’t overlook the core strength of your OLAP technology solution: SQL. By Michael Gonzales

The CUBE supergroup is the other extension to the GROUP BY clause that produces a result set that contains all the subtotal rows of a ROLLUP aggregation and, in addition, contains “cross-tabulation” rows. Cross-tabulation rows are additional superaggregate rows. They are, as the name implies, summaries across columns as if the data were represented as a spreadsheet. Like ROLLUP, a CUBE group can also be thought of as a series of grouping-sets. In the case of a CUBE, all permutations of the cubed grouping expression are computed along with the grand total. Therefore, the n elements of a CUBE translate to 2 raised to the power of n grouping-sets.

GROUP BY CUBE (a,b,c) is equivalent to:

GROUP BY GROUPING SETS
(
(a,b,c)
(a,b)
(a,c)
(b,c)
(a)
(b)
( c)
()
)

Notice that the three elements of the CUBE translate to eight grouping sets.

The number of additional records produced by a CUBE depends on the number of distinct values in each dimension. For example: show the sum of salaries for all job values as well as all departments as well as all combinations of the two as well as a grand total:

 select d.dname
 ,      e.job
 ,      sum(e.sal)
 from   emp e
 ,      dept d
 where  e.deptno = d.deptno
 group by cube(d.dname, e.job)
 /
DNAME          JOB       SUM(E.SAL)
-------------- --------- ----------
                              29025
               CLERK           4150
               ANALYST         6000
               MANAGER         8275
               SALESMAN        5600
               PRESIDENT       5000
SALES                          9400
SALES          CLERK            950
SALES          MANAGER         2850
SALES          SALESMAN        5600
RESEARCH                      10875
RESEARCH       CLERK           1900
RESEARCH       ANALYST         6000
RESEARCH       MANAGER         2975
ACCOUNTING                     8750
ACCOUNTING     CLERK           1300
ACCOUNTING     MANAGER         2450
ACCOUNTING     PRESIDENT       5000

In this case, the number of records returned is 18: 4 grouping sets: (job, dept),(job),(dept),() with 3 departments and 5 jobs (but only three jobs per department): 3 (dept) + 5 (job) + 1 () + 3 * 3 (job, dept).

We can also use CUBE to generate ‘dummy totals’ without any real aggregations. CUBE will produce POWER( 2,x) records, where x is the number of dimensions listed in the parameterlist. For example:

select 1 from dual group by CUBE (1,2,3,4)

will generate 2 to the power of 4 = 2*2*2*2= 16 rows. Note that CUBE (1,1,1,1) or CUBE (a,a,a,a) would give the same result; I use 1,2,3,4 to easily derive the number of records returned.

Using this ‘row generator’ in an in-line view with where clause on rownum we can control exactly how many records are produced. This can be used in many ways.
List the days of the week (knowing that 21st november 2004 is a Sunday

SELECT to_char( days.day + to_date('21-11-2004','DD-MM-YYYY'),'DAY') day
FROM   ( select rownum  day
         from   ( select 1
                  from   dual
                  group
                  by     cube (1,2,3)
                )
         where rownum<8) days

The result:

DAY
---------
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY
SUNDAY

Perhaps slightly more elegant is the following incarnation of this query:

WITH nums as
( select rownum num
  from   ( select 1 num
           from   dual
           group
           by     cube (1,2,3)
         )
)
SELECT to_char( days.day + to_date('21-11-2004','DD-MM-YYYY'),'DAY') day
FROM   ( select num day
         from   nums
         where  num<8) days

Find for the numbers 0 through 5 the number of managers who have that number of subordinates:

WITH nums as
( select rownum num
  from   ( select 1 num
           from   dual
           group
           by     cube (1,2,3)
         )
)
SELECT num_subs.num number_of_subordinates
,      count(subs_count.cnt) number_of_mgrs
FROM   nums num_subs
       left outer join
       (select count(*) cnt from emp group by mgr) subs_count
       on  num_subs.num-1 = subs_count.cnt
where  num_subs.num < 7
group
by     num_subs.num

Results:

NUMBER_OF_SUBORDINATES NUMBER_OF_MGRS
---------------------- --------------
                     0              0
                     1              4
                     2              1
                     3              1
                     4              0
                     5              1

Select the employees ranked 5 through 7 when ordered by salary, descending:

WITH nums as
( select rownum num
  from   ( select 1 num
           from   dual
           group
           by     cube (1,2,3)
         )
)
select *
from   nums
,      ( select emp.*, rownum rn
         from ( select emp.*
                from emp
                order
                by sal desc) emp
       ) emp
where  rn = nums.num
and    nums.num between 5 and 7
COLUMN_VALUE      EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO         RN
------------ ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- -
           5       7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30          5
           6       7782 CLARK      MARKETEER       7839 09-JUN-81       2450                    10          6
           7       7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30          7

11 Comments

  1. Bert de Geus November 6, 2007
  2. harm August 25, 2005
  3. Alex Nuijten August 25, 2005
  4. anton August 25, 2005
  5. Alex Nuijten August 24, 2005
  6. anton August 23, 2005
  7. harm June 2, 2005
  8. Alex Nuijten May 31, 2005
  9. Lucas February 13, 2005
  10. Mr. Ed February 12, 2005
  11. Alex Nuijten February 11, 2005