Making up records in SQL Queries - Table Functions and 10g Model clause 20188367001

Making up records in SQL Queries – Table Functions and 10g Model clause

From time to time there is a need in my SQL queries to have access to a table with a specific number of records. Sometimes I want to explode the results from a query – double, treble etc. the number of records. On other occasions I want to select from a specific range or outer join with a specified range to have an entry for every rank, date etc. More concrete examples (in terms of EMP and DEPT):

  • select the number of employees hired on every day of the week – include entries for days on which no one was hired
  • select the employees ranked on even positions when ordered by salary (note: there are many ways to approach this)
  • present for the numbers 1 to 5 the number of employees that manage that number of subordinates
  • select all employees as many times as there are Clerks

Of course the old way of doing this was create a utility or helper table that contained a very large number of records (something like dual but with many rows). Each row typically only contained a number. At other times you also have created tables with seven records (days of the week), twelve records (months of the year), 26 records (letters in the alphabet) etc. However, these could all have been derived in views from the one table with just many numeric values. You could also fake this table by using a view or table in your schema that you know to be always available and contain a substantial number of records. Tom Kyte’s favorite for example is ALL_OBJECTS. We will look at the ‘modern’ way: TABLE FUNCTIONS (Oracle 9i) and the MODEL clause (10g).

Using Table-Functions

Much has been written about Table Functions that I will not repeat here. Basically, a Table Function is a PL/SQL function that returns a Nested Table. Such a function can be used in SQL queries, with the function wrapped inside a TABLE( ) operator. We can use TABLE(function) just like any table, view or in-line view in the query; it is just another result-set or record-source.

A useful, generic Table-Function is the following:

create type number_table as table of number
create or replace function number_tbl
( p_lower in number:=0, p_upper in number:=10)
return number_table
is
  l_num_tbl  number_table := number_table();
begin
  for i in p_lower..p_upper loop
    l_num_tbl.extend;
    l_num_tbl(l_num_tbl.last):= i;
  end loop;
  return l_num_tbl;
end number_tbl;

Using this function number_tbl we have a row-generator at our disposal that can be used in many ways.
List the days of the week (knowing that 21st november 2004 is a Sunday

SELECT to_char( days.column_value + to_date('21-11-2004','DD-MM-YYYY'),'DAY') day
FROM   table(number_tbl(0,6)) days

The result:

DAY
---------
SUNDAY
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY

Find for the numbers 0 through 5 the number of managers who have that number of subordinates:

SELECT num_subs.column_value number_of_subordinates
,      count(subs_count.cnt) number_of_mgrs
FROM   table(number_tbl(0,5)) num_subs
       left outer join
       (select count(*) cnt from emp group by mgr) subs_count
       on  num_subs.column_value = subs_count.cnt
group
by     num_subs.column_value

Results:

NUMBER_OF_SUBORDINATES NUMBER_OF_MGRS
---------------------- --------------
                     0              0
                     1              4
                     2              1
                     3              1
                     4              0
                     5              1

Select the employees ranked 5 through 7 when ordered by salary, descending:

select *
from   table( number_tbl(5, 7)) num
,      ( select emp.*, rownum rn
         from ( select emp.*
                from emp
                order
                by sal desc) emp
       ) emp
where  rn = num.column_value
COLUMN_VALUE      EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO         RN
------------ ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- -
           5       7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30          5
           6       7782 CLARK      MARKETEER       7839 09-JUN-81       2450                    10          6
           7       7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30          7

Using the 10g Model Clause

Inspired by Tom Kyte’s column in Oracle Magazine of November 2004, I realized how you can make use of the MODEL clause in Oracle 10g to have records appear out of thin air. The advantage of this approach is that we do not have to create TYPEs and FUNCTIONs before we can start querying.

To select the number of employees hired in the years 1979 through 1983:

select years.year
,      count(empno)
from   ( SELECT 1979+cell year
         FROM   DUAL
         MODEL
          DIMENSION BY (0 attr)
          MEASURES (0 cell)
          RULES  ITERATE (5)
          ( cell[iteration_number] = iteration_number)
       ) years
       left outer join
       emp
on     (extract(year from hiredate) = years.year)
group
by     year
ORDER
BY     year

The outcome, in case you were wondering:

  YEAR COUNT(EMPNO)
------ ------------
  1979            0
  1980            1
  1981           10
  1982            1
  1983            0

The in-line view years makes use of DUAL and explodes the set returned from dual. It builds a two-dimensional array:

SELECT cell , attr
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (5)
  ( cell[iteration_number] = iteration_number)

 CELL       ATTR
----- ----------
    0          0
    1          1
    2          2
    3          3
    4          4

With the ITERATE rule we specify the number of records that should be created in th model. With the DIMENSION and MEASURES, we define the composition of each individual record. Finally we use the almost simplest assignment expression for the cell (MEASURE) values.

In a similar fashion we can get a list of the days of the week:

SELECT cell day_number
,      to_char( cell + to_date('21-11-2004','DD-MM-YYYY'),'DAY') day
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (7)
  ( cell[iteration_number] = iteration_number)

DAY_NUMBER DAY
---------- ---------
         0 SUNDAY
         1 MONDAY
         2 TUESDAY
         3 WEDNESDAY
         4 THURSDAY
         5 FRIDAY
         6 SATURDAY

Here I happen to use a SUNDAY as starting point (21st November 2004, today).

Using the MODEL clause in this fashion, you can any number of records you desire with values that are either from 1 to the number of records or derived from those values – such as dates, ascii characters etc. Get a list of all letters in the alphabet:

SELECT chr(97+cell) letter
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (26)
  ( cell[iteration_number] = iteration_number)

L
-
a
b
c
d
e
f
g
h
i
j
...

Resources on the MODEL Clause:
SQL Reference (Oracle 10g On Line Documentation) – Model Expressions
SQL Reference (Oracle 10g On Line Documentation) – Model Clause
Data Warehousing Guide (Oracle 10g On Line Documentation) – 22. SQL For Modeling

Partition Outer Join (10g)

In certain circumstances you may also want to make use of the Partition Outer Join Clause. For example, if we want to select from EMP the number of employees hired in the years 1980, 1981 and 1982 per department – we need a partition outer join to ensure that we will have a result for each of these years for each department.

If we execute this query – to find the number of employees hired in each year for every department:

select years.year
,      deptno
,      count(empno)
from   ( SELECT 1980+cell year
         FROM   DUAL
         MODEL
          DIMENSION BY (0 attr)
          MEASURES (0 cell)
          RULES  ITERATE (3)
          ( cell[iteration_number] = iteration_number)
       ) years
       left outer join
       emp
on     (extract(year from hiredate) = years.year)
group
by     year
,      deptno
ORDER
BY     year
,          deptno

the result is somewhat incomplete:

  YEAR     DEPTNO COUNT(EMPNO)
------ ---------- ------------
  1980         20            1
  1981         10            1
  1981         20            3
  1981         30            6
  1982         10            1

No entries for departments 10 and 30 in 1980 nor entries in 1982 for departments 20 and 30. With the partition outer join clause we can complement this result:

 select years.year
 ,      deptno
 ,      count(empno)
 from   ( SELECT 1980+cell year
          FROM   DUAL
          MODEL
           DIMENSION BY (0 attr)
           MEASURES (0 cell)
           RULES  ITERATE (3)
           ( cell[iteration_number] = iteration_number)
        ) years
        left outer join
        emp partition by (deptno)
 on     (extract(year from hiredate) = years.year)
 group
 by     year
 ,      deptno
 ORDER
 BY     year
 ,      deptno

Here we have “left outer joined” years with emp. That can be read as: take all left records and find joinable records from emp; if you can find none, add an “empty” record representing the right side – emp. In the previous query, since every year could be matched with at least one department’s score, there were no empty records added; hence we had no row at all for the departments 10 and 30 for 1980. In this query, we have added partition by (deptno) to emp. This tells the SQL Engine that the assignment to left outer join years to emp must be executed for every deptno-partition-of-emp. The set of records returned from EMP is partitioned by deptno (three partitions). Then, the outer join is performed again for each of the three partitions. This must results in a record for every year and every partition (i.e. deptno). The result of this query:

  YEAR     DEPTNO COUNT(EMPNO)
------ ---------- ------------
  1980         10            0
  1980         20            1
  1980         30            0
  1981         10            1
  1981         20            3
  1981         30            6
  1982         10            1
  1982         20            0
  1982         30            0

10 Comments

  1. drkalucard September 5, 2006
  2. Mike April 13, 2006
  3. Tony Andrews September 15, 2005
  4. Lucas December 16, 2004
  5. Lucas December 8, 2004
  6. Lucas December 8, 2004
  7. Lucas Jellema November 24, 2004
  8. Jason Grundy November 24, 2004
  9. sheetal November 23, 2004
  10. sheetal November 23, 2004