Making up records in SQL Queries – Table Functions and 10g Model clause

11

From time to time there is a need in my SQL queries to have access to a table with a specific number of records. Sometimes I want to explode the results from a query – double, treble etc. the number of records. On other occasions I want to select from a specific range or outer join with a specified range to have an entry for every rank, date etc. More concrete examples (in terms of EMP and DEPT):

  • select the number of employees hired on every day of the week – include entries for days on which no one was hired
  • select the employees ranked on even positions when ordered by salary (note: there are many ways to approach this)
  • present for the numbers 1 to 5 the number of employees that manage that number of subordinates
  • select all employees as many times as there are Clerks

Of course the old way of doing this was create a utility or helper table that contained a very large number of records (something like dual but with many rows). Each row typically only contained a number. At other times you also have created tables with seven records (days of the week), twelve records (months of the year), 26 records (letters in the alphabet) etc. However, these could all have been derived in views from the one table with just many numeric values. You could also fake this table by using a view or table in your schema that you know to be always available and contain a substantial number of records. Tom Kyte’s favorite for example is ALL_OBJECTS. We will look at the ‘modern’ way: TABLE FUNCTIONS (Oracle 9i) and the MODEL clause (10g).

Using Table-Functions

Much has been written about Table Functions that I will not repeat here. Basically, a Table Function is a PL/SQL function that returns a Nested Table. Such a function can be used in SQL queries, with the function wrapped inside a TABLE( ) operator. We can use TABLE(function) just like any table, view or in-line view in the query; it is just another result-set or record-source.

A useful, generic Table-Function is the following:

create type number_table as table of number
create or replace function number_tbl
( p_lower in number:=0, p_upper in number:=10)
return number_table
is
  l_num_tbl  number_table := number_table();
begin
  for i in p_lower..p_upper loop
    l_num_tbl.extend;
    l_num_tbl(l_num_tbl.last):= i;
  end loop;
  return l_num_tbl;
end number_tbl;

Using this function number_tbl we have a row-generator at our disposal that can be used in many ways.
List the days of the week (knowing that 21st november 2004 is a Sunday

SELECT to_char( days.column_value + to_date('21-11-2004','DD-MM-YYYY'),'DAY') day
FROM   table(number_tbl(0,6)) days

The result:

DAY
---------
SUNDAY
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY

Find for the numbers 0 through 5 the number of managers who have that number of subordinates:

SELECT num_subs.column_value number_of_subordinates
,      count(subs_count.cnt) number_of_mgrs
FROM   table(number_tbl(0,5)) num_subs
       left outer join
       (select count(*) cnt from emp group by mgr) subs_count
       on  num_subs.column_value = subs_count.cnt
group
by     num_subs.column_value

Results:

NUMBER_OF_SUBORDINATES NUMBER_OF_MGRS
---------------------- --------------
                     0              0
                     1              4
                     2              1
                     3              1
                     4              0
                     5              1

Select the employees ranked 5 through 7 when ordered by salary, descending:

select *
from   table( number_tbl(5, 7)) num
,      ( select emp.*, rownum rn
         from ( select emp.*
                from emp
                order
                by sal desc) emp
       ) emp
where  rn = num.column_value
COLUMN_VALUE      EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO         RN
------------ ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- -
           5       7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30          5
           6       7782 CLARK      MARKETEER       7839 09-JUN-81       2450                    10          6
           7       7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30          7

Using the 10g Model Clause

Inspired by Tom Kyte’s column in Oracle Magazine of November 2004, I realized how you can make use of the MODEL clause in Oracle 10g to have records appear out of thin air. The advantage of this approach is that we do not have to create TYPEs and FUNCTIONs before we can start querying.

To select the number of employees hired in the years 1979 through 1983:

select years.year
,      count(empno)
from   ( SELECT 1979+cell year
         FROM   DUAL
         MODEL
          DIMENSION BY (0 attr)
          MEASURES (0 cell)
          RULES  ITERATE (5)
          ( cell[iteration_number]= iteration_number)
       ) years
       left outer join
       emp
on     (extract(year from hiredate) = years.year)
group
by     year
ORDER
BY     year

The outcome, in case you were wondering:

  YEAR COUNT(EMPNO)
------ ------------
  1979            0
  1980            1
  1981           10
  1982            1
  1983            0

The in-line view years makes use of DUAL and explodes the set returned from dual. It builds a two-dimensional array:

SELECT cell , attr
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (5)
  ( cell[iteration_number]= iteration_number)

 CELL       ATTR
----- ----------
    0          0
    1          1
    2          2
    3          3
    4          4

With the ITERATE rule we specify the number of records that should be created in th model. With the DIMENSION and MEASURES, we define the composition of each individual record. Finally we use the almost simplest assignment expression for the cell (MEASURE) values.

In a similar fashion we can get a list of the days of the week:

SELECT cell day_number
,      to_char( cell + to_date('21-11-2004','DD-MM-YYYY'),'DAY') day
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (7)
  ( cell[iteration_number]= iteration_number)

DAY_NUMBER DAY
---------- ---------
         0 SUNDAY
         1 MONDAY
         2 TUESDAY
         3 WEDNESDAY
         4 THURSDAY
         5 FRIDAY
         6 SATURDAY

Here I happen to use a SUNDAY as starting point (21st November 2004, today).

Using the MODEL clause in this fashion, you can any number of records you desire with values that are either from 1 to the number of records or derived from those values – such as dates, ascii characters etc. Get a list of all letters in the alphabet:

SELECT chr(97+cell) letter
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (26)
  ( cell[iteration_number]= iteration_number)

L
-
a
b
c
d
e
f
g
h
i
j
...

Resources on the MODEL Clause:
SQL Reference (Oracle 10g On Line Documentation) – Model Expressions
SQL Reference (Oracle 10g On Line Documentation) – Model Clause
Data Warehousing Guide (Oracle 10g On Line Documentation) – 22. SQL For Modeling

Partition Outer Join (10g)

In certain circumstances you may also want to make use of the Partition Outer Join Clause. For example, if we want to select from EMP the number of employees hired in the years 1980, 1981 and 1982 per department – we need a partition outer join to ensure that we will have a result for each of these years for each department.

If we execute this query – to find the number of employees hired in each year for every department:

select years.year
,      deptno
,      count(empno)
from   ( SELECT 1980+cell year
         FROM   DUAL
         MODEL
          DIMENSION BY (0 attr)
          MEASURES (0 cell)
          RULES  ITERATE (3)
          ( cell[iteration_number]= iteration_number)
       ) years
       left outer join
       emp
on     (extract(year from hiredate) = years.year)
group
by     year
,      deptno
ORDER
BY     year
,          deptno

the result is somewhat incomplete:

  YEAR     DEPTNO COUNT(EMPNO)
------ ---------- ------------
  1980         20            1
  1981         10            1
  1981         20            3
  1981         30            6
  1982         10            1

No entries for departments 10 and 30 in 1980 nor entries in 1982 for departments 20 and 30. With the partition outer join clause we can complement this result:

 select years.year
 ,      deptno
 ,      count(empno)
 from   ( SELECT 1980+cell year
          FROM   DUAL
          MODEL
           DIMENSION BY (0 attr)
           MEASURES (0 cell)
           RULES  ITERATE (3)
           ( cell[iteration_number]= iteration_number)
        ) years
        left outer join
        emp partition by (deptno)
 on     (extract(year from hiredate) = years.year)
 group
 by     year
 ,      deptno
 ORDER
 BY     year
 ,      deptno

Here we have “left outer joined” years with emp. That can be read as: take all left records and find joinable records from emp; if you can find none, add an “empty” record representing the right side – emp. In the previous query, since every year could be matched with at least one department’s score, there were no empty records added; hence we had no row at all for the departments 10 and 30 for 1980. In this query, we have added partition by (deptno) to emp. This tells the SQL Engine that the assignment to left outer join years to emp must be executed for every deptno-partition-of-emp. The set of records returned from EMP is partitioned by deptno (three partitions). Then, the outer join is performed again for each of the three partitions. This must results in a record for every year and every partition (i.e. deptno). The result of this query:

  YEAR     DEPTNO COUNT(EMPNO)
------ ---------- ------------
  1980         10            0
  1980         20            1
  1980         30            0
  1981         10            1
  1981         20            3
  1981         30            6
  1982         10            1
  1982         20            0
  1982         30            0
Share.

About Author

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director for Fusion Middleware. Consultant, trainer and instructor on diverse areas including Oracle Database (SQL & PLSQL), Service Oriented Architecture, BPM, ADF, Java in various shapes and forms and many other things. Author of the Oracle Press book: Oracle SOA Suite 11g Handbook. Frequent presenter on conferences such as JavaOne, Oracle OpenWorld, ODTUG Kaleidoscope, Devoxx and OBUG. Presenter for Oracle University Celebrity specials.

11 Comments

  1. what do you think about this?
    select Job, DECODE(DEPTNO,10,sum(SAL), NULL) Dept 10,
    DECODE(DEPTNO,20,sum(SAL), NULL) Dept 20,
    DECODE(DEPTNO,30,sum(SAL), NULL) Dept 10,
    sum(SAL) Total
    from EMP
    group by Job;

  2. I am trying to create a view that shows the total of overdue people by, less than 30 days, 31 to 59days, and over 60 days overdue.
    What is wrong with my code.

    CREATE OR REPLACE VIEW csu_status_date_vu
    (category, less than 30 days, 31-59 days, 60 days and greater)

    AS SELECT ass_attribute9, effective_start_date

    FROM PER_ALL_ASSIGNMENTS_F

    WHERE (SELECT SUM (CASE
    WHEN effective_start_date > TRUNC (SYSDATE – 30)
    THEN 1
    ELSE 0
    END) “less than 30 days”,
    SUM (CASE
    WHEN effective_start_date = TRUNC (SYSDATE – 59)
    THEN 1
    ELSE 0
    END) “31 -59 days”,
    SUM (CASE
    WHEN effective_start_date

  3. Re the pivot query, isn’t this a whole lot simpler? :-

    SQL> select job
    2 , sum(case when deptno=10 then sal else 0 end) dept10
    3 , sum(case when deptno=20 then sal else 0 end) dept20
    4 , sum(case when deptno=30 then sal else 0 end) dept30
    5 , sum(sal) total
    6 from emp
    7 group by job;

    JOB DEPT10 DEPT20 DEPT30 TOTAL
    ——— ———- ———- ———- ———-
    ANALYST 0 6000 0 6000
    CLERK 1300 1900 950 4150
    MANAGER 2450 2975 2850 8275
    PRESIDENT 5000 0 0 5000
    SALESMAN 0 0 5600 5600

  4. OK, using the SQL MODEL clause (only in Oracle 10g I am afraid) I found the near perfect solution. Its result looks as follows:
    JOB DEPT10 DEPT20 DEPT30 GRAND_TOTAL
    --------- ---------- ---------- ---------- -----------
    CLERK 1300 1900 950 4150
    SALESMAN 5600 5600
    PRESIDENT 5000 5000
    MANAGER 2450 2975 2850 8275
    ANALYST 6000 6000

    The query itself:

    select job
    , dept10
    , dept20
    , dept30
    , grand_total
    from (select distinct deptno, job, sum(sal) OVER ( PARTITION BY deptno, job) sumsal , sum(sal) over( partition by job) sumjob from emp )
    model
    return updated rows
    partition by (job)
    dimension by (deptno)
    measures (sumsal, lpad(' ',10) dept10, lpad(' ',10) dept20, lpad(' ',10) dept30 , sumjob grand_total)
    rules upsert
    (
    dept10 [0] = sumsal [10]
    , dept20 [0] = sumsal [20]
    , dept30 [0] = sumsal [30]
    , grand_total [0] = max(grand_total) [ANY]
    )

    For more insight and an explanation, see this post Pivoting in SQL using the 10g Model Clause

  5. OK, I missed the proper pivot. By replacing job and deptno in the above query we get the following result:

    JOB       SAL_SUM1             SAL_SUM2             SAL_SUM3
    --------- -------------------- -------------------- -------------
    ANALYST   10:                  20: 6000             30:
    CLERK     10: 1300             20: 1900             30: 950
    MANAGER   10: 2450             20: 2975             30: 2850
    PRESIDENT 10: 5000             20:                  30:
    SALESMAN  10:                  20:                  30: 5600            

    The query now looks as follows:

    select job
    ,      substr(sal_sum1, 1, 20) sal_sum1
    ,      substr(sal_sum2, 1, 20) sal_sum2
    ,      substr(sal_sum3, 1, 20) sal_sum3
    ,      substr(sal_sum4, 1, 20) sal_sum4
    ,      substr(sal_sum5, 1, 20) sal_sum5
    ,      substr(sal_sum6, 1, 20) sal_sum6
    from ( select job
           ,      deptno||': '||sal_sum sal_sum1
           ,      lead(deptno,1) over (partition by job order by deptno)
                  ||': '||lead(sal_sum,1) over (partition by job order by deptno) sal_sum2
           ,      lead(deptno,2) over (partition by job order by deptno)
                  ||': '||lead(sal_sum,2) over (partition by job order by deptno) sal_sum3
           ,      lead(deptno,3) over (partition by job order by deptno)
                  ||': '||lead(sal_sum,3) over (partition by job order by deptno) sal_sum4
           ,      lead(deptno,4) over (partition by job order by deptno)
                  ||': '||lead(sal_sum,4) over (partition by job order by deptno) sal_sum5
           ,      lead(deptno,5) over (partition by job order by deptno)
                  ||': '||lead(sal_sum,5) over (partition by job order by deptno) sal_sum6
           ,      row_number() over (partition by job order by deptno) rnk
           from ( select distinct
                         job
                  ,      deptnos.deptno
                  ,      sum(sal) over (partition by emp.deptno, job) sal_sum
                  from   emp partition by (job)
                         right outer join
                         (select distinct deptno from emp) deptnos
                         on (emp.deptno = deptnos.deptno)
                  order
                  by     job
                  ,      deptno
                )
         )
    where rnk = 1 
  6. Maybe a better, more generic solution with less hard-coding and less coding in general – though only suitable for Oracle 10g because of the partion by clause in the outer join expression – would render the following result:

     DEPTNO SAL_SUM1             SAL_SUM2             SAL_SUM3             SAL_SUM4             SAL_SUM5             SAL_SUM6
    ------- -------------------- -------------------- -------------------- -------------------- -----
         10 ANALYST:             CLERK: 1300          MANAGER: 2450        PRESIDENT: 5000      SALESMAN:            :
         20 ANALYST: 6000        CLERK: 1900          MANAGER: 2975        PRESIDENT:           SALESMAN:            :
         30 ANALYST:             CLERK: 950           MANAGER: 2850        PRESIDENT:           SALESMAN: 5600       :

    The code used to produce this result:

    select deptno
    ,      substr(sal_sum1, 1, 20) sal_sum1
    ,      substr(sal_sum2, 1, 20) sal_sum2
    ,      substr(sal_sum3, 1, 20) sal_sum3
    ,      substr(sal_sum4, 1, 20) sal_sum4
    ,      substr(sal_sum5, 1, 20) sal_sum5
    ,      substr(sal_sum6, 1, 20) sal_sum6
    from ( select deptno
           ,      job||': '||sal_sum sal_sum1
           ,      lead(job,1) over (partition by deptno order by job)
                  ||': '||lead(sal_sum,1) over (partition by deptno order by job) sal_sum2
           ,      lead(job,2) over (partition by deptno order by job)
                  ||': '||lead(sal_sum,2) over (partition by deptno order by job) sal_sum3
           ,      lead(job,3) over (partition by deptno order by job)
                  ||': '||lead(sal_sum,3) over (partition by deptno order by job) sal_sum4
           ,      lead(job,4) over (partition by deptno order by job)
                  ||': '||lead(sal_sum,4) over (partition by deptno order by job) sal_sum5
           ,      lead(job,5) over (partition by deptno order by job)
                  ||': '||lead(sal_sum,5) over (partition by deptno order by job) sal_sum6
           ,      row_number() over (partition by deptno order by job) rnk
           from ( select distinct
                         deptno
                  ,      jobs.job
                  ,      sum(sal) over (partition by emp.job, deptno) sal_sum
                  from   emp partition by (deptno)
                         right outer join
                         (select distinct job from emp) jobs
                         on (emp.job = jobs.job)
                  order
                  by     deptno
                  ,      job
                )
         )
    where rnk = 1
  7. Lucas Jellema on

    I am not sure this is the most elegant solution, but it most certainly gives the matrix report you asked for:

    JOB DEPT10 DEPT20 DEPT30 TOTAL
    -------- ---------- ---------- ---------- ----------
    Clerks 1430 2090 1045 4565
    Managers 0 2975 2850 5825
    Salesmen 0 0 3100 3100


    select case dn.n
    when 1
    then 'Clerks'
    when 2
    then 'Managers'
    when 3
    then 'Salesmen'
    else null
    end job
    , max( case
    when rn=1 and dn.n = 1
    then clerks
    when rn=1 and dn.n = 2
    then managers
    when rn=1 and dn.n = 3
    then salesmen
    else null
    end) dept10
    , max( case
    when rn=2 and dn.n = 1
    then clerks
    when rn=2 and dn.n = 2
    then managers
    when rn=2 and dn.n = 3
    then salesmen
    else null
    end ) dept20
    , max(case
    when rn=3 and dn.n = 1
    then clerks
    when rn=3 and dn.n = 2
    then managers
    when rn=3 and dn.n = 3
    then salesmen
    else null
    end) dept30
    , sum(case
    when dn.n = 1
    then clerks
    when dn.n = 2
    then managers
    when dn.n = 3
    then salesmen
    else 0
    end) total
    from
    (select deptno
    , row_number() over (order by deptno) rn
    , sum( case job
    when 'CLERK'
    then sal
    else 0
    end
    ) clerks
    , sum( case job
    when 'MANAGER'
    then sal
    else 0
    end
    ) Managers
    , sum( case job
    when 'SALESMAN'
    then sal
    else 0
    end
    ) salesmen
    from emp
    group
    by deptno
    )
    , (select rownum n from emp where rownum<4) dn
    group
    by case dn.n
    when 1
    then 'Clerks'
    when 2
    then 'Managers'
    when 3
    then 'Salesmen'
    else null
    end

  8. There are some good examples here but don’t forget that in many circumstances an alternative approach
    for magicing rows into existence is to deliberately make a cartesian join to this type of in-line view:

    select rownum
    from {large table}
    where rownum < {rows that you want}

  9. how do i create a matrix query to display job, salary for that job based on dept no., and total salary for that job fro all departments,
    job dept10 dept20 dept30 total
    —- —— —— —— —–
    clerk 1300 1900 950 4150

    using grouping functions

  10. how do i create a matrix query to display job, salary for that job based on dept no., and total salary for that job fro all departments,
    job dept10 dept20 dept30 total
    —- —— —— —— —–
    clerk 1300 1900

  11. Pingback: » Courtesy of Tom Kyte: Generating rows in SQL with the CUBE statement - no dummy table or table function required