Making up records in SQL Queries – Table Functions and 10g Model clause

Lucas Jellema November 21, 2004 Database, IT, Oracle, PL/SQL 10 Comments

From time to time there is a need in my SQL queries to have access to a table with a specific number of records. Sometimes I want to explode the results from a query – double, treble etc. the number of records. On other occasions I want to select from a specific range or outer join with a specified range to have an entry for every rank, date etc. More concrete examples (in terms of EMP and DEPT):

select the number of employees hired on every day of the week – include entries for days on which no one was hired
select the employees ranked on even positions when ordered by salary (note: there are many ways to approach this)
present for the numbers 1 to 5 the number of employees that manage that number of subordinates
select all employees as many times as there are Clerks

Of course the old way of doing this was create a utility or helper table that contained a very large number of records (something like dual but with many rows). Each row typically only contained a number. At other times you also have created tables with seven records (days of the week), twelve records (months of the year), 26 records (letters in the alphabet) etc. However, these could all have been derived in views from the one table with just many numeric values. You could also fake this table by using a view or table in your schema that you know to be always available and contain a substantial number of records. Tom Kyte’s favorite for example is ALL_OBJECTS. We will look at the ‘modern’ way: TABLE FUNCTIONS (Oracle 9i) and the MODEL clause (10g).

Using Table-Functions

Much has been written about Table Functions that I will not repeat here. Basically, a Table Function is a PL/SQL function that returns a Nested Table. Such a function can be used in SQL queries, with the function wrapped inside a TABLE( ) operator. We can use TABLE(function) just like any table, view or in-line view in the query; it is just another result-set or record-source.

A useful, generic Table-Function is the following:

create type number_table as table of number
create or replace function number_tbl
( p_lower in number:=0, p_upper in number:=10)
return number_table
is
  l_num_tbl  number_table := number_table();
begin
  for i in p_lower..p_upper loop
    l_num_tbl.extend;
    l_num_tbl(l_num_tbl.last):= i;
  end loop;
  return l_num_tbl;
end number_tbl;

Using this function number_tbl we have a row-generator at our disposal that can be used in many ways.
List the days of the week (knowing that 21st november 2004 is a Sunday

SELECT to_char( days.column_value + to_date('21-11-2004','DD-MM-YYYY'),'DAY') day
FROM   table(number_tbl(0,6)) days

The result:

DAY
---------
SUNDAY
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY

Find for the numbers 0 through 5 the number of managers who have that number of subordinates:

SELECT num_subs.column_value number_of_subordinates
,      count(subs_count.cnt) number_of_mgrs
FROM   table(number_tbl(0,5)) num_subs
       left outer join
       (select count(*) cnt from emp group by mgr) subs_count
       on  num_subs.column_value = subs_count.cnt
group
by     num_subs.column_value

Results:

NUMBER_OF_SUBORDINATES NUMBER_OF_MGRS
---------------------- --------------
                     0              0
                     1              4
                     2              1
                     3              1
                     4              0
                     5              1

Select the employees ranked 5 through 7 when ordered by salary, descending:

select *
from   table( number_tbl(5, 7)) num
,      ( select emp.*, rownum rn
         from ( select emp.*
                from emp
                order
                by sal desc) emp
       ) emp
where  rn = num.column_value

COLUMN_VALUE      EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO         RN
------------ ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- -
           5       7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30          5
           6       7782 CLARK      MARKETEER       7839 09-JUN-81       2450                    10          6
           7       7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30          7

Using the 10g Model Clause

Inspired by Tom Kyte’s column in Oracle Magazine of November 2004, I realized how you can make use of the MODEL clause in Oracle 10g to have records appear out of thin air. The advantage of this approach is that we do not have to create TYPEs and FUNCTIONs before we can start querying.

To select the number of employees hired in the years 1979 through 1983:

select years.year
,      count(empno)
from   ( SELECT 1979+cell year
         FROM   DUAL
         MODEL
          DIMENSION BY (0 attr)
          MEASURES (0 cell)
          RULES  ITERATE (5)
          ( cell[iteration_number] = iteration_number)
       ) years
       left outer join
       emp
on     (extract(year from hiredate) = years.year)
group
by     year
ORDER
BY     year

The outcome, in case you were wondering:

  YEAR COUNT(EMPNO)
------ ------------
  1979            0
  1980            1
  1981           10
  1982            1
  1983            0

The in-line view years makes use of DUAL and explodes the set returned from dual. It builds a two-dimensional array:

SELECT cell , attr
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (5)
  ( cell[iteration_number] = iteration_number)

 CELL       ATTR
----- ----------
    0          0
    1          1
    2          2
    3          3
    4          4

With the ITERATE rule we specify the number of records that should be created in th model. With the DIMENSION and MEASURES, we define the composition of each individual record. Finally we use the almost simplest assignment expression for the cell (MEASURE) values.

In a similar fashion we can get a list of the days of the week:

SELECT cell day_number
,      to_char( cell + to_date('21-11-2004','DD-MM-YYYY'),'DAY') day
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (7)
  ( cell[iteration_number] = iteration_number)

DAY_NUMBER DAY
---------- ---------
         0 SUNDAY
         1 MONDAY
         2 TUESDAY
         3 WEDNESDAY
         4 THURSDAY
         5 FRIDAY
         6 SATURDAY

Here I happen to use a SUNDAY as starting point (21st November 2004, today).

Using the MODEL clause in this fashion, you can any number of records you desire with values that are either from 1 to the number of records or derived from those values – such as dates, ascii characters etc. Get a list of all letters in the alphabet:

SELECT chr(97+cell) letter
FROM   DUAL
  MODEL
  DIMENSION BY (0 attr)
  MEASURES (0 cell)
  RULES  ITERATE (26)
  ( cell[iteration_number] = iteration_number)

L
-
a
b
c
d
e
f
g
h
i
j
...

Resources on the MODEL Clause:
SQL Reference (Oracle 10g On Line Documentation) – Model Expressions
SQL Reference (Oracle 10g On Line Documentation) – Model Clause
Data Warehousing Guide (Oracle 10g On Line Documentation) – 22. SQL For Modeling

Partition Outer Join (10g)

In certain circumstances you may also want to make use of the Partition Outer Join Clause. For example, if we want to select from EMP the number of employees hired in the years 1980, 1981 and 1982 per department – we need a partition outer join to ensure that we will have a result for each of these years for each department.

If we execute this query – to find the number of employees hired in each year for every department:

select years.year
,      deptno
,      count(empno)
from   ( SELECT 1980+cell year
         FROM   DUAL
         MODEL
          DIMENSION BY (0 attr)
          MEASURES (0 cell)
          RULES  ITERATE (3)
          ( cell[iteration_number] = iteration_number)
       ) years
       left outer join
       emp
on     (extract(year from hiredate) = years.year)
group
by     year
,      deptno
ORDER
BY     year
,          deptno

the result is somewhat incomplete:

  YEAR     DEPTNO COUNT(EMPNO)
------ ---------- ------------
  1980         20            1
  1981         10            1
  1981         20            3
  1981         30            6
  1982         10            1

No entries for departments 10 and 30 in 1980 nor entries in 1982 for departments 20 and 30. With the partition outer join clause we can complement this result:

 select years.year
 ,      deptno
 ,      count(empno)
 from   ( SELECT 1980+cell year
          FROM   DUAL
          MODEL
           DIMENSION BY (0 attr)
           MEASURES (0 cell)
           RULES  ITERATE (3)
           ( cell[iteration_number] = iteration_number)
        ) years
        left outer join
        emp partition by (deptno)
 on     (extract(year from hiredate) = years.year)
 group
 by     year
 ,      deptno
 ORDER
 BY     year
 ,      deptno

Here we have “left outer joined” years with emp. That can be read as: take all left records and find joinable records from emp; if you can find none, add an “empty” record representing the right side – emp. In the previous query, since every year could be matched with at least one department’s score, there were no empty records added; hence we had no row at all for the departments 10 and 30 for 1980. In this query, we have added partition by (deptno) to emp. This tells the SQL Engine that the assignment to left outer join years to emp must be executed for every deptno-partition-of-emp. The set of records returned from EMP is partitioned by deptno (three partitions). Then, the outer join is performed again for each of the three partitions. This must results in a record for every year and every partition (i.e. deptno). The result of this query:

  YEAR     DEPTNO COUNT(EMPNO)
------ ---------- ------------
  1980         10            0
  1980         20            1
  1980         30            0
  1981         10            1
  1981         20            3
  1981         30            6
  1982         10            1
  1982         20            0
  1982         30            0

About The Author

Lucas Jellema

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director and Oracle Developer Champion. Solution architect and developer on diverse areas including SQL, JavaScript, Kubernetes & Docker, Machine Learning, Java, SOA and microservices, events in various shapes and forms and many other things. Author of the Oracle Press book Oracle SOA Suite 12c Handbook. Frequent presenter on user groups and community events and conferences such as JavaOne, Oracle Code, CodeOne, NLJUG JFall and Oracle OpenWorld.

10 Comments

drkalucard September 5, 2006

what do you think about this?
select Job, DECODE(DEPTNO,10,sum(SAL), NULL) Dept 10,
DECODE(DEPTNO,20,sum(SAL), NULL) Dept 20,
DECODE(DEPTNO,30,sum(SAL), NULL) Dept 10,
sum(SAL) Total
from EMP
group by Job;
Mike April 13, 2006

I am trying to create a view that shows the total of overdue people by, less than 30 days, 31 to 59days, and over 60 days overdue.
What is wrong with my code.

CREATE OR REPLACE VIEW csu_status_date_vu
(category, less than 30 days, 31-59 days, 60 days and greater)

AS SELECT ass_attribute9, effective_start_date

FROM PER_ALL_ASSIGNMENTS_F

WHERE (SELECT SUM (CASE
WHEN effective_start_date > TRUNC (SYSDATE – 30)
THEN 1
ELSE 0
END) “less than 30 days”,
SUM (CASE
WHEN effective_start_date = TRUNC (SYSDATE – 59)
THEN 1
ELSE 0
END) “31 -59 days”,
SUM (CASE
WHEN effective_start_date
Tony Andrews September 15, 2005

Re the pivot query, isn’t this a whole lot simpler? :-

SQL> select job
2 , sum(case when deptno=10 then sal else 0 end) dept10
3 , sum(case when deptno=20 then sal else 0 end) dept20
4 , sum(case when deptno=30 then sal else 0 end) dept30
5 , sum(sal) total
6 from emp
7 group by job;

JOB DEPT10 DEPT20 DEPT30 TOTAL
——— ———- ———- ———- ———-
ANALYST 0 6000 0 6000
CLERK 1300 1900 950 4150
MANAGER 2450 2975 2850 8275
PRESIDENT 5000 0 0 5000
SALESMAN 0 0 5600 5600
Lucas December 16, 2004

OK, using the SQL MODEL clause (only in Oracle 10g I am afraid) I found the near perfect solution. Its result looks as follows:
JOB DEPT10 DEPT20 DEPT30 GRAND_TOTAL --------- ---------- ---------- ---------- ----------- CLERK 1300 1900 950 4150 SALESMAN 5600 5600 PRESIDENT 5000 5000 MANAGER 2450 2975 2850 8275 ANALYST 6000 6000

The query itself:
select job , dept10 , dept20 , dept30 , grand_total from (select distinct deptno, job, sum(sal) OVER ( PARTITION BY deptno, job) sumsal , sum(sal) over( partition by job) sumjob from emp ) model return updated rows partition by (job) dimension by (deptno) measures (sumsal, lpad(' ',10) dept10, lpad(' ',10) dept20, lpad(' ',10) dept30 , sumjob grand_total) rules upsert ( dept10 [0] = sumsal [10] , dept20 [0] = sumsal [20] , dept30 [0] = sumsal [30] , grand_total [0] = max(grand_total) [ANY] )

For more insight and an explanation, see this post Pivoting in SQL using the 10g Model Clause

Lucas December 8, 2004

OK, I missed the proper pivot. By replacing job and deptno in the above query we get the following result:

JOB       SAL_SUM1             SAL_SUM2             SAL_SUM3
--------- -------------------- -------------------- -------------
ANALYST   10:                  20: 6000             30:
CLERK     10: 1300             20: 1900             30: 950
MANAGER   10: 2450             20: 2975             30: 2850
PRESIDENT 10: 5000             20:                  30:
SALESMAN  10:                  20:                  30: 5600

The query now looks as follows:

select job
,      substr(sal_sum1, 1, 20) sal_sum1
,      substr(sal_sum2, 1, 20) sal_sum2
,      substr(sal_sum3, 1, 20) sal_sum3
,      substr(sal_sum4, 1, 20) sal_sum4
,      substr(sal_sum5, 1, 20) sal_sum5
,      substr(sal_sum6, 1, 20) sal_sum6
from ( select job
       ,      deptno||': '||sal_sum sal_sum1
       ,      lead(deptno,1) over (partition by job order by deptno)
              ||': '||lead(sal_sum,1) over (partition by job order by deptno) sal_sum2
       ,      lead(deptno,2) over (partition by job order by deptno)
              ||': '||lead(sal_sum,2) over (partition by job order by deptno) sal_sum3
       ,      lead(deptno,3) over (partition by job order by deptno)
              ||': '||lead(sal_sum,3) over (partition by job order by deptno) sal_sum4
       ,      lead(deptno,4) over (partition by job order by deptno)
              ||': '||lead(sal_sum,4) over (partition by job order by deptno) sal_sum5
       ,      lead(deptno,5) over (partition by job order by deptno)
              ||': '||lead(sal_sum,5) over (partition by job order by deptno) sal_sum6
       ,      row_number() over (partition by job order by deptno) rnk
       from ( select distinct
                     job
              ,      deptnos.deptno
              ,      sum(sal) over (partition by emp.deptno, job) sal_sum
              from   emp partition by (job)
                     right outer join
                     (select distinct deptno from emp) deptnos
                     on (emp.deptno = deptnos.deptno)
              order
              by     job
              ,      deptno
            )
     )
where rnk = 1

Lucas December 8, 2004

Maybe a better, more generic solution with less hard-coding and less coding in general – though only suitable for Oracle 10g because of the partion by clause in the outer join expression – would render the following result:

 DEPTNO SAL_SUM1             SAL_SUM2             SAL_SUM3             SAL_SUM4             SAL_SUM5             SAL_SUM6
------- -------------------- -------------------- -------------------- -------------------- -----
     10 ANALYST:             CLERK: 1300          MANAGER: 2450        PRESIDENT: 5000      SALESMAN:            :
     20 ANALYST: 6000        CLERK: 1900          MANAGER: 2975        PRESIDENT:           SALESMAN:            :
     30 ANALYST:             CLERK: 950           MANAGER: 2850        PRESIDENT:           SALESMAN: 5600       :

The code used to produce this result:

select deptno
,      substr(sal_sum1, 1, 20) sal_sum1
,      substr(sal_sum2, 1, 20) sal_sum2
,      substr(sal_sum3, 1, 20) sal_sum3
,      substr(sal_sum4, 1, 20) sal_sum4
,      substr(sal_sum5, 1, 20) sal_sum5
,      substr(sal_sum6, 1, 20) sal_sum6
from ( select deptno
       ,      job||': '||sal_sum sal_sum1
       ,      lead(job,1) over (partition by deptno order by job)
              ||': '||lead(sal_sum,1) over (partition by deptno order by job) sal_sum2
       ,      lead(job,2) over (partition by deptno order by job)
              ||': '||lead(sal_sum,2) over (partition by deptno order by job) sal_sum3
       ,      lead(job,3) over (partition by deptno order by job)
              ||': '||lead(sal_sum,3) over (partition by deptno order by job) sal_sum4
       ,      lead(job,4) over (partition by deptno order by job)
              ||': '||lead(sal_sum,4) over (partition by deptno order by job) sal_sum5
       ,      lead(job,5) over (partition by deptno order by job)
              ||': '||lead(sal_sum,5) over (partition by deptno order by job) sal_sum6
       ,      row_number() over (partition by deptno order by job) rnk
       from ( select distinct
                     deptno
              ,      jobs.job
              ,      sum(sal) over (partition by emp.job, deptno) sal_sum
              from   emp partition by (deptno)
                     right outer join
                     (select distinct job from emp) jobs
                     on (emp.job = jobs.job)
              order
              by     deptno
              ,      job
            )
     )
where rnk = 1

Lucas Jellema November 24, 2004

I am not sure this is the most elegant solution, but it most certainly gives the matrix report you asked for:
JOB DEPT10 DEPT20 DEPT30 TOTAL -------- ---------- ---------- ---------- ---------- Clerks 1430 2090 1045 4565 Managers 0 2975 2850 5825 Salesmen 0 0 3100 3100
select case dn.n when 1 then 'Clerks' when 2 then 'Managers' when 3 then 'Salesmen' else null end job , max( case when rn=1 and dn.n = 1 then clerks when rn=1 and dn.n = 2 then managers when rn=1 and dn.n = 3 then salesmen else null end) dept10 , max( case when rn=2 and dn.n = 1 then clerks when rn=2 and dn.n = 2 then managers when rn=2 and dn.n = 3 then salesmen else null end ) dept20 , max(case when rn=3 and dn.n = 1 then clerks when rn=3 and dn.n = 2 then managers when rn=3 and dn.n = 3 then salesmen else null end) dept30 , sum(case when dn.n = 1 then clerks when dn.n = 2 then managers when dn.n = 3 then salesmen else 0 end) total from (select deptno , row_number() over (order by deptno) rn , sum( case job when 'CLERK' then sal else 0 end ) clerks , sum( case job when 'MANAGER' then sal else 0 end ) Managers , sum( case job when 'SALESMAN' then sal else 0 end ) salesmen from emp group by deptno ) , (select rownum n from emp where rownum<4) dn group by case dn.n when 1 then 'Clerks' when 2 then 'Managers' when 3 then 'Salesmen' else null end
Jason Grundy November 24, 2004

There are some good examples here but don’t forget that in many circumstances an alternative approach
for magicing rows into existence is to deliberately make a cartesian join to this type of in-line view:

select rownum
from {large table}
where rownum < {rows that you want}
sheetal November 23, 2004

how do i create a matrix query to display job, salary for that job based on dept no., and total salary for that job fro all departments,
job dept10 dept20 dept30 total
—- —— —— —— —–
clerk 1300 1900 950 4150

using grouping functions
sheetal November 23, 2004

how do i create a matrix query to display job, salary for that job based on dept no., and total salary for that job fro all departments,
job dept10 dept20 dept30 total
—- —— —— —— —–
clerk 1300 1900
Pingback: » Courtesy of Tom Kyte: Generating rows in SQL with the CUBE statement - no dummy table or table function required February 11, 2005

Using Table-Functions

Using the 10g Model Clause

Partition Outer Join (10g)

Share this:

Like this:

Related Posts

About The Author

Lucas Jellema

10 Comments