Sorting PL/SQL Collections, the quite simple way (part two: Have the SQL Engine do the heavy lifting)

10

Yesterday I published an article on this blog describing an approach to sorting the elements in PL/SQL Collections:Sorting PL/SQL Collections, the hard way, the intermediate way and the quite simple way (part one) . This approach makes use of the property of PL/SQL Associative Arrays that the keys are always kept in sorted order. We do not have to do anything, just stuff the things we want to have sorted into an Associative Array and when done, retrieve them again. They will reappear properly sorted! With some additional logic, we can use this mechanism to quickly sort large collections of complex records.

However, there are a few downsides to this approach: it will only sort on VARCHAR2 and NUMBER fields – and the alphabetical sort may not be the one we require. More complex sorting rules are not supported. Besides, the code may not be as clear and elegant as we would like. So far I have not done any performance tests – that would be the next step.

But first I want to demonstrate another way of sorting PL/SQL Collections. We will make use of the TABLE, MULTISET and CAST operators to out-source the heavy lifting to the SQL Engine! This means we have the full power of SQL at our disposal for manipulating the PL/SQL collection. Note: not just for sorting, also for filtering and aggregating! There is one important requirement: the PL/SQL Collection must be based on a type that is defined in the database!

.... 

Let’s start with a very simple example:

create type num_tbl is table of number
/

declare
   l_num_tbl num_tbl:= num_tbl(6,12,9,1,54,21, 11, 2);
 

We have defined a type num_tbl and now we have an unsorted collection of that type. Our first objective: get the elements of this collection in a properly sorted order. Let’s see how simple that is:

 begin
   select cast ( multiset( select *
                          from table( l_num_tbl )
                          order by 1
                         ) as num_tbl)
   into l_num_tbl
   from dual;
 

Using the table operator we pretend that l_num_tbl is a table-like-recordset. We can select records from this table-like-structure in an sequence determined by a normal SQL ORDER BY clause. The result of that query is turned into a collection again, using MULTISET, and subsequently casted into the desired collection type, namely NUM_TABLE. The result is selected into the variable l_num_tbl that is the same one as we started with.

Reporting the values in the l_num_tbl collection at this point:

   l_idx:= l_num_tbl.first;
   loop
     dbms_output.put_line(l_num_tbl(l_idx));
     l_idx:= l_num_tbl.next(l_idx);
     exit when l_idx is null;
   end loop;
end;
/
1
2
6
9
11
12
21
54

PL/SQL procedure successfully completed.

And we have done it! That’s even less code than in the comparable example in Sorting PL/SQL Collections, the hard way, the intermediate way and the quite simple way (part one). Note that this exchange with the SQL Engine is only possible because l_num_tbl is based on a type that is created in the database and not defined in PL/SQL.

Sorting Collection of more complex records

What we have seen so far was relatively trivial. Let’s now look at a collection of more complex structures that we may want to sort in more advanced ways. Again, the SQL engine can only sort collections of types that are defined in the database. We define two helper types in the database:

create type sortable
is object
( value varchar2(50)
, pointer number
)
/
create type sortable_table
is table of sortable
/
 

These types are later used to perform the sort on the field in the records that we want to have sorted.

Now let’s meet the complex record definition and the collection we want to sort:

declare
    type complex_type is record ( last_name varchar2(50)
                                , street varchar2(50)
                                , city varchar2(50)
                                , birthdate date
                                , zipcode varchar2(15)
                                , email_address varchar2(50)
                                , many_more_elements varchar2(4000)
                                );
    type complex_type_table_type is table of complex_type;
    -- define and initialize a complex Nested Table Collection
    l_cpx_tbl complex_type_table_type:= complex_type_table_type();
    l_cpx_rec complex_type;

The records in our collection are created and added as follows:

  begin
    -- first populate the collection with a few values.
    l_cpx_rec.last_name:= 'Jellema';
    l_cpx_rec.street:= 'Rembrandtlaan';
    l_cpx_rec.email_address:= 'toby231@hitmail.com';
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;
    l_cpx_rec.last_name:= 'Abramovic';
    l_cpx_rec.street:= 'Chelsea Road';
    l_cpx_rec.email_address:= 'abra1@notfail.com';
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;
    l_cpx_rec.last_name:= 'Rooney';
    l_cpx_rec.street:= 'ManYou 32';
    l_cpx_rec.email_address:= 'wayne@noavail.com';
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;
    l_cpx_rec.last_name:= 'Basten';
    l_cpx_rec.street:= 'Russia Park 1988';
    l_cpx_rec.email_address:= 'bassie@oneteam.com';
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;

 

We want to sort the collection by the last_name field in the complex_type records. Currently, the order is rather random – well, actually it is the order in which the records were added to the collection:
 

    l_idx:= l_cpx_tbl.first;
    loop
      dbms_output.put_line('**     '||l_cpx_tbl(l_idx).last_name
                         ||' email: '||l_cpx_tbl(l_idx).email_address);
      l_idx:= l_cpx_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;
**     Jellema email: toby231@hitmail.com
**     Abramovic email: abra1@notfail.com
**     Rooney email: wayne@noavail.com
**     Basten email: bassie@oneteam.com

The sort logic makes use of a helper collection that contains records with two field: the value field on which the complex_type records should be sorted and a pointer field that refers to a position in the l_cpx_tbl.

  declare
    l_sort_tbl sortable_table:= sortable_table();
    l_idx integer;
  begin
    l_idx:= l_cpx_tbl.first;
    -- PREPARE L_SORT_TBL

    -- loop over all elements in the l_cpx_tbl collection
    -- that we want to sort. Use the last_name for every
    -- element in l_cpx_tbl as a key for the l_sort_tbl. Associate
    -- the key with a current element in the l_cpx_tbl collection.
    loop
      l_sort_tbl.extend;
      -- note: the first field in the sortable object is the expression used for sorting!!!
      --       the second field is a pointer to the current position of the element in the l_cpx_tbl
      l_sort_tbl( l_sort_tbl.last):= new sortable( l_cpx_tbl(l_idx).last_name, l_idx);
      l_idx:= l_cpx_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;
 

The sort is again done by the SQL Engine. The sort is on the value field and the most important result is the pointers in the sorted order. 

   -- SORT L_SORT_TBL
   -- using the table operator we pretend that l_sort_tbl is a table-like-recordset;
   -- we can select records from this table-like-structure in an sequence determined by a
   -- normal SQL ORDER BY clause
   -- the result of that query is turned into a collection again, using MULTISET, and casted
   -- into the desired collection type, namely SORTABLE_TABLE
   select cast ( multiset( select *
                           from table( l_sort_tbl )
                           order by 1 -- note: here we have a very simple sort-operation;
                                      -- it could be much more complex obviously
                         ) as sortable_table)
   into l_sort_tbl
   from dual;
   -- at this point, l_sort_tbl is sorted

 

The set of pointers started out as 1,2,3,4 and is now set to 2,4,1,3:

   -- The hard work was done by the SQL engine!!
   -- brief inspection of the sort helper table l_sort_tbl:
      dbms_output.put_line('Brief look at the contents of the helper table l_sort_tbl:');
      l_idx:= l_sort_tbl.first;
      loop
        dbms_output.put_line('sort index: '||l_idx
                           ||' and sort value: '||l_sort_tbl(l_idx).value
                           ||' and pointer: '||l_sort_tbl(l_idx).pointer);
        l_idx:= l_sort_tbl.next(l_idx);
        exit when l_idx is null;
      end loop;

Brief look at the contents of the helper table l_sort_tbl:
sort index: 1 and sort value: Abramovic and pointer: 2
sort index: 2 and sort value: Basten and pointer: 4
sort index: 3 and sort value: Jellema and pointer: 1
sort index: 4 and sort value: Rooney and pointer: 3

At this point we can recreate the l_cpx_tbl collection with the elements in the proper order:

  declare
    l_ctr integer:=1;
    l_temp_cpx_tbl complex_type_table_type:= complex_type_table_type();
  begin
    l_idx:= l_sort_tbl.first;
    loop
      l_temp_cpx_tbl.extend;
      -- feed into the current slot the element that is at the pointer-indicated position in the original l_cpx_tbl
      l_temp_cpx_tbl(l_temp_cpx_tbl.last):= l_cpx_tbl(l_sort_tbl(l_idx).pointer);
      l_idx:= l_sort_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;
    l_cpx_tbl.delete;
    l_cpx_tbl:= l_temp_cpx_tbl;
    -- if l_cpx_tbl was a large collection
    -- then the delete operation on the collection let go of a lot of memory
    -- that we should make available to the pool
    dbms_session.free_unused_user_memory;
    -- DONE! At this point, l_cpx_tbl is properly sorted
  end;

The result when printed to the output is what we were looking for: 

    dbms_output.put_line('**********************************************************');
    dbms_output.put_line('We proudly present the sorted results: ');
    l_idx:= l_cpx_tbl.first;
    loop
      dbms_output.put_line('**     '||l_cpx_tbl(l_idx).last_name
                         ||' email: '||l_cpx_tbl(l_idx).email_address);
      l_idx:= l_cpx_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;
 end;
 /
**********************************************************
We proudly present the sorted results:
**     Abramovic email: abra1@notfail.com
**     Basten email: bassie@oneteam.com
**     Jellema email: toby231@hitmail.com
**     Rooney email: wayne@noavail.com

PL/SQL procedure successfully completed.

And we see that the sort was successful.

More advanced SQL operations on our collection

There is still quite a bit of PL/SQL code involved. Besides: we can still only sort on a single expression. And: if we leverage the SQL engine, can’t we make use of other SQL operations on our Collection? The answer is a resounding YES!

Let’s now look at a further integration with the SQL engine. Note: it is my expectation that given the more massive  data exchange from PL/SQL to SQL and back, this approach might have issues with performance and resource usage when the number or size of collection elements increases. Again, we should investigate this.

Since we want the SQL engine to have full access to the elements in the collection, we have to base our work on database types:

create type complex_type
is object
( last_name varchar2(50)
, street varchar2(50)
, city varchar2(50)
, birthdate date
, zipcode varchar2(15)
, email_address varchar2(50)
, many_more_elements varchar2(4000)
)
/

create type complex_type_table_type
is table of complex_type
/
 

Now the code to initializes the collection is slightly changed. Note though how similar the PL/SQL RECORD and SQL OBJECT types are!

declare
    -- define and initialize a complex Nested Table Collection
    l_cpx_tbl complex_type_table_type:= complex_type_table_type();
  begin
    -- first populate the collection with a few values.
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= complex_type('Jellema','Rembrandtlaan', 'Enschede', sysdate - 36*365, '7545 ZJ','toby231@hitmail.com',null );
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= complex_type('Abramovic','Chelsea Road', 'London', sysdate - 43*365, 'MSCW','abra1@notfail.com',null );
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= complex_type('Rooney','ManYou 32', 'Manchester', sysdate - 23*362, '32121','wayne@noavail.com',null );
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= complex_type('Basten','Russia Park 1988', 'Utrecht', sysdate - 39*365+43, '2006 WK','bassie@oneteam.com',null );
 

Now that we have our collection in a format understood by the SQL engine, operating on it becomes a breeze. Let’s look at an example that just sorts:

declare
    ...
    l_idx integer;
    l_sort_idx varchar2(50);
    l_cpx_rec complex_type;
  begin
  ...
    -- have the SQL engine do the heavy lifting in terms of sorting the collection!
    select cast ( multiset( select complex_type
                                   ( last_name
                                   , street
                                   , city
                                   , birthdate
                                   , zipcode
                                   , email_address
                                   , many_more_elements )
                            from table( l_cpx_tbl )
                            order by last_name -- note: here we have a very simple sort-operation;
                                               -- it could be much more complex obviously
                        ) as complex_type_table_type)
    into l_cpx_tbl
    from dual;
    -- DONE! At this point, l_cpx_tbl is properly sorted
 

The results of this operation look like this:

    dbms_output.put_line('**********************************************************');
    dbms_output.put_line('We proudly present the sorted results: ');
    l_idx:= l_cpx_tbl.first;
    loop
      dbms_output.put_line('**     '||l_cpx_tbl(l_idx).last_name
                         ||' email: '||l_cpx_tbl(l_idx).email_address);
      l_idx:= l_cpx_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;

  end;
/
**********************************************************
We proudly present the sorted results:
**     Abramovic email: abra1@notfail.com
**     Basten email: bassie@oneteam.com
**     Jellema email: toby231@hitmail.com
**     Rooney email: wayne@noavail.com

PL/SQL procedure successfully completed.

A more advanced SQL operation on the same collection is just as easy to implement:

    -- have the SQL engine do the heavy lifting in terms of sorting the collection!
    select cast ( multiset(  select complex_type
                                    ( last_name
                                    , street
                                    , city
                                    , birthdate
                                    , zipcode
                                    , email_address
                                    , many_more_elements
                                      ||' '||runcount
                                    )
                             from   ( select ct.*
                                      ,      count(*) over
                                                      (order by last_name desc) runcount
                                      from   table( l_cpx_tbl ) ct
                                      where  city <> 'London'
                                      order
                                      by     city
                                      ,      length(last_name)
                                      ,      sysdate - birthdate
                                      ,      2
                                    )
                        ) as complex_type_table_type)
    into l_cpx_tbl
    from dual;
    -- DONE! At this point, l_cpx_tbl is properly sorted
 

and the results are:

    dbms_output.put_line('**********************************************************');
    dbms_output.put_line('We proudly present the sorted results: ');
    l_idx:= l_cpx_tbl.first;
    loop
      dbms_output.put_line('**     '||l_cpx_tbl(l_idx).last_name
                         ||' city: '||l_cpx_tbl(l_idx).city
                         ||' email: '||l_cpx_tbl(l_idx).email_address
                         ||' birthdate: '||l_cpx_tbl(l_idx).birthdate
                         ||' many_more_elements: '||l_cpx_tbl(l_idx).many_more_elements
                         );
      l_idx:= l_cpx_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;
  end;
/
 **********************************************************
We proudly present the sorted results:
**     Jellema city: Enschede email: toby231@hitmail.com birthdate: 09-JUN-70 many_more_elements:  2
**     Rooney city: Manchester email: wayne@noavail.com birthdate: 14-AUG-83 many_more_elements:  1
**     Basten city: Utrecht email: bassie@oneteam.com birthdate: 23-JUL-67 many_more_elements:  3

PL/SQL procedure successfully completed.

 

Resources

Download the source code for this article: plsqlCollectionSortSQLScripts.sql

 

Share.

About Author

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director for Fusion Middleware. Consultant, trainer and instructor on diverse areas including Oracle Database (SQL & PLSQL), Service Oriented Architecture, BPM, ADF, Java in various shapes and forms and many other things. Author of the Oracle Press book: Oracle SOA Suite 11g Handbook. Frequent presenter on conferences such as JavaOne, Oracle OpenWorld, ODTUG Kaleidoscope, Devoxx and OBUG. Presenter for Oracle University Celebrity specials.

10 Comments

  1. Hi Lucas
    Great article and examples -thanks.
    One issue though, I got his working for my needs on a relatively small dataset but when I tried to scale it up to sort a pl/sql table of around 4 million rows I received the following error:
    ORA-22813: operand value exceeds system limits
    This error arises on the following statement.
    select cast(multiset(select t_pair
    (actual,
    predicted,
    original_index)
    from table(l_tbl)
    order by predicted desc, original_index asc) as t_pairs)
    into l_tbl
    from dual;
    I know that you said that you hadn’t tested it with a large data set, I guess i just did.   Any suggestions on to sorting larger pl/sql tables?
    Thanks
    shab

  2. Combining this with analytical functions is very useful.  I am using the technique in a modeling package that uses bulk inserts to speed the processing of large volumes of rows (35 million).  By using this method I can get the analytical function results without having to go to the DB a second time for updates after the inital inserts.

  3. Thanks for your simple collection sorting method. I have been looking for a way to do this without relying on an index.

    It’s so hard to find clear solutions for what should be simple Oracle tasks. No wonder I usually do things like this on the Java application side.

  4. Patrick Sinke on

    I also encountered a way to do it without the necessity to create a type outside the package (which is annoying):

    DECLARE
    TYPE typ_sorttabIS TABLE OF NUMBER(1) index by VARCHAR2(100) ;
    l_order_tab typ_sorttab;
    ln_pos NUMBER(8) ;
    lv_input VARCHAR2(100) ;
    lv_input_string varchar2(100) ;
    lb_doorgaan boolean default TRUE;
    lv_result VARCHAR2(100) := ‘;'; — string begint met een ‘;’
    BEGIN
    lv_input_string := ‘;BLA;DIE;BLOE;TEST;’
    — 1e gedeelte leest de inputstring in een array
    ln_pos := instr(lv_input_string, ‘;’);
    WHILE lb_doorgaan
    LOOP
    lv_input := substr(lv_input_string, 1, ln_pos-1) ; — alles tot eerstvolgende scheidingsteken
    IF lv_input is not null
    THEN
    — deze controle is nodig omdat de eerste keer een lege string zal opleveren
    l_order_tab(lv_input) := 1 ;
    END IF;
    lv_input_string := substr(lv_input_string, ln_pos + 1) ; — input_string min input
    ln_pos := instr(lv_input_string, ‘;’) ;– volgende scheidingsteken
    lb_doorgaan := ln_pos > 0 ; — scheidingstekens nog aanwezig
    END LOOP ;
    — 2e gedeelte voor output
    lv_input := l_order_tab.first ;
    WHILE lv_input IS NOT NULL
    LOOP
    lv_result := lv_result || lv_input || ‘;’ ;
    lv_input := l_order_tab.next(lv_input) ;
    END LOOP ;

    end;

    The trick is too create a table of number with index on the field to be sorted. In this example the array is filled manually from a comma-separated string and sorted by looping through the array in order of index.

  5. Kate Johnson on

    Thank you so much for this article, I have been hunting for how to properly use TABLE/CAST and now you made it even easier with TABLE/CAST/MULTISET. Thank you very much. 2 days of searching is now over.

    Next, INSERT/UPDATE/DELETE examples….

  6. Great !
    Another solution to sort a collection of records by using the . (dot) notation:

    create or replace TYPE typ_emp_rec AS OBJECT( empno NUMBER(4), ENAME VARCHAR(10) )
    /

    create or replace TYPE typ_tab_emp_rec AS TABLE OF typ_emp_rec
    /

    SQL> declare
    2 tremp typ_tab_emp_rec := typ_tab_emp_rec( null);
    3 tremp2 typ_tab_emp_rec := typ_tab_emp_rec( null);
    4 nb pls_integer := 0 ;
    5 begin
    6
    7 tremp.extend(5) ;
    8 for i In reverse 1..5 loop
    9 nb := nb + 1 ;
    10 tremp(nb) := new typ_emp_rec(i, ‘Name’ || to_char(i) ) ;
    11 end loop ;
    12
    13 — output the collection non-sorted —
    14 dbms_output.put_line( ‘*** non-sorted collection ***’ ) ;
    15 for i In tremp.first .. tremp.last loop
    16 dbms_output.put_line( tremp(i).ename ) ;
    17 end loop ;
    18
    19 select cast ( multiset( select t.*
    20 from table( tremp ) t
    21 order by t.ename
    22 ) as typ_tab_emp_rec)
    23 into tremp2
    24 from dual;
    25 — output the collection sorted on ename —
    26 dbms_output.put_line( ‘*** sorted collection (field 2) ***’ ) ;
    27 for i In tremp2.first .. tremp2.last loop
    28 dbms_output.put_line( tremp2(i).ename ) ;
    29 end loop ;
    30 end ;
    31
    32 /
    *** non-sorted collection ***
    Name5
    Name4
    Name3
    Name2
    Name1
    *** sorted collection (field 2) ***
    Name1
    Name2
    Name3
    Name4
    Name5

    Procédure PL/SQL terminée avec succès.