Sorting PL/SQL Collections, the hard way, the intermediate way and the quite simple way (part one)

Bubble Sort, Quick Sort, Insertion Sort, Shuttle Sort… When time is short, these are not the easiest ways of implementing sort operations on your PL/SQL Collections. In this brief article – or at least I will try to be brief – we will discuss two smarter ways of sorting collections than using these tried and tested algoritms. The first makes use of the fact that the PL/SQL engine will keep the keys of any Associative Array sorted in natural order, automatically. The second is a little more advanced and uses the MULTISET and CAST operators to have the SQL Engine do the sort for us – in even less code.

Note: Alex and I are doing a Quiz on SQL and PL/SQL at ODTUG 2006 in Washington next month. Yesterday we had a try-out of this quiz – 21 questions and some 20 odd contestants. The quiz and all demonstrations went okay, but we learned we had to shed at least 8 questions. The one we had one sorting PL/SQL Collections was one of the ones to – hence I can share it with you on the weblog. Of course the remaining 13 or so questions are not on the weblog until after we run the quiz in DC. However, you might care to take a look at the accompanying paper that you can find here: http://www.amis.nl/tech_artikelen.php?id=340 .

 

I assume you as a reader are familiar with PL/SQL Collections, so I will not spend any time on further introducing them to you. Let’s create ourselves a simple collection:

l_num_tbl num_tbl:= num_tbl(6,12,9,1,54,21, 11, 2);

Note that I am relying on a type NUM_TBL that was created earlier in our database with the following statement.

create type num_tbl is table of number
/

Note that for the first approach towards sorting the collection, the type could have been either a database type or a pl/sql type. The second approach that we will discuss later mandates the use of a database type.

If we write a little PL/SQL snippet that writes the contents of the collection to the output, we notice its hideous un-sortedness:

declare
   l_num_tbl num_tbl:= num_tbl(6,12,9,1,54,21, 11, 2);
   l_idx integer;
 begin
   l_idx:= l_num_tbl.first;
   loop
     dbms_output.put_line(l_num_tbl(l_idx));
     l_idx:= l_num_tbl.next(l_idx);
     exit when l_idx is null;
   end loop;
end;         

SQL> set serveroutput on size 99999
SQL> /
6
12
9
1
54
21
11
2

That is what we want to fix! We want it in proper order.

 

Sorting Collections using an Associative Array

You may or not know – but it is the truth all the same – that when you populate an Associative Array (the thing that once upon a time was called PL/SQL Table and Index By Table), Oracle will keep the keys in sorted order. If our associative array is index(ed) by varchar2 that means the keys are in alphabetical order and when the array is index(ed) by any of the numeric types, its keys are kept in natural order from smallest to largest value.

We can easily leverage this fact or feature – but not fiction – in the following way:

 declare
    -- define and initialize a simple Nested Table Collection
    l_num_tbl num_tbl:= num_tbl(6,12,9,1,54,21, 11, 2);
    l_idx integer;
  begin
    declare
      -- here is where the sorting magic starts
      -- we create an Associative Array that is indexed by binary_integer
      -- we know that this collection's keys (the only thing we care about in this case)
      -- will always be kept in sorted order
      type num_aat_t is table of number index by binary_integer;
      l_num_aat num_aat_t;
    begin
      l_idx:= l_num_tbl.first;
      -- loop over all elements in the l_num_tbl collection 
      -- that we want to sort. Use every element in l_num_tbl
      -- as a key for the l_num_aat associative array. Associate
      -- the key with a meaningless value; we do not care about
      -- the value in this case.
      loop
        l_num_aat( l_num_tbl(l_idx)):=0;
        l_idx:= l_num_tbl.next(l_idx);
        exit when l_idx is null;
      end loop;
      -- remove all elements from l_num_tbl
      l_num_tbl.delete;
      -- start repopulating l_num_tbl - in the proper order -
      -- from the sorted collection of keys in the l_num_aat collection
      l_idx:= l_num_aat.first;
      loop
        l_num_tbl.extend;
        l_num_tbl(l_num_tbl.last):= l_idx;
        l_idx:= l_num_aat.next(l_idx);
        exit when l_idx is null;
      end loop;
    end;
    -- DONE! At this point, l_num_tbl is properly sorted
    l_idx:= l_num_tbl.first;
    loop
      dbms_output.put_line('**     '||l_num_tbl(l_idx));
      l_idx:= l_num_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;
 end;
 /
**     1
**     2
**     6
**     9
**     11
**     12
**     21
**     54

PL/SQL procedure successfully completed.

Note that this time the values are written in the proper, sorted sequence.

Of course it would be interesting to apply this technique to a more challenging collection, both in terms of the complexity of the collection members as well as the sheer size of the collection. Let’s start with the first.

Sorting a Collection of Complex Records

We create a more complex record type and a collection of those records:

    type complex_type is record ( last_name varchar2(50)
                                , street varchar2(50)
                                , city varchar2(50)
                                , birthdate date
                                , zipcode varchar2(15)
                                , email_address varchar2(50)
                                , many_more_elements varchar2(4000)
                                );
    type complex_type_table_type is table of complex_type;
    -- define and initialize a complex Nested Table Collection
    l_cpx_tbl complex_type_table_type:= complex_type_table_type();

We populate the l_cpx_tbl collection with a number of complex_type records:

    l_cpx_rec complex_type;
  begin
    -- first populate the collection with a few values.
    l_cpx_rec.last_name:= 'Jellema';
    l_cpx_rec.street:= 'Rembrandtlaan';
    l_cpx_rec.email_address:= 'toby231@hitmail.com';
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;
    l_cpx_rec.last_name:= 'Abramovic';
    l_cpx_rec.street:= 'Chelsea Road';
    l_cpx_rec.email_address:= 'abra1@notfail.com';
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;
    l_cpx_rec.last_name:= 'Rooney';
    l_cpx_rec.street:= 'ManYou 32';
    l_cpx_rec.email_address:= 'wayne@noavail.com';
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;
    l_cpx_rec.last_name:= 'Basten';
    l_cpx_rec.street:= 'Russia Park 1988';
    l_cpx_rec.email_address:= 'bassie@oneteam.com';
    l_cpx_tbl.extend;
    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;

Now we want to sort this collection on the last_name property in the records. The way we do this is again through the use of an Associative Array.

    declare
      -- here is where the sorting magic starts
      -- we create an Associative Array that is indexed by 
      -- varchar2 as we want to
sort by the last_name
     -- we know that this collection's keys 
      -- will always be kept in sorted order
      type sort_aat_t is table of complex_type index by varchar2(50);
      l_sort_aat sort_aat_t;
    begin
      l_idx:= l_cpx_tbl.first;
      -- loop over all elements in the l_cpx_tbl collection 
      -- that we want to sort. Use the last_name for every 
      -- element in l_cpx_tbl as a key for the l_sort_aat associative array. Associate
      -- the key with a current element in the l_cpx_tbl collection.
      loop
        l_sort_aat( l_cpx_tbl(l_idx).last_name):= l_cpx_tbl(l_idx);
        l_idx:= l_cpx_tbl.next(l_idx);
        exit when l_idx is null;
      end loop;

When we inspect the contents of the l_sort_aat array, these are the results:

      -- brief inspection of the sort helper table l_sort_aat:
      dbms_output.put_line('Brief look at the contents of the helper table l_sort_aat:');
      l_sort_idx:= l_sort_aat.first;
      loop
        -- iterate one step in both l_cpx_tbl and l_sort_aat
        dbms_output.put_line('sort index: '||l_sort_idx||' and sort value: '||l_sort_aat(l_sort_idx).last_name);
        l_sort_idx:= l_sort_aat.next(l_sort_idx);        
        exit when l_sort_idx is null;
      end loop;

Brief look at the contents of the helper table l_sort_aat:
sort index: Abramovic and sort value: Abramovic
sort index: Basten and sort value: Basten
sort index: Jellema and sort value: Jellema
sort index: Rooney and sort value: Rooney

Now it is easy to re-instantiate l_cpx_tbl with the properly sorted collection of COMPLEX_TYPE records:

      -- remove all elements from l_cpx_tbl
      l_cpx_tbl.delete;
      -- start repopulating l_cpx_tbl - in the proper order -
      -- from the sorted collection of keys in the l_num_aat collection
      l_sort_idx:= l_sort_aat.first;
      loop
        l_cpx_tbl.extend;
        l_cpx_tbl(l_cpx_tbl.last):= l_sort_aat(l_sort_idx);
        l_sort_idx:= l_sort_aat.next(l_sort_idx);        
        exit when l_sort_idx is null;
      end loop;
    end;

Take a look at the end result:

    -- DONE! At this point, l_cpx_tbl is properly sorted
    dbms_output.put_line('**********************************************************');
    dbms_output.put_line('We proudly present the sorted results: ');
    l_idx:= l_cpx_tbl.first;
    loop
      dbms_output.put_line('**     '||l_cpx_tbl(l_idx).last_name
                         ||' email: '||l_cpx_tbl(l_idx).email_address);
      l_idx:= l_cpx_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;

**********************************************************
We proudly present the sorted results:
**     Abramovic email: abra1@notfail.com
**     Basten email: bassie@oneteam.com
**     Jellema email: toby231@hitmail.com
**     Rooney email: wayne@noavail.com

PL/SQL procedure successfully completed.

Lower memory usage and better performance through ‘reference swapping’

Although the method described here is very code efficient, its run-time characteristics may be slightly open for improvement – especially when the records get larger and more complex. Note: at this point I have not done any research into resource usage, I am merely hazarding a guess. The next approach is a small extension of what we did before – this time using references or pointers into the l_cpx_table and swapping those around. It may be that this is extra complexity that does not gain us anything – I still have to find out. I happen to know we will use this pointer-style swapping approach with the MULTISET CAST way of sorting that I will describe in a second installment.

Again we use an Associative Array, indexed by VARCHAR2. We populate this array with the last_name values as keys and the current position in the l_cpx_tbl collection as the values:

    declare
      -- here is where the sorting magic starts
      -- we create an Associative Array that is indexed by 
      -- varchar2 as we want to sort by the last_name
      -- we know that this collection's keys (the only thing we care about in this case)
      -- will always be kept in sorted order
      type sort_aat_t is table of number index by varchar2(50);
      l_sort_aat sort_aat_t;
    begin
      l_idx:= l_cpx_tbl.first;
      -- loop over all elements in the l_cpx_tbl collection 
      -- that we want to sort. Use the last_name for every 
      -- element in l_cpx_tbl as a key for the l_sort_aat associative array. Associate
      -- the key with a current position of the element in the l_cpx_tbl collection.
      loop
        l_sort_aat( l_cpx_tbl(l_idx).last_name):= l_idx;
        l_idx:= l_cpx_tbl.next(l_idx);
        exit when l_idx is null;
      end loop;

When we write the contents of the l_sort_aat table to the output after the previous code fragment, it looks like this:

      -- brief inspection of the sort helper table l_sort_aat:
      dbms_output.put_line('Brief look at the contents of the helper table l_sort_aat:');
      l_sort_idx:= l_sort_aat.first;
      loop
        -- iterate one step in both l_cpx_tbl and l_sort_aat
        dbms_output.put_line('sort index: '||l_sort_idx||' and sort value: '||l_sort_aat(l_sort_idx));
        l_sort_idx:= l_sort_aat.next(l_sort_idx);        
        exit when l_sort_idx is null;
      end loop;

Brief look at the contents of the helper table l_sort_aat:
sort index: Abramovic and sort value: 2
sort index: Basten and sort value: 4
sort index: Jellema and sort value: 1
sort index: Rooney and sort value: 3

This looks a little like the next figure.

Sorting PL/SQL Collections, the hard way, the intermediate way and the quite simple way (part one) plsqlCollSort1

The next piece of code demonstrates the real sorting core. Here is where based on the ordering of keys of the complex_type records in the l_sort_aat table we will now relocate the elements in the l_cpx_tbl:

      -- iterate through l_sort_aat and use it to start swapping around 
      -- the elements in the l_cpx_tbl
      dbms_output.put_line('Iterate through l_sort_aat and use it to start swapping around in l_cpx_tbl:');
      l_sort_idx:= l_sort_aat.first;
      l_idx:= l_cpx_tbl.first;
      loop
        -- make the swap:
        -- 1. first copy the value in the slot to be overwritten 
        --    in l_cpx_tbl to the local placeholder
        l_cpx_rec:= l_cpx_tbl(l_idx);
        dbms_output.put_line('swapped out '||l_cpx_rec.last_name||' from slot '||l_idx);

        -- 2. put the value indicated by the current (index)value 
        --    in the sorted aat into the current slot in l_cpx_tbl
        l_cpx_tbl(l_idx):= l_cpx_tbl(l_sort_aat(l_sort_idx));
        dbms_output.put_line('swapped in '||l_cpx_tbl(l_sort_aat(l_sort_idx)).last_name||' => into slot '||l_idx);
        -- 3. copy the evacuated value from l_cpx_rec to the now free
        --    slot where the swapped value came from
        l_cpx_tbl(l_sort_aat(l_sort_idx)):= l_cpx_rec;
        -- and update the value (pointer) in l_sort_aat:
        l_sort_aat(l_cpx_rec.last_name):= l_sort_aat(l_sort_idx);
        -- iterate one step in both l_cpx_tbl and l_sort_aat
        l_sort_idx:= l_sort_aat.next(l_sort_idx);
        l_idx:= l_cpx_tbl.next(l_idx);
        dbms_output.put_line('l_idx '||l_idx);
        exit when l_sort_idx is null;
      end loop;
    end;
    -- DONE! At this point, l
_cp
x_tbl is properly sorted

Output:
Iterate through l_sort_aat and use it to start swapping around in l_cpx_tbl:
swapped out Jellema from slot 1
swapped in Abramovic => into slot 1
l_idx 2
swapped out Jellema from slot 2
swapped in Basten => into slot 2
l_idx 3
swapped out Rooney from slot 3
swapped in Jellema => into slot 3
l_idx 4
swapped out Rooney from slot 4
swapped in Rooney => into slot 4
l_idx

After the first swap, our figure would be changed to:

Sorting PL/SQL Collections, the hard way, the intermediate way and the quite simple way (part one) plsqlCollSort2

After this is done, here is what the sorted collection looks like:

    dbms_output.put_line('**********************************************************');
    dbms_output.put_line('We proudly present the sorted results: ');
    l_idx:= l_cpx_tbl.first;
    loop
      dbms_output.put_line('**     '||l_cpx_tbl(l_idx).last_name
                         ||' email: '||l_cpx_tbl(l_idx).email_address);
      l_idx:= l_cpx_tbl.next(l_idx);
      exit when l_idx is null;
    end loop;

**********************************************************
We proudly present the sorted results:
**     Abramovic email: abra1@notfail.com
**     Basten email: bassie@oneteam.com
**     Jellema email: toby231@hitmail.com
**     Rooney email: wayne@noavail.com

As we had hoped for – and of course I had tried it out before writing this article – the records have properly sorted. Not by me – by the PL/SQL engine that keeps the keys of Associative Arrays always sorted.

Resources

Download code: plsqlCollectionSortScripts.sql.

Francois Degrelle wrote an article on a very similar topic, also using the built-in sorting of the keys of an Associative Array, called: see Using a collection to sort a string (http://fdegrelle.over-blog.com/categorie-266041.html) . It is amazing how people can have so similar thoughts at virtually the same time.

Sometimes I hate Google. I really had come up with all the ideas in this article by myself. But I decided to Google on PL/SQL Collection and Sort. And what do you know: I also ran into CASTing About For a Solution: Using CAST and Table Functions in PL/SQL
by Jim Czuprynski at http://www.dbasupport.com/oracle/ora9i/cast.shtml . I would almost start to think myself I am a little of a plagiarist when of course I am not.

The next installment in this series: Sorting PL/SQL Collections, the quite simple way (part two: Have the SQL Engine do the heavy lifting) at https://technology.amis.nl/blog/?p=1217

9 Comments

  1. Michael September 20, 2007
  2. Vizith February 26, 2007
  3. Miquel November 1, 2006
  4. Ziya Åženol August 3, 2006
  5. firefight July 9, 2006
  6. Aino May 31, 2006
  7. Lucas Jellema May 31, 2006
  8. Francois May 31, 2006
  9. Gary May 31, 2006