Sorting PL/SQL Collections, the hard way, the intermediate way and the quite simple way (part one)

9

Bubble Sort, Quick Sort, Insertion Sort, Shuttle Sort… When time is short, these are not the easiest ways of implementing sort operations on your PL/SQL Collections. In this brief article – or at least I will try to be brief – we will discuss two smarter ways of sorting collections than using these tried and tested algoritms. The first makes use of the fact that the PL/SQL engine will keep the keys of any Associative Array sorted in natural order, automatically. The second is a little more advanced and uses the MULTISET and CAST operators to have the SQL Engine do the sort for us – in even less code.

Note: Alex and I are doing a Quiz on SQL and PL/SQL at ODTUG 2006 in Washington next month. Yesterday we had a try-out of this quiz – 21 questions and some 20 odd contestants. The quiz and all demonstrations went okay, but we learned we had to shed at least 8 questions. The one we had one sorting PL/SQL Collections was one of the ones to – hence I can share it with you on the weblog. Of course the remaining 13 or so questions are not on the weblog until after we run the quiz in DC. However, you might care to take a look at the accompanying paper that you can find here: http://www.amis.nl/tech_artikelen.php?id=340 .

....

I assume you as a reader are familiar with PL/SQL Collections, so I will not spend any time on further introducing them to you. Let’s create ourselves a simple collection:

l_num_tbl num_tbl:= num_tbl(6,12,9,1,54,21, 11, 2); 

Note that I am relying on a type NUM_TBL that was created earlier in our database with the following statement.

create type num_tbl is table of number<br />/<br />&nbsp;

Note that for the first approach towards sorting the collection, the type could have been either a database type or a pl/sql type. The second approach that we will discuss later mandates the use of a database type.

If we write a little PL/SQL snippet that writes the contents of the collection to the output, we notice its hideous un-sortedness:

declare<br />   l_num_tbl num_tbl:= num_tbl(6,12,9,1,54,21, 11, 2);<br />   l_idx integer;<br /> begin<br />   l_idx:= l_num_tbl.first;<br />   loop<br />     dbms_output.put_line(l_num_tbl(l_idx));<br />     l_idx:= l_num_tbl.next(l_idx);<br />     exit when l_idx is null;<br />   end loop;<br />end;         <br /><br />SQL&gt; set serveroutput on size 99999<br />SQL&gt; /<br />6<br />12<br />9<br />1<br />54<br />21<br />11<br />2 <br />

That is what we want to fix! We want it in proper order. 

 

Sorting Collections using an Associative Array

You may or not know – but it is the truth all the same – that when you populate an Associative Array (the thing that once upon a time was called PL/SQL Table and Index By Table), Oracle will keep the keys in sorted order. If our associative array is index(ed) by varchar2 that means the keys are in alphabetical order and when the array is index(ed) by any of the numeric types, its keys are kept in natural order from smallest to largest value.

We can easily leverage this fact or feature – but not fiction – in the following way:

 declare<br />    -- define and initialize a simple Nested Table Collection<br />    l_num_tbl num_tbl:= num_tbl(6,12,9,1,54,21, 11, 2);<br />    l_idx integer;<br />  begin<br />    declare<br />      -- here is where the sorting magic starts<br />      -- we create an Associative Array that is indexed by binary_integer<br />      -- we know that this collection's keys (the only thing we care about in this case)<br />      -- will always be kept in sorted order<br />      type num_aat_t is table of number index by binary_integer;<br />      l_num_aat num_aat_t;<br />    begin<br />      l_idx:= l_num_tbl.first;<br />      -- loop over all elements in the l_num_tbl collection <br />      -- that we want to sort. Use every element in l_num_tbl<br />      -- as a key for the l_num_aat associative array. Associate<br />      -- the key with a meaningless value; we do not care about<br />      -- the value in this case.<br />      loop<br />        l_num_aat( l_num_tbl(l_idx)):=0;<br />        l_idx:= l_num_tbl.next(l_idx);<br />        exit when l_idx is null;<br />      end loop;<br />      -- remove all elements from l_num_tbl<br />      l_num_tbl.delete;<br />      -- start repopulating l_num_tbl - in the proper order -<br />      -- from the sorted collection of keys in the l_num_aat collection<br />      l_idx:= l_num_aat.first;<br />      loop<br />        l_num_tbl.extend;<br />        l_num_tbl(l_num_tbl.last):= l_idx;<br />        l_idx:= l_num_aat.next(l_idx);<br />        exit when l_idx is null;<br />      end loop;<br />    end;<br />    -- DONE! At this point, l_num_tbl is properly sorted<br />    l_idx:= l_num_tbl.first;<br />    loop<br />      dbms_output.put_line('**     '||l_num_tbl(l_idx));<br />      l_idx:= l_num_tbl.next(l_idx);<br />      exit when l_idx is null;<br />    end loop;<br /> end;<br /> /<br />**     1<br />**     2<br />**     6<br />**     9<br />**     11<br />**     12<br />**     21<br />**     54<br /><br />PL/SQL procedure successfully completed. <br />

Note that this time the values are written in the proper, sorted sequence. 

Of course it would be interesting to apply this technique to a more challenging collection, both in terms of the complexity of the collection members as well as the sheer size of the collection. Let’s start with the first. 

Sorting a Collection of Complex Records

We create a more complex record type and a collection of those records:

    type complex_type is record ( last_name varchar2(50)<br />                                , street varchar2(50)<br />                                , city varchar2(50)<br />                                , birthdate date<br />                                , zipcode varchar2(15)<br />                                , email_address varchar2(50)<br />                                , many_more_elements varchar2(4000)<br />                                );<br />    type complex_type_table_type is table of complex_type;<br />    -- define and initialize a complex Nested Table Collection<br />    l_cpx_tbl complex_type_table_type:= complex_type_table_type();<br />&nbsp;

We populate the l_cpx_tbl collection with a number of complex_type records:

    l_cpx_rec complex_type;<br />  begin<br />    -- first populate the collection with a few values.<br />    l_cpx_rec.last_name:= 'Jellema';<br />    l_cpx_rec.street:= 'Rembrandtlaan';<br />    l_cpx_rec.email_address:= 'toby231@hitmail.com';<br />    l_cpx_tbl.extend;<br />    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;<br />    l_cpx_rec.last_name:= 'Abramovic';<br />    l_cpx_rec.street:= 'Chelsea Road';<br />    l_cpx_rec.email_address:= 'abra1@notfail.com';<br />    l_cpx_tbl.extend;<br />    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;<br />    l_cpx_rec.last_name:= 'Rooney';<br />    l_cpx_rec.street:= 'ManYou 32';<br />    l_cpx_rec.email_address:= 'wayne@noavail.com';<br />    l_cpx_tbl.extend;<br />    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;<br />    l_cpx_rec.last_name:= 'Basten';<br />    l_cpx_rec.street:= 'Russia Park 1988';<br />    l_cpx_rec.email_address:= 'bassie@oneteam.com';<br />    l_cpx_tbl.extend;<br />    l_cpx_tbl(l_cpx_tbl.last):= l_cpx_rec;<br />&nbsp;

Now we want to sort this collection on the last_name property in the records. The way we do this is again through the use of an Associative Array.

    declare<br />      -- here is where the sorting magic starts<br />      -- we create an Associative Array that is indexed by <br />      -- varchar2 as we want to
sort by the last_name<br />
     -- we know that this collection's keys <br />      -- will always be kept in sorted order<br />      type sort_aat_t is table of complex_type index by varchar2(50);<br />      l_sort_aat sort_aat_t;<br />    begin<br />      l_idx:= l_cpx_tbl.first;<br />      -- loop over all elements in the l_cpx_tbl collection <br />      -- that we want to sort. Use the last_name for every <br />      -- element in l_cpx_tbl as a key for the l_sort_aat associative array. Associate<br />      -- the key with a current element in the l_cpx_tbl collection.<br />      loop<br />        l_sort_aat( l_cpx_tbl(l_idx).last_name):= l_cpx_tbl(l_idx);<br />        l_idx:= l_cpx_tbl.next(l_idx);<br />        exit when l_idx is null;<br />      end loop;<br />&nbsp;

When we inspect the contents of the l_sort_aat array, these are the results:

      -- brief inspection of the sort helper table l_sort_aat:<br />      dbms_output.put_line('Brief look at the contents of the helper table l_sort_aat:');<br />      l_sort_idx:= l_sort_aat.first;<br />      loop<br />        -- iterate one step in both l_cpx_tbl and l_sort_aat<br />        dbms_output.put_line('sort index: '||l_sort_idx||' and sort value: '||l_sort_aat(l_sort_idx).last_name);<br />        l_sort_idx:= l_sort_aat.next(l_sort_idx);        <br />        exit when l_sort_idx is null;<br />      end loop;<br /><br /><p>Brief look at the contents of the helper table l_sort_aat:<br />sort index: Abramovic and sort value: Abramovic<br />sort index: Basten and sort value: Basten<br />sort index: Jellema and sort value: Jellema<br />sort index: Rooney and sort value: Rooney <br /></p>

Now it is easy to re-instantiate l_cpx_tbl with the properly sorted collection of COMPLEX_TYPE records:

      -- remove all elements from l_cpx_tbl<br />      l_cpx_tbl.delete;<br />      -- start repopulating l_cpx_tbl - in the proper order -<br />      -- from the sorted collection of keys in the l_num_aat collection<br />      l_sort_idx:= l_sort_aat.first;<br />      loop<br />        l_cpx_tbl.extend;<br />        l_cpx_tbl(l_cpx_tbl.last):= l_sort_aat(l_sort_idx);<br />        l_sort_idx:= l_sort_aat.next(l_sort_idx);        <br />        exit when l_sort_idx is null;<br />      end loop;<br />    end;<br /><br />

Take a look at the end result:

    -- DONE! At this point, l_cpx_tbl is properly sorted<br />    dbms_output.put_line('**********************************************************');<br />    dbms_output.put_line('We proudly present the sorted results: ');<br />    l_idx:= l_cpx_tbl.first;<br />    loop<br />      dbms_output.put_line('**     '||l_cpx_tbl(l_idx).last_name<br />                         ||' email: '||l_cpx_tbl(l_idx).email_address);<br />      l_idx:= l_cpx_tbl.next(l_idx);<br />      exit when l_idx is null;<br />    end loop;<br /><br />**********************************************************<br />We proudly present the sorted results:<br />**     Abramovic email: abra1@notfail.com<br />**     Basten email: bassie@oneteam.com<br />**     Jellema email: toby231@hitmail.com<br />**     Rooney email: wayne@noavail.com<br /><br />PL/SQL procedure successfully completed. <br />

Lower memory usage and better performance through ‘reference swapping’

Although the method described here is very code efficient, its run-time characteristics may be slightly open for improvement – especially when the records get larger and more complex. Note: at this point I have not done any research into resource usage, I am merely hazarding a guess. The next approach is a small extension of what we did before – this time using references or pointers into the l_cpx_table and swapping those around. It may be that this is extra complexity that does not gain us anything – I still have to find out. I happen to know we will use this pointer-style swapping approach with the MULTISET CAST way of sorting that I will describe in a second installment.

Again we use an Associative Array, indexed by VARCHAR2. We populate this array with the last_name values as keys and the current position in the l_cpx_tbl collection as the values:

    declare<br />      -- here is where the sorting magic starts<br />      -- we create an Associative Array that is indexed by <br />      -- varchar2 as we want to sort by the last_name<br />      -- we know that this collection's keys (the only thing we care about in this case)<br />      -- will always be kept in sorted order<br />      type sort_aat_t is table of number index by varchar2(50);<br />      l_sort_aat sort_aat_t;<br />    begin<br />      l_idx:= l_cpx_tbl.first;<br />      -- loop over all elements in the l_cpx_tbl collection <br />      -- that we want to sort. Use the last_name for every <br />      -- element in l_cpx_tbl as a key for the l_sort_aat associative array. Associate<br />      -- the key with a current position of the element in the l_cpx_tbl collection.<br />      loop<br />        l_sort_aat( l_cpx_tbl(l_idx).last_name):= l_idx;<br />        l_idx:= l_cpx_tbl.next(l_idx);<br />        exit when l_idx is null;<br />      end loop;<br />&nbsp;

When we write the contents of the l_sort_aat table to the output after the previous code fragment, it looks like this:

      -- brief inspection of the sort helper table l_sort_aat:<br />      dbms_output.put_line('Brief look at the contents of the helper table l_sort_aat:');<br />      l_sort_idx:= l_sort_aat.first;<br />      loop<br />        -- iterate one step in both l_cpx_tbl and l_sort_aat<br />        dbms_output.put_line('sort index: '||l_sort_idx||' and sort value: '||l_sort_aat(l_sort_idx));<br />        l_sort_idx:= l_sort_aat.next(l_sort_idx);        <br />        exit when l_sort_idx is null;<br />      end loop;<br /><br />Brief look at the contents of the helper table l_sort_aat:<br />sort index: Abramovic and sort value: 2<br />sort index: Basten and sort value: 4<br />sort index: Jellema and sort value: 1<br />sort index: Rooney and sort value: 3 <br />

This looks a little like the next figure.

 

The next piece of code demonstrates the real sorting core. Here is where based on the ordering of keys of the complex_type records in the l_sort_aat table we will now relocate the elements in the l_cpx_tbl:

      -- iterate through l_sort_aat and use it to start swapping around <br />      -- the elements in the l_cpx_tbl<br />      dbms_output.put_line('Iterate through l_sort_aat and use it to start swapping around in l_cpx_tbl:');<br />      l_sort_idx:= l_sort_aat.first;<br />      l_idx:= l_cpx_tbl.first;<br />      loop<br />        -- make the swap:<br />        -- 1. first copy the value in the slot to be overwritten <br />        --    in l_cpx_tbl to the local placeholder<br />        l_cpx_rec:= l_cpx_tbl(l_idx);<br />        dbms_output.put_line('swapped out '||l_cpx_rec.last_name||' from slot '||l_idx);<br /><br />        -- 2. put the value indicated by the current (index)value <br />        --    in the sorted aat into the current slot in l_cpx_tbl<br />        l_cpx_tbl(l_idx):= l_cpx_tbl(l_sort_aat(l_sort_idx));<br />        dbms_output.put_line('swapped in '||l_cpx_tbl(l_sort_aat(l_sort_idx)).last_name||' =&gt; into slot '||l_idx);<br />        -- 3. copy the evacuated value from l_cpx_rec to the now free<br />        --    slot where the swapped value came from<br />        l_cpx_tbl(l_sort_aat(l_sort_idx)):= l_cpx_rec;<br />        -- and update the value (pointer) in l_sort_aat:<br />        l_sort_aat(l_cpx_rec.last_name):= l_sort_aat(l_sort_idx);<br />        -- iterate one step in both l_cpx_tbl and l_sort_aat<br />        l_sort_idx:= l_sort_aat.next(l_sort_idx);<br />        l_idx:= l_cpx_tbl.next(l_idx);<br />        dbms_output.put_line('l_idx '||l_idx);<br />        exit when l_sort_idx is null;<br />      end loop;<br />    end;<br />    -- DONE! At this point, l
_cp
x_tbl is properly sorted<br /><br />Output:<br />Iterate through l_sort_aat and use it to start swapping around in l_cpx_tbl:<br />swapped out Jellema from slot 1<br />swapped in Abramovic =&gt; into slot 1<br />l_idx 2<br />swapped out Jellema from slot 2<br />swapped in Basten =&gt; into slot 2<br />l_idx 3<br />swapped out Rooney from slot 3<br />swapped in Jellema =&gt; into slot 3<br />l_idx 4<br />swapped out Rooney from slot 4<br />swapped in Rooney =&gt; into slot 4<br />l_idx<br />&nbsp;

After the first swap, our figure would be changed to:

 

After this is done, here is what the sorted collection looks like:

    dbms_output.put_line('**********************************************************');<br />    dbms_output.put_line('We proudly present the sorted results: ');<br />    l_idx:= l_cpx_tbl.first;<br />    loop<br />      dbms_output.put_line('**     '||l_cpx_tbl(l_idx).last_name<br />                         ||' email: '||l_cpx_tbl(l_idx).email_address);<br />      l_idx:= l_cpx_tbl.next(l_idx);<br />      exit when l_idx is null;<br />    end loop;<br /><br />**********************************************************<br />We proudly present the sorted results:<br />**     Abramovic email: abra1@notfail.com<br />**     Basten email: bassie@oneteam.com<br />**     Jellema email: toby231@hitmail.com<br />**     Rooney email: wayne@noavail.com <br />

As we had hoped for – and of course I had tried it out before writing this article – the records have properly sorted. Not by me – by the PL/SQL engine that keeps the keys of Associative Arrays always sorted.

Resources

Download code: plsqlCollectionSortScripts.sql

Francois Degrelle wrote an article on a very similar topic, also using the built-in sorting of the keys of an Associative Array, called: see Using a collection to sort a string (http://fdegrelle.over-blog.com/categorie-266041.html) . It is amazing how people can have so similar thoughts at virtually the same time.

Sometimes I hate Google. I really had come up with all the ideas in this article by myself. But I decided to Google on PL/SQL Collection and Sort. And what do you know: I also ran into CASTing About For a Solution: Using CAST and Table Functions in PL/SQL
by Jim Czuprynski at http://www.dbasupport.com/oracle/ora9i/cast.shtml . I would almost start to think myself I am a little of a plagiarist when of course I am not.

The next installment in this series: Sorting PL/SQL Collections, the quite simple way (part two: Have the SQL Engine do the heavy lifting) at http://technology.amis.nl/blog/?p=1217  

Share.

About Author

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director for Fusion Middleware. Consultant, trainer and instructor on diverse areas including Oracle Database (SQL & PLSQL), Service Oriented Architecture, BPM, ADF, Java in various shapes and forms and many other things. Author of the Oracle Press book: Oracle SOA Suite 11g Handbook. Frequent presenter on conferences such as JavaOne, Oracle OpenWorld, ODTUG Kaleidoscope, Devoxx and OBUG. Presenter for Oracle University Celebrity specials.

9 Comments

  1. i guyz i tried to develop a number puzzle forms6i .. and am in a half way strucked with logic in it .. so can any one help me out to finish my project.

    i added the followin code

    declare

    temp varchar2(40);
    lbl varchar2(40);
    begin
    lbl:=get_item_property(name_in(‘system.trigger_item’),label);
    temp:= :system.trigger_item;
    if get_item_property(temp,VISIBLE)= ‘TRUE’ then
    set_item_property(:parameter.HIDENUM,VISIBLE,PROPERTY_TRUE);
    set_item_property(:parameter.HIDENUM,ENABLED,PROPERTY_TRUE);
    set_item_property(:parameter.HIDENUM,label,lbl);
    go_item(:parameter.HIDENUM);
    set_item_property(temp,VISIBLE,PROPERTY_FALSE);
    :parameter.HIDENUM:=temp;
    end if;
    /*if get_item_property(‘number_game.TWO’,displayed)=’TRUE’ then
    set_item_property(‘number_game.TWO’,displayed,’FALSE’);
    end if;
    if get_item_property(‘number_game.THREE’,displayed)=’TRUE’ then
    set_item_property(‘number_game.THREE’,displayed,’FALSE’);
    end if;*/
    if lbl =’EXIT’ then exit_form(no_validate);
    end if;
    end;

  2. With a little adjustement (2 dimension table for sorting) it works for duplicates:
    /* Formatted on 2006/11/01 20:11 (Formatter Plus v4.8.5) */
    DECLARE
    – define and initialize a simple Nested Table Collection
    TYPE num_tbl IS TABLE OF NUMBER;

    l_num_tbl num_tbl := num_tbl (6, 12, 9, 1, 54, 21, 11, 2, 12, 12, 11);
    l_idx INTEGER;
    i INTEGER;
    BEGIN
    DECLARE
    – here is where the sorting magic starts
    – we create an Associative Array that is indexed by binary_integer
    – we know that this collections keys (the only thing we care about in this case)
    – will always be kept in sorted order
    TYPE num_aat_t IS TABLE OF NUMBER
    INDEX BY BINARY_INTEGER;

    TYPE num_aat_t2 IS TABLE OF num_aat_t
    INDEX BY BINARY_INTEGER;
    – two dimension table to handle duplicates
    l_num_aat num_aat_t2;
    BEGIN
    l_idx := l_num_tbl.FIRST;

    – loop over all elements in the l_num_tbl collection
    – that we want to sort. Use every element in l_num_tbl
    – as a key for the l_num_aat associative array. Associate
    – the key with a meaningless value; we do not care about
    – the value in this case.
    LOOP
    i := 1;

    — if duplicate increment second dimension
    WHILE l_num_aat.EXISTS (l_num_tbl (l_idx))
    AND l_num_aat (l_num_tbl (l_idx)).EXISTS (i)
    LOOP
    i := i + 1;
    END LOOP;

    l_num_aat (l_num_tbl (l_idx)) (i) := 0;
    l_idx := l_num_tbl.NEXT (l_idx);
    EXIT WHEN l_idx IS NULL;
    END LOOP;

    – remove all elements from l_num_tbl
    l_num_tbl.DELETE;
    – start repopulating l_num_tbl – in the proper order -
    – from the sorted collection of keys in the l_num_aat collection
    l_idx := l_num_aat.FIRST;

    LOOP
    FOR i IN 1 .. l_num_aat (l_idx).LAST
    LOOP
    l_num_tbl.EXTEND;
    l_num_tbl (l_num_tbl.LAST) := l_idx;
    END LOOP;

    l_idx := l_num_aat.NEXT (l_idx);
    EXIT WHEN l_idx IS NULL;
    END LOOP;
    END;

    – DONE! At this point, l_num_tbl is properly sorted
    l_idx := l_num_tbl.FIRST;

    LOOP
    DBMS_OUTPUT.put_line (‘ ** ‘ || l_num_tbl (l_idx));
    l_idx := l_num_tbl.NEXT (l_idx);
    EXIT WHEN l_idx IS NULL;
    END LOOP;
    END;

  3. The source of the problem is that PL/SQL is severly lacking in many areas, e.g. there’s no language support for casting the record type to a generic type, and implementing a “Comparable” interface, al la Java & Collections.sort.

    It being designed after Pascal led to so many of the problems with type system.

  4. Another pitfall of this approach is that duplicates will disappear from the collection and will probably cause some unexpected behaviour with ‘reference swapping’.

  5. “alphabetical order ”
    This is the bit that has worried me. I haven’t found a document reference to the actual ordering sequence. I’ve assumed that it basic ASCII, so you’d get ‘a’ sorted after ‘A’ and ‘B’. I haven’t tested against a machine that is natively EBCDIC, which has ‘a’ before ‘A’ and ‘B’.
    And then there’s multilingual sorts….
    Just something to bear in mind (and test) if appropriate to your situation.