Re-ordering without procedural code or using a sequence

5

Question from one of my customers:
“Please help me find an update statement that’s able to re-order a primary key after a delete.. and by the way.. you are not allowed to use PL/SQL, triggers or a sequence, because we cannot change anything in the app schema”.
Updating a PK seemed a bit odd, but then I was told about a composite PK over 2 columns, with the second column a number, meant to always start with 0 and ascending per value in the first column, and – most importantly! – without any holes in de numbering.

At first I couldn’t see a way to use pure SQL for this request, and thought of PL/SQL loops and using a temporary table.
But after some time I came up with a solution, using row_number() and rowid.
I think it’s a nice example of being pressed to think in sets instead of row based processing, with the solution being short, fast, and rather elegant.
Try it yourself!

– create test table
create table t1( name varchar2(100), seqnr number(4), note varchar2(30));
alter table t1 add ( constraint t_pk primary key ( name, seqnr));

– insert test data
insert into t1 values ('harry', 0, 'test record 100');
insert into t1 values ('harry', 1, 'test record 101');
insert into t1 values ('harry', 2, 'test record 102');
insert into t1 values ('harry', 3, 'test record 103');
insert into t1 values ('harry', 4, 'test record 104');
insert into t1 values ('ann', 0, 'test record 105');
insert into t1 values ('ann', 1, 'test record 106');
insert into t1 values ('ann', 2, 'test record 107');
insert into t1 values ('bob', 0, 'test record 108');
insert into t1 values ('bob', 1, 'test record 109');
insert into t1 values ('bob', 2, 'test record 110');
insert into t1 values ('bob', 3, 'test record 111');
insert into t1 values ('ronald', 0, 'test record 112');
insert into t1 values ('ronald', 1, 'test record 113');
insert into t1 values ('ronald', 2, 'test record 114');
insert into t1 values ('ronald', 3, 'test record 115');
insert into t1 values ('ronald', 4, 'test record 116');
insert into t1 values ('ronald', 5, 'test record 117');
insert into t1 values ('ronald', 6, 'test record 118');
insert into t1 values ('patrick', 0, 'test record 119');
insert into t1 values ('patrick', 1, 'test record 120');
insert into t1 values ('patrick', 2, 'test record 121');
insert into t1 values ('patrick', 3, 'test record 122');
insert into t1 values ('marie', 0, 'test record 123');
insert into t1 values ('marie', 1, 'test record 124');
commit;

– save the original data in a second table
create table t2 as select * from t1 order by 3;

– delete some rows, creating holes in the sequence for every name
delete t1 where seqnr in (0,2);
commit;

– reset the sequence number, starting with 0 again, filling up any hole in
– the numbering, in fact, updating a part of the composite primary key!
update t1
set seqnr = ( select seqnr_new from ( select rowid rid,(row_number() over ( partition by name order by seqnr ))-1 seqnr_new from t1 ) t0
where t1.rowid = t0.rid );
commit;

– new sequence starts at 0 again, and isn’t showing any holes!
select * from t1;

– start again
drop table t1 cascade constraints purge;
create table t1 as select * from t2 order by 3;
alter table t1 add ( constraint t1_pk primary key ( name, seqnr));

– drop all
drop table t1 cascade constraints purge;
drop table t2 cascade constraints purge;

Share.

About Author

Harry Dragstra is Senior Oracle DBA. He is specialized in Oracle RAC, Oracle Virtualization, and has a a strong interest in APEX development and PL/SQL.

5 Comments

  1. Your solution does a new full index scan for each and every record. Also, it updates every record whether seqnr changes or not. Run the update twice in a row and you will see the same number of lines changed.

    When doing batch updates, always run the code twice in a row and make sure 0 rows are updated the second time.

    With MERGE, you can select just the rows that need to be changed.

    I would have posted this yesterday but my company blocks comments on most blogs.
    merge into T1
    using (
    select rid, new_seqnr from (
    select rowid rid, SEQNr,
    ROW_NUMBER() over(partition by name order by SEQNR) - 1 NEW_SEQNR
    from T1
    )
    where SEQNR != NEW_SEQNR
    ) TN
    on (T1.rowid = TN.RID)
    when matched then update set seqnr = new_seqnr;

    • Harry Dragstra on

      Stew, you’re right.
      BTW. I also received an email from Rob van Wijk with his merge example, and I post it here with his permission:
      =================================================================================================================
      Sure, here is an example with a merge statement:

      SQL> merge into t1
      2 using ( select rowid rid
      3 , row_number() over (partition by name order by seqnr) - 1 seqnr
      4 from t1
      5 ) new
      6 on ( t1.rowid = new.rid )
      7 when matched then
      8 update set seqnr = new.seqnr
      9 /

      With this plan:

      ------------------------------------------------------------------------------------------------------------------
      | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
      ------------------------------------------------------------------------------------------------------------------
      | 0 | MERGE STATEMENT | | 1 | | 0 |00:00:00.01 | 48 | | | |
      | 1 | MERGE | T1 | 1 | | 0 |00:00:00.01 | 48 | | | |
      | 2 | VIEW | | 1 | | 14 |00:00:00.01 | 8 | | | |
      |* 3 | HASH JOIN | | 1 | 14 | 14 |00:00:00.01 | 8 | 1114K| 1114K| 1252K (0)|
      | 4 | VIEW | | 1 | 14 | 14 |00:00:00.01 | 1 | | | |
      | 5 | WINDOW BUFFER | | 1 | 14 | 14 |00:00:00.01 | 1 | 73728 | 73728 | |
      | 6 | INDEX FULL SCAN| T_PK | 1 | 14 | 14 |00:00:00.01 | 1 | | | |
      | 7 | TABLE ACCESS FULL| T1 | 1 | 14 | 14 |00:00:00.01 | 7 | | | |
      ------------------------------------------------------------------------------------------------------------------

      While the update statement gives this plan (and watch the 196 in the A-rows):

      ----------------------------------------------------------------------------------------------------------------
      | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
      ----------------------------------------------------------------------------------------------------------------
      | 0 | UPDATE STATEMENT | | 1 | | 0 |00:00:00.01 | 69 | | | |
      | 1 | UPDATE | T1 | 1 | | 0 |00:00:00.01 | 69 | | | |
      | 2 | INDEX FULL SCAN | T_PK | 1 | 14 | 14 |00:00:00.01 | 1 | | | |
      |* 3 | VIEW | | 14 | 14 | 14 |00:00:00.01 | 14 | | | |
      | 4 | WINDOW BUFFER | | 14 | 14 | 196 |00:00:00.01 | 14 | 2048 | 2048 | 2048 (0)|
      | 5 | INDEX FULL SCAN| T_PK | 14 | 14 | 196 |00:00:00.01 | 14 | | | |
      ----------------------------------------------------------------------------------------------------------------

      Regards,
      Rob.
      =====================================================================================================================================

      I tried the update as wel as the merge statement on a test table with 600 instead of 25 rows, and the results are obvious.
      This is what I did to test the performance of both update and merge statement:

      – drop test table
      SQL>drop table t1 cascade constraints purge;

      – create test table with 600 rows
      SQL>create table t1
      as
      select name, seqnr, name||' record '||seqnr note
      from (
      select case when rownum between 1 and 100 then 'harry'
      when rownum between 101 and 200 then 'ann'
      when rownum between 201 and 300 then 'bob'
      when rownum between 301 and 400 then 'ronald'
      when rownum between 401 and 500 then 'patrick'
      when rownum between 501 and 600 then 'marie' end name,
      case when rownum between 1 and 100 then rownum-1
      when rownum between 101 and 200 then rownum-101
      when rownum between 201 and 300 then rownum-201
      when rownum between 301 and 400 then rownum-301
      when rownum between 401 and 500 then rownum-401
      when rownum between 501 and 600 then rownum-501 end seqnr
      from dual
      connect by rownum <= 600 );
      alter table t1 add ( constraint t1_pk primary key ( name, seqnr));

      –delete 18 rows, creating holes in the sequence for every name
      SQL>delete t1 where seqnr in (0,2,99);
      commit;

      – reset the sequence number, starting with 0 again, filling up any hole in
      – the numbering, in fact, updating a part of the composite primary key!
      SQL>update /*+ GATHER_PLAN_STATISTICS */ t1
      set seqnr = ( select seqnr_new from ( select rowid rid,(row_number() over ( partition by name order by seqnr ))-1 seqnr_new from t1 ) t0
      where t1.rowid = t0.rid );
      select * from table ( dbms_xplan.display_cursor( format => 'ALLSTATS LAST') );

      ----------------------------------------------------------------------------------------------------------------------
      | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
      ----------------------------------------------------------------------------------------------------------------------
      | 0 | UPDATE STATEMENT | | 1 | | 0 |00:00:02.33 | 6527 | | | |
      | 1 | UPDATE | T1 | 1 | | 0 |00:00:02.33 | 6527 | | | |
      | 2 | TABLE ACCESS FULL | T1 | 1 | 582 | 582 |00:00:00.01 | 5 | | | |
      |* 3 | VIEW | | 582 | 582 | 582 |00:00:02.19 | 2962 | | | |
      | 4 | WINDOW SORT | | 582 | 582 | 338K|00:00:02.25 | 2962 | 46080 | 46080 |40960 (0)|
      | 5 | INDEX FAST FULL SCAN| T1_PK | 582 | 582 | 338K|00:00:00.29 | 2962 | | | |
      ----------------------------------------------------------------------------------------------------------------------

      SQL>rollback;

      – better alternative by Rob van Wijk and Stew Ashton
      SQL>merge /*+ GATHER_PLAN_STATISTICS */ into t1
      using ( select rowid rid
      , row_number() over (partition by name order by seqnr) - 1 seqnr
      from t1
      ) new
      on ( t1.rowid = new.rid )
      when matched then
      update set seqnr = new.seqnr;
      select * from table ( dbms_xplan.display_cursor( format => 'ALLSTATS LAST') );

      ------------------------------------------------------------------------------------------------------------------------
      | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
      ------------------------------------------------------------------------------------------------------------------------
      | 0 | MERGE STATEMENT | | 1 | | 0 |00:00:00.08 | 2962 | | | |
      | 1 | MERGE | T1 | 1 | | 0 |00:00:00.08 | 2962 | | | |
      | 2 | VIEW | | 1 | | 582 |00:00:00.01 | 10 | | | |
      |* 3 | HASH JOIN | | 1 | 582 | 582 |00:00:00.01 | 10 | 1114K| 1114K| 1252K (0)|
      | 4 | VIEW | | 1 | 582 | 582 |00:00:00.01 | 5 | | | |
      | 5 | WINDOW SORT | | 1 | 582 | 582 |00:00:00.01 | 5 | 73728 | 73728 | |
      | 6 | INDEX FAST FULL SCAN| T1_PK | 1 | 582 | 582 |00:00:00.01 | 5 | | | |
      | 7 | TABLE ACCESS FULL | T1 | 1 | 582 | 582 |00:00:00.01 | 5 | | | |
      ------------------------------------------------------------------------------------------------------------------------

      – new sequence starts at 0 again, and isn’t showing any holes!
      SQL>select * from t1;

      SQL>rollback;

      Look at the actual time it took to update 582 rows, and compare it with the merge…. I actually tried the update on a test table with 60.000 rows, and it didn’t even finish within 15 minutes on my XE 11gR2 testdatabase, so I cancelled it. The merge however took just a couple of seconds to finish, so your point is taken, Rob and Stew!!
      The really horrible part of the update is the full scan on all rows for every row in the table, resulting in a total index scan of 582*582 = 338.724 rows. This statement just isn’t scalable… imagine a table with 100.000 rows… and the update to renumber after delete of some rows resulting in an index scan of nearly 100.000*100.000 =10.000.000.000 rows.
      The merge does way better in this respect…

Leave a Reply