Question from one of my customers:
“Please help me find an update statement that’s able to re-order a primary key after a delete.. and by the way.. you are not allowed to use PL/SQL, triggers or a sequence, because we cannot change anything in the app schema”.
Updating a PK seemed a bit odd, but then I was told about a composite PK over 2 columns, with the second column a number, meant to always start with 0 and ascending per value in the first column, and – most importantly! – without any holes in de numbering.
At first I couldn’t see a way to use pure SQL for this request, and thought of PL/SQL loops and using a temporary table.
But after some time I came up with a solution, using row_number() and rowid.
I think it’s a nice example of being pressed to think in sets instead of row based processing, with the solution being short, fast, and rather elegant.
Try it yourself!
— create test table
create table t1( name varchar2(100), seqnr number(4), note varchar2(30)); alter table t1 add ( constraint t_pk primary key ( name, seqnr));
— insert test data
insert into t1 values ('harry', 0, 'test record 100'); insert into t1 values ('harry', 1, 'test record 101'); insert into t1 values ('harry', 2, 'test record 102'); insert into t1 values ('harry', 3, 'test record 103'); insert into t1 values ('harry', 4, 'test record 104'); insert into t1 values ('ann', 0, 'test record 105'); insert into t1 values ('ann', 1, 'test record 106'); insert into t1 values ('ann', 2, 'test record 107'); insert into t1 values ('bob', 0, 'test record 108'); insert into t1 values ('bob', 1, 'test record 109'); insert into t1 values ('bob', 2, 'test record 110'); insert into t1 values ('bob', 3, 'test record 111'); insert into t1 values ('ronald', 0, 'test record 112'); insert into t1 values ('ronald', 1, 'test record 113'); insert into t1 values ('ronald', 2, 'test record 114'); insert into t1 values ('ronald', 3, 'test record 115'); insert into t1 values ('ronald', 4, 'test record 116'); insert into t1 values ('ronald', 5, 'test record 117'); insert into t1 values ('ronald', 6, 'test record 118'); insert into t1 values ('patrick', 0, 'test record 119'); insert into t1 values ('patrick', 1, 'test record 120'); insert into t1 values ('patrick', 2, 'test record 121'); insert into t1 values ('patrick', 3, 'test record 122'); insert into t1 values ('marie', 0, 'test record 123'); insert into t1 values ('marie', 1, 'test record 124'); commit;
— save the original data in a second table
create table t2 as select * from t1 order by 3;
— delete some rows, creating holes in the sequence for every name
delete t1 where seqnr in (0,2); commit;
— reset the sequence number, starting with 0 again, filling up any hole in
— the numbering, in fact, updating a part of the composite primary key!
update t1 set seqnr = ( select seqnr_new from ( select rowid rid ,(row_number() over ( partition by name order by seqnr ))-1 seqnr_new from t1 ) t0 where t1.rowid = t0.rid ); commit;
— new sequence starts at 0 again, and isn’t showing any holes!
select * from t1;
— start again
drop table t1 cascade constraints purge; create table t1 as select * from t2 order by 3; alter table t1 add ( constraint t1_pk primary key ( name, seqnr));
— drop all
drop table t1 cascade constraints purge; drop table t2 cascade constraints purge;
It would be much easier and readable if you used loop instead. Was it intentional or something?
Your solution does a new full index scan for each and every record. Also, it updates every record whether seqnr changes or not. Run the update twice in a row and you will see the same number of lines changed.
When doing batch updates, always run the code twice in a row and make sure 0 rows are updated the second time.
With MERGE, you can select just the rows that need to be changed.
I would have posted this yesterday but my company blocks comments on most blogs.
merge into T1
using (
select rid, new_seqnr from (
select rowid rid, SEQNr,
ROW_NUMBER() over(partition by name order by SEQNR) – 1 NEW_SEQNR
from T1
)
where SEQNR != NEW_SEQNR
) TN
on (T1.rowid = TN.RID)
when matched then update set seqnr = new_seqnr;
Stew, you’re right.
BTW. I also received an email from Rob van Wijk with his merge example, and I post it here with his permission:
================================
Sure, here is an example with a merge statement:
SQL> merge into t1
2 using ( select rowid rid
3 , row_number() over (partition by name order by seqnr) – 1 seqnr
4 from t1
5 ) new
6 on ( t1.rowid = new.rid )
7 when matched then
8 update set seqnr = new.seqnr
9 /
With this plan:
——————————————————————————————————————
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
——————————————————————————————————————
| 0 | MERGE STATEMENT | | 1 | | 0 |00:00:00.01 | 48 | | | |
| 1 | MERGE | T1 | 1 | | 0 |00:00:00.01 | 48 | | | |
| 2 | VIEW | | 1 | | 14 |00:00:00.01 | 8 | | | |
|* 3 | HASH JOIN | | 1 | 14 | 14 |00:00:00.01 | 8 | 1114K| 1114K| 1252K (0)|
| 4 | VIEW | | 1 | 14 | 14 |00:00:00.01 | 1 | | | |
| 5 | WINDOW BUFFER | | 1 | 14 | 14 |00:00:00.01 | 1 | 73728 | 73728 | |
| 6 | INDEX FULL SCAN| T_PK | 1 | 14 | 14 |00:00:00.01 | 1 | | | |
| 7 | TABLE ACCESS FULL| T1 | 1 | 14 | 14 |00:00:00.01 | 7 | | | |
——————————————————————————————————————
While the update statement gives this plan (and watch the 196 in the A-rows):
—————————————————————————————————————-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
—————————————————————————————————————-
| 0 | UPDATE STATEMENT | | 1 | | 0 |00:00:00.01 | 69 | | | |
| 1 | UPDATE | T1 | 1 | | 0 |00:00:00.01 | 69 | | | |
| 2 | INDEX FULL SCAN | T_PK | 1 | 14 | 14 |00:00:00.01 | 1 | | | |
|* 3 | VIEW | | 14 | 14 | 14 |00:00:00.01 | 14 | | | |
| 4 | WINDOW BUFFER | | 14 | 14 | 196 |00:00:00.01 | 14 | 2048 | 2048 | 2048 (0)|
| 5 | INDEX FULL SCAN| T_PK | 14 | 14 | 196 |00:00:00.01 | 14 | | | |
—————————————————————————————————————-
Regards,
Rob.
================================
I tried the update as wel as the merge statement on a test table with 600 instead of 25 rows, and the results are obvious.
This is what I did to test the performance of both update and merge statement:
— drop test table
SQL>drop table t1 cascade constraints purge;
— create test table with 600 rows
SQL>create table t1
as
select name, seqnr, name||’ record ‘||seqnr note
from (
select case when rownum between 1 and 100 then ‘harry’
when rownum between 101 and 200 then ‘ann’
when rownum between 201 and 300 then ‘bob’
when rownum between 301 and 400 then ‘ronald’
when rownum between 401 and 500 then ‘patrick’
when rownum between 501 and 600 then ‘marie’ end name,
case when rownum between 1 and 100 then rownum-1
when rownum between 101 and 200 then rownum-101
when rownum between 201 and 300 then rownum-201
when rownum between 301 and 400 then rownum-301
when rownum between 401 and 500 then rownum-401
when rownum between 501 and 600 then rownum-501 end seqnr
from dual
connect by rownum <= 600 );
alter table t1 add ( constraint t1_pk primary key ( name, seqnr));
–delete 18 rows, creating holes in the sequence for every name
SQL>delete t1 where seqnr in (0,2,99);
commit;
— reset the sequence number, starting with 0 again, filling up any hole in
— the numbering, in fact, updating a part of the composite primary key!
SQL>update /*+ GATHER_PLAN_STATISTICS */ t1
set seqnr = ( select seqnr_new from ( select rowid rid,(row_number() over ( partition by name order by seqnr ))-1 seqnr_new from t1 ) t0
where t1.rowid = t0.rid );
select * from table ( dbms_xplan.display_cursor( format => ‘ALLSTATS LAST’) );
———————————————————————————————————————-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
———————————————————————————————————————-
| 0 | UPDATE STATEMENT | | 1 | | 0 |00:00:02.33 | 6527 | | | |
| 1 | UPDATE | T1 | 1 | | 0 |00:00:02.33 | 6527 | | | |
| 2 | TABLE ACCESS FULL | T1 | 1 | 582 | 582 |00:00:00.01 | 5 | | | |
|* 3 | VIEW | | 582 | 582 | 582 |00:00:02.19 | 2962 | | | |
| 4 | WINDOW SORT | | 582 | 582 | 338K|00:00:02.25 | 2962 | 46080 | 46080 |40960 (0)|
| 5 | INDEX FAST FULL SCAN| T1_PK | 582 | 582 | 338K|00:00:00.29 | 2962 | | | |
———————————————————————————————————————-
SQL>rollback;
— better alternative by Rob van Wijk and Stew Ashton
SQL>merge /*+ GATHER_PLAN_STATISTICS */ into t1
using ( select rowid rid
, row_number() over (partition by name order by seqnr) – 1 seqnr
from t1
) new
on ( t1.rowid = new.rid )
when matched then
update set seqnr = new.seqnr;
select * from table ( dbms_xplan.display_cursor( format => ‘ALLSTATS LAST’) );
————————————————————————————————————————
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
————————————————————————————————————————
| 0 | MERGE STATEMENT | | 1 | | 0 |00:00:00.08 | 2962 | | | |
| 1 | MERGE | T1 | 1 | | 0 |00:00:00.08 | 2962 | | | |
| 2 | VIEW | | 1 | | 582 |00:00:00.01 | 10 | | | |
|* 3 | HASH JOIN | | 1 | 582 | 582 |00:00:00.01 | 10 | 1114K| 1114K| 1252K (0)|
| 4 | VIEW | | 1 | 582 | 582 |00:00:00.01 | 5 | | | |
| 5 | WINDOW SORT | | 1 | 582 | 582 |00:00:00.01 | 5 | 73728 | 73728 | |
| 6 | INDEX FAST FULL SCAN| T1_PK | 1 | 582 | 582 |00:00:00.01 | 5 | | | |
| 7 | TABLE ACCESS FULL | T1 | 1 | 582 | 582 |00:00:00.01 | 5 | | | |
————————————————————————————————————————
— new sequence starts at 0 again, and isn’t showing any holes!
SQL>select * from t1;
SQL>rollback;
Look at the actual time it took to update 582 rows, and compare it with the merge…. I actually tried the update on a test table with 60.000 rows, and it didn’t even finish within 15 minutes on my XE 11gR2 test database, so I cancelled it. The merge however took just a couple of seconds to finish, so your point is taken, Rob and Stew!!
The really horrible part of the update is the full scan on all rows for every row in the table, resulting in a total index scan of 582*582 = 338.724 rows. This statement just isn’t scalable… imagine a table with 100.000 rows… and the update to renumber after delete of some rows resulting in an index scan of nearly 100.000*100.000 =10.000.000.000 rows.
The merge does way better in this respect!
Interesting… do you have an example.
Regards,
Harry
This has O(N^2) performance, so I’d use a merge statement here.
Regards,
Rob.