Deleting some rows or tens of millions of rows from an Oracle database should be treated in a completely different fashion. Though the delete itself is technically the same, maintaining indexes and validating constraints may have such a time and resource consuming influence that a vast amount of undo and redo is necessary when deleting millions of rows, in contrast to just a bit when you delete some.
A classic recipe in order to limit the effect on undo and temp is to commit in batches. The result is that the delete is executed in the context of more than one transaction, with these transactions actually done sequentially, and thus saving on undo and temp. But this delete method is not very efficient as to using database resources, it generates a lot more redo than necessary, and it is quite detrimental to transaction time.
Another option is to chop the delete in parts and execute the parts in parallel… this way you are using your resources a lot better, but by hitting undo and temp at the same time, you end up using a lot of undo and temp again, as before. The best option probably is to setup partitioning… this way the delete will be converted into a DDL statement – alter table x drop partition p – and excessive generation of undo, redo and temp won’t be an issue any more. But this requires the Enterprise Edition and on top of that a separate license for the Partitioning Option, some customers cannot afford.
So let us assume this is an Oracle Standard Edition Database, and you want the delete of 10 million rows to be just one fast transaction, with no more than 2-4GB undo and 2-4GB temp usage, and redo should be as minimal as possible. This is how you could do it, presuming there is a short window allowing you to drop all indexes, with no exception, so even a drop of primary, unique and foreign key indexes is allowed.
During the delete an application that touches this table will slow down because of the dropped indexes, and the table reorganization after the delete with the “move” command will offline the table for a short time. If this is not allowed, you could use the dbms_redefinition package to reorganize the table online, but at the expense of considerably slowing down overall transaction time.
First of all, do a CXTAS of all rows you want to delete, including the rowid’s. You may wonder what CXTAS means… Create eXternal Table As Select. Because of all indexes still active the where clause predicate should be fast, and because this is effectively a DDL statement virtually no redo is generated. By the way, if no archiving or backup of the data to be deleted is needed, consider creating an external table with rowid’s only.
Analyze the external table.
Secondly, disable primary, unique and foreign key constraints, dropping the index. Also drop all other indexes or set unusable. Then delete all rows in the original table with rowid’s saved in the external table you just made, and commit when finished.
Thirdly, move the original table in nologging mode so as to reorganize and lower its high watermark, enable all constraints again and recreate or rebuild – if set unusable before – all other indexes.
Analyze the table.
And last, if the deleted data must be archived, move the external table file dump to some other database or filesystem.
Checkout the following code… this cleanup procedure is actually in use at one of our customers. It is a dynamic setup for cleanup of any table in any schema, provided the table contains just one (primary key ) index, possibly a lob or clob, and a timestamp or date column. If your environment demands another setup, feel free to adjust.
CREATE OR REPLACE PROCEDURE CLEANUP_TABLE ( P_schema VARCHAR2 := 'ETL' , P_table_name VARCHAR2 := 'LOG' , P_column_name VARCHAR2 := 'TIMESTAMP' , P_constraint VARCHAR2 := 'LOG_PK' , P_directory VARCHAR2 := 'ETL_BCK_DIR' , P_keep_data PLS_INTEGER := 1 -- 1 month , P_unload2file PLS_INTEGER := 0 , P_rollback PLS_INTEGER := 0 ) AS v_curr_schema VARCHAR2(30 CHAR) := SYS_CONTEXT('USERENV','CURRENT_SCHEMA'); v_directory VARCHAR2(30 CHAR) := UPPER(TRIM(P_directory)); v_schema VARCHAR2(30 CHAR) := UPPER(TRIM(P_schema)); v_table_name VARCHAR2(30 CHAR) := UPPER(TRIM(P_table_name)); v_column_name VARCHAR2(30 CHAR) := UPPER(TRIM(P_column_name)); v_constraint VARCHAR2(30 CHAR) := UPPER(TRIM(P_constraint)); v_dmp VARCHAR2(100 CHAR) := LOWER(TRIM(P_table_name))||'.dmp'; v_external_table_name VARCHAR2(30 CHAR) := 'XDEL_'||UPPER(TRIM(substr(P_table_name,1,25))); v_stat VARCHAR2(4000 CHAR); ----------------------------------------------------------------------------------------------------- PROCEDURE Delete_Backupped_Table_Data IS BEGIN BEGIN EXECUTE IMMEDIATE 'DROP TABLE '||v_external_table_name||' PURGE'; EXCEPTION WHEN OTHERS THEN NULL; END; BEGIN UTL_FILE.FREMOVE ( v_directory, v_dmp ); EXCEPTION WHEN OTHERS THEN NULL; END; END Delete_Backupped_Table_Data; ---------------------------------------------------------------------------------------------------- PROCEDURE Backup_Table_Data_2b_Deleted IS BEGIN IF unload2file = 1 THEN v_stat := 'CREATE TABLE '||v_external_table_name||chr(10) ||'ORGANIZATION EXTERNAL'||chr(10) ||'('||chr(10) ||'TYPE ORACLE_DATAPUMP'||chr(10) ||'DEFAULT DIRECTORY '||v_directory||chr(10) ||'ACCESS PARAMETERS ( NOLOGFILE )'||chr(10) ||'LOCATION ( '''||v_dmp||''' )'||chr(10) ||')'||chr(10) ||'AS SELECT ROWID RID, t.* FROM '||v_schema||'.'||v_table_name||' t'||chr(10) ||'WHERE TO_NUMBER(TO_CHAR('||v_column_name||',''YYYYMMDD'')) < '||chr(10) ||'TO_NUMBER(TO_CHAR(TRUNC(ADD_MONTHS(SYSDATE, -'||P_keep_data||')),''YYYYMMDD''))'; ELSE v_stat := 'CREATE TABLE '||v_external_table_name||chr(10) ||'ORGANIZATION EXTERNAL'||chr(10) ||'('||chr(10) ||'TYPE ORACLE_DATAPUMP'||chr(10) ||'DEFAULT DIRECTORY '||v_directory||chr(10) ||'ACCESS PARAMETERS ( NOLOGFILE )'||chr(10) ||'LOCATION ( '''||v_dmp||''' )'||chr(10) ||')'||chr(10) ||'AS SELECT ROWID RID FROM '||v_schema||'.'||v_table_name||chr(10) ||'WHERE TO_NUMBER(TO_CHAR('||v_column_name||',''YYYYMMDD'')) < '||chr(10) ||'TO_NUMBER(TO_CHAR(TRUNC(ADD_MONTHS(SYSDATE, -'||P_keep_data||')),''YYYYMMDD''))'; END IF; dbms_output.put_line ( v_stat ); execute immediate v_stat; END Backup_Table_Data_2b_Deleted; ----------------------------------------------------------------------------------------------------- PROCEDURE Exec_Cleanup IS BEGIN -- drop index v_stat := ' ALTER TABLE '||v_schema||'.'||v_table_name||' MODIFY CONSTRAINT '||v_constraint||' DISABLE DROP INDEX'; dbms_output.put_line ( v_stat ); execute immediate v_stat; -- delete rows v_stat := ' DELETE '||v_schema||'.'||v_table_name||chr(10) ||' WHERE ROWID IN ( SELECT RID FROM '||v_external_table_name||' )'; dbms_output.put_line ( v_stat ); execute immediate v_stat; if P_rollback = 1 then rollback; end if; -- move (reorganize) table and clobs v_stat := ' ALTER TABLE '||v_schema||'.'||v_table_name||' MOVE NOLGGING'; dbms_output.put_line ( v_stat ); execute immediate v_stat; for i in ( select column_name, tablespace_name from dba_lobs where owner = v_schema and table_name = v_table_name ) loop v_stat := ' ALTER TABLE '||v_schema||'.'||v_table_name||' MOVE NOLOGGING '||chr(10) ||' LOB ('||i.column_name||') STORE AS ( TABLESPACE '||i.tablespace_name||' )'; dbms_output.put_line ( v_stat ); execute immediate v_stat; end loop; -- create pk index again v_stat := ' ALTER TABLE '||v_schema||'.'||v_table_name||' MODIFY CONSTRAINT '||v_constraint||' ENABLE'; dbms_output.put_line ( v_stat ); execute immediate v_stat; END Exec_Cleanup; ----------------------------------------------------------------------------------------------------- BEGIN dbms_output.put_line ( '************************************************************************' ); dbms_output.put_line ( TO_CHAR(SYSDATE,'DY DD-MON-YYYY HH24:MI:SS')||' >> START CLEANUP TABLE '||v_schema||'.'||v_table_name ); dbms_output.put_line ( '************************************************************************' ); -- Drop external table and dump file, if they already exist Delete_Backupped_Table_Data; -- Create external table and dump file, using a select stat of the data 2b deleted Backup_Table_Data_2b_Deleted; -- Analyze the external table you just made DBMS_STATS.GATHER_TABLE_STATS( v_curr_schema, v_external_table_name ); -- Delete table data, using the rowids in the external table Exec_Cleanup; -- Analyze the table after reorg DBMS_STATS.GATHER_TABLE_STATS( v_schema, v_table_name ); dbms_output.put_line ( '************************************************************************' ); dbms_output.put_line ( TO_CHAR(SYSDATE,'DY DD-MON-YYYY HH24:MI:SS')||' >> END CLEANUP TABLE '||v_schema||'.'||v_table_name ); dbms_output.put_line ( '************************************************************************' ); EXCEPTION WHEN OTHERS THEN RAISE_APPLICATION_ERROR( -20010, SQLERRM ); END; /
Great information on minimal undo, redo and temp!
Your “create pk index again” logic did not specify a tablespace for the restored Primary Key Index.
In most production environments, the large table indexes have a defined tablespace.
For example, our 4.7 billion record SERVICE table has tablespace, SERVICE_DATA, for the data,
and tablespace, SERVICE_INDX, for all the indexes.
In testing your logic, we modified the sample DDL by adding a “USING INDEX” clause.
ALTER TABLE CRM.SERVICE ENABLE PRIMARY KEY USING INDEX TABLESPACE SERVICE_INDX
Please be aware that this DLL may take considerable time to execute (several hours for our test).
Thanks for sharing Harry. Good stuff to read, as always.
Hi Harry,
Your post about deleting many rows from an Oracle table made me think of a project I did a long time ago (2001). In that project the system had to delete almost all of the data every day, except for a few rows. We faced the same problems as you describe. I came up with the idea; why not keep the data we want to preserve instead of throwing away unwanted data.
I implemented this by creating a partitioned copy of the original table and selecting the data I want to preserve. Then exchange the newly created table partition with the original table and rebuild the indexes. That’s all!
I tried to code it in a similar way as you did. Advantages of this way of ‘deleting’ data I see:
– it is really fast.
– new data segment is organized by default.
– can be parallellized if needed.
– indexes can be pre-built on the temp-table before exchange (not implemented in example).
– very little redo and undo.
I hope you like it 🙂
create or replace PROCEDURE CLEANUP_TABLE2
( P_schema VARCHAR2
, P_table_name VARCHAR2
, P_column_name VARCHAR2
, P_keep_data VARCHAR2 := ’00-01′ — 00 year and 01 months
)
authid current_user
AS
c_nl VARCHAR2(1 CHAR) DEFAULT chr(10);
v_schema VARCHAR2(30 CHAR) := UPPER(TRIM(P_schema));
v_table_name VARCHAR2(30 CHAR) := UPPER(TRIM(P_table_name));
v_column_name VARCHAR2(30 CHAR) := UPPER(TRIM(P_column_name));
v_tmp_table_name VARCHAR2(30 CHAR) := ‘TMP_EXCHG’;
v_stat VARCHAR2(4000 CHAR);
—————————————————————————————————–
PROCEDURE Delete_Backupped_Table_Data
IS
BEGIN
BEGIN
EXECUTE IMMEDIATE ‘DROP TABLE ‘||v_tmp_table_name||’ PURGE’;
dbms_output.put_line(‘Table ‘||v_tmp_table_name||’ dropped.’);
EXCEPTION
WHEN OTHERS THEN
dbms_output.put_line(‘Table ‘||v_tmp_table_name||’ doesn”t exist.’);
END;
END Delete_Backupped_Table_Data;
—————————————————————————————————-
PROCEDURE Keep_Table_Data_2b_Preserved
IS
BEGIN
v_stat := ‘create table ‘||v_tmp_table_name||c_nl
||’partition by range (‘||v_column_name||’)’||c_nl
||'(partition pmax values less than (maxvalue))’||c_nl
||’AS SELECT t.* FROM ‘||v_schema||’.’||v_table_name||’ t’||c_nl
||’WHERE not(‘||v_column_name||’ < '||'TRUNC(SYSDATE – TO_YMINTERVAL('||''''||P_keep_data||''''||')))';
dbms_output.put_line ( v_stat );
execute immediate v_stat;
dbms_output.put_line('Table '||v_tmp_table_name||' created.');
END Keep_Table_Data_2b_Preserved;
—————————————————————————————————–
PROCEDURE Exec_Cleanup
IS
BEGIN
— exchange partition with table
v_stat := 'alter table '||v_tmp_table_name||' exchange partition pmax with table '||v_schema||'.'||v_table_name;
dbms_output.put_line ( v_stat );
execute immediate v_stat;
dbms_output.put_line('Table exchanged with partition.');
for r_idx in (select idx.owner
, idx.index_name
from all_indexes idx
where idx.table_owner = v_schema
and idx.table_name = v_table_name
and idx.status ‘VALID’
)
loop
v_stat := ‘alter index ‘||r_idx.owner||’.’||r_idx.index_name||’ rebuild’;
dbms_output.put_line ( v_stat );
execute immediate v_stat;
end loop;
dbms_output.put_line(‘Indexes rebuild’);
END Exec_Cleanup;
—————————————————————————————————–
BEGIN
dbms_output.put_line ( ‘************************************************************************’ );
dbms_output.put_line ( TO_CHAR(SYSDATE,’DY DD-MON-YYYY HH24:MI:SS’)||’ >> START CLEANUP TABLE ‘||v_schema||’.’||v_table_name );
dbms_output.put_line ( ‘************************************************************************’ );
— Drop temporary table, if they already exist
Delete_Backupped_Table_Data;
— Create partitioned table with data to be preserved
Keep_Table_Data_2b_Preserved;
— Exchange table data with the temporary table
Exec_Cleanup;
— Analyze the table after reorg
DBMS_STATS.GATHER_TABLE_STATS( v_schema, v_table_name );
dbms_output.put_line ( ‘************************************************************************’ );
dbms_output.put_line ( TO_CHAR(SYSDATE,’DY DD-MON-YYYY HH24:MI:SS’)||’ >> END CLEANUP TABLE ‘||v_schema||’.’||v_table_name );
dbms_output.put_line ( ‘************************************************************************’ );
END;
Arjen, I just found a bit of a bug in the TO_YMINTERVAL function… it doesn’t take into account a difference in the amount of days within a month when subtracting or adding one or more months from or to a specific date.
For instance, if you subtract a month from date March 30th of this year, the result should be February 28 2015 but instead you run into an ORA-01839: “date not valid for month specified.” The ADD_MONTHS function behaves more like expected, so I adjusted the PL/SQL code to use this. Best if you adjust your code too.
In reaction to your comment…
Using a CTAS to set all remaining rows apart with a DDL statement is quite a common approach to bulk deletes. By subsequently dropping the original table, renaming the newly created table to its original name, and recreating indexes you avoid any DML statement, resulting in low undo, redo and temp usage. But this presupposes enough disk space for an extra table, no archiving of the deleted data, no dependencies or at least not many dependencies, and an offline time slot for the table switch.
Your solution using partitioning with CTAS is a bit puzzling to me… why not use partitioning on the original table and drop partitions you no longer need. That’s what I referred to when I mentioned using the Partitioning Option on Oracle Enterprise Edition as probably the best method for a bulk delete. All other stuff, including my code, assumes the Oracle Database Standard Edition.