How-to bulk delete ( or archive ) as fast as possible, using minimal undo, redo and temp

4

Deleting some rows or tens of millions of rows from an Oracle database should be treated in a completely different fashion. Though the delete itself is technically the same, maintaining indexes and validating constraints may have such a time and resource consuming influence that a vast amount of undo and redo is necessary when deleting millions of rows, in contrast to just a bit when you delete some.

A classic recipe in order to limit the effect on undo and temp is to commit in batches. The result is that the delete is executed in the context of more than one transaction, with these transactions actually done sequentially, and thus saving on undo and temp. But this delete method is not very efficient as to using database resources, it generates a lot more redo than necessary, and it is quite detrimental to transaction time.
Another option is to chop the delete in parts and execute the parts in parallel… this way you are using your resources a lot better, but by hitting undo and temp at the same time, you end up using a lot of undo and temp again, as before. The best option probably is to setup partitioning… this way the delete will be converted into a DDL statement – alter table x drop partition p – and excessive generation of undo, redo and temp won’t be an issue any more. But this requires the Enterprise Edition and on top of that a separate license for the Partitioning Option, some customers cannot afford.

So let us assume this is an Oracle Standard Edition Database, and you want the delete of 10 million rows to be just one fast transaction, with no more than 2-4GB undo and 2-4GB temp usage, and redo should be as minimal as possible. This is how you could do it, presuming there is a short window allowing you to drop all indexes, with no exception, so even a drop of primary, unique and foreign key indexes is allowed.
During the delete an application that touches this table will slow down because of the dropped indexes, and the table reorganization after the delete with the “move” command will offline the table for a short time. If this is not allowed, you could use the dbms_redefinition package to reorganize the table online, but at the expense of considerably slowing down overall transaction time.

First of all, do a CXTAS of all rows you want to delete, including the rowid’s. You may wonder what CXTAS means… Create eXternal Table As Select. Because of all indexes still active the where clause predicate should be fast, and because this is effectively a DDL statement virtually no redo is generated. By the way, if no archiving or backup of the data to be deleted is needed, consider creating an external table with rowid’s only.
Analyze the external table.
Secondly, disable primary, unique and foreign key constraints, dropping the index. Also drop all other indexes or set unusable. Then delete all rows in the original table with rowid’s saved in the external table you just made, and commit when finished.
Thirdly, move the original table in nologging mode so as to reorganize and lower its high watermark, enable all constraints again and recreate or rebuild – if set unusable before – all other indexes.
Analyze the table.
And last, if the deleted data must be archived, move the external table file dump to some other database or filesystem.

Checkout the following code… this cleanup procedure is actually in use at one of our customers. It is a dynamic setup for cleanup of any table in any schema, provided the table contains just one (primary key ) index, possibly a lob or clob, and a timestamp or date column. If your environment demands another setup, feel free to adjust.

CREATE OR REPLACE PROCEDURE CLEANUP_TABLE
  ( P_schema       VARCHAR2    := 'ETL'
  , P_table_name   VARCHAR2    := 'LOG'
  , P_column_name  VARCHAR2    := 'TIMESTAMP'
  , P_constraint   VARCHAR2    := 'LOG_PK'
  , P_directory    VARCHAR2    := 'ETL_BCK_DIR'
  , P_keep_data    PLS_INTEGER := 1 -- 1 month
  , P_unload2file  PLS_INTEGER := 0
  , P_rollback     PLS_INTEGER := 0 )
AS
  v_curr_schema          VARCHAR2(30 CHAR)  := SYS_CONTEXT('USERENV','CURRENT_SCHEMA');
  v_directory            VARCHAR2(30 CHAR)  := UPPER(TRIM(P_directory));
  v_schema               VARCHAR2(30 CHAR)  := UPPER(TRIM(P_schema));
  v_table_name           VARCHAR2(30 CHAR)  := UPPER(TRIM(P_table_name));
  v_column_name          VARCHAR2(30 CHAR)  := UPPER(TRIM(P_column_name));
  v_constraint           VARCHAR2(30 CHAR)  := UPPER(TRIM(P_constraint));
  v_dmp                  VARCHAR2(100 CHAR) := LOWER(TRIM(P_table_name))||'.dmp';
  v_external_table_name  VARCHAR2(30 CHAR)  := 'XDEL_'||UPPER(TRIM(substr(P_table_name,1,25)));
  v_stat                 VARCHAR2(4000 CHAR);

-----------------------------------------------------------------------------------------------------
PROCEDURE Delete_Backupped_Table_Data
IS

BEGIN

  BEGIN
    EXECUTE IMMEDIATE 'DROP TABLE '||v_external_table_name||' PURGE';
  EXCEPTION
  WHEN OTHERS THEN
    NULL;
  END;

  BEGIN
    UTL_FILE.FREMOVE ( v_directory, v_dmp );
  EXCEPTION
  WHEN OTHERS THEN
    NULL;
  END;

END Delete_Backupped_Table_Data;
----------------------------------------------------------------------------------------------------
PROCEDURE Backup_Table_Data_2b_Deleted
IS
BEGIN

IF unload2file = 1 THEN
  v_stat := 'CREATE TABLE '||v_external_table_name||chr(10)
          ||'ORGANIZATION EXTERNAL'||chr(10)
          ||'('||chr(10)
          ||'TYPE ORACLE_DATAPUMP'||chr(10)
          ||'DEFAULT DIRECTORY '||v_directory||chr(10)
          ||'ACCESS PARAMETERS ( NOLOGFILE )'||chr(10)
          ||'LOCATION ( '''||v_dmp||''' )'||chr(10)
          ||')'||chr(10)
          ||'AS SELECT ROWID RID, t.* FROM '||v_schema||'.'||v_table_name||' t'||chr(10)
          ||'WHERE TO_NUMBER(TO_CHAR('||v_column_name||',''YYYYMMDD'')) < '||chr(10)
          ||'TO_NUMBER(TO_CHAR(TRUNC(ADD_MONTHS(SYSDATE, -'||P_keep_data||')),''YYYYMMDD''))';
ELSE
  v_stat := 'CREATE TABLE '||v_external_table_name||chr(10)
          ||'ORGANIZATION EXTERNAL'||chr(10)
          ||'('||chr(10)
          ||'TYPE ORACLE_DATAPUMP'||chr(10)
          ||'DEFAULT DIRECTORY '||v_directory||chr(10)
          ||'ACCESS PARAMETERS ( NOLOGFILE )'||chr(10)
          ||'LOCATION ( '''||v_dmp||''' )'||chr(10)
          ||')'||chr(10)
          ||'AS SELECT ROWID RID FROM '||v_schema||'.'||v_table_name||chr(10)
          ||'WHERE TO_NUMBER(TO_CHAR('||v_column_name||',''YYYYMMDD'')) < '||chr(10)
          ||'TO_NUMBER(TO_CHAR(TRUNC(ADD_MONTHS(SYSDATE, -'||P_keep_data||')),''YYYYMMDD''))';
END IF;
  dbms_output.put_line ( v_stat );
  execute immediate v_stat;

END Backup_Table_Data_2b_Deleted;
-----------------------------------------------------------------------------------------------------
PROCEDURE Exec_Cleanup
IS
BEGIN

-- drop index
  v_stat := ' ALTER TABLE '||v_schema||'.'||v_table_name||' MODIFY CONSTRAINT '||v_constraint||' DISABLE DROP INDEX';
  dbms_output.put_line ( v_stat );
  execute immediate v_stat;

-- delete rows
  v_stat := ' DELETE '||v_schema||'.'||v_table_name||chr(10)
          ||' WHERE ROWID IN ( SELECT RID FROM '||v_external_table_name||' )';
  dbms_output.put_line ( v_stat );
  execute immediate v_stat;
  if P_rollback = 1 then
    rollback;
  end if;

-- move (reorganize) table and clobs
  v_stat := ' ALTER TABLE '||v_schema||'.'||v_table_name||' MOVE NOLGGING';
  dbms_output.put_line ( v_stat );
  execute immediate v_stat;
  for i in ( select column_name, tablespace_name
             from dba_lobs
             where owner = v_schema
             and table_name = v_table_name )
  loop
    v_stat := ' ALTER TABLE '||v_schema||'.'||v_table_name||' MOVE NOLOGGING '||chr(10)
            ||' LOB ('||i.column_name||')  STORE AS ( TABLESPACE '||i.tablespace_name||' )';
    dbms_output.put_line ( v_stat );
    execute immediate v_stat;
  end loop;

-- create pk index again
  v_stat := ' ALTER TABLE '||v_schema||'.'||v_table_name||' MODIFY CONSTRAINT '||v_constraint||' ENABLE';
  dbms_output.put_line ( v_stat );
  execute immediate v_stat;

END Exec_Cleanup;
-----------------------------------------------------------------------------------------------------

BEGIN
  dbms_output.put_line ( '************************************************************************' );
  dbms_output.put_line ( TO_CHAR(SYSDATE,'DY DD-MON-YYYY HH24:MI:SS')||' >> START CLEANUP TABLE '||v_schema||'.'||v_table_name );
  dbms_output.put_line ( '************************************************************************' );
-- Drop external table and dump file, if they already exist
  Delete_Backupped_Table_Data;
-- Create external table and dump file, using a select stat of the data 2b deleted
  Backup_Table_Data_2b_Deleted;
-- Analyze the external table you just made
  DBMS_STATS.GATHER_TABLE_STATS( v_curr_schema, v_external_table_name );
-- Delete table data, using the rowids in the external table
  Exec_Cleanup;
-- Analyze the table after reorg
  DBMS_STATS.GATHER_TABLE_STATS( v_schema, v_table_name );
  dbms_output.put_line ( '************************************************************************' );
  dbms_output.put_line ( TO_CHAR(SYSDATE,'DY DD-MON-YYYY HH24:MI:SS')||' >> END CLEANUP TABLE '||v_schema||'.'||v_table_name );
  dbms_output.put_line ( '************************************************************************' );

EXCEPTION
  WHEN OTHERS THEN
  RAISE_APPLICATION_ERROR( -20010, SQLERRM );
END;
/

About Author

Harry Dragstra is Senior Oracle DBA. He is specialized in Oracle RAC, Oracle Virtualization, and has a a strong interest in APEX development and PL/SQL.

4 Comments

  1. William DeLorie on

    Great information on minimal undo, redo and temp!

    Your “create pk index again” logic did not specify a tablespace for the restored Primary Key Index.
    In most production environments, the large table indexes have a defined tablespace.
    For example, our 4.7 billion record SERVICE table has tablespace, SERVICE_DATA, for the data,
    and tablespace, SERVICE_INDX, for all the indexes.

    In testing your logic, we modified the sample DDL by adding a “USING INDEX” clause.

    ALTER TABLE CRM.SERVICE ENABLE PRIMARY KEY USING INDEX TABLESPACE SERVICE_INDX

    Please be aware that this DLL may take considerable time to execute (several hours for our test).

  2. Hi Harry,

    Your post about deleting many rows from an Oracle table made me think of a project I did a long time ago (2001). In that project the system had to delete almost all of the data every day, except for a few rows. We faced the same problems as you describe. I came up with the idea; why not keep the data we want to preserve instead of throwing away unwanted data.

    I implemented this by creating a partitioned copy of the original table and selecting the data I want to preserve. Then exchange the newly created table partition with the original table and rebuild the indexes. That’s all!

    I tried to code it in a similar way as you did. Advantages of this way of ‘deleting’ data I see:
    – it is really fast.
    – new data segment is organized by default.
    – can be parallellized if needed.
    – indexes can be pre-built on the temp-table before exchange (not implemented in example).
    – very little redo and undo.

    I hope you like it 🙂

    create or replace PROCEDURE CLEANUP_TABLE2
    ( P_schema VARCHAR2
    , P_table_name VARCHAR2
    , P_column_name VARCHAR2
    , P_keep_data VARCHAR2 := ’00-01′ — 00 year and 01 months
    )
    authid current_user
    AS
    c_nl VARCHAR2(1 CHAR) DEFAULT chr(10);
    v_schema VARCHAR2(30 CHAR) := UPPER(TRIM(P_schema));
    v_table_name VARCHAR2(30 CHAR) := UPPER(TRIM(P_table_name));
    v_column_name VARCHAR2(30 CHAR) := UPPER(TRIM(P_column_name));
    v_tmp_table_name VARCHAR2(30 CHAR) := ‘TMP_EXCHG’;
    v_stat VARCHAR2(4000 CHAR);

    —————————————————————————————————–
    PROCEDURE Delete_Backupped_Table_Data
    IS

    BEGIN

    BEGIN
    EXECUTE IMMEDIATE ‘DROP TABLE ‘||v_tmp_table_name||’ PURGE’;
    dbms_output.put_line(‘Table ‘||v_tmp_table_name||’ dropped.’);
    EXCEPTION
    WHEN OTHERS THEN
    dbms_output.put_line(‘Table ‘||v_tmp_table_name||’ doesn”t exist.’);
    END;

    END Delete_Backupped_Table_Data;
    —————————————————————————————————-
    PROCEDURE Keep_Table_Data_2b_Preserved
    IS
    BEGIN

    v_stat := ‘create table ‘||v_tmp_table_name||c_nl
    ||’partition by range (‘||v_column_name||’)’||c_nl
    ||'(partition pmax values less than (maxvalue))’||c_nl
    ||’AS SELECT t.* FROM ‘||v_schema||’.’||v_table_name||’ t’||c_nl
    ||’WHERE not(‘||v_column_name||’ < '||'TRUNC(SYSDATE – TO_YMINTERVAL('||''''||P_keep_data||''''||')))';
    dbms_output.put_line ( v_stat );
    execute immediate v_stat;
    dbms_output.put_line('Table '||v_tmp_table_name||' created.');

    END Keep_Table_Data_2b_Preserved;
    —————————————————————————————————–
    PROCEDURE Exec_Cleanup
    IS
    BEGIN

    — exchange partition with table
    v_stat := 'alter table '||v_tmp_table_name||' exchange partition pmax with table '||v_schema||'.'||v_table_name;
    dbms_output.put_line ( v_stat );
    execute immediate v_stat;
    dbms_output.put_line('Table exchanged with partition.');

    for r_idx in (select idx.owner
    , idx.index_name
    from all_indexes idx
    where idx.table_owner = v_schema
    and idx.table_name = v_table_name
    and idx.status ‘VALID’
    )
    loop
    v_stat := ‘alter index ‘||r_idx.owner||’.’||r_idx.index_name||’ rebuild’;
    dbms_output.put_line ( v_stat );
    execute immediate v_stat;
    end loop;
    dbms_output.put_line(‘Indexes rebuild’);
    END Exec_Cleanup;
    —————————————————————————————————–

    BEGIN
    dbms_output.put_line ( ‘************************************************************************’ );
    dbms_output.put_line ( TO_CHAR(SYSDATE,’DY DD-MON-YYYY HH24:MI:SS’)||’ >> START CLEANUP TABLE ‘||v_schema||’.’||v_table_name );
    dbms_output.put_line ( ‘************************************************************************’ );
    — Drop temporary table, if they already exist
    Delete_Backupped_Table_Data;
    — Create partitioned table with data to be preserved
    Keep_Table_Data_2b_Preserved;
    — Exchange table data with the temporary table
    Exec_Cleanup;
    — Analyze the table after reorg
    DBMS_STATS.GATHER_TABLE_STATS( v_schema, v_table_name );
    dbms_output.put_line ( ‘************************************************************************’ );
    dbms_output.put_line ( TO_CHAR(SYSDATE,’DY DD-MON-YYYY HH24:MI:SS’)||’ >> END CLEANUP TABLE ‘||v_schema||’.’||v_table_name );
    dbms_output.put_line ( ‘************************************************************************’ );

    END;

    • Harry Dragstra on

      Arjen, I just found a bit of a bug in the TO_YMINTERVAL function… it doesn’t take into account a difference in the amount of days within a month when subtracting or adding one or more months from or to a specific date.
      For instance, if you subtract a month from date March 30th of this year, the result should be February 28 2015 but instead you run into an ORA-01839: “date not valid for month specified.” The ADD_MONTHS function behaves more like expected, so I adjusted the PL/SQL code to use this. Best if you adjust your code too.

      In reaction to your comment…
      Using a CTAS to set all remaining rows apart with a DDL statement is quite a common approach to bulk deletes. By subsequently dropping the original table, renaming the newly created table to its original name, and recreating indexes you avoid any DML statement, resulting in low undo, redo and temp usage. But this presupposes enough disk space for an extra table, no archiving of the deleted data, no dependencies or at least not many dependencies, and an offline time slot for the table switch.
      Your solution using partitioning with CTAS is a bit puzzling to me… why not use partitioning on the original table and drop partitions you no longer need. That’s what I referred to when I mentioned using the Partitioning Option on Oracle Enterprise Edition as probably the best method for a bulk delete. All other stuff, including my code, assumes the Oracle Database Standard Edition.