Oracle dbms_datapump provides a parallel option for exports and imports, but some objects cannot be processed in this mode. In a migration project from AIX 11gR2 to ODA X5-2 ( OL 5.9 ) 12c that included an initial load for Golden Gate, I had to deal with one of those objects, a 600G table with LOB fields, stored in the database as Basic Files ( = traditional LOB storage ).
By applying some DIY parallelization I was able to bring the export time back from 14 hours to 35 minutes.
Instrumental in this solution is the handy “detach” feature in the dbms_datapump package, and the use of dbms_rowid to “split” the table data in same sized chunks. The first allowed me to just define and start datapump jobs without having to wait till each one is finished, the second results in all jobs to end within just a short time of each other.
The following PL/SQL exports tables in 32 chunks with 32 concurrent datapump jobs. Feel free to adjust this “dop” and
schema as well as table names. Just one parameter is provided… it makes the export procedure as a whole wait
for the end of all exports, so some other action may start automatically ( e.g. a file transfer ).
CREATE OR REPLACE PACKAGE Datapump_Parallel_Exp_Pck IS g_parallel CONSTANT NUMBER := 32; g_dmp_dir CONSTANT VARCHAR2(25) := 'DATA_PUMP_DIR'; ------------------------------------------------------------------------------------------------- PROCEDURE Exec_Export ( P_wait IN PLS_INTEGER := 0 ); -------------------------------------------------------------------------------------------------- END Datapump_Parallel_Exp_Pck; / SHOW ERRORS; CREATE OR REPLACE PACKAGE BODY Datapump_Parallel_Exp_Pck IS ------------------------------------------------------------------------------------------------- PROCEDURE Sleep (P_millisesconds IN NUMBER) AS LANGUAGE JAVA NAME 'java.lang.Thread.sleep(int)'; ------------------------------------------------------------------------------------------------- FUNCTION Get_Current_Scn RETURN NUMBER IS v_ret NUMBER := 0; BEGIN SELECT current_scn INTO v_ret FROM v$database; RETURN v_ret; EXCEPTION WHEN OTHERS THEN RAISE_APPLICATION_ERROR( -20010, SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE ); END Get_Current_Scn; ------------------------------------------------------------------------------------------------- PROCEDURE Exp_Tables_Parallel ( P_scn IN NUMBER , P_dmp OUT VARCHAR2 ) IS h1 NUMBER(10); v_dop NUMBER := g_parallel; v_curr_scn NUMBER := P_scn; v_job_name_org VARCHAR2(30) := 'PX_'||TO_CHAR(sysdate,'YYYYMMDDHH24MISS'); -- PX: Parallel Execution v_job_name VARCHAR2(30) := v_job_name_org; v_dmp_file_name_org VARCHAR2(100) := lower(v_job_name||'.dmp'); v_dmp_file_name VARCHAR2(100) := v_dmp_file_name_org; v_log_file_name_org VARCHAR2(100) := lower(v_job_name||'.log'); v_log_file_name VARCHAR2(100) := v_log_file_name_org; BEGIN -- drop master table for "orphaned job" if it exists for i in ( select 'DROP TABLE '||owner_name||'.'||job_name||' PURGE' stat from dba_datapump_jobs where owner_name = USER and instr(v_job_name, upper(job_name) ) > 0 and state = 'NOT RUNNING' and attached_sessions = 0 ) loop execute immediate i.stat; end loop; -- set out parameter P_dmp := v_dmp_file_name; -- start jobs in parallel DBMS_OUTPUT.PUT_LINE('**** START SETTING DATAPUMP PARALLEL_TABLE_EXPORT JOBS ****' ); for counter in 0 .. v_dop-1 loop v_job_name := v_job_name_org||'_'||lpad(counter+1,3,0); v_dmp_file_name := v_dmp_file_name_org||'_'||lpad(counter+1,3,0); v_log_file_name := v_log_file_name_org||'_'||lpad(counter+1,3,0); h1 := dbms_datapump.open ( operation => 'EXPORT' , job_mode => 'SCHEMA' , job_name => v_job_name , version => 'LATEST'); DBMS_OUTPUT.PUT_LINE( 'Successfully opened job: '||v_job_name); dbms_datapump.set_parallel(handle => h1, degree => 1); dbms_datapump.set_parameter(handle => h1, name => 'KEEP_MASTER', value => 0); dbms_datapump.set_parameter(handle => h1, name => 'ESTIMATE', value => 'BLOCKS'); dbms_datapump.set_parameter(handle => h1, name => 'INCLUDE_METADATA', value => 0); dbms_datapump.set_parameter(handle => h1, name => 'METRICS', value => 1); dbms_datapump.set_parameter(handle => h1, name => 'FLASHBACK_SCN', value => v_curr_scn); DBMS_OUTPUT.PUT_LINE('Successfully set job parameters for job '||v_job_name); -- export just these schemas dbms_datapump.metadata_filter(handle => h1, name => 'SCHEMA_LIST', value => ' ''<SCHEMA01>'',''<SCHEMA02>'',''<SCHEMA03>'' '); DBMS_OUTPUT.PUT_LINE('Successfully set schemas for job '||v_job_name); -- export tables only dbms_datapump.metadata_filter(handle => h1, name => 'INCLUDE_PATH_EXPR', value => q'[='TABLE']' ); DBMS_OUTPUT.PUT_LINE('Successfully set table export for job '||v_job_name); -- export just these tables dbms_datapump.metadata_filter(handle => h1, name => 'NAME_LIST', value => ' ''<TABLE01>'',''<TABLE02>'',''<TABLE03>'',''<TABLE04>'',''<TABLE05>'' ', object_path => 'TABLE'); DBMS_OUTPUT.PUT_LINE('Successfully set tables for job '||v_job_name); -- export just a 1/v_dop part of the data dbms_datapump.data_filter(handle => h1, name => 'SUBQUERY', value => 'WHERE MOD(DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID), '||v_dop||')='||counter); DBMS_OUTPUT.PUT_LINE('Successfully set data filter for job '||v_job_name); dbms_datapump.add_file ( handle => h1 , filename => v_dmp_file_name , directory => g_dmp_dir , filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_DUMP_FILE , reusefile => 1 ); DBMS_OUTPUT.PUT_LINE('Successfully add dmp file: '||v_dmp_file_name); dbms_datapump.add_file ( handle => h1 , filename => v_log_file_name , directory => g_dmp_dir , filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_LOG_FILE); DBMS_OUTPUT.PUT_LINE('Successfully add log file: '||v_log_file_name ); dbms_datapump.log_entry(handle => h1, message => 'Job '||(counter+1)||'/'||v_dop||' starting at '||to_char(sysdate, 'dd-mon-yyyy hh24:mi:ss')||' as of scn: '||v_curr_scn ); dbms_datapump.start_job(handle => h1, skip_current => 0, abort_step => 0); DBMS_OUTPUT.PUT_LINE('Successfully started job '||(counter+1)||'/'||v_dop||' at '||to_char(sysdate,'dd-mon-yyyy hh24:mi:ss') ||' as of scn: '||v_curr_scn ); dbms_datapump.detach(handle => h1); DBMS_OUTPUT.PUT_LINE('Successfully detached from job' ); end loop; DBMS_OUTPUT.PUT_LINE('**** END SETTING DATAPUMP PARALLEL_TABLE_EXPORT JOBS ****' ); EXCEPTION WHEN OTHERS THEN dbms_datapump.detach(handle => h1); DBMS_OUTPUT.PUT_LINE('Successfully detached from job' ); DBMS_OUTPUT.PUT_LINE('Error: '||SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE ); DBMS_OUTPUT.PUT_LINE('**** END SETTING DATAPUMP PARALLEL_TABLE_EXPORT JOBS ****' ); RAISE_APPLICATION_ERROR( -20010, SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE ); END Exp_Tables_Parallel; ------------------------------------------------------------------------------------------------- PROCEDURE Exec_Export ( P_wait IN PLS_INTEGER := 0 ) IS v_scn NUMBER; v_dmp VARCHAR2(200); export_done PLS_INTEGER := 0; BEGIN -- get current scn v_scn := Get_Current_Scn; -- start parallel export processes + detach Exp_Tables_Parallel( v_scn, v_dmp ); if P_wait = 1 then -- wait till all parallel export processes are finished -- check every 5 minutes export_done := 0; loop for i in ( select 1 from ( select count(*) cnt from user_tables where instr(table_name,upper(replace(v_dmp,'.dmp'))) > 0 ) where cnt = 0 ) loop export_done := 1; end loop; if export_done = 1 then exit; end if; Sleep(300000); end loop; end if; EXCEPTION WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE('Error: '||SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE ); RAISE_APPLICATION_ERROR( -20010, SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE ); END Exec_Export; -------------------------------------------------------------------------------------------------------- END Datapump_Parallel_Exp_Pck; / SHOW ERRORS;
How would you import this data? Do we specify “px_20200320110018.dmp_%U” as the dump file name for impdp?
Can you please post the script usage please
Nice script Harry and it works like a charm 😉
Harry, thanks for this Do It Yourself post.
Question for a correct understanding: Is this specifically useful fro Oracle Standard Edition, as you can’t use the SET_PARALLEL procedure (only for EE) to set the degree of parallelism.?
Regards, Job.
Yes, on the Enterprise Edition, in any case where parallelization just doesn’t work ( like on a table with LOB’s stored as Basic File instead of as Secure File ), but also on Oracle Standard Database where parallelization isn’t part of the license.
To the best of my knowledge dbms_rowid can be used in any edition, but I’m not an expert on license rules so better make sure this is allowed.