Read an Excel xlsx with PL/SQL

Anton Scheffer January 19, 2013 APEX, PL/SQL 131 Comments

At the OTN SQL and PLSQL forum I promised to publish some code I use for a project I’ still working on. This code allows you to select the content from an Excel document

select * from table( as_read_xlsx.read( as_read_xlsx.file2blob( 'DOC', 'Book1.xlsx' ) ) );

  SHEET_NR SHEET_NAME ROW_NR COL_NR CELL  CEL STRING_VAL NUMBER_VAL DATE_VAL
---------- ---------- ------ ------ ----- --- ---------- ---------- --------------------------
         1 Mijn naam       1      1 A1    N                      11
         1 Mijn naam       1      2 B1    N                      12
         1 Mijn naam       1      3 C1    N                      13
         1 Mijn naam       2      1 A2    N                      21
         1 Mijn naam       2      2 B2    N                      22
         1 Mijn naam       2      3 C2    N                      23
         1 Mijn naam       3      1 A3    N                      31
         1 Mijn naam       3      2 B3    N                      32
         1 Mijn naam       3      3 C3    N                      33
         1 Mijn naam       4      4 D4    S   D4
         1 Mijn naam       6      2 B6    D
         1 Mijn naam       7      2 B7    D
         1 Mijn naam       8      2 B8    D
         1 Mijn naam       9      2 B9    D
         1 Mijn naam      10      2 B10   D
         2 Sheet3          2      2 B2    S   Test
         2 Sheet3          3      3 C3    D                         19-JAN-2013 20:17:00
         2 Sheet3          4      1 A4    S   Anton

18 rows selected.

So here it is

Anton

** Changelog:
** 18-02-2013 – Ralph Bieber
Handle cell type “str” to prevent ORA-06502
if cell content is a string calculated by formula,
then cell type is “str” instead of “s” and value is inside tag
** 19-02-2013 – Ralph Bieber
Add column formula in tp_one_cell record, to show, if value is calculated by formula
** 20-02-2013 – Anton Scheffer
Handle cell types ‘inlineStr’ and ‘e’ to prevent ORA-06502
** 19-03-2013 – Anton Scheffer
Support for formatted and empty strings
Handle columns per row to prevent ORA-31186: Document contains too many nodes
** 12-06-2013 – Anton Scheffer
Handle sharedStrings.xml on older Oracle database versions
** 18-09-2013 – Anton Scheffer
** 18-09-2013 – Anton Scheffer
Fix for LPX-00200 could not convert from encoding UTF-8 to …
(Note, this is an error I can’t reproduce myself, maybe depending on database version and characterset)
Thank you Stanislav Safonov for this solution
Handle numbers with scientific notation
** 20-01-2014 – Anton Scheffer
Fix for a large number (60000+) of strings
** 16-05-2014 – Anton Scheffer
round to 15 digits
** 22-01-2016 – Anton Scheffer
free some used memory
** 02-02-2016 – Anton Scheffer
fixed bug regarding only returning first sheet, thanks Martin Goblet
** 19-07-2017 – Anton Scheffer
added fix for some invalid xlsx wrongly having /xl/ before file location

Tags:excel, plsql

About The Author

Anton Scheffer

Oracle Consultant at AMIS

131 Comments

Jose Luis A.R. July 25, 2016

Hello Anton:

First of all thanks for the input, I’m trying to read a file in excel, I’m using Toad for Oracle 9.7.2, and shows me the following error:

select * from table (as_read_xlsx.read (as_read_xlsx.file2blob (‘LEEEXCEL’, ‘Fuente.xlsx’)))

Error:
ORA-22288: failure file or LOB operation FILEOPEN
The device is not ready.
ORA-06512: at “DB_DATOS.AS_READ_XLSX”, line 376

Please could you help me, thanks.
- Anton Scheffer July 25, 2016
  
  Are you still using Oracle 9? I’ve never tried as_read_xlsx on an Oracle 9 database, so I’ve no idea if it will work at all. The error you get has nothing to do with parsing the Excel file though. The error is from the function file2blob. So the the file itself can’t be read. I can’t help you with that.
  - Jose Luis A.R. August 1, 2016
    
    Good thank you
Simon June 30, 2016

Apologies, seems it was an issue with the file…. recreated the xlsx files and appears to be working as expected…. thanks
Simon June 29, 2016

question: if I have a column that is formatted as “general” I’m not getting a value….. any ideas?
- Anton Scheffer June 29, 2016
  
  No, I have no idea. But when you can send me that Excel-file I will take a look and maybe I can tell you more.
Lucas June 25, 2016

Fantastic! So many thanks!
COUAS June 13, 2016

Hi Anton

I use APEX 4.0.2 with Oracle10 Express Edition, could i use your package to Import xlsx file from windows PC ?
Currently i have an error message ORA-22285: répertoire ou fichier inexistant pour l’opération FILEOPEN
ORA-06512: à “SCE_PROD.AS_READ_XLSX”, ligne 374

Regards
Phil
- Anton Scheffer June 13, 2016
  
  No, you can’t use this package to read from a file on a client PC. It works on a blob (i.e. in the database) or on a file on your database server.
  - Martin Goblet June 13, 2016
    
    As Anton just mentionned, you’ll need a procedure to get your file from your pc to the database/server. Have a look at this package https://technology.amis.nl/wp-content/uploads/images/FTP_INTERFACE_v2.1.sql
    - COUAS June 14, 2016
      
      Hi,
      
      Currently i use File WWW_FLOW_FILE with upload button. So i think using it ?
      
      My current code to upload an CSV file
      select blob_content into v_blob_data
      from wwv_flow_files
      where last_updated = (select max(last_updated) from wwv_flow_files where UPDATED_BY = :APP_USER)
      and id = (select max(id) from wwv_flow_files where updated_by = :APP_USER);
      
      Regards
      - Anton Scheffer June 14, 2016
        
        This package will not work on CSV-files. If you upload a XLSX file you can use this package to parse that. Just take the blob from wwv_flow_files.
        select *
        from table( as_read_xlsx.read( v_blob_data ) )
Nagaraj June 12, 2016

Hi ANTON SCHEFFER,

This was an awesome package. It helped me a lot in my current implementation.Just one thing i wanted to know, about the Hyperlink creation. I am facing issue while creating the hyperlinks in excel sheet. Facing issue while creating hyperlink using “Place in This Document” option within the excel. It will be a great help if you send me an example for creating hyperlink.

Thanks in Advance!

Regards,
Nagaraj
- Anton Scheffer June 13, 2016
  
  With this package you can read a XLSX-file, but you can’t create a XLSX or a hyperlink inside a XLSX.
  - Nagaraj June 17, 2016
    
    Hi Anton,
    
    I tried below option and it worked for me creating hyperlink. Please let me know your suggestion on the below.
    
    hyperlink ( p_col => 1, p_row => 1 , p_value => ‘EMP’ , p_url => ‘C:Exampleexamples.xlsx#’EMP’!A1′,p_sheet => 1);
    
    Regards,
    Nagaraj
  - Nagaraj June 17, 2016
    
    Hi Anton,
    I tried below option and it worked for me creating hyperlink. Please let me know your suggestion on the below.
    
    hyperlink ( p_col => 1, p_row => 1 , p_value => ‘EMP’ , p_url => ‘C:\Example\examples.xlsx#’EMP’!A1′,p_sheet => 1);
    
    Regards,
    Nagaraj
Nagaraj June 10, 2016

Hi ANTON SCHEFFER,

This was an excellent package i never came across. It helped me a lot in my implementation.

One thing i am facing issues is while creating the Hyperlinks in the excel. Could you please share with me, if you have any example for creating the Hyperlink using your package?

Thanks in Advance!

Regards,
Nagaraj
GZB4ZP March 16, 2016

Anton, this is a very interesting solution. Have you done anything similar for Excel 97-2003 type files, i.e. .xls?
- Anton Scheffer March 17, 2016
  
  No, this solution is only for xlsx format
  - GZB4ZP March 17, 2016
    
    Thanks for the quick reply.
  - Stanislav March 17, 2016
    
    Anton, you’re doomed to answer that question forever! )
    - Sampathkumar March 21, 2016
      
      Anton,
      
      I am getting below error.
      
      ORA-22285: non-existent directory or file for FILEOPEN operation
      ORA-06512: at “APPS.AS_READ_XLSX”, line 377
      - Anton Scheffer March 21, 2016
        
        That error says everything you need to know. The directory (parameter p_dir) you use doesn’t exists on your database server, or you (= user APPS) don’t have the necessary grants to use it.
      - martin.goblet@dornach.eu March 21, 2016
        
        @ SAMPATHKUMAR : got the same error in the past. Problem : I used the physical name of the directory instead of the name saved in ALL_DIRECTORIES. Hope this helps.
      - Sampathkumar March 21, 2016
        
        Thanks for your prompt response.
        
        My requirement is
        I have one excel file with 25 sheets.
        I need load the data into 25 different tables.
        How I need to approach.
        
        I am calling this package from apps user
        for (parameter p_dir) do I need o create any directory.
        
        Thanks in advance
        
        Sampathkumar Reddy E
      - Sampathkumar March 21, 2016
        
        Hi Anton,
        
        I have created the directory still i am getting below error
        
        ORA-22288: file or LOB operation FILEOPEN failed
        No such file or directory
        ORA-06512: at “APPS.AS_READ_XLSX”, line 377
      - snaseer March 21, 2016
        
        Create a DATABASE directory, which stores the path WHERE the files will be downloaded/read
        
        CREATE OR REPLACE DIRECTORY file_dir AS ‘/opt/FILE’;
  - Manoj May 27, 2016
    
    I need to do the same with xls e.g Excel 97-2003 type files. Please help
    - Anton Scheffer May 27, 2016
      
      Help with what? Finding the Excel97 specs? Writing some code? Money to hire a PL/SQL developer? This blog entry clearly states that it is about Excel xlsx. It has nothing to do with Excel 97-2003 type files. You’re welcome to ask anything about this blog, but for general questions about Oracle you can better use https://asktom.oracle.com or https://community.oracle.com/community/database/developer-tools/sql_and_pl_sql/content
    - Andrew May 27, 2016
      
      xls files are binary and you need the Excel program or a parser that can read the binary file.
      xlsx files are zipped directory structures containing a defined structure of xml files which are just text. That is why Anton’s plsql package can read xlsx and not xls.
      
      Try saving your xls files as xlsx.
Martin Goblet February 2, 2016

Hi Anton

The last version with the release of the memory only reads the first worksheet. The other worksheets seems to be ignored with the memory release, so i guess it’s not at the right place.

Regards
Martin
- Anton Scheffer February 2, 2016
  
  @Martin Goblet
  Fixed it, thanks for pointing out that error. See fixed version
  - naveen reddy Somu April 14, 2016
    
    Hi Anton,
    
    In the latest version all the records are getting duplicated. can you Please check this?
    
    Thanks,
    Naveen
    - Anton Scheffer April 14, 2016
      
      I have not heard from anybody having problems with the latest version. But I do know that I had some problems regarding double records when I used a Oracle 10 XE database, see https://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/ That turned out to be a Oracle bug. With Oracle 11 and 12 databases it should work.
      - naveen reddy Somu April 14, 2016
        
        Hi Anton,
        
        thanks very much for the quick response.
        
        currently I am testing in “Oracle Database 10g Express Edition Release 10.2.0.1.0 – Product” environment. as per you this is a problem with Oracle 10 version . Can it be work around with this version?
      - Anton Scheffer April 14, 2016
        
        I don’t know if a work around exists, but I certainly don’t have one. Sorry!
mahmoud kandeel January 24, 2016

Thanks a lot Anton
mahmoud kandeel January 24, 2016

Hi Andrew,

regarding the modification to decrease load on memory, would you explain the code?
I made the first modification:
————————————————————–
function blob2node( p_blob blob )
return dbms_xmldom.domnode
is
t_ndoc dbms_xmldom.DOMDocument;
begin
if p_blob is null or dbms_lob.getlength( p_blob ) = 0
then
return null;
end if;
t_ndoc := dbms_xmldom.newdomdocument( xmltype( p_blob, nls_charset_id( ‘AL32UTF8’ ) ) );
return dbms_xmldom.makenode( dbms_xmldom.getdocumentelement( t_ndoc ) );
/*return dbms_xmldom.makenode ( dbms_xmldom.getdocumentelement ( dbms_xmldom.newdomdocument ( xmltype ( p_blob, nls_charset_id ( ‘AL32UTF8’ ) ) ) ) );*/
exception
when others
then
declare
t_nd dbms_xmldom.domnode;
t_clob clob;
t_dest_offset integer;
t_src_offset integer;
t_lang_context number := dbms_lob.default_lang_ctx;
t_warning integer;
begin
dbms_lob.createtemporary( t_clob, true );
t_dest_offset := 1;
t_src_offset := 1;
dbms_lob.converttoclob( t_clob
, p_blob
, dbms_lob.lobmaxsize
, t_dest_offset
, t_src_offset
, nls_charset_id(‘AL32UTF8’)
, t_lang_context
, t_warning
);
t_nd := dbms_xmldom.makenode( dbms_xmldom.getdocumentelement( dbms_xmldom.newdomdocument( t_clob ) ) );
dbms_lob.freetemporary( t_clob );
return t_nd;
end;
end;
————————————————————–
but I’m not sure where to put
dbms_xmldom.freedocument(t_ndoc);
dbms_session.free_unused_user_memory;
return;

Regards
Mahmoud Kandeel
- Anton Scheffer January 24, 2016
  
  I’ve put a new version https://technology.amis.nl/wp-content/uploads/2013/01/as_read_xlsx8.txt on the blog. That version releases some unused memory, including a dbms_xmldom.freedocument(t_ndoc)
Andrew January 21, 2016

@Martin Goblet
” I guess oracle does not remove everything from the memory : I’ve searched a lot, but not found a solution!!!”

I noticed this as well. It is a memory leak issue with the plsql package not releasing the memory with additional calls in the same Oracle session. I made an adjustment to the package by creating a local variable instead of passing a function call as a parameter. This allows the code to explicitly release the memory.

See my post on October 11, 2013 20:18
mahmoud kandeel January 21, 2016

After discussion with Martin, it’s figured out that the package needs memory, and no limit for excel records.

Thanks Martin
Thanks Anton
mahmoud kandeel January 21, 2016

hi Martin,

would you please try my excel file? But I don’t know how to send to you.
However, my e-mail address is: mahmoud.kandeel@gmail.com

Best Regards,
Mahmoud Kandeel
Martin Goblet January 21, 2016

Hi MAHMOUD

Just tested with an excel sheet with 42164 rows, 12 columns on a PC (a very small machine) : it took 3,5 seconds to get the first records and 142 seconds to retrieve all the 505968 records.

Most of the time I do not read those records like this, but in columns with MAX (I’ve already explained in another post), to get something that looks directly like the lines of the worksheet.

PS : be aware, that if you launch several time the statement in the same session, each time it’s longer to execute. I guess oracle does not remove everything from the memory : I’ve searched a lot, but not found a solution!!!

Regards
Martin
mahmoud kandeel January 21, 2016

Hi Martin,

tell me please, how long did it take to extract 50,000 records from the excel file.

I have the same problem regarding “REPLY” button, never mind

Regards
Mahmoud kandeel
mahmoud kandeel January 21, 2016

Hi Martin,

tell me how much time did it take to extract 50,000 records from the excel

I have the same problem regarding “REPLY” button, never mind

Regards
Mahmoud kandeel
Martin Goblet January 21, 2016

Hi MAHMOUD

You don’t need to do a DISTINCT, just SELECT * FROM TABLE(as_read_xlsx.read(as_read_xlsx.file2blob(ps_filepath, ps_xlsx_sheet)));

PS : the REPLY button does no more work, So i’ve to create each time a new conversations instead of answering directly. Tried whit Chrome and IE.

Regards
Martin
mahmoud kandeel January 21, 2016

hi Martin,

please provide me with your steps, I might drop something when using the Anton’s package.
note: I use the following query to get data from a BLOB field within a database table
select distinct D.sheet_nr
, D.sheet_name
, D.row_nr
, D.col_nr
, D.cell
, D.cell_type
, D.string_val
, D.number_val
, D.date_val
, D.formula
from table
(select
(as_read_xlsx.read(g.FILE_BLOB))
from UPLOAD_BLOB g) D

Regards
Mahmoud Kandeel
- Anton Scheffer January 21, 2016
  
  There’s no limit on the number of records. This query should work
  select D.sheet_nr
  , D.sheet_name
  , D.row_nr
  , D.col_nr
  , D.cell
  , D.cell_type
  , D.string_val
  , D.number_val
  , D.date_val
  , D.formula
  from upload_blob g
  , table( as_read_xlsx.read(g.FILE_BLOB) ) d
Martin Goblet January 21, 2016

Hi MAHMOUD

I use Anton’s package on excel sheets with >50000 records without any problem.

Regards
Martin
mahmoud kandeel January 21, 2016

Hi ANTON,

the package works perfectly well with excel files with 5000 records or less.
is there a way out from this trap?

Regards
Mahmoud Kandeel
mahmoud kandeel January 14, 2016

Anton,

your packages are highly appreciated, and the results are really good.

I can use as_read_xlsx package by calling a BLOB column existing in some table of the database.
But calling a database directory doesn’t work.
My database directory path is as follows ‘\\10.10.1.33\general’, which works very fine using utl_file.fopen to read text files, as this path is fully accessible to all domain users with read and write privileges.

CREATE OR REPLACE DIRECTORY UTL_FILE_DIR AS ‘\\10.10.1.33\general’;

grant read,write on directory utl_file_dir to ;

select *
from table(
nonmed.as_read_xlsx.read
(nonmed.as_read_xlsx.file2blob( ‘UTL_FILE_DIR’, ‘EXCEL_EDITED.xlsx’ ))
);

the following error returned:
ORA-22288: file or LOB operation FILEOPEN failed

Thanks & Best Regards
Mahmoud kandeel
- Anton Scheffer January 19, 2016
  
  What do you mean with “which works fine using utl_file.fopen”
  That this is working?
  declare
  fh utl_file.file_type;
  begin
  fh := utl_file.fopen( ‘UTL_FILE_DIR’, ‘EXCEL_EDITED.xlsx’, ‘rb’ );
  utl_file.fclose( fh );
  end;
  
  Or this
  declare
  fh utl_file.file_type;
  begin
  fh := utl_file.fopen( ‘\\10.10.1.33\general’, ‘EXCEL_EDITED.xlsx’, ‘rb’ );
  utl_file.fclose( fh );
  end;
Lilianna November 18, 2015

Is that any update, because I found some column and row that return null, but show in select query. Example in excel just have 4 column and 10 row, but the select query return more than 4 column and 10 row, and different number of column.
- MARTIN GOBLET November 18, 2015
  
  Hi Lilianna
  
  When you are in the excel file and press the keys END HOME, do you get to field D10 or another ? If it’s another, then you have somewhere a space (or something like this) in a field. Delete the unused columns, save and try again.
  
  Regards
  Martin
  - Lilianna November 19, 2015
    
    I can’t avoid that space or anything in that field because the excel is provide by user, so I make dynamic query for select just for the right column, thanks for the fast reply.
TH Tech September 13, 2015

Hello Anton – great package, and works for most sane XLSX files … but I ran into a batch of XLSX files that unpredictably contain cellls with more than 4000 characters. So I had to modify your code to handle CLOBS. Yup, seems like overkill for a spreadsheet – but it is what iit i and I need to read it …

So.. the VARCHAR2(4000) limitation comes in the string_val declaration and the DBMS_XSLPROCESSOR.VALUEOF call. I switched by string_Vl declaration to CLOB, and switched the VALUEOF function call to a VALUEOF procedure call. The VALUE procedure will return up to 3767 chars, and that OUT parameters gets implicitly treated as a CLOB. I used CLOB because that what I need to store my data as.
Thank you!

Read more here … http://wp.me/pPnuD-6y
Jack R August 1, 2015

Hi Anton,

Thank you so much for sharing with us.

I have one similar requirements,read the excel sheet data (in blob column) from one of worksheet and first clear existing data and add new data from other table only for one selected worksheet, other worksheet data remains as it is. Do you have any such type of things which change the data and return in blob type.

Thanks
Jack R
Sagay June 3, 2015

@Anton – Thanks for the package. It is parsing the Excel file as expected.

When I run the given query in SQL developer, after returning few 100 rows it hits “ORA-06502: PL/SQL: numeric or value error: character string buffer too small” at line 296 of the package.

Can you please help?
- Anton Scheffer June 3, 2015
  
  @Sagay. The max size of a varchar2 in a SQL-query is 4000 characters. If your excel contains larger strings you will get that error. I don’t handle that in purpose. If you do want to truncate those values, just change that line to t_one_cell.string_val := substr( t_strings( to_number( t_val ) ), 1, 4000 );
CUGAR April 22, 2015

Hi, this is great package, but I have one problem.

When column in excel file contains only column name (in first row) and rest of values in taht column contaions NULLs, package will read only column name and maybe one more row with Null value.
It will not read other NULL’s values from that column.

How can it be fixed?
- Anton Scheffer April 23, 2015
  
  In my opinion that doesn’t need to be fixed. The package reads values stored in a Excel file. If a column only contains a “column name”, that means there’s nothing stored in the other rows (for that column). And that means no output from this package. But if you know that the Excel contains a matrix, and you expect null values for empty cells, YOU can change the query to add those values, outer joining it to something for instance. But feel free to change to package yourself to fix it.
- Nicolas Denis April 23, 2015
  
  Here the way I proceed the Excel once loaded. I use this to rebuild the matrix to its original state using a single SQL query.
  I dump the excel into a Global Temporary table for faster access, detect the “header” row (it may not be the first row) and rebuild my matrix with a query similar to:
  
  SELECT MAX(DECODE(col_nr, 1, TRIM(string_val), NULL)) COL_A,
  MAX(DECODE(col_nr, 2, TRIM(string_val), NULL)) COL_B
  FROM GLOBAL_TEMPORARY_TABLE
  WHERE SHEET_NAME = ‘My Sheet’
  AND ROW_NR > nFirstRowID
  GROUP BY ROW_NR;
  
  That way, even if the package do not read empty cell – the output you get include the blank column with NULL.
  Use “string_val”, “number_val” or “date_val” in the query depending with datatype you expect.
  
  Hope this will help
Martin Goblet March 12, 2015

Hi Anton

Again, many thanks for this wonderful package (and for the as_zip too)

Just a little correction, because I got a error on a huge string : added a substr(…, 1, 4000) on 2 lines

IF t_t IN (‘str’, ‘inlineStr’, ‘e’) THEN
t_one_cell.cell_type := ‘S’;
t_one_cell.string_val := SUBSTR(t_val, 1, 4000);
ELSIF t_t = ‘s’ THEN
t_one_cell.cell_type := ‘S’;

IF t_val IS NOT NULL THEN
t_one_cell.string_val := SUBSTR(t_strings(TO_NUMBER(t_val)), 1, 4000);
END IF;
ELSE

An other problem I noticed : when I run the procedure several times, it takes more time at each run. When I often run it on large xlsx files, I’ve to close my session and reopen it to start from scratch.

I guess that some memory is not freed up (DBMS_LOB.freetemporary, DBMS_SESSION.free_unused_user_memory, …) somewhere ?

Regards
Martin
- Andrew March 12, 2015
  
  Martin,
  I experienced the elapsed time increase with successive runs as well. There is a memory leak because a call to dbms_xmldom.getdocumentelement() is used as a parameter in a call to dbms_xmldom.newdocument(). The allocated memory is not released.
  I made a fix for that by using a local variable t_ndoc for the getdocumentelement() call and then explicitly freeing the memory.
  
  See my post on Oct 11, 2013.
  
  -–return dbms_xmldom.makenode( dbms_xmldom.getdocumentelement( –dbms_xmldom.newdomdocument( xmltype( p_blob, nls_charset_id( ‘AL32UTF8′ ) ) ) ) );
  
  t_ndoc := dbms_xmldom.newdomdocument( xmltype( p_blob, nls_charset_id( ‘AL32UTF8′ ) ) );
  return dbms_xmldom.makenode( dbms_xmldom.getdocumentelement( t_ndoc ) );
  - Martin Goblet March 12, 2015
    
    Andrew
    
    I didn’t see you post before Tried your solution, but the elapsed time still increases (a little bit less now) by each run. Could it be possible that the BLOB parameters memory is never reset (p_xlsx, p_blob, p_zipped_blob) ?
    
    Regards
    Martin
    - Andrew March 12, 2015
      
      I haven’t noticed any continued degradation after that fix but it may be possible the memory for the blobs are not released as well. Perhaps my files are not large enough to notice a difference.
      
      There is this line which may have a similar effect since it is using a function to pass as a parameter.
      
      t_nd := blob2node( get_file( p_xlsx, ‘xl/workbook.xml’ ) );
      
      Perhaps setting the get_file() call to a local variable first will have an effect but that is just speculation right now. You may have to search through the code to find instances like that and try it out.
Steven Holley August 6, 2014

Anton,
This is an excellent package that has saved me countless hours. I have only run into one problem. When reading values from the spreadsheet using the following command – select *
from table( as_read_xlsx.read( as_read_xlsx.file2blob( gv_directory, p_filename ) ) ) – if there are two consecutive null cells in the row, the second one does not show up in the returned data set. Is there any way to fix this?

Thanks for your time
Steve
Sokratis Aristovoulidis May 17, 2014

Many thanks for sharing!!
Martin Goblet May 14, 2014

Hi Anton

Many thanks for this code, it saved me a lot of time.

Just 2 points where I’ve some trouble with.
– Numbers appears like this 0.82280000000000009 instead of 0.8228. There are some false decimals at the end. Workaround I use is a round(), because I know the expected decimals in this field.
– I’ve a worksheet (link placed in the website) with a formula in a column (for each row). I only get the formula from the first row, the others are null.

Regards
Martin
- Anton Scheffer May 16, 2014
  
  @Martin Goblet
  I’ve added rounding to the package.
  The formula’s are empty because it are “shared” formula’s. Something I didn’t know about before, and that’s not handled (yet)
  - Martin Goblet May 19, 2014
    
    Many thanks.
critian April 28, 2014

Anton Scheffer you sent or explain like work in xls…greetings you are the best.
Cristian April 9, 2014

Hola Anton
Una consulta, cuando mando a listar me lista pero con data repetida
Gracias
Nicolas DENIS April 2, 2014

Hello,

I did do a quick improvement to the code.
When you put by mistake a large file (did test with a 500K text file) that is NOT a excel file, the get_file take pretty long time to return nothing.

I did add this at the beginning of the get_file function:
BEGIN
IF dbms_lob.substr(p_zipped_blob, 4, 1) = hextoraw(‘504B0304’) THEN
….
END IF;
RETURN NULL;
END;

To test the ZIP Header signature. If not a ZIP file, I simply ignore the request. (I could have add an exception but was not required).

I did also add a function to test if the file look like a valid Excel file:

FUNCTION testFile(p_xlsX BLOB) RETURN VARCHAR2 IS
cTest BLOB;
BEGIN
cTest := get_file(p_xlsx, ‘xl/workbook.xml’);
IF cTest IS NULL THEN
RETURN ‘N’;
ELSE
RETURN ‘Y’;
END IF;
END;

Otherwise, Thanks for this useful piece of code!
Jackson March 25, 2014

Hi thanks for the very good example, but you have a package with the same example reading xls?
- Anton Scheffer March 25, 2014
  
  @Jackson. No, I don’t have a package which can do something similar with xls files.
Krishna Santosh February 10, 2014

Hi,
I am Getting File2blob as a error while executing the script
- Anton Scheffer February 10, 2014
  
  @Krishna Santosh
  Make sure your directory object points to an existing directory on the file system, you have enough right to use it, you have enough rights on the file system and you have an excel file at that place.
Guido Leenders December 14, 2013

I’ve been using the XML storage facility of Excel for some time (going back to 2003). It works, too bad Oracle is not delivering a more extensive library set for PL/SQL when compared to Java or .NET. For integration of Excel and Oracle we’ve created an add-in to Excel named Invantive Control. Works the other way around: the user uploads (and downloads) all Excel data to/from a database or data vault. In that way, he can throw away the Excel sheet when done
N.S. December 12, 2013

when I try to open my xlsx file that has 27,916 lines is presenting the error

ORA-04030: out of process memory When trying to allocate 16,408 bytes

How could I fix this?

thank you
N.S. December 10, 2013

It would be possible to show the select in the same way that I see in excel, ie columns and data?

for example…..

select * from table( as_read_xlsx.read( as_read_xlsx.file2blob( ‘DOC’, ‘test.xlsx’ ), ‘Sheet1’ ) )

…. and show as below….

table_cod table_name product unit description price
1 name 1 166 CX test 46,56
2 name 2 203 CX second test name 63,06

thank you!
- Anton Scheffer December 10, 2013
  
  No
  - Martin Goblet March 12, 2015
    
    I do it this way :
    
    SELECT MAX(DECODE(col_nr, 1, SUBSTR(string_val, 1, 10), ”)) cod_taric
    , MAX(DECODE(col_nr, 1, SUBSTR(string_val, 12, 2), ”)) genre
    , MAX(DECODE(col_nr, 2, date_val, ”)) dat_valid_from
    , MAX(DECODE(col_nr, 3, string_val, ”)) utilisable
    FROM AS_READ…
    WHERE sheet_nr = 1 AND row_nr > 1 — Void description line
    GROUP BY sheet_nr, row_nr
    ORDER BY sheet_nr, row_nr;
    - mago28 March 12, 2015
      
      Thanks.
Frank November 10, 2013

Wondering if you can assist? I’ve found this package extremely useful, however some XLSX files are producing the following error:

ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00200: could not convert from encoding UTF-8 to ISO-8859-1
Error at line 257
ORA-06512: at “SYS.XMLTYPE”, line 265
ORA-06512: at “CSDPT.AS_READ_XLSX”, line 38
ORA-06512: at “CSDPT.AS_READ_XLSX”, line 191
ORA-06512: at line 1
31011. 00000 – “XML parsing failed”
*Cause: XML parser returned an error while trying to parse the document.
*Action: Check if the document to be parsed is valid.

As best as I can tell, it must be encountering invalid characters and killing the parser, however I do not have any control over the source XLSX file in order to see it remedied. Is there a method whereby the package can incorporate an error trap for this?
- Anton Scheffer November 11, 2013
  
  Try the latest version of this package. It might fix your problem. Note, this is an error I can’t reproduce myself, so I’m not sure if works for you.
  - Herbert Broich March 10, 2015
    
    Hello Anton!
    I have the same problem with UTF8 and MSWIN1252.
    Where can I get the latest version of this package?
    
    Regards
    Herbert Broich
    PRODATIC
    - Anton Scheffer March 10, 2015
      
      The latest version is linked in this blog.
      - Herbert Broich March 12, 2015
        
        Thanks a lot!
        This version works very well!
Ralph Bieber November 8, 2013

I ran into error ORA-06502, when parsing a number with exponential value.
Cell content was 3.3013052861106202E+17 (number 330130528611062020)

Following format string in the TO_NUMBER conversion raised the error:
t_nr := to_number( t_val, translate( t_val, ‘.0123456789,-+’, ‘D9999999999’ ), ‘NLS_NUMERIC_CHARACTERS=.,’ );

I replaced by:
t_nr := to_number(t_val, RegExp_Replace(translate(t_val, ‘.0123456789,+-‘, ‘D9999999999′ ),'(E).*$’,’EEEE’,1,1,’i’), ‘NLS_NUMERIC_CHARACTERS=.,’ );
- Anton Scheffer November 8, 2013
  
  Or you could use
  t_nr := to_number( t_val
  , case when instr( t_val, ‘E’ ) = 0
  then translate( t_val, ‘.012345678,-+’, ‘D999999999’ )
  else translate( substr( t_val, 1, instr( t_val, ‘E’ ) – 1 ), ‘.012345678,-+’, ‘D999999999’ ) || ‘EEEE’
  end
  , ‘NLS_NUMERIC_CHARACTERS=.,’
  );
Alban October 31, 2013

Thanks a lot, It’s really fast !
I had a trouble with some empty cells where your package found some values !! (it’s empty in Excel not in Oracle Result) After copy and paste entire sheet in a new one the problem gone. I think the values came from a previous action like ‘cut’ to this cell and the xlsx format keep it.
M.R. October 23, 2013

A carriage return character in a Excel file cell is coming as _x000D_ in Oracle…
M. October 22, 2013

Amazing solution, works great! Thank you very much for sharing it.

I have a question though: it does not work with xls files (older Excel format, 2003), it only works with xlsx files. Would be possible to get it to work with xls as well?

Regards,
M. Rusu, Canada
- Anton Scheffer October 22, 2013
  
  It’s certainly possible to write a a function to read the xls format, but that function would have almost nothing in common with the function of this blog.
- Jackson March 25, 2014
  
  hi you got something to xls
Andrew Markiewicz October 11, 2013

Thanks for this package. It saved me a lot of work.
However, there currently seems to be a memory leak using dbms_xmldom. I am on Oracle 11.2.0.3 so I thought this would have been fixed from 10g.

I was able to find a fix for this with a minor alteration to your package in the read function. I created a local domdocument variable instead of the implicit document parameter to dbms_xmldom.getdocumentelement()

….
t_ndoc dbms_xmldom.domdocument;
—
function blob2node( p_blob blob )
return dbms_xmldom.domnode
is
begin
if p_blob is null or dbms_lob.getlength( p_blob ) = 0
then
return null;
end if;
t_ndoc := dbms_xmldom.newdomdocument( xmltype( p_blob, nls_charset_id( ‘AL32UTF8’ ) ) );
–return dbms_xmldom.makenode( dbms_xmldom.getdocumentelement( dbms_xmldom.newdomdocument( xmltype( p_blob, nls_charset_id( ‘AL32UTF8’ ) ) ) ) );
return dbms_xmldom.makenode( dbms_xmldom.getdocumentelement( t_ndoc ) );
end;

This allows us to explicitly free up the memory at the end of the pipelined function with dbms_xmldom.freedocument(). I also call dbms_session.free_unused_user_memory() for good measure but that may not be necessary.

……
dbms_xmldom.freedocument(t_ndoc);
dbms_session.free_unused_user_memory;
return;
end;

Thanks again
Andrew
Robertson September 25, 2013

Hi Anto,
Today i copied the package , it is selecting the infinity empty rows after the data has finished in the Excel sheet. Please help on that
Stan September 17, 2013

Hello, Anton!
I’ve too stuck with same error, but your cure didn’t help. Is any way to get rid of those unrecognized simbols, or substitute something in place?
Thank you for your package!
- Anton Scheffer September 17, 2013
  
  @Stan
  It’s a problem I can’t reproduce on my database. So I’m only guessing what the cause could be, and what a solution might be. So, if the “cure” doesn’t help I have no idea how to solve it.
  Anton
TJ Hall July 18, 2013

Hi. I need somw help to execute. Where is the directory located when I execute the following?

select *
from table( as_read_xlsx.read( as_read_xlsx.file2blob( ‘DOC’, ‘Book1.xlsx’ ) ) )

I created the package in TOAD under my ID and I am trying to execute the select statement in TOAD. I get an error that it can’t find the directory (ORA-22285). Where is it looking for the directory? I tried using c:\dir name but that didn’t work either. Thanks Much.
- Anton Scheffer July 18, 2013
  
  That example is looking at directory object DOC. And that directory object could be pointing at a directory some where at your Oracle DB server. Something that you have to set up your self.
  - Selva January 22, 2015
    
    Anton,
    
    Same error showing for me ,
    select * from table( as_read_xlsx.read( as_read_xlsx.file2blob( ‘DOC’, ‘TestExcel.xlsx’ ) ) )
    
    Can’t find the directory line number 361 .. Am using sql developer ..the particular line is below ,
    
    t_nd := blob2node (get_file (p_xlsx, ‘xl/styles.xml’));
    t_nl :=
    DBMS_XSLPROCESSOR.selectnodes (t_nd,
    ‘/styleSheet/numFmts/numFmt’,
    t_ns
    );
    
    Need to change anything in code level , please suggest me .. i thing its too long back am asking question in this forum ..Thank you
  - Selva January 22, 2015
    
    This is the error am getting ..
    
    ORA-22285: non-existent directory or file for FILEOPEN operation
  - Alberto March 31, 2015
    
    Hello Anton,
    you could be a little more specific when you say “DOC” in which directory server database references? as could change this path. thanks for this great contribution
    - Alberto March 31, 2015
      
      Resolved, understood as change this directory, thank you. http://www.tuinformaticafacil.com/oracle-11g/como-crear-y-administrar-objetos-directory-en-oracle
Jon Renaut July 15, 2013

Thanks very much for sharing this – you’ve saved me a ton of time. I am, however, having some trouble with the code. It seems to be both doubling the sheets and doubling the cells – if I parse an xlsx with one sheet and one populated cell, I get four entries. Any ideas?
- Anton Scheffer July 15, 2013
  
  This function doesn’t return only populated cells, it also returns emptry cells. And because also the sheetname, and a row, column and cell number are returned it’s easy to see if you get double values. But I have never heard of anybody who gets any doubling.
  - michael July 15, 2013
    
    Thanks to Anton I got my “version” of the code to work, but I am now experiencing the same probelm an Jon Renaut: Doubling of results…
    - Anton Scheffer July 16, 2013
      
      Are you, by chance, using a 10.2.0.1.0 database? The last time I heard of double output was with some other blog of mine, regarding docx files: http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql
      And that was an Oracle bug.
      - Jon Renaut July 16, 2013
        
        Yeah, we’re still on 10. Upgrade in the works for at least a year with no progress. Thanks.
      - michael July 16, 2013
        
        Yes, we are using 10.2.0.3… We plan on moving to 11 by year end…
        
        I will assume that the problem is with ORACLE and apply code fix accordingly.
        
        Thanks.
michael July 9, 2013

I am trying to read an XLSX file:

select *
from table(as_read_xlsx.read(as_read_xlsx.file2blob(‘CALM_RESULTS’,’Closing_Final3.xlsx’),’Main Sheet’, ‘AF7’));

and getting the following error:

ERROR at line 2:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00200: could not convert from encoding UTF-8 to ISO-8859-1
Error at line 2
ORA-06512: at “SYS.XMLTYPE”, line 265
ORA-06512: at “DATASTORE.AS_READ_XLSX”, line 38
ORA-06512: at “DATASTORE.AS_READ_XLSX”, line 191
ORA-06512: at line 1

any ideas?

Thanks,
- Anton Scheffer July 9, 2013
  
  @michael
  Change the character set of your database? The error message says it all, the Excel contains some UTF-8 characters which can’t be handled by your database.
  You could try to handle it by changing this line
  t_nd := blob2node( get_file( p_xlsx, ‘xl/sharedStrings.xml’ ) );
  to
  declare
  t_blob blob;
  t_clob clob;
  t_dest_offset integer;
  t_src_offset integer;
  t_lang_context number := dbms_lob.default_lang_ctx;
  t_warning integer;
  XML_parsing_failed exception;
  PRAGMA EXCEPTION_INIT( XML_parsing_failed, -31011 );
  begin
  t_nd := blob2node( get_file( p_xlsx, ‘xl/sharedStrings.xml’ ) );
  exception
  when XML_parsing_failed
  then
  dbms_lob.createtemporary( t_clob, true);
  t_dest_offset := 1;
  t_src_offset := 1;
  dbms_lob.converttoclob( t_clob
  , get_file( p_xlsx, ‘xl/sharedStrings.xml’ )
  , dbms_lob.lobmaxsize
  , t_dest_offset
  , t_src_offset
  , nls_charset_id( ‘AL32UTF8’ )
  , t_lang_context
  , t_warning
  );
  dbms_lob.createtemporary( t_blob, true);
  t_dest_offset := 1;
  t_src_offset := 1;
  dbms_lob.converttoblob( t_blob
  , t_clob
  , dbms_lob.lobmaxsize
  , t_dest_offset
  , t_src_offset
  , nls_charset_id( ‘AL32UTF8’ )
  , t_lang_context
  , t_warning
  );
  t_nd := blob2node( t_blob );
  dbms_lob.freetemporary( t_blob );
  dbms_lob.freetemporary( t_clob );
  end;
Naseer June 11, 2013

Anton thanks a lot for your help. With your help I am able to see the data in the string_val column by replacing pattern from ‘.’ to ‘*/text()’

Thanks again.
Naseer June 11, 2013

I think in my case this line is causing string_val to be empty.
t_strings( i ) := dbms_xslprocessor.valueof( dbms_xmldom.item( t_nl, i ), ‘.’ );

It looks like patron ‘.’ is incorrect because dbms_xslprocessor.valueof( dbms_xmldom.item( t_nl, i ), ‘.’ ); is returning NULL.

Do you think there is some kind of version issue with XML related packages?
Naseer June 10, 2013

Query is:
select *
from table( as_read_xlsx.read( as_read_xlsx.file2blob( ‘UPLOAD_DIR’, ‘Spring2013.xlsx’ ) ) )

Query can read file and correctly list all columns except STRING_VAL (I am suing formula)

I will send you .xlsx file in few minutes.

Regards,
Anton Scheffer June 10, 2013

You could show me the query you are using or send me the excel you use
scheffer @ “the company I work” .nl
Naseer June 10, 2013

Thanks for quick reply. My bad I though I gave the necessary info :(.
In my case whenever the output for column CELL_TYPE is S in these cases STRING_VAL column is blank.
I tried to play with line below but no success.
t_one_cell.string_val := t_strings( to_number( t_val ) );

What info you are looking to suggest any possible reason.

Thanks a lot.
Naseer June 10, 2013

Hello Anton,

This is great utility. In my case String valuses are not comming up (blank). Number are comming correctly. We are at office 2010 and db is 11.1.0.7.0 – 64bit.

Any Idea?

Thanks,

Naseer – Houston
- Anton Scheffer June 10, 2013
  
  @Naseer.
  No, I have no ideas. But it could be that you gave just to little information to work on :).
tuataneo February 19, 2013

Excellent, but when I try *xls file, I cannot read * xls file. How I read *xls file in PL-SQL? is it possible?
Thanks
- Anton Scheffer February 19, 2013
  
  @tuataneo This blog is called read Excel xlsx for a reason. Of course you can read xls, or every other format you can think of with plsql, but not with the code I provided here.
- Jackson March 25, 2014
  
  Hi, you could use xls?
ynixon January 20, 2013

Excellent,
Since I am using oracle 10g, I cannot use the PIVOT function
Are you going to implement a formatted output the same as it looks in the excel ?
- Anton Scheffer January 20, 2013
  
  @ynixon
  No, I have no plans to change anything in this code. But if you know that your Excel has always the same number of columns its very easy to change the code yourself to return all Excel columns in one row.
  
  Anton
  - regrbde February 19, 2013
    
    Anton,
    
    thanks for providing this great code for parsing XLSX files. I was looking a long time for exactly this.
    The code works well, except if you use formula inside cells.
    I changed the code just a little to be able to handle and show formulas.
    
    Are you interested in the changed package?
    
    kind regards
    
    Ralph
    - Anton Scheffer February 19, 2013
      
      @regrbde. This code shows the result of every formula I tried, but please show your changes or mail it to me scheffer@…..
      - critian April 28, 2014
        
        what is your email.
        My email is netzaq80@gmail.com
      - regrbde May 9, 2014
        
        don’t know if I sent the code yet, but I sent a mail today
        
        Regards
        Ralph (regrbde)
    - critian April 28, 2014
      
      what is your twitter

Share this:

Like this:

Related Posts

About The Author

Anton Scheffer

131 Comments