Probably this is old news for all of you who have lots of experience with XMLDB, but for me this is something new. In order for me to remember this sort of stuff, I need to write it down (I am getting older, you know) hence this blog. What I needed was a way to extract the data from an XML and insert all the values into two relational tables with a Master-Detail relation between them. The XML was initially stored in a table with a column of XMLType. This is the structure of the XML that needed to be processed:
<SinterklaasWishlists> <child> <name>Tim</name> <wishlist> <article> <artno>21491269</artno> <description>Crane</description> <price>12.50</price> </article> <article> <artno>21499517</artno> <description>Keyboard</description> <price>10</price> </article> <article> <artno>21521591</artno> <description>Crime Investigation Game</description> <price>9.95</price> </article> </wishlist> </child> </SinterklaasWishlists> The Master data in the XML is the Child for whom this is the Wishlist. The Detail data is the actual wishlist, the toys that the child would like to get from Santa Clause. Let assume for now only one child can be in the XML. And these are the relational tables in which the extracted data is going to be stored:
create table kids (name varchar2(25) PRIMARY key ) / create table wishlist (kid_name VARCHAR2(25) references kids ,artno number ,description varchar2(50) ,price number (6,2) ) /
Using XPath expressions it is easy to retrieve the value from a node in the XML. When you want to retrieve the Childs name from the XML file, this is the syntax:
SQL> SELECT extractvalue (wishlist, '*/child/name') 2 FROM t 3* WHERE id = 1 SQL> / EXTRACTVALUE(WISHLIST,'*/CHILD/NAME') --------------------------------------------------- Tim SQL>
Retrieving the detail information, the actual toys, posed more of a challenge. When you want to retrieve the article description with a similar syntax as before you get a exception:
SQL> SELECT extractvalue (wishlist, '*/child/wishlist/article/description') 2 FROM t 3 / SELECT extractvalue (wishlist, '*/child/wishlist/article/description') * ERROR at line 1: ORA-19025: EXTRACTVALUE returns value of only one node
Extractvalue can only retrieve the value of a single node, it reminded me of a TOO-MANY-ROWS exception. And it makes sense, the Wishlist contains multiple articles in a single record so to speak. Which of the articles should be shown when you only have a single record? Should it be the Keyboard, the Crane or the Crime Investigation Game? There are just too many rows to pick from. What is needed is a way to create multiple records from this single XML. Creating a cartesian product would generate lots of rows and this is technique is used in the below example:
SQL> SELECT extractvalue (Value (wl), '*/description') 2 FROM t 3 , TABLE (xmlsequence (extract (wishlist, '*/child/wishlist/article'))) wl 4 / EXTRACTVALUE(VALUE(WL),'*/DESCRIPTION') ----------------------------------------------------------------------------------- Crane Keyboard Crime Investigation Game
XMLSequence returns a VARRAY of XMLType. With the TABLE function, you can query a VARRAY as if it was a regular table. How this TABLE function works is beyond the scope of this Blog. Here is the final query to extract all the values from the XMLType .
SQL> SELECT extractvalue (wishlist, '*/child/name') childname 2 , extractvalue (Value (wl), '*/artno') Article_number 3 , extractvalue (Value (wl), '*/description') description 4 , extractvalue (Value (wl), '*/price') price 5 FROM t 6 , TABLE (xmlsequence (extract (wishlist, '*/child/wishlist/article'))) wl 7 where t.id = 1 8 / CHILDNAME ARTICLE_NUMBER DESCRIPTION PRICE --------------- -------------------- ----------------------------------- ----------- Tim 21491269 Crane 12.50 Tim 21499517 Keyboard 10 Tim 21521591 Crime Investigation Game 9.95
Multiple Kids
And just when you think you are done, things change. What happens when there are multiple wishlists in a single XML? The query that we just created will fail, with the same exception that we saw earlier: “ORA-19025: EXTRACTVALUE returns value of only one node”. The thing to do is use another XMLSequence to retrieve the wishlist for each of the kids:
SQL> SELECT extractvalue (value (kids), '*/name') childname 2 , extractvalue (value (wl), '*/artno') Article_number 3 , extractvalue (value (wl), '*/description') description 4 , extractvalue (value (wl), '*/price') price 5 FROM t 6 , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids 7 , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article'))) wl 8 WHERE id = 2 9 / CHILDNAME ARTICLE_NUMBER DESCRIPTION PRICE --------------- --------------- ----------------------------------- ------------- Tim 21491269 Crane 12.50 Tim 21499517 Keyboard 10 Tim 21521591 Crime Investigation Game 9.95 Lara 21539350 Chalk Board 12 Lara 21271168 Winnie the Pooh Bingo 10 Lara 21517846 Washing Machine 10
Multi Table Insert
Since all elements are falling in place, the last step was to insert all the values into the relational tables. This could be done using a Multi Table Insert, just like:
SQL> INSERT ALL 2 WHEN rn = 1 3 THEN INTO kids 4 VALUES (childname) 5 WHEN article_number IS NOT NULL 6 THEN INTO wishlist 7 VALUES (childname,article_number, description, price) 8 SELECT Row_Number() over (PARTITION BY childname 9 ORDER BY null) rn 10 , childname 11 , article_number 12 , description 13 , price 14 from (select extractvalue (value (kids), '*/name') childname 15 , extractvalue (value (wl), '*/artno') Article_number 16 , extractvalue (value (wl), '*/description') description 17 , extractvalue (value (wl), '*/price') price 18 FROM t 19 , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids 20 , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article'))) wl 21 WHERE id = 2 22 ) 23 / 8 rows created.
And there you have it, going from a single XML to filling two tables with a Master-Detail relation (sometimes called a Parent-Child relation, where the Parent table in this example is the KIDS table.) in a single SQL statement.
Using XMLTABLE syntax
As I am using a Oracle 10r2 database on my laptop, there is also a different kind of syntax you can use.
“Older” Syntax |
SELECT extractvalue (wishlist, '*/child/name') childname , extractvalue (Value (wl), '*/artno') Article_number , extractvalue (Value (wl), '*/description') description , extractvalue (Value (wl), '*/price') price FROM t , TABLE (xmlsequence (extract (wishlist, '*/child/wishlist/article'))) wl WHERE id = 1 |
SELECT extractvalue (value (kids), '*/name') childname , extractvalue (value (wl), '*/artno') Article_number , extractvalue (value (wl), '*/description') description , extractvalue (value (wl), '*/price') price FROM t , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article'))) wl WHERE id = 2 |
“Newer” Syntax |
SELECT extractvalue (wishlist, '*/child/name') childname , wl.* FROM t , xmltable ('*/child/wishlist/article' passing t.wishlist columns articlenumber number path 'artno' , description varchar2(25) path 'description' , price varchar2(10) path 'price' ) wl WHERE id = 1 |
SELECT kids.childname , wl.* FROM t , xmltable ('*/child' passing t.wishlist columns childname varchar2(25) path 'name' , articles xmltype path 'wishlist' ) kids , xmltable ('*/article' passing kids.articles columns articlenumber number path 'artno' , description varchar2(25) path 'description' , price varchar2(10) path 'price' ) wl WHERE id = 2 |
Sinterklaas
In the Netherlands, where AMIS is located, we celebrate Sinterklaas. As the legend goes, he lives in Spain and arives in the Netherlands by Steamboat. December fifth is his birthday and on that day children get presents. Our Sinterklaas is probably derived from the same legend as Santa Clause. The wishlists in this blog belong to my children. They are eagerly anticipating the arrival of Sinterklaas. More on Sinterklaas: http://www-astro.physics.ox.ac.uk/~erik/sint/sint.html
Update: How could I forget? Like I said before.. I am getting older 😉 Here is the Demo file I used in this blog.
For Steve (from the comments below)
Because it is not possible to properly format code in the comments section, I put it up here. What you want to do when a kid doesn’t have a wishlist list (indeed hard to imagine) is do an outerjoin. Yes, it is a simple as that…
First lets add an extra kid to our XML who doesn’t have a wishlist:
<child> <name>Alex</name> </child>
Then add the outerjoin to the query:
SELECT extractvalue (value (kids), '*/name') childname , extractvalue (value (wl), '*/artno') Article_number , extractvalue (value (wl), '*/description') description , extractvalue (value (wl), '*/price') price FROM t , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article')))(+) wl WHERE id = 2
And that’s all there is to it…
For Val (from the comments below)
You can refer to attributes using an xpath-expression. To learn more about xpath expressions, I highly recommend www.w3schools.com.
First I’ll create another entry in the T-table with a thing in the article tag which contains the attributes we’re interested in. Using the @-sign, you can refer to attribute values.
INSERT INTO t VALUES (3, xmltype ( '<sinterklaaswishlists> <child> <name>Tim</name> <wishlist> <article> <thing price="12.50" description="Crane" artno="21491269"> first </thing> </article> <article> <thing price="10" description="Keyboard" artno="21499517"> second </thing> </article> <article> <thing price="9.95" description="Crime Investigation Game" artno="21521591"> third </thing> </article> </wishlist> </child> </sinterklaaswishlists> ')) / COMMIT / SELECT extractvalue (value (kids), '*/name') childname , extractvalue (value (wl), '*/thing/@artno') Article_number , extractvalue (value (wl), '*/thing/@description') description , extractvalue (value (wl), '*/thing/@price') price FROM t , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article'))) wl WHERE id = 3 / CHILDNAME ARTICLE_NUMBER DESCRIPTION PRICE ------------------------- ------------------------- ------------------------- ----- Tim 21491269 Crane 12.50 Tim 21499517 Keyboard 10 Tim 21521591 Crime Investigation Game 9.95
Hope this helps.
Have a look at the following XMLTABLE with FOR ORIDINALITY example, where in this case the “GrossDividentRate” represents a repeating group of elements in the XML document and is passed as an XML fragment into the next XMLTABLE function:
See for more info and howto’s the Oracle OTN XMLDB forum and or my dedicated XMLDB blog, besides my posts on XML DB on the technology.amis.nl blog.
HTH
Marco
ÂÂ
Some good advice to all who read this post.
Be aware that the propriety XSQL functions are getting deprecated from 11gR2 and onwards.
The table(xmlsequence(extract())) syntax (XPath V1) should only be used in Oracle 1ogR1 and below.
The optimized XMLTABLE function (XPath V2 and XQuery 1.0 compatible) should be used from Oracle 10gR2 and onwards. This functionality is (performance wise) optimized in 10gR2 and onwards while the older propriety functionality is maintained only. XMLTABLE makes use of the XQuery engine introduced in Oracle 1ogR2.
To answer KILE’s question. Have a look at the FOR ORDINALITY syntax in XMLTABLE to sort out repeating groups/elements. In principal you can pass on the repeating group to a SECOND XMLTABLE function as a XML Fragment by choosing XMLTYPE as its datatype. In the second XMLTABLE function you refer in the PASSING clause to defined XMLTYPE column in the XMLTABLE function.
You could do the same (passing on XML fragments) with the table(xmlsequence(extract())) construct via the “position” XPath function as an alternative for the FOR ORDINALITY, but I wouldn’t advice this to be used in Oracle 10gR2. To make this more interesting. Officially, despite that it works, the “position” xpath function is not supported by Oracle in 10g and below…
…AND ALWAYS, ALWAYS try to use the full XPATH.
Avoid, but better, NEVER USE WILDCARD’s like “*” or “//”, if you favor performance. By giving the XML parser (or in some cases the Oracle Cost based Optimizer) only such hints to work with, the whole XML document will be searched. In effect this is very very very inefficient regarding memory and CPU use… You have been warned by someone who nowadays knows…
HTH
Marco
ÂÂ
Hi,
I’m using the following select to get the values of the items1 field:
SELECT EXTRACT (VALUE (xmltable0), ‘//value/text()’).getstringval () AS nombre_completo
FROM TABLE i, TABLE (XMLSEQUENCE (EXTRACT (i.xml_informe, ‘//main/items2/value’))) xmltable0
Where a record from this table is:
<main>
<items1>
<value>3</value>
<value>5</value>
<value>6</value>
<value>7</value>
<value>8</value>
</items1>
<items2>
<value>3</value>
<value>5</value>
<value>
<value>1-7</value>
<value>1-8</value>
</value>
<value>7</value>
<value>8</value>
</items2>
<main>
So I get 5 rows with values: 3 5 6 7 8.
My question is that sometimes I get more than 1 level of hierarchy, like in items2. How could be a good way to get it? I’m a little lost about how to make it, so any help will be more than welcome.
ÂÂ
Thank u very much in advance
ÂÂ
ÂÂ
Hey,
really a very useful article on XML DB. Easy to understand.
Resolved my question with SQL:
SELECT extractvalue(Value(a1), ‘*/ProductReference’) productreference,
extractvalue(Value(a1), ‘*/SerialNumber’) serialnumber,
extractvalue(Value(a1), ‘*/Model’) “model”,
extractvalue(Value(a1), ‘*/Coverage’) coverage,
extractvalue(Value(a1), ‘*/CoverageStatus’) coveragestatus,
extractvalue(Value(s1), ‘*/CountryCode’) countrycode,
extractvalue(Value(s1), ‘*/CountryName’) countryname,
extractvalue(Value(s1), ‘*/CustomerName’) customername,
extractvalue(Value(s1), ‘*/CustomerNumber’) customernumber,
extractvalue(Value(s1), ‘*/SiteNumber’) sitenumber,
extractvalue(Value(s1), ‘*/SiteShortName’) siteshortname,
extractvalue(Value(s1), ‘*/Address1’) address1,
extractvalue(Value(s1), ‘*/Address2’) address2,
extractvalue(Value(s1), ‘*/City’) city,
extractvalue(Value(s1), ‘*/State’) state,
extractvalue(Value(s1), ‘*/PostalCode’) postalcode,
extractvalue(Value(s1), ‘*/Location’) sitelocation,
extractvalue(Value(s1), ‘*/SitePhoneNumber’) sitephonenumber,
extractvalue(Value(s1), ‘*/ContactName’) contactname,
extractvalue(Value(s1), ‘*/Contactphone’) contactphone,
extractvalue(Value(s1), ‘*/CSRCode’) csrcode,
extractvalue(Value(s1), ‘*/CSRName’) csrname,
extractvalue(Value(s1), ‘*/BranchCode’) branchcode,
extractvalue(Value(s1), ‘*/TerritoryCode’) territorycode,
sdi.sdiid as reqid
FROM aradmin.sdi_xml_tab sdi,
TABLE (xmlsequence (extract (sdidocxml, ‘*/Assets/Asset’))) a1,
TABLE (xmlsequence (extract (value(a1), ‘*/Site’))) s1
Thanks,
Arthur
I have a similar structured XML where I would like to display each Asset’s tags and the Site tags for each Asset. when I run the query below, I get error- ORA-00904: “ASSET”: invalid identifier. What’s wrong?
XML:
===
–
+
–
–
…
…
…
…
–
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
.
.
.
200
84
QUERY:
=====
SELECT extractvalue(Value(a1), ‘*/ProductReference’) productreference,
extractvalue(Value(a1), ‘*/SerialNumber’) serialnumber,
extractvalue(Value(a1), ‘*/Model’) “model”,
extractvalue(Value(a1), ‘*/Coverage’) coverage,
extractvalue(Value(a1), ‘*/CoverageStatus’) coveragestatus,
extractvalue(Value(s1), ‘*/CountryCode’) countrycode,
extractvalue(Value(s1), ‘*/CountryName’) countryname,
extractvalue(Value(s1), ‘*/CustomerName’) customername,
extractvalue(Value(s1), ‘*/CustomerNumber’) customernumber,
extractvalue(Value(s1), ‘*/SiteNumber’) sitenumber,
extractvalue(Value(s1), ‘*/SiteShortName’) siteshortname,
extractvalue(Value(s1), ‘*/Address1’) address1,
extractvalue(Value(s1), ‘*/Address2’) address2,
extractvalue(Value(s1), ‘*/City’) city,
extractvalue(Value(s1), ‘*/State’) state,
extractvalue(Value(s1), ‘*/PostalCode’) postalcode,
extractvalue(Value(s1), ‘*/Location’) sitelocation,
extractvalue(Value(s1), ‘*/SitePhoneNumber’) sitephonenumber,
extractvalue(Value(s1), ‘*/ContactName’) contactname,
extractvalue(Value(s1), ‘*/Contactphone’) contactphone,
extractvalue(Value(s1), ‘*/CSRCode’) csrcode,
extractvalue(Value(s1), ‘*/CSRName’) csrname,
extractvalue(Value(s1), ‘*/BranchCode’) branchcode,
extractvalue(Value(s1), ‘*/TerritoryCode’) territorycode,
sdi.sdiid as reqid
FROM aradmin.sdi_xml_tab sdi,
TABLE (xmlsequence (extract (Asset, ‘*/Assets’))) a1,
TABLE (xmlsequence (extract (value(a1), ‘*/Asset/Site’))) s1
Thank you, Val for your comments. You can use xpath expressions to refer to attributes.
I’ll put up an example in the body of the post (formatting issue in the comments section)
Thank you for writing this. I do have a question – what if my XML file has only three tags and all the values are in attributes? What’s the syntax of selecting attribute values?
Thank you for your comments Steve. I put an answer to your question in the posting (it is not possible to properly format code in this section). Now that I reread your question I think I interpreted your question wrong. Maybe you don’t want an outerjoin after all. If you don’t want to see the kids without wishlist, but only the kids who do have a wishlist, the queries as the were posted originally do that.
I’ve been looking at a similar query, but always wondered how to do an inner join if the kids had no wishlist (hard to believe!), i.e. I don’t want to see aa row at all if there is no wishlist for the child.
Thank you, Jaromir. Good point. This is something I probably should have explained more or better (if at all) in this blog.
Hi,
> How you got the cartesian product is hard to determine..
not for my as I made the same error:)
the wrong way is to do:
5 FROM t
6 , TABLE (xmlsequence (extract (wishlist, ‘*/child’))) kids
7 , TABLE (xmlsequence (extract (wishlist, ‘*/wishlist/article’))) wl
8 WHERE id = 2
Note the usage of whishlist instead of value (kids) in the second TABLE() line
regards,
Jaromir D.B. Nemec
Hi Alan,
Thank you for your comment.
Table T is the table that looks like this:
create table t
(id number
,wishlist xmltype
)
In this table there are two records, the first one (with ID 1) has a single child’s wishlist. The second one (with ID 2) contains the wishlist of
two children.
The demo-script, included with the blog, contains both the table definition and the contents of this table.
How you got the cartesian product is hard to determine as you didn’t include the query that you used.
When I create another entry in the T-table (with ID 3) like:
INSERT INTO T
VALUES (3, xmltype (‘
Tim
21491269
Crane
12.50
21499517
Keyboard
10
21521591
Crime Investigation Game
9.95
Lara
21539350
Chalk Board
12
21517846
Washing Machine
10
21271168
Winnie the Pooh Bingo
10
Rachel
21539341
Barbie Doll
11.50
‘));
commit;
And issue this query:
SELECT Row_Number() over (PARTITION BY childname
ORDER BY null) rn
, childname
, article_number
, description
, price
from (
select extractvalue (value (kids), ‘*/name’) childname
, extractvalue (value (wl), ‘*/artno’) Article_number
, extractvalue (value (wl), ‘*/description’) description
, extractvalue (value (wl), ‘*/price’) price
FROM t
, TABLE (xmlsequence (extract (wishlist, ‘*/child’))) kids
, TABLE (xmlsequence (extract (value (kids), ‘*/wishlist/article’))) wl
WHERE id = 3
);
This is the result:
1 1 Lara 21539350 Chalk Board 12
2 2 Lara 21517846 Washing Machine 10
3 3 Lara 21271168 Winnie the Pooh Bingo 10
4 1 Rachel 21539341 Barbie Doll 11.50
5 1 Tim 21491269 Crane 12.50
6 2 Tim 21499517 Keyboard 10
7 3 Tim 21521591 Crime Investigation Game 9.95
Hope this answers your questions. And for those of you curious about how got what? Tim got the keyboard and Lara the chalkboard. 😉
I tried your suggested solutions but had a few problems.
I presume the table with an XMLtype column was called t.
Where is the column id defined?
With multiple names and multiple Article_number, description and price my result set produced
CHILDNAME ARTICLE_NUMBER DESCRIPTION PRICE
Tim 21491269 Crane 12.5
Tim 21499517 Keyboard 10
Tim 21521591 Crime Investigation Game 9.95
Tim 21491255 doll 52.5
Tim 21499517 Keyboard 10
Tim 21521599 Kitchen 29.95
Rachel 21491269 Crane 12.5
Rachel 21499517 Keyboard 10
Rachel 21521591 Crime Investigation Game 9.95
Rachel 21491255 doll 52.5
Rachel 21499517 Keyboard 10
Rachel 21521599 Kitchen 29.95
i.e. a cartesian join between children and Article_number.
Any suggestions?
Very interesting and useful article. Good job!