Extracting Master-Detail data from an XMLType with a single SQL statement startup 594127 1280

Extracting Master-Detail data from an XMLType with a single SQL statement

Probably this is old news for all of you who have lots of experience with XMLDB, but for me this is something new. In order for me to remember this sort of stuff, I need to write it down (I am getting older, you know) hence this blog. What I needed was a way to extract the data from an XML and insert all the values into two relational tables with a Master-Detail relation between them. The XML was initially stored in a table with a column of XMLType. This is the structure of the XML that needed to be processed:

<SinterklaasWishlists>
   <child>
      <name>Tim</name>
      <wishlist>
         <article>
            <artno>21491269</artno>
            <description>Crane</description>
            <price>12.50</price>
         </article>
         <article>
            <artno>21499517</artno>
            <description>Keyboard</description>
            <price>10</price>
         </article>
         <article>
            <artno>21521591</artno>
            <description>Crime Investigation Game</description>
            <price>9.95</price>
         </article>
      </wishlist>
   </child>
</SinterklaasWishlists>
The Master data in the XML is the Child for whom this is the Wishlist. The Detail data is the actual wishlist, the toys that the child would like to get from Santa Clause. Let assume for now only one child can be in the XML. And these are the relational tables in which the extracted data is going to be stored:
create table kids
(name varchar2(25) PRIMARY key
)
/
create table wishlist
(kid_name VARCHAR2(25) references kids
,artno number
,description varchar2(50)
,price number (6,2)
)
/

Using XPath expressions it is easy to retrieve the value from a node in the XML. When you want to retrieve the Childs name from the XML file, this is the syntax:

SQL> SELECT extractvalue (wishlist, '*/child/name')
  2    FROM t
  3*  WHERE id = 1
SQL> /

EXTRACTVALUE(WISHLIST,'*/CHILD/NAME')
---------------------------------------------------
Tim

SQL>

Retrieving the detail information, the actual toys, posed more of a challenge. When you want to retrieve the article description with a similar syntax as before you get a exception:

SQL> SELECT extractvalue (wishlist, '*/child/wishlist/article/description')
  2    FROM t
  3  /
SELECT extractvalue (wishlist, '*/child/wishlist/article/description')
*
ERROR at line 1:
ORA-19025: EXTRACTVALUE returns value of only one node

Extractvalue can only retrieve the value of a single node, it reminded me of a TOO-MANY-ROWS exception. And it makes sense, the Wishlist contains multiple articles in a single record so to speak. Which of the articles should be shown when you only have a single record? Should it be the Keyboard, the Crane or the Crime Investigation Game? There are just too many rows to pick from. What is needed is a way to create multiple records from this single XML. Creating a cartesian product would generate lots of rows and this is technique is used in the below example:

SQL> SELECT extractvalue (Value (wl), '*/description')
  2    FROM t
  3       , TABLE (xmlsequence (extract (wishlist, '*/child/wishlist/article'))) wl
  4  /

EXTRACTVALUE(VALUE(WL),'*/DESCRIPTION')
-----------------------------------------------------------------------------------
Crane
Keyboard
Crime Investigation Game

XMLSequence returns a VARRAY of XMLType. With the TABLE function, you can query a VARRAY as if it was a regular table. How this TABLE function works is beyond the scope of this Blog. Here is the final query to extract all the values from the XMLType .

SQL> SELECT extractvalue (wishlist, '*/child/name') childname
  2       , extractvalue (Value (wl), '*/artno') Article_number
  3       , extractvalue (Value (wl), '*/description') description
  4       , extractvalue (Value (wl), '*/price') price
  5    FROM t
  6       , TABLE (xmlsequence (extract (wishlist, '*/child/wishlist/article'))) wl
  7   where t.id = 1
  8  /

CHILDNAME       ARTICLE_NUMBER       DESCRIPTION                         PRICE
--------------- -------------------- ----------------------------------- -----------
Tim             21491269             Crane                               12.50
Tim             21499517             Keyboard                            10
Tim             21521591             Crime Investigation Game            9.95

Multiple Kids

And just when you think you are done, things change. What happens when there are multiple wishlists in a single XML? The query that we just created will fail, with the same exception that we saw earlier: “ORA-19025: EXTRACTVALUE returns value of only one node”. The thing to do is use another XMLSequence to retrieve the wishlist for each of the kids:

SQL> SELECT extractvalue (value (kids), '*/name') childname
  2       , extractvalue (value (wl), '*/artno') Article_number
  3       , extractvalue (value (wl), '*/description') description
  4       , extractvalue (value (wl), '*/price') price
  5    FROM t
  6       , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids
  7       , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article'))) wl
  8   WHERE id = 2
  9  /

CHILDNAME       ARTICLE_NUMBER  DESCRIPTION                         PRICE
--------------- --------------- ----------------------------------- -------------
Tim             21491269        Crane                               12.50
Tim             21499517        Keyboard                            10
Tim             21521591        Crime Investigation Game            9.95
Lara            21539350        Chalk Board                         12
Lara            21271168        Winnie the Pooh Bingo               10
Lara            21517846        Washing Machine                     10

Multi Table Insert

Since all elements are falling in place, the last step was to insert all the values into the relational tables. This could be done using a Multi Table Insert, just like:

SQL> INSERT ALL
  2    WHEN rn = 1
  3    THEN INTO kids
  4    VALUES (childname)
  5    WHEN article_number IS NOT NULL
  6    THEN INTO wishlist
  7    VALUES (childname,article_number, description, price)
  8  SELECT Row_Number() over (PARTITION BY childname
  9                                ORDER BY null) rn
 10       , childname
 11       , article_number
 12       , description
 13       , price
 14    from (select extractvalue (value (kids), '*/name') childname
 15               , extractvalue (value (wl), '*/artno') Article_number
 16               , extractvalue (value (wl), '*/description') description
 17               , extractvalue (value (wl), '*/price') price
 18            FROM t
 19               , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids
 20               , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article'))) wl
 21           WHERE id = 2
 22         )
 23  /

8 rows created.

And there you have it, going from a single XML to filling two tables with a Master-Detail relation (sometimes called a Parent-Child relation, where the Parent table in this example is the KIDS table.) in a single SQL statement.

Using XMLTABLE syntax

As I am using a Oracle 10r2 database on my laptop, there is also a different kind of syntax you can use.

“Older” Syntax
SELECT extractvalue (wishlist, '*/child/name') childname
     , extractvalue (Value (wl), '*/artno') Article_number
     , extractvalue (Value (wl), '*/description') description
     , extractvalue (Value (wl), '*/price') price
  FROM t
     , TABLE (xmlsequence (extract (wishlist, '*/child/wishlist/article'))) wl
WHERE id = 1
SELECT extractvalue (value (kids), '*/name') childname
     , extractvalue (value (wl), '*/artno') Article_number
     , extractvalue (value (wl), '*/description') description
     , extractvalue (value (wl), '*/price') price
  FROM t
     , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids
     , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article'))) wl
 WHERE id = 2
“Newer” Syntax
SELECT extractvalue (wishlist, '*/child/name') childname
     , wl.*
  FROM t
     , xmltable ('*/child/wishlist/article'
       passing t.wishlist
       columns articlenumber number path 'artno'
             , description varchar2(25) path 'description'
             , price varchar2(10) path 'price'
       ) wl
WHERE id = 1
SELECT kids.childname
     , wl.*
  FROM t
     , xmltable ('*/child'
       passing t.wishlist
       columns childname varchar2(25) path 'name'
             , articles xmltype path 'wishlist'
       ) kids
     , xmltable ('*/article'
       passing kids.articles
       columns articlenumber number path 'artno'
             , description varchar2(25) path 'description'
             , price varchar2(10) path 'price'
       ) wl
 WHERE id = 2

Sinterklaas

In the Netherlands, where AMIS is located, we celebrate Sinterklaas. As the legend goes, he lives in Spain and arives in the Netherlands by Steamboat. December fifth is his birthday and on that day children get presents. Our Sinterklaas is probably derived from the same legend as Santa Clause. The wishlists in this blog belong to my children. They are eagerly anticipating the arrival of Sinterklaas. More on Sinterklaas: http://www-astro.physics.ox.ac.uk/~erik/sint/sint.html

Update: How could I forget? Like I said before.. I am getting older 😉 Here is the Demo file I used in this blog.

For Steve (from the comments below)

Because it is not possible to properly format code in the comments section, I put it up here. What you want to do when a kid doesn’t have a wishlist list (indeed hard to imagine) is do an outerjoin. Yes, it is a simple as that…
First lets add an extra kid to our XML who doesn’t have a wishlist:

<child>
   <name>Alex</name>
</child>

Then add the outerjoin to the query:

SELECT extractvalue (value (kids), '*/name') childname
     , extractvalue (value (wl), '*/artno') Article_number
     , extractvalue (value (wl), '*/description') description
     , extractvalue (value (wl), '*/price') price
  FROM t
     , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids
     , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article')))(+) wl
 WHERE id = 2

And that’s all there is to it…

 

For Val (from the comments below)

You can refer to attributes using an xpath-expression. To learn more about xpath expressions, I highly recommend www.w3schools.com.
First I’ll create another entry in the T-table with a thing in the article tag which contains the attributes we’re interested in. Using the @-sign, you can refer to attribute values.

INSERT INTO t
VALUES (3, xmltype (
'<sinterklaaswishlists>
   <child>
      <name>Tim</name>
      <wishlist>
         <article>
         <thing price="12.50" description="Crane" artno="21491269">
            first
         </thing>
         </article>
         <article>
           <thing price="10" description="Keyboard" artno="21499517">
            second
         </thing>
         </article>
         <article>
            <thing price="9.95" description="Crime Investigation Game" artno="21521591">
            third
         </thing>
         </article>
      </wishlist>
   </child>
</sinterklaaswishlists>
'))
/
COMMIT
/
SELECT extractvalue (value (kids), '*/name') childname
     , extractvalue (value (wl), '*/thing/@artno') Article_number
     , extractvalue (value (wl), '*/thing/@description') description
     , extractvalue (value (wl), '*/thing/@price') price
  FROM t
     , TABLE (xmlsequence (extract (wishlist, '*/child'))) kids
     , TABLE (xmlsequence (extract (value (kids), '*/wishlist/article'))) wl
 WHERE id = 3
/
CHILDNAME                 ARTICLE_NUMBER            DESCRIPTION               PRICE
------------------------- ------------------------- ------------------------- -----
Tim                       21491269                  Crane                     12.50
Tim                       21499517                  Keyboard                  10
Tim                       21521591                  Crime Investigation Game  9.95

Hope this helps.

15 Comments

  1. Marco Gralike March 6, 2010
  2. Marco Gralike March 6, 2010
  3. KILE March 6, 2010
  4. Madhu January 29, 2010
  5. Arthur Johnson October 22, 2007
  6. Arthur Johnson October 22, 2007
  7. Alex Nuijten August 30, 2007
  8. Val August 30, 2007
  9. Alex Nuijten April 12, 2007
  10. Steve April 10, 2007
  11. Alex Nuijten January 22, 2007
  12. Jaromir D.B. Nemec January 19, 2007
  13. Alex Nuijten January 8, 2007
  14. Alan Smith December 29, 2006
  15. Paweł Barut November 20, 2006