Working for years with Oracleâ€™s relational environment, XML DB is a completely different kind of beast. Youâ€™re confronted with all the languages and techniques the XML environment has to offer – XML, XML Schema, XSLT, Xpath, DTD, DOM, SAX, etc,etc,etc – plus the ones you know from Oracleâ€™s (object) relational (database) world â€“ SQL, PL/SQL, java, OCI, java, etc.

I am currently working for a customer, who implemented the XML DB functionality in a new system. The total amount of XML data is roundabout 200 Gb. The XML data is split-up in 4 tables, 2 tables containing an XMLType column (XML Schema based, CLOB storage) and 2 XMLType tables (XML Schema based, Object Relational storage).

Some XML documents are more than 200 pages printed data and fairly complex in their structure , sometimes as big as 3 to 4 Mb. So it’s a real challenge to get the data you want, in a decent time period. Even simple questions like â€œgive me a count ofâ€¦ï¿½? or â€œsummerize the following numbersâ€¦ï¿½? can be tricky if you are dealing with complextype constructions/elements. Usage of COUNT and SUM can be tricky in an XML environment.

The following will give you more insight in Oracle SQL/XPath solutions and the need for a clear understanding of your XML Schema’s and the solutions presented by the Oracle database.

You can use Oracleâ€™s XMLType functions to query and/or extract data. From Oracleâ€™s XML DB developers Guide it states:

You can query XMLType data and extract portions of it using the existsNode(), extract(), or extractValue() functions. These functions use a subset of the W3C XPath recommendation to navigate the document.

<blockquote>
<li>existsNode()</li>
<li>extract()</li>
<li>extractValue()</li>
</blockquote>

The extract() function returns the node or nodes that match the XPath expression. Nodes are returned as an instance of XMLType. The results of extract() can be either a document or documentFragment.

<blockquote><pre>
SELECT extract(object_value,’/PurchaseOrder/Reference’) "Reference"
FROM PURCHASEORDER;
Reference
———————————————————————–
<reference>SBELL-2002100912333601PDT</reference>

1 row selected.
</pre></blockquote>

The extractValue() function returns the value of the text node or attribute value that matches the supplied XPath expression. The value is returned as a SQL scalar datatype. This means that the XPath expression passed to extractValue() must uniquely identify a single text node or attribute value within the document.

<blockquote><pre>
SELECT extractValue(object_value,’/PurchaseOrder/Reference’) "Reference"
FROM PURCHASEORDER;
Reference
———————————————————————–
SBELL-2002100912333601PDT

1 row selected.
</pre></blockquote>

The existsNode function evaluates whether or not a given document contains a node which matches a W3C XPath expression. The existsNode() function returns true (1) if the document contains the node specified by the XPath expression supplied to the function and false (0) if it does not.

<blockquote><pre>
SELECT COUNT(*)
FROM PURCHASEORDER
WHERE existsNode(object_value,’/PurchaseOrder/Reference’) = 1;
COUNT(*)
———-
132
1 row selected.

SELECT count(*)
FROM PURCHASEORDER
WHERE existsNode(object_value, ‘/PurchaseOrder[Reference="SBELL-2002100912333601PDT"]’) = 1;
COUNT(*)
———-
1
1 row selected.
</pre></blockquote>

These examples (see <a href="http://oraclesvca2.oracle.com/docs/cd/B14117_01/appdev.101/b10790/xdb03usg.htm#BABBBJBE" target="_blank"> "Example 3-23 Searching XML Content Using the existsNode() Function"</a> ) from the XML DB Developers Guide can be misleading if you focus to much on the count(*) expression.

The count is only correct (in the last example) because there is an explicit â€œwhere Reference = {value}ï¿½? XPath declaration in it. If you would base your count, as in the statement above, in XPath terms on â€œwhere there exists a node called referenceï¿½?, then the Xpath expression will probably not give you the expected result.

This is because the existsNode only validates the existance off the node â€“ not how many nodes there are of this node in this branch. If it finds a node â€“ it validates the condition as valid – and exits the branchâ€¦

This would give you the "expected" result for the count(*), in the example below = 2 records, if the structure would be a structure like:

<blockquote><pre>
<purchaseorder>
<reference>SBELL-2002100912333601PDT</reference>
</purchaseorder>
<purchaseorder>
<reference>SBELL-2004100912389972PDT</reference>
</purchaseorder>
</pre></blockquote>

The statement would give you an "unexpected" result for the count(*), in the example below = 1 record, containting repeating elements like a structure as described below:

<blockquote><pre>
<purchaseorder>
<reference>SBELL-2004100912333601PDT</reference>
<reference>SBELL-2004100912389972PDT</reference>
</purchaseorder>
</pre></blockquote>

Look at the following XML Schema structure:

Based on this schema we now create a table and insert some rows.

set lines 50
set long 10000
set trimspool on

drop table amis_xml;

create table amis_xml (data xmltype);

insert into amis_xml
values
(xmltype(‘<PAYMENT><PERSON><SURNAME>Marco</SURNAME><NAME>Gralike</NAME><SALARY>2345</SALARY></PERSON></PAYMENT>’));

insert into amis_xml
values
(xmltype(‘<PAYMENT><PERSON><SURNAME>BertJan</SURNAME><NAME>Meinders</NAME><SALARY>3456</SALARY><SALARY>125</SALARY></PERSON></PAYMENT>’));

insert into amis_xml
values
(xmltype(‘<PAYMENT><PERSON><SURNAME>Chris</SURNAME><NAME>Gralike</NAME><SALARY>4567</SALARY></PERSON></PAYMENT>’));

insert into amis_xml
values
(xmltype(‘<PAYMENT><PERSON><SURNAME>Karin</SURNAME><NAME>Kriebisch</NAME></PERSON></PAYMENT>’));

commit;

desc AMIS_XML
Name Null? Type
———————– ——– —————-
DATA XMLTYPE

select t.data from AMIS_XML t;

DATA
————————————————–
<PAYMENT>
<PERSON>
<SURNAME>Marco</SURNAME>
<NAME>Gralike</NAME>
<SALARY>2345</SALARY>
</PERSON>
</PAYMENT>
<PAYMENT>
<PERSON>
<SURNAME>BertJan</SURNAME>
<NAME>Meinders</NAME>
<SALARY>3456</SALARY>
<SALARY>125</SALARY>
</PERSON>
</PAYMENT>
<PAYMENT>
<PERSON>
<SURNAME>Chris</SURNAME>
<NAME>Gralike</NAME>
<SALARY>4567</SALARY>
</PERSON>
</PAYMENT>
<PAYMENT>
<PERSON>
<SURNAME>Karin</SURNAME>
<NAME>Kriebisch</NAME>
</PERSON>
</PAYMENT>

4 rows selected.

select * from amis_xml t
where existsnode(t.data,’/PAYMENT/PERSON/SALARY’)=1;

3 rows selected.

select count(*) from amis_xml t
where existsnode(t.data,’/PAYMENT/PERSON/SALARY’)=1;

COUNT(*)
———
3

1 row selected.

</pre></code>

Instead off the (maybe) expected 4 rows selected, existsnode only returns 3. The two salary nodes of Mr. Meinders are counted as one.

The solution to this problem is presented by the XMLSequence function.

The XMLSequence() function makes it possible to take an XMLType containing a fragment and perform SQL operations on it. It generates a collection of XMLType objects from an XMLType containing a fragment. The collection contains one XMLType for each of the root elements in the fragment. This collection of XMLType objects can then be converted into a virtual table using the SQL table() function. Converting the fragment into a virtual table makes it easier to use SQL to process the results of an extract() function that returned multiple nodes.

The <a href="http://oraclesvca2.oracle.com/docs/cd/B14117_01/appdev.101/b10790/xdb03usg.htm#BABFEHBD" target="_blank">examples</a> as shown in the XML DB Developers Guide, show us examples why and when we should use XMLSequence (mark the XXX records selected).

The described structure:

<blockquote><pre>
<purchaseorder>
<reference>SBELL-2004100912333601PDT</reference>
<reference>SBELL-2004100912389972PDT</reference>
</purchaseorder>
</pre></blockquote>

Would give with the statement:

<blockquote><pre>
SELECT extract(object_value,’/PurchaseOrder/Reference’) "Reference"
FROM PURCHASEORDER
;
Reference
———————————————————————–
<reference>SBELL-2004100912333601PDT</reference>
<reference>SBELL-2004100912389972PDT</reference>
.
1 record selected
</pre>
</blockquote>

<blockquote><pre>
SELECT extract(value(v),’/Reference’) "Reference"
FROM PURCHASEORDER t
, TABLE(XMLSequence(extract(object_value, ‘/PurchaseOrder/Reference’))) v
;
Reference
———————————————————————–
<reference>SBELL-2004100912333601PDT</reference>
<reference>SBELL-2004100912389972PDT</reference>
.
2 records selected
</pre>
</blockquote>

Knowing this, we now can re-write our SQL/XPath statements.

<blockquote><pre>
select extract(value(v),’/SALARY’) "Salary"
from amis_xml t
, TABLE(XMLSequence(extract(t.data,’/PAYMENT/PERSON/SALARY’))) v;
.
Salary
————————————————–
<SALARY>2345</SALARY>
<SALARY>3456</SALARY>
<SALARY>125</SALARY>
<SALARY>4567</SALARY>
.
4 rows selected.

select count(*)
from amis_xml t
, TABLE(XMLSequence(extract(t.data,’/PAYMENT/PERSON/SALARY’))) v;
.
COUNT(*)
———
4
.
1 row selected.

select extractvalue(value(v),’/SALARY’) "Salary"
from amis_xml t
, TABLE(XMLSequence(extract(t.data,’/PAYMENT/PERSON/SALARY’))) v;
.
Salary
——
2345
3456
125
4567
.
4 rows selected.

select sum(extractvalue(value(v),’/SALARY’)) "Total"
from amis_xml t
, TABLE(XMLSequence(extract(t.data,’/PAYMENT/PERSON/SALARY’))) v;

Total
———
10493
1 row selected.
</pre>
</blockquote>

The next release of the Oracle database, version 10g version 2, will hopefully present more flexible tools like XQuery and or better XPath possibilities like:

<blockquote><pre>
select extract(t.data,’count(/PAYMENT/PERSON/SALARY)’)
from amis_xml t ;
ERROR:
ORA-31012: Given XPATH expression not supported
</pre>
</blockquote>

5 Comments

Marco Gralike November 17, 2005

Sean Dillon on asktom (http://asktom.oracle.com) has always very good examples. I looked it up a little bit over there (right now i am into “storage” performacne problems not really into select, update, deletes etc), but i think i found an artikel that displains how to do this. See artikel http://asktom.oracle.com/pls/ask/f?p=4950:8:13085855774406036280::NO::F4950_P8_DISPLAYID,F4950_P8_CRITERIA:6634140285004
(i used search for “update xml sean dillon”). A search there on “sean dillon” is a good read anyway.
Foo November 14, 2005

Marco, how would one insert a new reference element into the xml block below using Oracle’s XML funcs(ie: extract, updateXml). Is it possible to remove a reference element also ? It seems updateXMl w/ NULL just makes the reference element become NULL. SBELL-2004100912333601PDT
SBELL-2004100912389972PDT
Foo November 14, 2005

Marco, do you know how I could insert a new element into an xml block like: SBELL-2004100912333601PDT
SBELL-2004100912389972PDT
How would I insert another reference element using Oracle XML DB funcs ? Is it possible also to remove a reference element completely ? deleteXML() doesnt exist in 10g
Marco May 23, 2005

Oops, your right. I will correct it asap. Thanx Harm
harm May 21, 2005

Hi Marco, can you explain the difference in node in your XSD and node in your data. To me it looks like your data is not being validated by your XSD, or is it?

CU Harm

Mastering XML DB – COUNT and SUM

Like this:

About The Author

Marco Gralike

5 Comments

Share this:

Like this:

Related Posts

About The Author

Marco Gralike

5 Comments