Oracle RDBMS 10GR1: solution to avoid character encoding in XML with UPDATEXML

0

On a recent project, I ran into a problem with an XML document, that had to be enclosed within another XML document, generated from a database query. The problem I ran into was the character encoding of the XMLElement function, which eventually was worked around with UPDATEXML.

In this blog post I would like to share with you several attempts to solve this problem, why they failed and the final solution to the problem.

Printing of official documents happened in a separate printing module. Whenever a document needed to be printed, an XML document was generated in a database query and sent to the printing module. Most of the data came from relational database tables, but pieces of standard text were supplied as XML documents by another party… and saved in our database in an XMLType column based on CLOB storage.

Our printing module expected to find these standard text XML documents within a tag in our generated XML document. Also both XML documents, our generated XML and the supplied XML, had different namespaces, this needed to maintained this way. The Oracle database version we were working with was 10.1.0.5.0.

In this post I will use a simplified version of the real XML documents and queries to explain the problem and solution.

Incoming XML:

<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">
<doc:TEXT>
Incoming text
</doc:TEXT>
</doc:DOCUMENT>

Expected result XML:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xm
lns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org
/1999/XMLSchema">
<SOAP-ENV:Body>
<serviceControl xmlns="http://www.company.nl/default_xmlns">
<version>1.0</version>
<prt:tag1 xmlns:prt="http://www.company.nl/print">
<prt:tag2>
<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">
<doc:TEXT>
Incoming text
</doc:TEXT>
</doc:DOCUMENT>
</prt:tag2>
</prt:tag1>
</serviceControl>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

This is the first version of the query:

SELECT XMLElement( "SOAP-ENV:Envelope"
, XMLAttributes( 'http://schemas.xmlsoap.org/soap/envelope/' AS "xmlns:SOAP-ENV"
, 'http://www.w3.org/1999/XMLSchema-instance' AS "xmlns:xsi"
, 'http://www.w3.org/1999/XMLSchema' AS "xmlns:xsd"
)
, XMLElement( "SOAP-ENV:Body"
, XMLElement( "serviceControl"
, XMLAttributes( 'http://www.company.nl/default_xmlns' AS "xmlns" )
, XMLElement( "version", '1.0' )
, XMLElement( "prt:tag1"
, XMLAttributes( 'http://www.company.nl/print' AS "xmlns:prt" )
, XMLElement( "prt:tag2"
, '<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">'
||'<doc:TEXT>Incoming text</doc:TEXT></doc:DOCUMENT>'
)
)
)
)
)
FROM dual;

And this came out as:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xm
lns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org
/1999/XMLSchema">
<SOAP-ENV:Body>
<serviceControl xmlns="http://www.company.nl/default_xmlns">
<version>1.0</version>
<prt:tag1 xmlns:prt="http://www.company.nl/print">
<prt:tag2>
<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">
<doc:TEXT>
Incoming text
</doc:TEXT>
</doc:DOCUMENT>
</prt:tag2>
</prt:tag1>
</serviceControl>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

As you can see in bold, the

<, >,”

are automatically encoded by XMLElement as &gt;, &lt;, &quot;,which is not the desired result. Avoiding this became a real problem after several experiments, none with a satisfying outcome.

A number of attempts are described here:

  • REPLACE: I tried to replace a dummy tag with the real XML, with REPLACE. At first it seemed to work, but REPLACE caused an implicit conversion from CLOB to VARCHAR2. Unfortunately our end result was too long for the VARCHAR2 data type.
SELECT REPLACE( XMLElement( "SOAP-ENV:Envelope"
, XMLAttributes( 'http://schemas.xmlsoap.org/soap/envelope/' AS "xmlns:SOAP-ENV"
, 'http://www.w3.org/1999/XMLSchema-instance' AS "xmlns:xsi"
, 'http://www.w3.org/1999/XMLSchema' AS "xmlns:xsd"
)
, XMLElement( "SOAP-ENV:Body"
, XMLElement( "serviceControl"
, XMLAttributes( 'http://www.company.nl/default_xmlns' AS "xmlns" )
, XMLElement( "version", '1.0' )
, XMLElement( "prt:tag1"
, XMLAttributes( 'http://www.company.nl/print' AS "xmlns:prt" )
, XMLElement( "prt:tag2"
, 'BODY_DUMMY_TEXT_IN'
)
)
)
)
)
, 'BODY_DUMMY_TEXT_IN'
, '<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">'
||'<doc:TEXT>Incoming text</doc:TEXT></doc:DOCUMENT>'
)
FROM dual;
  • CDATA: I tried to enclose the XML in a CDATA tag. All text inside a CDATA section will be ignored by the parser. Oracle introduced a CDATA function in RDBMS 10g release 2, but unfortunately we are working in RDBMS 10g release 1. Therefore I tried to concatenate CDATA around the enclosed XML. But the result was the same as the first query: the same encoding problem occurred as before,
    <, >,”

    were automatically encoded as &gt;, &lt;, &quot;.

SELECT XMLElement( "SOAP-ENV:Envelope"
, XMLAttributes( 'http://schemas.xmlsoap.org/soap/envelope/' AS "xmlns:SOAP-ENV"
, 'http://www.w3.org/1999/XMLSchema-instance' AS "xmlns:xsi"
, 'http://www.w3.org/1999/XMLSchema' AS "xmlns:xsd"
)
, XMLElement( "SOAP-ENV:Body"
, XMLElement( "serviceControl"
, XMLAttributes( 'http://www.company.nl/default_xmlns' AS "xmlns" )
, XMLElement( "version", '1.0' )
, XMLElement( "prt:tag1"
, XMLAttributes( 'http://www.company.nl/print' AS "xmlns:prt" )
, XMLElement( "prt:tag2"
, '<![CDATA['
||'<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">'
||'<doc:TEXT>Incoming text</doc:TEXT></doc:DOCUMENT>'
||']]>'
)
)
)
)
)
FROM dual;
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xm
lns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org
/1999/XMLSchema">
<SOAP-ENV:Body>
<serviceControl xmlns="http://www.company.nl/default_xmlns">
<version>1.0</version>
<prt:tag1 xmlns:prt="http://www.company.nl/print">
<prt:tag2>
lt;![CDATA[
<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">
<doc:TEXT>
Incoming text
</doc:TEXT>
</doc:DOCUMENT>
]]>
</prt:tag2>
</prt:tag1>
</serviceControl>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
  • Eventually, satisfaction came with the next version of the query with UPDATEXML, where dummy element “dummy_text_in” is completely replaced by the incoming xml:
 
SELECT UPDATEXML
   ( XMLElement
   ( "SOAP-ENV:Envelope" , XMLAttributes( 'http://schemas.xmlsoap.org/soap/envelope/' 
      AS "xmlns:SOAP-ENV" 
   , 'http://www.w3.org/1999/XMLSchema-instance' AS "xmlns:xsi" 
   , 'http://www.w3.org/1999/XMLSchema' AS "xmlns:xsd" ) 
   , XMLElement( "SOAP-ENV:Body" 
   , XMLElement( "serviceControl" 
      , XMLAttributes( 'http://www.company.nl/default_xmlns' AS "xmlns" ) 
   , XMLElement( "version", '1.0' ) 
   , XMLElement( "prt:tag1" 
      , XMLAttributes( 'http://www.company.nl/print' AS "xmlns:prt" ) 
   , XMLElement( "prt:tag2" , XMLElement( "prt:dummy_text_in" , 'BODY_DUMMY_TEXT_IN' 
   ) ) ) ) ) 
   ) , '//prt:tag1/prt:tag2/prt:dummy_text_in[1]' 
   , '<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">' 
   ||'<doc:TEXT>Incoming text</doc:TEXT></doc:DOCUMENT>' 
   , 'xmlns:prt="http://www.company.nl/print"' ) 
FROM dual;
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xm
lns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org
/1999/XMLSchema">
<SOAP-ENV:Body>
<serviceControl xmlns="http://www.company.nl/default_xmlns">
<version>1.0</version>
<prt:tag1 xmlns:prt="http://www.company.nl/print">
<prt:tag2>
<doc:DOCUMENT xmlns:doc="http://www.company.nl/document">
<doc:TEXT>
Incoming text
</doc:TEXT>
</doc:DOCUMENT>
</prt:tag2>
</prt:tag1>
</serviceControl>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Another Note: do keep in mind that using a relative location path in UPDATEXML (‘//prt:tag1/prt:tag2/prt:dummy_text_in[1]‘) searches the entire XML tree.More info on UPDATEXMLUPDATEXML is an Oracle SQL/XML function as part of Oracle XMLDB. UPDATEXML takes as arguments an XMLType instance and an XPath-value pair and returns an XMLType instance with the updated value.Oracle Documentation about UPDATEXML:http://docs.oracle.com/cd/B14117_01/server.101/b10759/functions190.htm#sthref1977

Share.

About Author

Comments are closed.