Building an RSS Feed Reader in PL/SQL – using dbms_xmlparser, dbms_xmldom and dbms_xslprocessor for parsing and transforming

Lucas Jellema April 29, 2006 Database, IT, Oracle, XML 20 Comments

This article will demonstrate how we can build an RSS Feed Reader in PL/SQL. We will use DBMS_XMLPARSER to read an RSS feed from an URL and parsethe RSS document, then using the DBMS_XMLDOM package we will analyze the document and then make use of the DBMS_XSLPROCESSOR package to transform the RSS document into a presentable format.

Many websites today publish RSS feeds. These feeds contain news items, the latest article publications, the most recent entries in discussion forums, software releases etc. The information available from RSS Feeds is structured according to a generic (almost) standard structure. That means that we can easily create software that processes RSS Feeds! Of course many RSS Feed Readers are available, for example as plugins for FireFox or other browsers, for Eclipse (created by my colleague Jeroen, see his article soon to be published on the weblog) or just as stand-alone tool.

In this article we create our own RSS Feed Reader, in PL/SQL of all programming languages!

Building an RSS Feed Reader in PL/SQL - using dbms_xmlparser, dbms_xmldom and dbms_xslprocessor for parsing and transforming rssFRPLSQL2

Some of the RSS Feeds that we can read programmatically:

https://technology.amis.nl/blog/?feed=rss2
http://rss.cnn.com/rss/cnn_topstories.rss
http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml
http://www.oracle.com/technology/syndication/rss_otn_news.xml
http://blogs.oracle.com/readingLists/oracleblogs.xml

Retrieve, Parse and Report

We will use the supplied package utl_http to retrieve our RSS document. This package is fed a URL and returns the HttpResponse that is received from the WebServer behind that URL. You could say that using UTL_HTTP the database acts like a Web Browser! Well, actually: we do not even need to use UTL_HTTP ourselves as DBMS_XMLPARSER can parse content from an URL directly! (underneath it will probably leverage utl_http, but we need not bother).

Since we know that the response from an RSS feed returned from the URL is in fact an XML Document, we can parse the text into an XML Document, using the DBMS_XMLPARSER package.

     rssFeedDoc    dbms_xmldom.domdocument;
   begin
     rssFeedDoc   := dbms_xmlparser.parse(url);
     ...

Now that we have the RSS Feed as parsed XML Document, we can process its contents using the functions and procedures in the DBMS_XMLDOM package:

   rssItemsList:= dbms_xmldom.GETELEMENTSBYTAGNAME(rssFeedDoc, 'item');
   itemNode:= dbms_xmldom.item(rssItemsList,0);

It is quite easy to read the relevant node-values into fields of our PL/SQL Record type variable and then write a brief summary from each RSS Feed entry to the output. The entire code block:

declare
 
 type rss_item_type is record
 ( title    varchar2(2000)
 , link     varchar2(200)
 , pubDate  varchar2(200)
 , author   varchar2(200)
 , category varchar2(500)
 , description varchar2(4000)
 );

 url varchar2(500):= 'https://technology.amis.nl/blog/?feed=rss2';
 rssFeedDoc    dbms_xmldom.domdocument;
 itemNode      dbms_xmldom.domnode;
 childnode     dbms_xmldom.domnode;
 myParser      dbms_xmlparser.Parser;
 rssItemsList  dbms_xmldom.domnodeList  ;
 rssItemFields dbms_xmldom.domnodeList  ;
 childNodeText varchar2(32000);
 
 l_rss_item        rss_item_type;
 l_empty_rss_item  rss_item_type;
 
 procedure pl(text in varchar2)
 is
 begin
   if length(text)> 255
   then
     for i in 1..(trunc(length(text)/255)+1) loop
       dbms_output.put_line(substr(text,1+ 255*(i-1),255));
     end loop;
   else
     dbms_output.put_line(text);
   end if;
 end pl;

 procedure report_rss_item(p_rss_item in rss_item_type)
 is
 begin
   pl(p_rss_item.title||' by '||p_rss_item.author);
   pl('Published on: '||p_rss_item.pubDate);
   pl('Category: '||p_rss_item.category);
   pl('==');
   pl('Description: '||p_rss_item.description);
   pl('----------------------------------------');
   pl(' ');
 end report_rss_item;

 
begin
   myParser := dbms_xmlparser.newParser;
   rssFeedDoc   := dbms_xmlparser.parse(url);
   rssItemsList:= dbms_xmldom.GETELEMENTSBYTAGNAME(rssFeedDoc, 'item');
   l_rss_item:= l_empty_rss_item;
   pl('RSS Feed from AMIS Technology Weblog - most recent '||dbms_xmldom.getlength(rssItemsList)||' articles:');
   pl('=================================================================');
   for i in 1..dbms_xmldom.getlength(rssItemsList) loop
     itemNode:= dbms_xmldom.item(rssItemsList,i-1);
     rssItemFields:= dbms_xmldom.getChildNodes(itemNode);
     for j in 1..dbms_xmldom.getlength(rssItemFields) loop
       childnode:= dbms_xmldom.item(rssItemFields, j-1);
       childNodeText:= dbms_xmldom.getNodeValue(dbms_xmldom.getFirstChild(childnode));
       case dbms_xmldom.getNodeName(childnode)
       when 'title'
       then l_rss_item.title := childNodeText;
       when 'link'
       then l_rss_item.link := childNodeText;
       when 'pubDate'
       then l_rss_item.pubDate := childNodeText;
       when 'dc:creator'
       then l_rss_item.author := childNodeText;
       when 'description'
       then l_rss_item.description := childNodeText;
       when 'category'
       then l_rss_item.category := childNodeText;
       else null;
       end case;             
     end loop;
     report_rss_item(l_rss_item);
   end loop;
end;
/

The output of this PL/SQL code for the AMIS Technology WebLog RSS Feed:

RSS Feed from AMIS Technology Weblog - most recent 10 articles:
=================================================================
Building an RSS Feed Reader in PL/SQL - using utl_http, dbms_xmldom and
dbms_xslprocessor for parsing and transforming by Lucas Jellema
Published on: Sat, 29 Apr 2006 18:22:53 +0000
Category: XML
==
Description: This article will demonstrate how we can build an RSS Feed Reader
in PL/SQL. We will use utl_http to read an RSS feed from an URL, parse the RSS
document using the DBMS_XMLDOM package and then make use of the
DBMS_XSLPROCESSOR package to tran
sform the RSS document into a presentable format.&#160;Many websites today
publish RSS [...]
----------------------------------------
Getting used to Oracle SQL Developer by Patrick Sinke
Published on: Fri, 28 Apr 2006 15:30:34 +0000
Category: Devel. + PL/SQL tools
==
Description: 
&#160;It took a few months, but I finally start appreciating the
Oracle
SQL Developer developed by Oracle (The Project Formerly Known As
Raptor).
There are still some bugs, for instance with debugging through
a
firewall (hey, didn&#8217;t we
have the same problem with JDeveloper?) and
some features are missing, but
it&#8217;s quite usable. Especially when there
is no TOAD or [...]
----------------------------------------
Software Deploying using ANT by bertjan
Published on: Thu, 27 Apr 2006 16:13:45 +0000
Category: KC Software Engineering
==
Description: 
  Some time ago a customer of AMIS asked me to setup the aut
omatic
...

Let’s add a bit of XSL-T transformation to our code.

Using the dbms_xslprocessor package for XSL-T transformations

The presentation of the RSS items has been programmed somewhat clumsily. We do not really benefit from the fact that the RSS Feed is available as XML Document. Using an XSLT stylesheet, we do not need to programmatically extract individual nodes and print their contents to the output.

The XSL-T Stylesheet is first defined in a simple VARCHAR2 variable that subsequently is parsed into an XMLDocument.

 myParser      dbms_xmlparser.Parser;
 proc        dbms_xslprocessor.processor;
 xsltdomdoc  dbms_xmldom.domdocument;
 xsl         dbms_xslprocessor.stylesheet;
 outdomdocf  dbms_xmldom.domdocumentfragment;
 outnode     dbms_xmldom.domnode;
 ...
  xsldoc     varchar2(32000):= 
   '<?xml version="1.0"?> 
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      ...
    </xsl:stylesheet>';
   ...
   -- get XSLT stylesheet in stylesheet type variable
   myParser := dbms_xmlparser.newParser;
   dbms_xmlparser.parseBuffer(myParser, xsldoc);
   xsltdomdoc := dbms_xmlparser.getDocument(myParser);
   xsl        := dbms_xslprocessor.newstylesheet(xsltdomdoc, '');

Then we can use the dbms_xmlprocessor to use this stylesheet for transforming the RSS Feed document into a more report-oriented XML document:

 --apply stylesheet to DOM document   
   proc       := dbms_xslprocessor.newProcessor;

The entire block of code:

declare
 

-- url varchar2(500):= 'https://technology.amis.nl/blog/?feed=rss2';
 --url varchar2(500):= 'http://rss.cnn.com/rss/cnn_topstories.rss ';
 url varchar2(500):= 'http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml ';
--http://www.oracle.com/technology/syndication/rss_otn_news.xml
--http://blogs.oracle.com/readingLists/oracleblogs.xml


 rssFeedDoc    dbms_xmldom.domdocument;
 itemNode      dbms_xmldom.domnode;
 childnode     dbms_xmldom.domnode;
 myParser      dbms_xmlparser.Parser;
 rssItemsList  dbms_xmldom.domnodeList  ;
 rssItemFields dbms_xmldom.domnodeList  ;
 childNodeText varchar2(32000);
 proc        dbms_xslprocessor.processor;
 
 xsltdomdoc  dbms_xmldom.domdocument;
 xsl         dbms_xslprocessor.stylesheet;
 outdomdocf  dbms_xmldom.domdocumentfragment;
 outnode     dbms_xmldom.domnode;
 buf         varchar2(32000); 
 
  xsldoc     varchar2(32000):= 
   '<?xml version="1.0"?> 
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output encoding="utf-8"/>
<xsl:template match="/">
         <rssItems>
         <xsl:apply-templates select="//item"/>
        </rssItems>
      </xsl:template>      
      <xsl:template match="//item">
         <rssItem>
           <xsl:value-of select="title"/><xsl:text>
           by: </xsl:text>
           <xsl:value-of select="creator"/><xsl:text>
           Published on: </xsl:text><xsl:value-of select="pubDate"/>
           <xsl:text>
           Category: </xsl:text><xsl:value-of select="category"/>
           <xsl:text>
           ==
           </xsl:text><xsl:value-of select="description"/><xsl:text>
           -----------------------------------------------------------------------
           </xsl:text>
           </rssItem>
      </xsl:template>
    </xsl:stylesheet>';
 
 procedure pl(text in varchar2)
 is
 begin
   if length(text)> 255
   then
     for i in 1..(trunc(length(text)/255)+1) loop
       dbms_output.put_line(substr(text,1+ 255*(i-1),255));
     end loop;
   else
     dbms_output.put_line(text);
   end if;
 end pl;

 
 
begin
   rssFeedDoc   := dbms_xmlparser.parse(url);

   -- get XSLT stylesheet in stylesheet type variable
   myParser := dbms_xmlparser.newParser;
   dbms_xmlparser.parseBuffer(myParser, xsldoc);
   xsltdomdoc := dbms_xmlparser.getDocument(myParser);
   xsl        := dbms_xslprocessor.newstylesheet(xsltdomdoc, '');

   
   --apply stylesheet to DOM document   
   proc       := dbms_xslprocessor.newProcessor;
   outdomdocf := dbms_xslprocessor.processxsl(proc, xsl, rssFeedDoc);
   outnode    := dbms_xmldom.makenode(outdomdocf); 
   dbms_xmldom.writetobuffer(outnode, buf);
   pl('RSS Feed from AMIS Technology Weblog - most recent articles:');
   pl('=================================================================');
   pl(buf);
end;
/

The results:

RSS Feed from AMIS Technology Weblog - most recent articles:
=================================================================
<rssItems>
  <rssItem>Building an RSS Feed Reader in PL/SQL - using
dbms_xmlparser, dbms_xmldom and dbms_xslprocessor for parsing and transforming

by: 
           Published on: Sat, 29 Apr 2006 18:22:53 +0000

Category: General

==
           
This article will demonstrate how we can build an RSS Feed Reader
in PL/SQL. We will use DBMS_XMLPARSER to read an RSS feed from an URL and
parsethe RSS document, then using the DBMS_XMLDOM package we will analyze the
document and then
make use of the DBMS_XSLPROCESSOR package to transform the RSS document into a
[...]

-----------------------------------------------------------------------

</rssItem>
  <rssItem>Getting used to Oracle SQL Developer
           by: 

Published on: Fri, 28 Apr 2006 15:30:34 +0000
           Category: Devel. +
PL/SQL tools
           ==
           
&amp;#160;It took a few months, but I
finally start appreciating the Oracle
SQL Developer developed by Oracle (The
Project Former
ly Known As Raptor).
There are still some bugs, for instance with debugging
through a
firewall (hey, didn&amp;#8217;t we have the same problem with
JDeveloper?) and
some features are missing, but it&amp;#8217;s quite usable.

Web Publishing our RSS Feed using MOD_PLSQL

Using the MOD_PLSQL extension to the Apache Web Server, we can easily create a PL/SQL procedure that publishes the RSS Feed we have read from the Internet URL. Using a proper XSLT-stylesheet, we can turn the RSS XML Document into a pretty presentable HTML page. With the following XSL-T stylesheet:

<?xml version="1.0"?> 
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output encoding="utf-8"/>
<xsl:template match="/">
<h2>RSS Feed from AMIS Technology Weblog - most recent articles:</h2><hr />
         <ul>
         <xsl:apply-templates select="//item"/>
        </ul>
      </xsl:template>      
      <xsl:template match="//item">
<li>
         <xsl:element name="a">
           <xsl:attribute name="target"><xsl:text>_blank</xsl:text></xsl:attribute>
           <xsl:attribute name="href"><xsl:value-of select="link" /></xsl:attribute>
           <xsl:value-of select="title" />
         </xsl:element>
         <span style="font-family:sans-serif;font-size:8pt">
         <xsl:attribute name="title"><xsl:value-of select="description" /></xsl:attribute>
         <xsl:text>( Published on: </xsl:text><xsl:value-of select="pubDate" /><xsl:text>)</xsl:text></span>
       </li>
      </xsl:template>
    </xsl:stylesheet>

we create the web-page as shown below:

Building an RSS Feed Reader in PL/SQL - using dbms_xmlparser, dbms_xmldom and dbms_xslprocessor for parsing and transforming rssFRPLSQL1

Note how we use the HTML attribute title to show the bubble text whenever the cursor hovers over the publication date.

The procedure rss_reader that we now invoke through the MOD_PLSQL module is as follows:

create or replace
procedure rss_reader
as 

 url varchar2(500):= 'http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml ';

 rssFeedDoc    dbms_xmldom.domdocument;
 itemNode      dbms_xmldom.domnode;
 childnode     dbms_xmldom.domnode;
 myParser      dbms_xmlparser.Parser;
 rssItemsList  dbms_xmldom.domnodeList  ;
 rssItemFields dbms_xmldom.domnodeList  ;
 childNodeText varchar2(32000);
 proc        dbms_xslprocessor.processor;
 
 xsltdomdoc  dbms_xmldom.domdocument;
 xsl         dbms_xslprocessor.stylesheet;
 outdomdocf  dbms_xmldom.domdocumentfragment;
 outnode     dbms_xmldom.domnode;
 buf         varchar2(32000); 


  xsldoc     varchar2(32000):= 
   '<?xml version="1.0"?> 
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output encoding="utf-8"/>
<xsl:template match="/">
<h2>RSS Feed from AMIS Technology Weblog - most recent articles:</h2><hr />
         <ul>
         <xsl:apply-templates select="//item"/>
        </ul>
      </xsl:template>      
      <xsl:template match="//item">
<li>
         <xsl:element name="a">
           <xsl:attribute name="target"><xsl:text>_blank</xsl:text></xsl:attribute>
           <xsl:attribute name="href"><xsl:value-of select="link" /></xsl:attribute>
           <xsl:value-of select="title" />
         </xsl:element>
         <span style="font-family:sans-serif;font-size:8pt">
         <xsl:attribute name="title"><xsl:value-of select="description" /></xsl:attribute>
         <xsl:text>( Published on: </xsl:text><xsl:value-of select="pubDate" /><xsl:text>)</xsl:text></span>
       </li>
      </xsl:template>
    </xsl:stylesheet>';
 
begin
   rssFeedDoc   := dbms_xmlparser.parse(url);

   -- get XSLT stylesheet in stylesheet type variable
   myParser := dbms_xmlparser.newParser;
   dbms_xmlparser.parseBuffer(myParser, xsldoc);
   xsltdomdoc := dbms_xmlparser.getDocument(myParser);
   xsl        := dbms_xslprocessor.newstylesheet(xsltdomdoc, '');

   
   --apply stylesheet to DOM document   
   proc       := dbms_xslprocessor.newProcessor;
   outdomdocf := dbms_xslprocessor.processxsl(proc, xsl, rssFeedDoc);
   outnode    := dbms_xmldom.makenode(outdomdocf); 
   dbms_xmldom.writetobuffer(outnode, buf);
   --pl(buf);
   htp.htmlOpen;
   htp.headOpen;
   htp.title('RSS Feed - BBC Headlines');
   htp.headClose;   
   htp.BodyOpen;
   htp.p(buf);
   htp.BodyClose;
   htp.htmlClose;
end;
/

Resources

Oracle Documentation: XML Developer’s Guide

About The Author

Lucas Jellema

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director and Oracle Developer Champion. Solution architect and developer on diverse areas including SQL, JavaScript, Kubernetes & Docker, Machine Learning, Java, SOA and microservices, events in various shapes and forms and many other things. Author of the Oracle Press book Oracle SOA Suite 12c Handbook. Frequent presenter on user groups and community events and conferences such as JavaOne, Oracle Code, CodeOne, NLJUG JFall and Oracle OpenWorld.

20 Comments

Helen Hunt January 1, 2010

I don’t know what is wrong with my installation, but it does not seem to want to work. I will run it again and post the error message here if that’s allowed
Pon September 2, 2008

Hello,

I have a question regarding on how to restrict the number of rss feeds to be listed in the page?
Is there any way I could pass some kind of number, so that I would be able to display only first 5 rss feeds.

Any help is highly appreciable.

Thanks
Pon August 12, 2008

Hello,

I am trying to use this example for developing a RSS reader. When I change the URL for the rss feed, I am getting the following error:

Value too large for the buffer.

How can I resolve this? I changed the data type from VARCHAR2 to LONG.

Thanks
Chris March 26, 2008

These are great solutions to the problem of grabbing XML from a remote RSS feed.

I’m curious what woudl be the best way to simply return the RSS feed as a plain XML document so I could parse it later through another tier (such as an RIA, like Flex)?

Would I just want to use UTL_HTTP?

Also, does anyone have experience providing search parameters to the select * from xmltable using XQuery? So if I wanted to return only results from the feed where description contained a certain word?

Thanks.
JLW June 15, 2006

This code works great for a small feed app. I’ve been building, however I’m getting a strange parse error when I try to run this code over an non-rss xml document. Seems like it should work ….

I’m working in 10gR2. Here’s the xml document:

——————————————————————————————

]>

1. A retirement plan that allows you a charitable deduction is
A. an IRA.
B. a 401(k) plan.
C. a unitrust.
2. A portion of the sum paid each year from a charitable gift annuity is
A. tax-free.
B. appreciated property.
C. deferred.
3. A pooled income fund
A. holds liquid assets only.
B. must be invested in short-term securities.
C. distributes its earnings to a number of beneficiaries.
4. To get a current tax deduction, your life income gift must be
A. revocable.
B. amendable.
C. irrevocable.
5. A charitable remainder annuity trust pays the beneficiary
A. a percentage of market value.
B. a fixed dollar amount.
C. its net income.
6. The greater the annual payout of a charitable remainder trust,
A. the greater the deduction.
B. the smaller the deduction.
C. the higher the capital gain.
7. You can get a current income tax charitable deduction for your home if you
A. deed it to a charitable organization.
B. will it to the IRS.
C. move out now.
8. Your most important reward from a life income gift is
A. a tax deduction.
B. increasing your income.
C. the joy of helping others.

Now, check your answers.
]]>

———————————————————————————————-
When I call the code from a stored procedure (to a web page) I get the following error:

PACKAGE: RSS.story SQLCODE: ORA-31011: XML parsing failed ORA-19202: Error occured in XML processingLPX-00111: Warning: element “ARTICLE” attribute “CATEGORIES” has invalid enumeration value “QUIZZES” Error at line 17

Any clues would be greatly appreciated …
JLW
anton May 4, 2006

mmm the WordPress Blog doesn’t like my XSLT, the previous should be

And if namespaces are causing any troubles, you can always remove them from the XML.

The following works on 9.2, 10.1 and 10.2

select value( i ).getstringval() creator
from table( XMLSequence( HTTPURITYPE( ‘http://technology.amis.nl/blog?feed=rss2’ ).getXML().transform
( xmltype( ‘<?xml version=”1.0″?>
<xsl:stylesheet version=”1.0″ xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>
<xsl:template match=”*”>
<xsl:element name=”{local-name()}”>
<xsl:apply-templates select=”node()”/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>’ )
).extract(‘//item’) ) ) i

Note that the XSLT transformation removes not only the namespaces, but also the attributes. If you need them you have tou use something like

<?xml version=”1.0″?>
<xsl:stylesheet version=”1.0″ xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>
<xsl:template match=”*”>
<xsl:element name=”{local-name()}”>
<xsl:apply-templates select=”@*|node()”/>
</xsl:element>
</xsl:template>
<xsl:template match=”@*”>
<xsl:attribute name=”{local-name()}”>
<xsl:value-of select=”.”/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
anton May 4, 2006

And if namespaces are causing any troubles, you can always remove them from the XML.

The following works on 9.2, 10.1 and 10.2

select value( i ).extract( ‘/item/creator/text()’ ).getstringval() creator
from table( XMLSequence( HTTPURITYPE( ‘http://technology.amis.nl/blog?feed=rss2’ ).getXML().transform
( xmltype( ‘

‘ )
).extract(‘//item’) ) ) i

Note that the XSLT transformation removes not only the namespaces, but also the attributes. If you need them you have tou use something like

Marco Gralike May 1, 2006

…OR in the new syntax with XMLTABLE (can’t find a 10gR1 database so maybe it won’t work on that version)

select *
from   xmltable('//item'
                passing HTTPURITYPE('http://technology.amis.nl/blog?feed=rss2').getXML()
                columns
                  title            varchar2(4000) path '/item/title/text()',
                  link             varchar2(4000) path '/item/link/text()',
                  publication_date varchar2(4000) path '/item/pubDate/text()',
                  creator          varchar2(4000) path '/item/*[namespace-uri()="http://purl.org/dc/elements/1.1/" and local-name()="creator"]/text()',
                  description      varchar2(4000) path '/item/description/text()'
--                  category         varchar2(4000) path '/item/category/text()'
             )
;

This syntax doesn’t like the multiple categories values. Haven’t found (yet) a workaround (but because its XQuery related stuff – there should be one)

anton May 1, 2006

Handling namespaces can be done in another way:

select value( i ).extract( ‘/item/dc:creator/text()’, ‘xmlns:dc=http://purl.org/dc/elements/1.1/”‘ ).getstringval() creator
from table( XMLSequence( HTTPURITYPE( ‘http://technology.amis.nl/blog?feed=rss2’ ).getXML().extract(‘//item’) ) ) i

And the difference between 10.1 and 10.2

On 10.1 the extract(‘//items’) returns
[dc:creator]Bert Jan Meinders[/dc:creator]

i.e. without the namespace information

On 10.2 the namespace url is added
[dc:creator xmlns:dc=”http://purl.org/dc/elements/1.1/”]Bert Jan Meinders[/dc:creator]
Lucas Jellema May 1, 2006

So apparently I am somewhat old fashioned (10gR1) as Harm pointed out. Anton’s very neat trick which did not even require the use of PL/SQL will be available to me when I finally upgrade to 10gR2. I am most impressed by the way with Anton’s handling of the dc:author element – namespaces can be horrible when parsing and transforming!

It is interesting to see how many ways there are to work with XML content in the Oracle database. It is also good to have colleagues like Harm and Anton who can show me the (proper) way.
anton May 1, 2006

Mijn code werkt op 10.2, meer geeft op 10.1 de volgende fout
ERROR at line 2:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00234: namespace prefix “dc” is not declared
Error at line 6
ORA-06512: at “SYS.XMLTYPE”, line 105
ORA-06512: at line 1

De code van Lucas werkt op 10.1, maar geeft op 10.2 de volgende fout
ERROR at line 1:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00216: invalid character 128 (0x80)
Error at line 145
ORA-06512: at “XDB.DBMS_XMLDOM”, line 4145
ORA-06512: at “XDB.DBMS_XMLDOM”, line 4170
ORA-06512: at line 45
anton May 1, 2006

Did you ever tried something like:

select
value( i ).extract( ‘/item/title/text()’ ).getstringval() title
, value( i ).extract( ‘/item/link/text()’ ).getstringval() link
, value( i ).extract( ‘/item/pubDate/text()’ ).getstringval()
, value( i ).extract( ‘/item/*[namespace-uri()=”http://purl.org/dc/elements/1.1/” and local-name()=”creator”]/text()’ ).getstringval() creator
, value( i ).extract( ‘/item/description/text()’ ).getclobval() description
, value( i ).extract( ‘/item/category/text()’ ).getstringval() category
from table( XMLSequence( HTTPURITYPE( ‘http://technology.amis.nl/blog?feed=rss2’ ).getXML().extract(‘//item’) ) ) i

Anton
Harm Verschuren May 1, 2006

About compactness: my solution will use less CPU. dbms_xmlparser, dbms_xmltransform and dbms_xmldom under the covers use java (CPU intensive and context-switch between PL/SQL – java); xmltype and dbms_genxml don’t. So the latter will have less impact on the performance (important when data volumes increase).
Xmltype parses automatically under the covers upon cretion and also has a transform method. Try to use that one instead of dbms_xmltransform.
Lucas Jellema May 1, 2006

OK, found it: http://www.stanford.edu/dept/itss/docs/oracle/10g/appdev.101/b10802/d_xmldom.htm#1124851:

DBMS_XMLDOM.GETXMLTYPE(
doc IN DOMDocument)
RETURN SYS.XMLType;
Lucas Jellema May 1, 2006

What I found on the internet suggestes no direct transformation. Here is some code:

declare
xml xmltype;
childClob clob;
doc dbms_xmldom.domdocument;
root_element dbms_xmldom.domelement;
begin
… — fetch data into xml
— Turn XMLType into DomDocument
doc := dbms_xmldom.newdomdocument(xml);
— do some processing on DomDocument
root_element := dbms_xmldom.getdocumentelement(doc);
root_node := dbms_xmldom.makenode(root_element);
— turn Node from DomDocument into XMLType
dbms_xmldom.writetoclob(root_node, childClob);
xml := xmltype(child_clob);
…
end;

You would hope it could go more straightforward from DomDocument to XMLType, benefiting from the fact that it is already an XML tree structure.
Lucas Jellema May 1, 2006

Thanks Harm, that is pretty straightforward.

I am not sure it is more compact than my original
rssFeedDoc := dbms_xmlparser.parse(url);
but I can see how the methods available on XMLType will make further processing easier than through the dbms_xmldom package.

Do you happen to know whether there is an easy way of converting a dbms_xmldom.domdocument variable in an XMLType variable?
Harm Verschuren May 1, 2006

Oh, and in case you have trouble with the ampersand in the url, issue the following in sqlplus: SET DEFINE OFF
Harm Verschuren May 1, 2006

Here’s a simple function that you could use. It returns a parsed RSS feed, based on the URL (and optional proxy address in case you are behind a proxyserver).

Usage: select get_rss(‘http://technology.amis.nl/blog/?feed=rss2&p=1173’, null) from dual;

create or replace function get_rss(p_url varchar2, p_proxy varchar2 default null)
return sys.XMLType
is
—
— Author: Harm Verschuren, 01-05-2006
—
t_pieces utl_http.html_pieces;
t_idx pls_integer;
t_payload clob := empty_clob();
begin
if p_proxy is null
then
t_pieces := utl_http.request_pieces(p_url);
else
t_pieces := utl_http.request_pieces(url => p_url, proxy => p_proxy);
end if;
for t_idx in t_pieces.first..t_pieces.last
loop
t_payload := t_payload || t_pieces(t_idx);
end loop;
return xmltype(t_payload);
end;
Lucas Jellema April 30, 2006

Is there an easy way (in 10gR1) to create an XMLType variable based on a document found at an URL somewhere on the internet (like I have done with dbms_xmlparser.parse(url)? Or would I use utl_http to get the data in a CLOB and then construct the XMLType using XMLType.createXML( p_clob)?

On dbms_output: I am using a 10g1R1 database so for me that limit still exists (it was removed only in 10gR2).
Aino Andriessen April 30, 2006

You can also use XMLTYPE instead of using dbms_xmldom and extract the data directly with the use of XPath, perform schema validation and perform xmslt transformation. The usage is much easier..
More info at the OTN Oracle documentation http://download-uk.oracle.com/docs/cd/B19306_01/appdev.102/b14258/t_xml.htm#i1007914.
By the way, dbms_output.put_line does not have a 255 character limit anymore :-).