Enriching XMLType data using relational data – XQuery and fn:collection in action

Share this on .. Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someoneShare on TumblrBuffer this page

One of my colleagues had an interesting challenge. In an Oracle Database he has a table that contains an XMLType column. The column contains XML documents with people collections. This data is retrieved and transformed inside the database using an XSLT stylesheet, to a format – for example XHTML – that can be served directly to an end user in a browser. However, the XMLType contains a country code element where the displayed data should show the name of the country. There is a lookup table that contains the country data; this table can be used to enrich the data in the XMLType. The question now was: what is the best moment in the processing pipeline (query => transform => output) to perform this enrichment. The definition of ‘best’ should include performance, scalability, programming effort and (infrastructural) complexity.

One of the options we discussed was enriching in the first stage of the pipeline, as part of the retrieval step. This would be done using an XQuery operation against the XMLType, producing another [enriched]XMLType that would be fed into the XSLT transformation. This article shows how that could be done. Using the experienced insights of my colleague Marco Gralike, I can even mention some performance considerations. We will be using the XQuery operation and the fn:collection function that was introduced in Oracle Database 11gR2. Other options – completely out of scope for this article – could include using a Service Bus or other middleware infrastructure.

Continue reading

Share this on .. Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someoneShare on TumblrBuffer this page

Oracle Database 12c: XQuery Full Text

Share this on .. Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someoneShare on TumblrBuffer this page

New in Oracle 12c and one of the big new features in XMLDB is the XQuery Full Text functionality and, as mentioned in the post about XQuery Update, is the official W3C standard to handle unstructured pieces of XML content. The XQuery Full Text and XQuery Full Text Index extends the XQuery XMLDB language to efficiently and XML aware handle unstructured content in XML. So finally in Oracle 12c, we are able to handle “structured”, “semi-structured” and “unstructured” content, XML aware and based on standards, across the board with regards to XML content.

Screen Shot 2013-06-28 at 14.24.38

XQuery Full Text FTSelection / Match

The following “Full Text FTSelection Operators” are supported:

  • FTAnd (ftand)
  • FTMildNot (not in)
  • FTOr (ftor)
  • FTOrder (ordered)
  • FTUnaryNot (ftnot)
  • FTWindow (window)
  • FTWords

The following “XQuery Full Text Match Options” are supported:

  • FTStemOption (stemming, no stemming)

The default behavior specified in the XQuery and XPath Full Text 1.0 Recommendation is used for each unsupported match option, with the following exceptions:

  • FTLanguage (unsupported) – The language used is the language defined by the default lexer, which means the language that was used when the database was installed.
  • FTStopWordOption (unsupported) – The stoplist used is the stoplist defined for that language.

For instance, the following XQuery Full Text will search for “Wikipedia Content” in a table with Wikipedia stored XML content, within a window of 6 words, for matches of “oracle” and “12c” via the ftand operator (Full Text AND)
.

SELECT xt1.PAGE_TEXT as "Wikipedia Content"
  FROM BINARYXML_RANGE_PART_XSD t1,
       XMLTABLE(xmlnamespaces(default 'http://www.mediawiki.org/xml/export-0.8/' ),
                '$P/page/revision/text'
                PASSING t1.content as "P"
                COLUMNS
                  PAGE_TEXT varchar2(4000) PATH '.'
               ) xt1
WHERE XMLExists('xquery version "1.0"; 
                 declare default element namespace "http://www.mediawiki.org/xml/export-0.8/"; 
                 $P/page/revision/text[. contains text 
                                         {$PHRASE1} ftand {$PHRASE2} 
                                         using stemming window 6 words]'
                 PASSING
                    t1.content as "P",
                   'oracle' as "PHRASE1",
                   '12c' as "PHRASE2"
               )
/

During tests it proved very fast on English Wikipedia content (10++ Gb) and delivered the results within less than a second. But such a statement will only be picked up very efficiently if the new, introduced in 12c, corresponding Oracle XQuery Full-Text Index has been created.

As seen in the manual this can be done very easily via, for example:

Creating an XML Full-Text Index

Index preference BASIC_STORAGE specifies the tablespace and creation parameters for the database tables and indexes that constitute an Oracle Text index.

BEGIN
  CTX_DDL.create_section_group('mysecgroup', 'PATH_SECTION_GROUP');
  CTX_DDL.set_sec_grp_attr('mysecgroup', 'XML_ENABLE', 'T');

  CTX_DDL.create_preference('mypref', 'BASIC_STORAGE');
  CTX_DDL.set_attribute('mypref',
                        'D_TABLE_CLAUSE',
                        'TABLESPACE my_ts
                         LOB(DOC) STORE AS SECUREFILE 
                         (TABLESPACE my_ts COMPRESS MEDIUM CACHE)');
  CTX_DDL.set_attribute('mypref',
                        'I_TABLE_CLAUSE',
                        'TABLESPACE my_ts
                         LOB(TOKEN_INFO) STORE AS SECUREFILE
                         (TABLESPACE my_ts NOCOMPRESS CACHE)');
END;
/

CREATE INDEX po_ctx_idx ON po_binxml(OBJECT_VALUE)
  INDEXTYPE IS CTXSYS.CONTEXT
  PARAMETERS('storage mypref section group mysecgroup');

So on my Wikipedia table this looked like:

DROP INDEX FT_RANGE_PART_IDX FORCE;
  begin
    CTX_DDL.DROP_SECTION_GROUP('FTRI');
  end;
/

begin
  CTX_DDL.DROP_PREFERENCE('STORAGE_PREFS');
end;
/

begin
  CTX_DDL.CREATE_SECTION_GROUP('FTRI','PATH_SECTION_GROUP');
  CTX_DDL.SET_SEC_GRP_ATTR('FTRI','XML_ENABLE','T');
  CTX_DDL.create_preference('STORAGE_PREFS', 'BASIC_STORAGE');
  CTX_DDL.set_attribute('STORAGE_PREFS',
                        'D_TABLE_CLAUSE',
                        'LOB(DOC) STORE AS SECUREFILE (COMPRESS MEDIUM CACHE)');
  CTX_DDL.set_attribute('STORAGE_PREFS',
                        'I_TABLE_CLAUSE',
                        'LOB(TOKEN_INFO) STORE AS SECUREFILE (NOCOMPRESS CACHE)'
                       );
  END;
/

CREATE INDEX FT_RANGE_PART_IDX
 ON BINARYXML_RANGE_PART_XSD(CONTENT)
 INDEXTYPE IS CTXSYS.CONTEXT
 LOCAL
  PARAMETERS('storage STORAGE_PREFS
              section group FTRI')
 PARALLEL;

More information

More information can be found in the Oracle 12c XMLDB Developers Guide, or via the following introduction / presentation:

…and/or its accompanied pdf document, describing the syntax used for one of the storage methods, called “Creating Structure in Unstructured Data, DML DDL used” (pdf 559 Kb)

Share this on .. Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someoneShare on TumblrBuffer this page

Oracle Database 12c: XQuery Update

Share this on .. Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someoneShare on TumblrBuffer this page

New, new…? No, not really new, XQuery Update (W3C standard/draft 2011) was already implemented in 11.2.0.3.0, but is now officially also announced. Besides the XQuery Full Text support (XQFT for short, W3C standard/draft 2011), this is one of the big new features in Oracle 12c. With this update XQuery functionality, the XQuery standard is further expanded in a mature query language for XML (/unstructured) information.

Screen Shot 2013-06-28 at 12.01.40
Now also in Oracle 12c, this has immediate consequences for following, implementing a XQuery standard across the board. The following features, functions have now been given the status “deprecated” (most of them introduced in Oracle version 10.2) for updating XML data:

All Oracle SQL functions for updating XML data are deprecated. Oracle recommends that you use XQuery Update instead. These are the deprecated XML updating functions:

    updateXML
    insertChildXML
    insertChildXMLbefore
    insertChildXMLafter
    insertXMLbefore
    insertXMLafter
    appendChildXML
    deleteXML

Although being signaled “deprecated”, be aware that they are still supported, but Oracle move to a position that the new XQuery Update functionality will be the basis for further (performance) improvements and after a while will be the sole solution to update XML content. Following along of the XQuery standard is not that surprising, especially noticing very interesting movements regarding XQuery extensions like JSoniq (JSON Extension for XQuery – have a go at it in the Zorba demo environment). This will make interesting standard architectural solutions available for all kinds unstructured data environment.

Screen Shot 2013-06-28 at 12.22.35

A small example of an update of XML content could now be done via:

--
-- Modifying the content of existing nodes using XQuery Update.
--
update PURCHASEORDER
   set object_value = XMLQuery
                      (
                        'copy $NEWXML := $XML modify (
                           for $PO in $NEWXML/PurchaseOrder return (
                                replace value of node $PO/User with $USERID,
                                replace value of node $PO/Requestor with $FULLNAME,
                                replace value of node $PO/LineItems/LineItem/Part[@Description=$OLDTITLE]/@Description with $NEWTITLE 
                               )
                         )
                        return $NEWXML'
                        passing object_value as "XML",
                        'KCHUNG' as "USERID",
                        'Kelly Chung' as "FULLNAME",
                        'The Mean Season' as "OLDTITLE",
                        'The Wizard of Oz' as "NEWTITLE"
                        returning content
                      )
 where xmlExists(
         '$XML/PurchaseOrder[Reference=$REF]/LineItems/LineItem/Part[@Description=$OLDTITLE]'
          passing object_value as "XML",
                  'AFRIPP-20120430212831873PDT' as "REF",
                  'The Mean Season' as "OLDTITLE"
       )
/

Have a further look at some of the Oracle OpenWorld Hands-on Lab Oracle OpenWorld 2012 Hands-on Lab Examples or the Oracle 12c XMLDB Developers Guide.

Share this on .. Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someoneShare on TumblrBuffer this page

Oracle Database 12c: Oracle XMLDB is now Mandatory!

Share this on .. Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someoneShare on TumblrBuffer this page

Is this one of my highlights? No not really, but it is a remarkable one. Oracle XMLDB functionality is now mandatory and can not be uninstalled. As mentioned in the new feature section of the Oracle XMLDB Developers Guide for Oracle 12c:

Oracle XML DB is now a mandatory component of Oracle Database. You cannot uninstall it, and there is no option not to include it when you create Oracle Database. It is automatically installed when you create a new database or (if not existing already) when you upgrade an existing database to Oracle Database 12c Release 1 (12.1.0.1).

…and a bit further on…
Continue reading

Share this on .. Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someoneShare on TumblrBuffer this page