Query Rewrite is an increasingly more important feature of the Oracle database. Linked intimately to Materialized Views, Query Rewrite is used by the CBO to make queries leverage intermediate results – often aggregate values or pre-joined records – available in Materialized Views when executing a query, thus preventing the need for and performance hit of re-retrieving those data. This article gives a very simple example of such a query rewrite operation and then continues to illustrate how providing some additional meta-data to the database – next to constraints and proper data-type definitions – in the form of Dimensions can make all the different to the Query Rewrite capabilities of the CBO.

Before we start on this exploration, we will have to ask you to assume that the EMP table has millions of records. Otherwise, you will probably not be duly impressed.

....

In order to be able to efficiently query the records in our Employee data warehouse (i.e. table EMP), we have created a Materialized View, as follows:

create materialized view emp_time_mv<br />build immediate<br />refresh on demand<br />enable query rewrite<br />as<br />select to_char(hiredate, 'mmyyyy') mmyyyy<br />,      count(sal)<br />,      sum(sal)<br />from   emp e<br />group<br />by     to_char(hiredate, 'mmyyyy')<br />/<br />&nbsp;

This Materialized View is Query rewrite enabled. That means that we allow the Cost Based Optimizer to make use of this MV if that helps for speeding up our queries against base table EMP. The consequence can be that the following query against our monstrous EMP table:

select *<br /> from   ( select avg(sal) avg_sal<br />          ,      to_char(hiredate, 'mmyyyy') mmyyyy<br />          from   emp e<br />          group<br />          by     to_char(hiredate, 'mmyyyy')  <br />        )<br /> where  mmyyyy = 121981<br />&nbsp;

can be executed by the CBO as query against the Materialized View with a maximum of some hundreds of records:


 
What might surprise you here is that the CBO is able to recognize the fact that AVG is calculated from SUM and COUNT, and therefore it can leverage the intermediate results in the Materialized view.

Now suppose we would like to have the average salary per year. What if we query for that value against table EMP, will the CBO be smart enough to do the query rewrite?

select avg(sal) avg_sal<br />,      substr(to_char(hiredate, 'mmyyyy'),3) yyyy<br />from   emp e<br />group<br />by     substr(to_char(hiredate, 'mmyyyy'), 3)<br />&nbsp;

The results are fine, as is the execution plan:

Because we asked for aggregation by year – with year a substring of the myyyy string we have used as a basis for the Materialized View -the CBO was able to rewrite our query and make use of the MV.

Now suppose a different situation. We are still interested in aggregate salary values per timeframe. We want look at Quarters and we want to specify some additional characteristics for the years.

We retrace our steps a little bit. First we create a table to hold the dimension values, here for the time dimension:

create table time_hierarchy <br />( mmyyyy, qtr_yyyy, yyyy)<br />as <br />select distinct<br />       to_char(hiredate, 'mmyyyy') mmyyyy<br />,      'Q'||to_char(hiredate, 'q')||to_char(hiredate, 'yyyy') qtr_yyyy<br />,      to_char(hiredate, 'yyyy') yyyy<br />from   emp<br /><br />

This table contains records for every month in every year that we have Employee records for.
 
Now we recreate the Materialized View EMP_TIME_MV

create materialized view emp_time_Mv<br /> build immediate<br /> refresh on demand<br /> enable query rewrite<br /> as<br /> select t.mmyyyy<br /> ,      count(sal)<br /> ,      sum(sal)<br /> from   emp e<br /> ,      time_hierarchy t<br /> where  to_char(hiredate, 'mmyyyy') = t.mmyyyy<br /> group<br /> by     t.mmyyyy<br />/<br />

Here we have joined the EMP aggregation results with the entries in the Time Dimension. At this point, the CBO is still unaware of the hierarchical relationship between MMYYY through QTR_YYYY to YYYY. If we query:

select t.yyyy<br />,      avg(sal)<br />from   emp e<br />,      time_hierarchy t<br />where  to_char(hiredate, 'mmyyyy') = t.mmyyyy<br />group<br />by     t.yyyy<br />/<br /><br />

the CBO does not know it can make use of the Materialized View.
 
Let’s tell the CBO about the hierarchy between Months, Quarters and Years:

create dimension time_hierarchy<br /> level mmyyyy is time_hierarchy.mmyyyy<br /> level qtr_yyyy is time_hierarchy.qtr_yyyy<br /> level yyyy is time_hierarchy.yyyy <br />hierarchy time_roll_up<br />( mmyyyy child of qtr_yyyy child of yyyy<br />)<br /><br />

If we now try to select the aggregate year data – the average salary per hire year – we will see the CBO is able to leverage the Materialized, thereby speeding up the query tremendously:

select t.yyyy<br />,      avg(sal)<br />from   emp e<br />,      time_hierarchy t<br />where  to_char(hiredate, 'mmyyyy') = t.mmyyyy<br />group<br />by     t.yyyy<br />/<br /><br />