Query Rewrite is an increasingly more important feature of the Oracle database. Linked intimately to Materialized Views, Query Rewrite is used by the CBO to make queries leverage intermediate results – often aggregate values or pre-joined records – available in Materialized Views when executing a query, thus preventing the need for and performance hit of re-retrieving those data. This article gives a very simple example of such a query rewrite operation and then continues to illustrate how providing some additional meta-data to the database – next to constraints and proper data-type definitions – in the form of Dimensions can make all the different to the Query Rewrite capabilities of the CBO.

Before we start on this exploration, we will have to ask you to assume that the EMP table has millions of records. Otherwise, you will probably not be duly impressed.

In order to be able to efficiently query the records in our Employee data warehouse (i.e. table EMP), we have created a Materialized View, as follows:

create materialized view emp_time_mv
build immediate
refresh on demand
enable query rewrite
as
select to_char(hiredate, 'mmyyyy') mmyyyy
,      count(sal)
,      sum(sal)
from   emp e
group
by     to_char(hiredate, 'mmyyyy')
/

This Materialized View is Query rewrite enabled. That means that we allow the Cost Based Optimizer to make use of this MV if that helps for speeding up our queries against base table EMP. The consequence can be that the following query against our monstrous EMP table:

select *
 from   ( select avg(sal) avg_sal
          ,      to_char(hiredate, 'mmyyyy') mmyyyy
          from   emp e
          group
          by     to_char(hiredate, 'mmyyyy')  
        )
 where  mmyyyy = 121981

can be executed by the CBO as query against the Materialized View with a maximum of some hundreds of records:

Cost Based Optimizing through time travel? - the value of meta-data for enabling Query Rewrite qrw1

What might surprise you here is that the CBO is able to recognize the fact that AVG is calculated from SUM and COUNT, and therefore it can leverage the intermediate results in the Materialized view.

Now suppose we would like to have the average salary per year. What if we query for that value against table EMP, will the CBO be smart enough to do the query rewrite?

select avg(sal) avg_sal
,      substr(to_char(hiredate, 'mmyyyy'),3) yyyy
from   emp e
group
by     substr(to_char(hiredate, 'mmyyyy'), 3)

The results are fine, as is the execution plan:

Cost Based Optimizing through time travel? - the value of meta-data for enabling Query Rewrite qrw2

Because we asked for aggregation by year – with year a substring of the myyyy string we have used as a basis for the Materialized View -the CBO was able to rewrite our query and make use of the MV.

Now suppose a different situation. We are still interested in aggregate salary values per timeframe. We want look at Quarters and we want to specify some additional characteristics for the years.

We retrace our steps a little bit. First we create a table to hold the dimension values, here for the time dimension:

create table time_hierarchy 
( mmyyyy, qtr_yyyy, yyyy)
as 
select distinct
       to_char(hiredate, 'mmyyyy') mmyyyy
,      'Q'||to_char(hiredate, 'q')||to_char(hiredate, 'yyyy') qtr_yyyy
,      to_char(hiredate, 'yyyy') yyyy
from   emp

This table contains records for every month in every year that we have Employee records for.
Cost Based Optimizing through time travel? - the value of meta-data for enabling Query Rewrite qrw3
Now we recreate the Materialized View EMP_TIME_MV

create materialized view emp_time_Mv
 build immediate
 refresh on demand
 enable query rewrite
 as
 select t.mmyyyy
 ,      count(sal)
 ,      sum(sal)
 from   emp e
 ,      time_hierarchy t
 where  to_char(hiredate, 'mmyyyy') = t.mmyyyy
 group
 by     t.mmyyyy
/

Here we have joined the EMP aggregation results with the entries in the Time Dimension. At this point, the CBO is still unaware of the hierarchical relationship between MMYYY through QTR_YYYY to YYYY. If we query:

select t.yyyy
,      avg(sal)
from   emp e
,      time_hierarchy t
where  to_char(hiredate, 'mmyyyy') = t.mmyyyy
group
by     t.yyyy
/

the CBO does not know it can make use of the Materialized View.
Cost Based Optimizing through time travel? - the value of meta-data for enabling Query Rewrite qrw4
Let’s tell the CBO about the hierarchy between Months, Quarters and Years:

create dimension time_hierarchy
 level mmyyyy is time_hierarchy.mmyyyy
 level qtr_yyyy is time_hierarchy.qtr_yyyy
 level yyyy is time_hierarchy.yyyy 
hierarchy time_roll_up
( mmyyyy child of qtr_yyyy child of yyyy
)

If we now try to select the aggregate year data – the average salary per hire year – we will see the CBO is able to leverage the Materialized, thereby speeding up the query tremendously:

select t.yyyy
,      avg(sal)
from   emp e
,      time_hierarchy t
where  to_char(hiredate, 'mmyyyy') = t.mmyyyy
group
by     t.yyyy
/

Cost Based Optimizing through time travel? - the value of meta-data for enabling Query Rewrite qrw5

5 Comments

Maxim April 21, 2006

That are indeed exciting features, interesting is as well, the rewrite can be “bidirectional”, i.e. if your materialized view is built on count(*) and avg(*), it is used to query a sum (it means formula to calculate the total sum from partial averages and partial counts is known to Query Transformer), there are even much more complex transformations known ( total variance can be calculated from partial variances etc). Also big help to recognize rewrite capabilities does dbms_mview.explain_rewrite.

Regards

Maxim
Karl April 21, 2006

Yes!
i got it!
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i82005

The Query Optimizer (CBO)
has following sub components
– Query Transformer (what i meant with Query Rewriter )
– Estimator
– Plan generator
Thanks greetings

Interesting is that the Documentation of 10G does not talk about CBO at the given link but of the ‘Query Optimizer’
Greetings
Karl
Alex Nuijten April 21, 2006

Karl, according to the documentation it’s the CBO that’s responsible for the query rewrite.
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96520/qr.htm#29014

Great article btw…
Karl April 21, 2006

Great article!
‘That means that we allow the Cost Based Optimizer to make use of this MV’
I am not sure about this : Not CBO rewrites the Query but a Component call Query Rewriter? the Query Rewriter rewrites the Query and then let’s the CBO evalualte the costs? Then out of the set of rewritten queries the lowes cost rewritten query is used.
Greetings
Karl
Marco Gralike April 21, 2006

Great example. Thanx

Cost Based Optimizing through time travel? – the value of meta-data for enabling Query Rewrite

Like this:

About The Author

Lucas Jellema

5 Comments

Share this:

Like this:

Related Posts

About The Author

Lucas Jellema

5 Comments