Do you know what it sounds like when a production system comes to a grinding halt? It sounds like a telephone ringing.
Actually it sounds like a dozen phones ringing at the same time.
This is exactly what happened last week after an application upgrade. The release of the application went smoothly, trouble started when the end users actually started using the application. Go Figure!
Early Monday morning all was still fine, but as the number of users increased the performance went down, and it went down fast.
That’s when the phones started ringing.
The application performs a series of predefined checks based on customer information. All these predefined checks and results are stored into a table. When all checks are done an overall result is determined based on the results found.
The part of the application that starting causing problems was the process where the overall result was determined.
Just to give you an idea of what I’m talking about:
create table master (id number primary key) / create table results (masterid number ,code varchar2(5) ,value varchar2(25) ) / alter table results add constraint results_fk foreign key (masterid) references master / create table final_results (a varchar2(10) ,b varchar2(10) ,c varchar2(10) ,d varchar2(10) ,e varchar2(10) ,final varchar2 (10) ) /
The content of the RESULTS table may look like:
1 | A | A result |
1 | B | B result |
1 | C | C result |
1 | D | D result |
1 | E | E result |
The content of the FINAL_RESULTS may look like:
A result | B result | C result | D result | E result | Great |
A result | B result | C result | D result | Good | |
A result | B result | C result | Fair | ||
A result | B result | Neutral | |||
A result | Poor |
In the RESULTS table the different checks and outcome are stored for a particular MASTER. The FINAL_RESULTS table holds all possible outcomes of the RESULTS and the overall result. Quality ratings like “Great”,”Good”, “Fair”, “Neutral”, and “Poor” could be the final result,depending on the outcome of the individual checks.
Because of the structure of the tables, the RESULTS needed to be pivoted to match a row in the FINAL_RESULS table.
Now that we have the test tables and functions set up, let me show you the culprit query:
select id , m.a , m.b , m.c , m.d , m.e , fres.final from (select id , f ('A', 1) a --<---- Note these function calls , f ('B', 1) b --<---- in this in-line view. , f ('C', 1) c --<---- These are used to Pivot , f ('D', 1) d --<---- the rows into columns , f ('E', 1) e --<---- . from master ) m join final_results fres on (m.a = fres.a and m.b = fres.b and m.c = fres.c and m.d = fres.d and m.e = fres.e ) where id = 1 --<------ Note this predicate/
The function calls were used to create the columns “a” through “e” with their individual value. Note that these were stored as rows in the RESULTS table. This in line view would pivot the results based on the Master Id.
So, why did the performance go down the drain?
We could see that most of the sessions were busy with waiting. They were basically doing nothing. Because of the instrumentation in the source code this was easy to track down. Sometimes it would take up to 3500 seconds before the whole process was completed. And most of time it was spend waiting…
The time it takes to execute this query is around 1 to 2 seconds on my test system. That’s a single user environment.
You may notice the amount of IO that’s taking place to execute this query. All this IO is what was creating the problems. Each consistent get is a latch, and latches are the root cause of contention.
Obviously, the predicate was in the wrong place. Adding the predicate to the in-line view made only a minor change in execution time and the amount of IO that was taking place. What did the trick was to add ROWNUM to the in-line view.
The ROWNUM materialized the in-line view so all those function calls were reduced.
Just adding the ROWNUM and the predicate to the in-line view solved the problems for now.
Look at the effect of ROWNUM and the extra predicate in TKPROF.
This is the output from the original query:
******************************************************************************** select id , m.a , m.b , m.c , m.d , m.e , fres.final from (select id , f ('A', 1) a --< ---- Note these function calls , f ('B', 1) b --<---- in this in-line view. , f ('C', 1) c --<---- These are used to Pivot , f ('D', 1) d --<---- the rows into columns , f ('E', 1) e --<---- . from master ) m join final_results fres on (Nvl (m.a, 'x') = Nvl (fres.a, 'x') and nvl (m.b, 'x') = nvl (fres.b, 'x') and nvl (m.c, 'x') = nvl (fres.c, 'x') and nvl (m.d, 'x') = nvl (fres.d, 'x') and nvl (m.e, 'x') = nvl (fres.e, 'x') ) where id = 1 --<------ Note this predicate call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 112.64 0 6 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 112.64 0 6 0 1 Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 21 Rows Row Source Operation ------- --------------------------------------------------- 1 NESTED LOOPS 1 INDEX UNIQUE SCAN SYS_C007896 (object id 57738) 1 TABLE ACCESS FULL FINAL_RESULTS ******************************************************************************** SELECT VALUE FROM RESULTS WHERE MASTERID = :B2 AND CODE = :B1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 166.67 20.48 0 0 0 0 Execute 24 0.00 61.44 0 0 0 0 Fetch 24 9500.00 9431.04 0 8568 0 24 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 49 9666.67 9512.96 0 8568 0 24 Misses in library cache during parse: 1 Optimizer goal: CHOOSE Parsing user id: 21 (recursive depth: 1) ********************************************************************************
… and here’s the TKPROF from the “ROWNUM” query:
******************************************************************************** select id , m.a , m.b , m.c , m.d , m.e , fres.final from (select id , rownum --< --- Added to materialize the in-line view , f ('A', 1) a --<---- These function calls , f ('B', 1) b --<---- stayed in place. , f ('C', 1) c --<---- Note the difference in , f ('D', 1) d --<---- the IO that's taking , f ('E', 1) e --<---- place. from master where id = 1 --<----- This Predicated was added ) m join final_results fres on (Nvl (m.a, 'x') = Nvl (fres.a, 'x') and nvl (m.b, 'x') = nvl (fres.b, 'x') and nvl (m.c, 'x') = nvl (fres.c, 'x') and nvl (m.d, 'x') = nvl (fres.d, 'x') and nvl (m.e, 'x') = nvl (fres.e, 'x') ) where id = 1 --<------ It's still here. call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 30.72 0 5 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 30.72 0 5 0 1 Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 21 Rows Row Source Operation ------- --------------------------------------------------- 1 HASH JOIN 5 TABLE ACCESS FULL FINAL_RESULTS 1 VIEW 1 COUNT 1 INDEX UNIQUE SCAN SYS_C007896 (object id 57738) ******************************************************************************** SELECT VALUE FROM RESULTS WHERE MASTERID = :B2 AND CODE = :B1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 10.24 0 0 0 0 Execute 15 0.00 10.24 0 0 0 0 Fetch 15 6166.67 6164.48 0 5355 0 15 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 31 6166.67 6184.96 0 5355 0 15 Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 21 (recursive depth: 1) ********************************************************************************
The difference is quite noticeable, in elapsed time and reduction in IO.
If we could only get rid of all these function calls altogether, that could really make a dramatic difference. So a rewrite of the query to exclude the function calls should perform best of all, and it does!
Here is the TKPROF from the Pivot Query:
******************************************************************************** select m.mid , m.a , m.b , m.c , m.d , m.e , fres.final from (select masterid mid , max (decode (code, 'A', value)) a , max (decode (code, 'B', value)) b , max (decode (code, 'C', value)) c , max (decode (code, 'D', value)) d , max (decode (code, 'E', value)) e from (select masterid , code , value from results where masterid = 1 ) m group by masterid ) m join final_results fres on (Nvl (m.a, 'x') = Nvl (fres.a, 'x') and nvl (m.b, 'x') = nvl (fres.b, 'x') and nvl (m.c, 'x') = nvl (fres.c, 'x') and nvl (m.d, 'x') = nvl (fres.d, 'x') and nvl (m.e, 'x') = nvl (fres.e, 'x') ) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 333.33 409.60 0 361 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 333.33 409.60 0 361 0 1 Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 21 Rows Row Source Operation ------- --------------------------------------------------- 1 HASH JOIN 1 VIEW 1 SORT GROUP BY 5 TABLE ACCESS FULL RESULTS 5 TABLE ACCESS FULL FINAL_RESULTS ********************************************************************************
A sample script is available for download here, so you can see the results for yourself.
Very instructive example! Great post.
This is explained in little more detail in Tom Kyte’s book Effective Oraclee by Design p 491-496.
Great article Alex, Thanks.
Patrick