Are Index Scans always good……

0

During my work a lot of times I have heard the same rumour. Lots of people are stating that when a system is doing index scans the system is doing well. But is this actual true, that is the question. In my work I experienced a lot of times that I was called to look at a system and specific queries (BI) where not performing well. In lots of these cases I saw that the system was doing lots of "db file sequential reads" and not "db file scattered reads". The "db file sequential reads" are used when index scans are done, the "db file scattered reads" are used when full table scans are done.....

When you encounter such a situation you should start to look into a 10046 level 8 or higher trace from the statement involved. In this trace you should first check which object is read in the "db file sequential read" very often, for example:

WAIT #1: nam=’db file sequential read’ ela= 159 p1=14 p2=137858 p3=1

In this line "ela" is telling the time it took to do the read in millionst of a second, "p1" tells the file_id, "p2" the block number in the file and "p3" the number of blocks read by the read. The block number together with the file_id and a query on dba_extents can bring up the object read.

The object you find here is what is causing the long wait. Another way can be by online checking the session running the query and check via v$session_wait and v$session what the object is the session is waiting for.

After finding which object is involved in the "db file sequential read", you can start to checkout the execution plan rowsource data of the query, and see where the table is in the execution plan. Most of the time the involved object is a table read via sequential reads after an index was accessed. You can find something like:

 148243                  TABLE ACCESS BY INDEX ROWID <FOUND TABLE>
 998934                   INDEX RANGE SCAN INDEX1 (object id 29469)

The above two lines have been extracted from a tkprof output or can be obtained in the STATS lines in a raw trace. The two lines above say that via the index "INDEX1" 998934 rows are retrieved from the table, but in real for the outcome of the query and the existing criteria on the query this number of retrieved rows should not be more than 148243 rows. So in this case we actually are jugling around in the temp file with almost 7 times more rows than actually needed for the outcome, so this means 7 times more sequential I/O as needed, much more used processing power of the CPU, etc, etc…..

So as we can see in the above example, we have indeed used an index to retrieve rows from the table, but the used index is not specific enough to prevent reading a lot of rows unneccesary. So it is highly important to checkout if such situations are around in a system. The thing to look for is a high number of rows in front of a index and a low number of rows (in compare) in front of a table.

So now the big question rises, how we can get around such an issue. The first thing we should checkout is how many indexes are on the table and on which columns. The second thing is that we should know what selection criteria are not part of the index. After this analyses we have some options to choose from and not all are very simple to do, it is very depending on the situation you are in:
- Change the optimizer behaviour: Regather schema statistics, use histograms, use hints, etc.
- Add extra index
- Change the selection criteria in such a way that the functional result stays the same
- Etc.

In the example above the solution was to change a selection criteria in the query (Business Objects Query change). This resulted in the below situation:

 148243                  TABLE ACCESS BY INDEX ROWID <FOUND TABLE>
 152241                   INDEX RANGE SCAN INDEX2 (object id 29478)

So the end result is now that the table is accessed via another index (INDEX2) and that around 4000 rows too much are read from the table. This is much better and the total performance of the query went up with a factor 8-9. This because everything was much less stressed.

What we can see in the above example is that not always high "db file sequential read" is a good thing, it also does not mean that much full table scans are well…… The most important thing is that the amount of I/O and the real needed amount of I/O is very close. How to find this out is sometimes a challenge, a 10046 level 8 can help with this.

As a last note use one important principe, never do I/O if it is not needed, and do it only it there is no other option.

Regards, Gerwin

Share.

About Author

Comments are closed.