Myths on bitmap indexes Oracle Headquarters Redwood Shores1 e1698667100526

Myths on bitmap indexes

Although bitmap indexes are already available in Oracle 8i or even before (some sources even say 7.3!!!), they are still surrounded by many myths. Time for some clearing up. And believe it or not, every single myth mentioned here I heard many times in several projects.

First, I would like to point out that these are my personal experiences on specific projects and that they everyone’s experience with what is said ‘in the field’ may be different. It possibly is. And that brings me to the second point:

Secondly, in no way I claim to hold all the wisdom on this subject. I noticed that there are a lot of half-true opinions and ‘heard-of’ stories about bitmap index use (and other Oracle features, in fact). I will not pretend to tell the definitive story on Bitmap Indexes. However, my personal experiences can be useful to trigger you to do some investigation
in your specific environment as well; who knows what unexpected pleasure you
can derive from using Bitmap Indexes. At the very least my findings can stir up
some discussion with others bringing their experiences to the table.

Some people (one of them being Jonathan Lewis) pointed out that I was not entirely correct (or thorough) in my statements about bitmap indexes. He is surely right about that. So I took the liberty of rewriting part of this blog. I’m sure that it still doesn’t cover every case, so be warned. The only thing I ‘m trying to say is: check everything you read (including everything I’ve written here) or hear, on the internet, in classroom training events or during  conference presentations . Test it in your own specific case, in your application on your database version. There is no general rule of thumbs. Do not make assumptions, especially not if they are based on bold statements written down on the internet.

Myth 1 – Bitmap indexes take up a lot more space than B-Tree indexes.

mydb> create table bm_test( datum date, ind1 varchar2(1));
Tabel is aangemaakt.
mydb> desc bm_test
 Naam                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 DATUM                                              DATE
 IND1                                               VARCHAR2(1)
mydb> create index btr1 on bm_test( ind1 );
Index is aangemaakt.
mydb> select segment_name, bytes, blocks, extents
  2  from user_segments where segment_type = 'INDEX';
SEGMENT_NAME     BYTES     BLOCKS    EXTENTS
--------------------------------- ---------- ----------
BTR1       1228800        150         30
mydb> drop index btr1;
Index is verwijderd.
mydb> create bitmap index bmi1 on bm_test( ind1 );
Index is aangemaakt.
mydb> select segment_name, bytes, blocks, extents
  2  from user_segments where segment_type = 'INDEX';
SEGMENT_NAME     BYTES     BLOCKS    EXTENTS

-------------------------- ---------- ----------
BMI1     40960          5          1
1 rij is geselecteerd.

There, 41 kB for the bitmap index compared to 1.2 MB for a B-tree, or 5 blocks vs 150 blocks. Keep in mind that the space occupied by bm indexes is related to many factors. In your environment, results may be entirely different.

Myth 2 – Bitmap indexes are only suitable for data warehouses

I’m still not sure what is the logic behind this thesis, because it is usually not followed by any explanation. I can imagine that for example usage flags are often queried and have extreme poor performance with b-tree indexes. Check the explain plans:

use of btree index on column with values ‘J’ and ‘N’

Execution plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=19 Card=13415 Bytes=
          134150)
   1    0   TABLE ACCESS (FULL) OF 'BM_TEST' (TABLE) (Cost=19 Card=134
          15 Bytes=134150)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         65  consistent gets
         59  physical reads
          0  redo size
        574  bytes sent via SQL*Net to client
        283  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         22  rows processed

use of bitmap index on same column

Execution plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=24 Bytes=240)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'BM_TEST' (TABLE) (Cost=5
           Card=24 Bytes=240)
   2    1     BITMAP CONVERSION (TO ROWIDS)
   3    2       BITMAP INDEX (SINGLE VALUE) OF 'BMI1' (INDEX (BITMAP))
Statistics
----------------------------------------------------------
        136  recursive calls
          0  db block gets
         26  consistent gets
          3  physical reads
          0  redo size
        595  bytes sent via SQL*Net to client
        283  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
         22  rows processed

Myth 3 – Bitmap indexes have slow performance on DML actions

This might me one of the origins for Myth 2. I’ve tried some updates on those 80K records, but found no proof this is a real problem. I did some timing experiments (which are not the most trustworthy, I know!), and in some cases the update on a table with bitmap was even twice as fast as with a b-tree. To be honest, I’m not too surprised with the amount of blocks used in mind. Especially look at the difference in Redo size and recursive calls:

mydb> create index btr1 on bm_tst( ind1) ;
Index is aangemaakt.
mydb> update bm_test set ind1 ='X';
80487 rijen zijn bijgewerkt.
Verstreken: 00:00:19.03
Execution plan
----------------------------------------------------------
   0      UPDATE STATEMENT Optimizer=CHOOSE (Cost=51 Card=80487 Bytes=
          160974)
   1    0   UPDATE OF 'BM_TEST'
   2    1     TABLE ACCESS (FULL) OF 'BM_TEST' (TABLE) (Cost=51 Card=8
          0487 Bytes=160974)
Statistics
----------------------------------------------------------
       3190  recursive calls
     410208  db block gets
       1319  consistent gets
        365  physical reads
   57035576  redo size
        475  bytes sent via SQL*Net to client
        445  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
         42  sorts (memory)
          0  sorts (disk)
      80487  rows processed
mydb> drop index btr1;
Index is verwijderd.
mydb> create bitmap index bmi1 on bm_test( ind1);
Index is aangemaakt.
mydb> update bm_test set ind1 ='Y';
80487 rijen zijn bijgewerkt.
Verstreken: 00:00:11.18
Execution plan
----------------------------------------------------------
   0      UPDATE STATEMENT Optimizer=CHOOSE (Cost=3 Card=80487 Bytes=1
          60974)
   1    0   UPDATE OF 'BM_TEST'
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'BM_TEST' (TABLE) (Cost
          =3 Card=80487 Bytes=160974)
   3    2       BITMAP CONVERSION (TO ROWIDS)
   4    3         BITMAP INDEX (FULL SCAN) OF 'BMI1' (INDEX (BITMAP))
Statistics
----------------------------------------------------------
         50  recursive calls
      88282  db block gets
        212  consistent gets
        965  physical reads
   20102176  redo size
        474  bytes sent via SQL*Net to client
        445  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          2  sorts (memory)
          1  sorts (disk)
      80487  rows processed

There is a big catch here though, and I think that is exactly where the thesis ‘bm-indexes are slow’ comes from. When you do an update on a bitmap index, all the records in your table which have the same value will be locked untill you commit or rollback. Mind you, only the records that have the same key value. So not the entire table as is sometimes assumed. Also, this will occur only when you update the key value column, not when you update another column in the table.

What does all this say? Again, not much more than that it is crucial to know what your application does to (or with) your data, and what the and benefits and risks of every type of index is (or broader, every feature in the Oracle database). Is there a lot of DML on these low-cardinality columns? Orderstatus fields might be changed often, gender fields probably not (unless you’re writing software for a gender-change clinic). This is something few people can accurately predict beforehand.

Conclusion: test, test, test, do not make assumptions.

 

 

22 Comments

  1. John Wu October 10, 2007
  2. Robert H. Goretsky September 24, 2007
  3. John Wu September 22, 2007
  4. Jaromir D.B. Nemec December 22, 2006
  5. Jonathan Lewis December 10, 2006
  6. Karl r. December 7, 2006
  7. Patrick Sinke December 7, 2006
  8. Laurent Schneider December 7, 2006
  9. Patrick Sinke December 7, 2006
  10. Gary December 7, 2006
  11. Patrick Sinke December 6, 2006
  12. Edwin van Meerendonk December 6, 2006
  13. laurent schneider December 6, 2006
  14. William Robertson December 6, 2006
  15. William Robertson December 6, 2006
  16. Tonguç December 5, 2006
  17. Stewart Bryson December 5, 2006
  18. Laurent Schneider December 5, 2006
  19. Stewart Bryson December 5, 2006
  20. Laurent Schneider December 5, 2006
  21. Tonguç December 5, 2006
  22. DJ December 5, 2006