Database Transaction Recorder – Adding Who to When and What to make Flashback take over from Journalling tables

2

In previous articles, I have suggested that the arrival of the Flashback Data Archive in Oracle RDBMS 11g allowed us to finally say goodbye to journalling tables. Keeping track of all changes and previous states of our records in our own dedicated tables is no longer required, was my statement (for example in http://technology.amis.nl/blog/2453/oracle-11g-total-recall-flashback-in-the-hands-of-database-designers-and-application-developers-at-last-and-the-end-of-journalling-tables). Not only would using Flashback Data Archives require less programming, it improves performance for DML operations substantially and also provides a lot of functionality at our fingertips, as Flashback Queries easily incorporate historical records in straightforward SQL queries and using dbms_flashback we can even turn back time and regard all data as it was back then, using the same application and all the same queries.

In the not too distant past, Flashback knew quite a long list of limitations that made it almost impossible to make any changes to a table definition and still retain the historical data. With recent improvements in 11gR2, most of these limitations have been removed and DDL such as adding, renaming and removing columns and constraints is now possible for tables with an associated Flashback Data Archive.

One element of journaling tables is missing in the Flashback Data Archive and from the Flashback Queries. In addition to the When (timestamp for the DML operation) and the What (type of operation and the old values of the data record) it would be nice to know WHO caused the change to occur.

It is important to discuss right at this point what we mean by WHO in this respect. In the past that used to be the USER – the database user committing the transaction. However, in today’s multi-tier world with Service Bus components and Web Applications accessing the database through pooled Database Connections, the database user is usually not indicative at all of the real user behind a transaction. The end user behind the operation is usually (or should be) available from sys_context(‘userenv’, ‘client_identifier’), set by dbms_session.set_identifier() or directly on the connection (in JDBC for example). For our journalling purposes, we need to record or retrieve the real perpetrator and therefore the client identifier.

This article describes ways to record and later on find the client identifier value at the time of the transaction.

Standard Database Audit Facilities

I have to admit I am not overly familiar with the Audit capabilities of the Oracle RDBMS. For some reason that feels like someone else’s domain – one I have not ventured into a lot. However, when trying to pry information about past transactions from somewhere in the database, it is only logical to turn to auditing of course. And this approach readily gives me what I need: a record per tranasaction – and by transactionid that can be joined to the flashback version queries – of data such as time and all information about the user executing the operation, including the client identifier.

Auditing in the Oracle Database can be enabled on many different actions – from create and grant to update and execute – and on various levels – per object type, or per database user executing the action and even per individual object. Auditing may result in huge volumes of data, that may have to be purged on a regular schedule. We will come back to that a little later.

To get started with creating the transaction history I desire for my alternative journaling table, I need to switch on auditing. And I need to decide on the level of auditing – which operations and which objects (and perhaps user?). Let’s say that I want to have full journaling for table EMP in SCOTT’s schema. I need to associate this table with a FLASHBACK DATA ARCHIVE – as described in the referenced article – and I need to ensure that every transaction with a DML operation on this table is recorded. That means, the following auditing should be switched on:

AUDIT INSERT, UPDATE, DELETE ON SCOTT.EMP

With this auditing in place, I could pretend to a be a web application through which a real end user makes changes in the data. The statements executed in the database would be for example:

To set the client identifier the application would execute

begin
  dbms_session.set_identifier('TheRealUser');
end;

and next the DML operation executed could be something like:

update emp
set      sal = sal + 100
where  job ='CLERK'

After this transaction has been committed, the new values are available from table EMP – and the old values are now part of the history that Flashback Versions Query can unveil to us:

select ename
,      sal
,      versions_xid
,      VERSIONS_STARTTIME
,      VERSIONS_ENDTIME
from   emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE
where  job = 'CLERK'

rendering the following data:

ENAME      SAL    VERSIONS_XID     VERSIONS_STARTTIME           VERSIONS_ENDTIME
---------- ------ ---------------- -------------------------    -------------------------
ADAMS      1405   07001E00A9040000 06-FEB-11 08.12.09.00 AM     06-FEB-11 08.42.13.00 AM
ADAMS      1505   04000E0088040000 06-FEB-11 08.42.13.00 AM
ADAMS      1400                                                 06-FEB-11 08.12.09.00 AM
JAMES      1255   07001E00A9040000 06-FEB-11 08.12.09.00 AM     06-FEB-11 08.42.13.00 AM
JAMES      1355   04000E0088040000 06-FEB-11 08.42.13.00 AM
JAMES      1250                                                 06-FEB-11 08.12.09.00 AM
MILLER     1605   07001E00A9040000 06-FEB-11 08.12.09.00 AM     06-FEB-11 08.42.13.00 AM
MILLER     1705   04000E0088040000 06-FEB-11 08.42.13.00 AM
MILLER     1600                                                 06-FEB-11 08.12.09.00 AM
SMITH      1105   07001E00A9040000 06-FEB-11 08.12.09.00 AM     06-FEB-11 08.42.13.00 AM
SMITH      1205   04000E0088040000 06-FEB-11 08.42.13.00 AM
SMITH      1100                                                 06-FEB-11 08.12.09.00 AM

Clearly this data does not specify the user that was responsible for the modification. However, with the audit trail started with the AUDIT statement earlier on, we can now join the Flashback Versions Query to the USER_OBJECT_AUDIT view and find out more about WHO did this to our data:

select ename
,      sal
,      uat.client_id
,      uat.os_username
,      uat.userhost
,      uat.username
,      uat.scn
,      versions_xid
,      VERSIONS_STARTTIME
,      VERSIONS_STARTSCN
,      VERSIONS_ENDTIME
,      VERSIONS_ENDSCN
from   emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE eh
       join
       user_audit_trail uat
       on ( eh.versions_xid = uat.transactionid)
where  job = 'CLERK'
order
by     ename, versions_starttime

The results from this query shows the client identifier’s value that betrays the real user behind the change:

ENAME  SAL  CLIENT_ID    OS_USERNAME USERHOST USERNAME SCN      VERSIONS_XID     VERSIONS_STARTTIME       VERSIONS_STARTSCN
---------- ---------------------- ---------------------------------------------------------------- ------------------------
ADAMS  1505 TheRealUser  demo        xp-vm    SCOTT    2200769  04000E0088040000 06-FEB-11 08.42.13.00 AM 2200778
JAMES  1355 TheRealUser  demo        xp-vm    SCOTT    2200769  04000E0088040000 06-FEB-11 08.42.13.00 AM 2200778
MILLER 1705 TheRealUser  demo        xp-vm    SCOTT    2200769  04000E0088040000 06-FEB-11 08.42.13.00 AM 2200778
SMITH  1205 TheRealUser  demo        xp-vm    SCOTT    2200769  04000E0088040000 06-FEB-11 08.42.13.00 AM 2200778

Note: if we want to find the records from the flashback versions query from before we started doing the audit trail, we should use a left outer join in our query:

select ename
,      sal
,      uat.client_id
,      uat.os_username
,      uat.userhost
,      uat.username
,      uat.scn
,      versions_xid
,      VERSIONS_STARTTIME
,      VERSIONS_STARTSCN
,      VERSIONS_ENDTIME
,      VERSIONS_ENDSCN
from   emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE eh
       left outer join
       ( select uat.*
         ,      row_number() over (partition by transactionid order by extended_timestamp) rn
         from   user_audit_trail uat
        ) uat
       on ( eh.versions_xid = uat.transactionid)
where  job = 'CLERK'
and    rn = 1
order
by     ename, versions_starttime

Note how I have introduced an inline view, to only select one record from user_audit_trail for every transaction, rather than one for every statement in each transaction which is what user_audit_trail will return.

This approach works and is pretty painless too. For every table that we associate with a Flashback Data Archive because we want to have the journaling information we simply also enable auditing. And that is that.

There is a drawback however. Auditing introduces overhead. There is a slight performance penalty – as there is with triggers on journaling tables obviously, the latter firing for every record being touched – but that is not so much the issue. The volume of information stored in the system audit table SYS.AUD$ can be quite substantial and it is stored collectively for all operations, users and objects that are being audited. This table should regularly be purged to data. Note that this can be done selectively – targeting audit trail entries for specific objects or operations.

To combine the ease of use of the standard audit trail with a controlled audit trail, we could implement a job that periodically retrieves the records from the sys.aud$ table and uses them to construct the transaction history we require. As the auditing mechanism records an entry for every DML statement in a transaction and we only need a single record per transaction, we can typically substantially reduce the size of the transaction history by aggregating the per statement audit trail entries to per transaction records.

The transaction history table could be something like:

create table transaction_history
( transaction_id    varchar2(100) not null
, client_identifier varchar2(200)
, os_user           varchar2(200)
, scn               number
, transaction_start_timestamp     timestamp default systimestamp
, db_user           varchar2(100)
, application_info  varchar2(2000)
)

and a statement to retrieve the transaction history from the audit trail could be:

insert into transaction_history
( transaction_id
, client_identifier
, os_user
, scn
, transaction_start_timestamp
, db_user
)
select uat.transactionid
,      uat.client_id
,      uat.os_username
,      uat.scn
,      uat.extended_timestamp
,      uat.username
from   ( select uat.*
         ,      row_number() over (partition by transactionid order by extended_timestamp) rn
         from   user_audit_trail uat
        ) uat
where  rn = 1
and    transactionid is not null

With the audit trail (potentially partially purged) and the transaction history table (potentially not completely up to date) we can use the following combination of the two to construct the history of the records in table EMP.

select ename
,      sal
,      versions_xid
,      nvl(th.client_identifier, uat.client_id) client_id
,      nvl(th.os_user, uat.os_username) os_user
,      uat.userhost userhost
,      nvl(th.db_user, uat.username) db_user
,      nvl(th.scn,uat.scn) scn
,      VERSIONS_STARTTIME
,      VERSIONS_STARTSCN
,      VERSIONS_ENDTIME
,      VERSIONS_ENDSCN
from   emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE eh
       left outer join
       ( select uat.*
         ,      row_number() over (partition by transactionid order by extended_timestamp) rn
         from   user_audit_trail uat
        ) uat
       on ( eh.versions_xid = uat.transactionid)
       left outer join
       transaction_history th
       on ( eh.versions_xid = th.transaction_id)
where  job = 'CLERK'
and    uat.rn = 1
order
by     ename, versions_starttime

Note: it would be practical to combine the audit trail information and our own transaction_history table in a single View.

The approach outlined above provides us with a way to creating journaling for standard applications, for which we for example are not allowed to create triggers. Simply enable auditing of DML statements on the tables of the standard application – and associate these tables with a flashback data archive – and journaling is in full swing. More on that in a later article.

There may be one hurdle to overcome before we can actually implement journaling in this way: the DBA. Auditing – especially the standard auditing features – are very much the domain of the database administrator and as application developers in need of a journaling feature in our application we may not be able to count on the DBA’s blessing. (however, we need him or her to set up the flashback data archives as well).

In a future article I will discuss two slightly alternative – though similar – approaches to track the transaction history and thus complement the journaling details gathered in the flashback data archive.

Resources

My original article on replacing Journaling tables through usinjg Flashback Query – http://technology.amis.nl/blog/2453/oracle-11g-total-recall-flashback-in-the-hands-of-database-designers-and-application-developers-at-last-and-the-end-of-journalling-tables

Very good article on Flashback Data Archive, Flashback Query and analyzing historical data by Rob van Wijk – including a discussion of the last remaining limitations (and the ones that 11gR2 got rid of): http://rwijk.blogspot.com/2009/12/journaling-using-flashback-data.html

Rob’s overview of how to set up the alternative to Journaling Tables relying on ‘Total Recall’ or Flashback Data Archives: http://rwijk.blogspot.com/2008/11/journaling-using-flashback-data.html.

Oracle Database 11gR2 Documentation on
Flashback Data Archive – http://download.oracle.com/docs/cd/E11882_01/appdev.112/e17125/adfns_flashback.htm#ADFNS1008 (Advanced Application Development), http://download.oracle.com/docs/cd/E11882_01/appdev.112/e16760/d_flashb.htm#ARPLS142 (DBMS_FLASHBACK in Packages and Types)
Audit – http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/statements_4007.htm#SQLRF01107 (the AUDIT statement), http://download.oracle.com/docs/cd/E11882_01/network.112/e16543/auditing.htm#DBSEG006 (Security Guide)

Share.

About Author

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director for Fusion Middleware. Consultant, trainer and instructor on diverse areas including Oracle Database (SQL & PLSQL), Service Oriented Architecture, BPM, ADF, Java in various shapes and forms and many other things. Author of the Oracle Press book: Oracle SOA Suite 11g Handbook. Frequent presenter on conferences such as JavaOne, Oracle OpenWorld, ODTUG Kaleidoscope, Devoxx and OBUG. Presenter for Oracle University Celebrity specials.

2 Comments

  1. Pingback: Do it yourself Transaction Recorder – when standard auditing is not available « AMIS Technology blog

Leave a Reply