In previous articles, I have suggested that the arrival of the Flashback Data Archive in Oracle RDBMS 11g allowed us to finally say goodbye to journalling tables. Keeping track of all changes and previous states of our records in our own dedicated tables is no longer required, was my statement (for example in http://technology.amis.nl/blog/2453/oracle-11g-total-recall-flashback-in-the-hands-of-database-designers-and-application-developers-at-last-and-the-end-of-journalling-tables). Not only would using Flashback Data Archives require less programming, it improves performance for DML operations substantially and also provides a lot of functionality at our fingertips, as Flashback Queries easily incorporate historical records in straightforward SQL queries and using dbms_flashback we can even turn back time and regard all data as it was back then, using the same application and all the same queries.
In the not too distant past, Flashback knew quite a long list of limitations that made it almost impossible to make any changes to a table definition and still retain the historical data. With recent improvements in 11gR2, most of these limitations have been removed and DDL such as adding, renaming and removing columns and constraints is now possible for tables with an associated Flashback Data Archive.
One element of journaling tables is missing in the Flashback Data Archive and from the Flashback Queries. In addition to the When (timestamp for the DML operation) and the What (type of operation and the old values of the data record) it would be nice to know WHO caused the change to occur.
It is important to discuss right at this point what we mean by WHO in this respect. In the past that used to be the USER – the database user committing the transaction. However, in today’s multi-tier world with Service Bus components and Web Applications accessing the database through pooled Database Connections, the database user is usually not indicative at all of the real user behind a transaction. The end user behind the operation is usually (or should be) available from sys_context(‘userenv’, ‘client_identifier’), set by dbms_session.set_identifier() or directly on the connection (in JDBC for example). For our journalling purposes, we need to record or retrieve the real perpetrator and therefore the client identifier.
This article describes ways to record and later on find the client identifier value at the time of the transaction.
Standard Database Audit Facilities
I have to admit I am not overly familiar with the Audit capabilities of the Oracle RDBMS. For some reason that feels like someone else’s domain – one I have not ventured into a lot. However, when trying to pry information about past transactions from somewhere in the database, it is only logical to turn to auditing of course. And this approach readily gives me what I need: a record per tranasaction – and by transactionid that can be joined to the flashback version queries – of data such as time and all information about the user executing the operation, including the client identifier.
Auditing in the Oracle Database can be enabled on many different actions – from create and grant to update and execute – and on various levels – per object type, or per database user executing the action and even per individual object. Auditing may result in huge volumes of data, that may have to be purged on a regular schedule. We will come back to that a little later.
To get started with creating the transaction history I desire for my alternative journaling table, I need to switch on auditing. And I need to decide on the level of auditing – which operations and which objects (and perhaps user?). Let’s say that I want to have full journaling for table EMP in SCOTT’s schema. I need to associate this table with a FLASHBACK DATA ARCHIVE – as described in the referenced article – and I need to ensure that every transaction with a DML operation on this table is recorded. That means, the following auditing should be switched on:
AUDIT INSERT, UPDATE, DELETE ON SCOTT.EMP
With this auditing in place, I could pretend to a be a web application through which a real end user makes changes in the data. The statements executed in the database would be for example:
To set the client identifier the application would execute
begin dbms_session.set_identifier('TheRealUser'); end;
and next the DML operation executed could be something like:
update emp set sal = sal + 100 where job ='CLERK'
After this transaction has been committed, the new values are available from table EMP – and the old values are now part of the history that Flashback Versions Query can unveil to us:
select ename , sal , versions_xid , VERSIONS_STARTTIME , VERSIONS_ENDTIME from emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE where job = 'CLERK'
rendering the following data:
ENAME SAL VERSIONS_XID VERSIONS_STARTTIME VERSIONS_ENDTIME ---------- ------ ---------------- ------------------------- ------------------------- ADAMS 1405 07001E00A9040000 06-FEB-11 08.12.09.00 AM 06-FEB-11 08.42.13.00 AM ADAMS 1505 04000E0088040000 06-FEB-11 08.42.13.00 AM ADAMS 1400 06-FEB-11 08.12.09.00 AM JAMES 1255 07001E00A9040000 06-FEB-11 08.12.09.00 AM 06-FEB-11 08.42.13.00 AM JAMES 1355 04000E0088040000 06-FEB-11 08.42.13.00 AM JAMES 1250 06-FEB-11 08.12.09.00 AM MILLER 1605 07001E00A9040000 06-FEB-11 08.12.09.00 AM 06-FEB-11 08.42.13.00 AM MILLER 1705 04000E0088040000 06-FEB-11 08.42.13.00 AM MILLER 1600 06-FEB-11 08.12.09.00 AM SMITH 1105 07001E00A9040000 06-FEB-11 08.12.09.00 AM 06-FEB-11 08.42.13.00 AM SMITH 1205 04000E0088040000 06-FEB-11 08.42.13.00 AM SMITH 1100 06-FEB-11 08.12.09.00 AM
Clearly this data does not specify the user that was responsible for the modification. However, with the audit trail started with the AUDIT statement earlier on, we can now join the Flashback Versions Query to the USER_OBJECT_AUDIT view and find out more about WHO did this to our data:
select ename , sal , uat.client_id , uat.os_username , uat.userhost , uat.username , uat.scn , versions_xid , VERSIONS_STARTTIME , VERSIONS_STARTSCN , VERSIONS_ENDTIME , VERSIONS_ENDSCN from emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE eh join user_audit_trail uat on ( eh.versions_xid = uat.transactionid) where job = 'CLERK' order by ename, versions_starttime
The results from this query shows the client identifier’s value that betrays the real user behind the change:
ENAME SAL CLIENT_ID OS_USERNAME USERHOST USERNAME SCN VERSIONS_XID VERSIONS_STARTTIME VERSIONS_STARTSCN ---------- ---------------------- ---------------------------------------------------------------- ------------------------ ADAMS 1505 TheRealUser demo xp-vm SCOTT 2200769 04000E0088040000 06-FEB-11 08.42.13.00 AM 2200778 JAMES 1355 TheRealUser demo xp-vm SCOTT 2200769 04000E0088040000 06-FEB-11 08.42.13.00 AM 2200778 MILLER 1705 TheRealUser demo xp-vm SCOTT 2200769 04000E0088040000 06-FEB-11 08.42.13.00 AM 2200778 SMITH 1205 TheRealUser demo xp-vm SCOTT 2200769 04000E0088040000 06-FEB-11 08.42.13.00 AM 2200778
Note: if we want to find the records from the flashback versions query from before we started doing the audit trail, we should use a left outer join in our query:
select ename , sal , uat.client_id , uat.os_username , uat.userhost , uat.username , uat.scn , versions_xid , VERSIONS_STARTTIME , VERSIONS_STARTSCN , VERSIONS_ENDTIME , VERSIONS_ENDSCN from emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE eh left outer join ( select uat.* , row_number() over (partition by transactionid order by extended_timestamp) rn from user_audit_trail uat ) uat on ( eh.versions_xid = uat.transactionid) where job = 'CLERK' and rn = 1 order by ename, versions_starttime
Note how I have introduced an inline view, to only select one record from user_audit_trail for every transaction, rather than one for every statement in each transaction which is what user_audit_trail will return.
This approach works and is pretty painless too. For every table that we associate with a Flashback Data Archive because we want to have the journaling information we simply also enable auditing. And that is that.
There is a drawback however. Auditing introduces overhead. There is a slight performance penalty – as there is with triggers on journaling tables obviously, the latter firing for every record being touched – but that is not so much the issue. The volume of information stored in the system audit table SYS.AUD$ can be quite substantial and it is stored collectively for all operations, users and objects that are being audited. This table should regularly be purged to data. Note that this can be done selectively – targeting audit trail entries for specific objects or operations.
To combine the ease of use of the standard audit trail with a controlled audit trail, we could implement a job that periodically retrieves the records from the sys.aud$ table and uses them to construct the transaction history we require. As the auditing mechanism records an entry for every DML statement in a transaction and we only need a single record per transaction, we can typically substantially reduce the size of the transaction history by aggregating the per statement audit trail entries to per transaction records.
The transaction history table could be something like:
create table transaction_history ( transaction_id varchar2(100) not null , client_identifier varchar2(200) , os_user varchar2(200) , scn number , transaction_start_timestamp timestamp default systimestamp , db_user varchar2(100) , application_info varchar2(2000) )
and a statement to retrieve the transaction history from the audit trail could be:
insert into transaction_history ( transaction_id , client_identifier , os_user , scn , transaction_start_timestamp , db_user ) select uat.transactionid , uat.client_id , uat.os_username , uat.scn , uat.extended_timestamp , uat.username from ( select uat.* , row_number() over (partition by transactionid order by extended_timestamp) rn from user_audit_trail uat ) uat where rn = 1 and transactionid is not null
With the audit trail (potentially partially purged) and the transaction history table (potentially not completely up to date) we can use the following combination of the two to construct the history of the records in table EMP.
select ename , sal , versions_xid , nvl(th.client_identifier, uat.client_id) client_id , nvl(th.os_user, uat.os_username) os_user , uat.userhost userhost , nvl(th.db_user, uat.username) db_user , nvl(th.scn,uat.scn) scn , VERSIONS_STARTTIME , VERSIONS_STARTSCN , VERSIONS_ENDTIME , VERSIONS_ENDSCN from emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE eh left outer join ( select uat.* , row_number() over (partition by transactionid order by extended_timestamp) rn from user_audit_trail uat ) uat on ( eh.versions_xid = uat.transactionid) left outer join transaction_history th on ( eh.versions_xid = th.transaction_id) where job = 'CLERK' and uat.rn = 1 order by ename, versions_starttime
Note: it would be practical to combine the audit trail information and our own transaction_history table in a single View.
The approach outlined above provides us with a way to creating journaling for standard applications, for which we for example are not allowed to create triggers. Simply enable auditing of DML statements on the tables of the standard application – and associate these tables with a flashback data archive – and journaling is in full swing. More on that in a later article.
There may be one hurdle to overcome before we can actually implement journaling in this way: the DBA. Auditing – especially the standard auditing features – are very much the domain of the database administrator and as application developers in need of a journaling feature in our application we may not be able to count on the DBA’s blessing. (however, we need him or her to set up the flashback data archives as well).
In a future article I will discuss two slightly alternative – though similar – approaches to track the transaction history and thus complement the journaling details gathered in the flashback data archive.
My original article on replacing Journaling tables through usinjg Flashback Query – http://technology.amis.nl/blog/2453/oracle-11g-total-recall-flashback-in-the-hands-of-database-designers-and-application-developers-at-last-and-the-end-of-journalling-tables
Very good article on Flashback Data Archive, Flashback Query and analyzing historical data by Rob van Wijk – including a discussion of the last remaining limitations (and the ones that 11gR2 got rid of): http://rwijk.blogspot.com/2009/12/journaling-using-flashback-data.html
Rob’s overview of how to set up the alternative to Journaling Tables relying on ‘Total Recall’ or Flashback Data Archives: http://rwijk.blogspot.com/2008/11/journaling-using-flashback-data.html.
Oracle Database 11gR2 Documentation on
Flashback Data Archive – http://download.oracle.com/docs/cd/E11882_01/appdev.112/e17125/adfns_flashback.htm#ADFNS1008 (Advanced Application Development), http://download.oracle.com/docs/cd/E11882_01/appdev.112/e16760/d_flashb.htm#ARPLS142 (DBMS_FLASHBACK in Packages and Types)
Audit – http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/statements_4007.htm#SQLRF01107 (the AUDIT statement), http://download.oracle.com/docs/cd/E11882_01/network.112/e16543/auditing.htm#DBSEG006 (Security Guide)