Role of the Database in a J2EE Architecture - Dumb Data Store? Only Data Tier? Also Business Tier? americas cup win 2682133k1

Role of the Database in a J2EE Architecture – Dumb Data Store? Only Data Tier? Also Business Tier?

The other day, we had a very interesting debate about the role of the database in (our) Java/J2EE projects and architectures. Should you limit the use of the database to data storage and retrieval and maybe the odd index to improve query performance? Should you try to leverage every advanced database feature money already has bought? How important – or realistic – is it to make the application completely vendor independent? And is the only way of achieving the desired level of database independence completely abandoning all specific database functionality – only using the lowest common denominator between all database platforms you may ever want to migrate your application to?

The discussion was part of the design of the presentation – and its abstract – that my colleague Aino has entered for the JFall conference (October 12th in Ede, The Netherlands, see http://www.nljug.org/) on proper use of the Database in a Java/J2EE architecture: Database and Java: combined forces.

Our company has long tradition as an Oracle specialist – in 1999 we started using Java/J2EE in full swing. We know a lot about the Oracle database and all its capabilities. And while we must watch out for the ‘we have a hammer so that problem is definitely a nail’ syndrome, we have a distinct feeling that in many J2EE architectures and Java applications, the usage of the database and its specific powers is needlessly limited, often even to a damaging level when it comes to for example performance, security and scalability.

A very simple example of the typical dilemma we see quite frequently: we were developing an application with one of our customers. The customer wanted a clean, strict OO and J2EE-ish application. In this application, we had a requirement for copying a certain business object. This meant deep-cloning an object graph with up to several thousands of objects, most of which were not even instantiated in our middle tier. Running this copy operation in the Java Middle Tier, as OO theory and J2EE seemed to dictate, took close to 10 seconds and a lot of code. Implementing the same operation in the database resulted in about 0.3 seconds execution time and less than 15% of the original amount of code. In this situation, the database code was a piece of PL/SQL that primarily contained a number of SQL statements. The code was not directly portable to other databases, though it would be easy to implement the same Stored Procedure API – so that the JDBC call from the Java application would be the same on each database.

Another subject on which we often have raging debates is the implementation of (data oriented) business logic. Rules that dictate the state of the data. There are several considerations when discussing the place to implement these rules. There is the element of user feedback, which suggests client side implementation. There is the issue of data integrity above all, which suggests implementation in the database. Then there are the mechanisms, such as ApacheCommons (Struts) Validator, ADF BC Validations, Drools, Oracle Rules, Declarative Database Constraints etc. that each suggest a certain way of implementing the rules.

We believe in separating the tiers, defining clear responsibilities. However, we do not necessarily assume that the borders between the tiers also need to be the borders between the physical layers. Unless you define all columns in your database tables as VARCHAR2, there is always some validation of data going on when you insert of update the data. The discussion could be on how much validation – implicit because of conversion or explicit with constraints or triggers – should take place inside the database.

I do not accept an argument like: the database is (in) the data tier and therefore should not contain any business logic or validations. First of all, who says the database is only part of the data tier? Why should it not implement part of the business tier as well? Second of all, it is plain stupid – if at all possible- to ban all validations and business logic from the database. Not only is the database much better suited for simple things like uniqueness enforcement and referential integrity performance wise, it is also the only place where you can really guard the integrity of the data.

Part of the reasoning behind the split between Application Logic (Business Tier) and the Presentation Tier – the sound ambition to have multiple Applications or Presentation Components make use of the same Business Logic – forms a very important argument for implementing the Data Integrity rules or Data Oriented Business Rules inside the database. You Java application will very likely NOT be the only client of the database. It is extremely likely that at some point in time, other applications or processes or even the Database Administrator or Application Administrator, will also access the tables in your database. Through low level database operations such as Import/Export, through direct manipulation through SQL, through messaging applications or in other ways. But in my experience you are naive to trust that all database access will be through your Java based middle tier Business Logic Tier. So apart from scalability, security, performance and ease (== productivity) of programming, the crucial concern for data integrity should motivate you to implement the data rules in the database.

Modern databases – and that includes MySQL, PostgreSQL in addition to the commercial heavy lifting RDBMS’s – provide a wealth of functionality in addition to just storing and retrieving data. No one forces you to use even the smallest bit of this functionality. But not using it because you have some dogmatic point of view about simply not wanting to touch it is really inexcusable. And I have the impression that it is not at all uncommon to have people abstaining from doing anything but the most trivial things with their databases because of a sheer lack of knowledge. Trying to cover up for your lack of knowledge with partylines like ‘we must ensure complete database independence’ and ‘we need our application to be portable across databases’ is a much sadder approach than trying to find out what these modern database can offer and then based on the merits of each case deciding whether or not to use them.

Show me the organizations that switched databases – from Oracle 9i to DB2 for example. Show me the applications that needed to be migrated from one database platform to another. Now show me the organizations that spent thousands or hundreds of thousands of dollars or euros on their database – and then use it so sparingly that they might as well have ‘bought’ the mentally challenged brother of MySQL. And then see if that organizations perhaps spent way too much on development because everything was done the hard way – in the Java middle tier. And spent even more on fixing data issues because of integrity violations.

If database independence is an issue, and of course those situations do exist, think of better ways of achieving maximum or better yet: optimal independence without giving up on the real power of databases. Yes, even if your database is not an Oracle database.

Oracle 9i Database Features every Java developer can benefit from

I am not an expert when it comes to other databases besides Oracle. So I would not want to presume to make any suggestions on those databases. However, I am pretty sure that each database offers a lot of functionality that as a Java developer developing against such a database you really ought to be aware of.

For Oracle databases, there is a number of areas that I would like to mention in particular, primarily because these features have saved my day on numerous occasions. And knowing about them is really worth your time.

  • Database Sequences (Oracle’s way of generating unique numbers), Before Row triggers and the Returning clause on Insert statements
  • Database Triggers in general and Table and View triggers in particular – yes, you can define a trigger on a view; it is a so-called Instead-Of trigger. When you perform a DML statement (insert, update or delete) against a view with an Instead-Of trigger, rather than attempting to execute the DML on the underlying tables, the PL/SQL code in your trigger is executed. You have access to all old and new values on the record being manipulated. And you can do whatever you like in this trigger.
  • PL/SQL in general, Table Functions in particular – where a PL/SQL function can be wrapped in a View as if it were a table. Imagine this statement: select * from PLSQL_function. That is more or less what Table Functions allow you to do.
  • Virtual Private Database (VPD), aka Row Level Security or Fine Grained Access Control. VPD has been developed especially for web-applications and ASP environments where multiple organisations have their data hosted in a single database. With VPD, each table is virtually partitioned: every user only sees a subset of records. However, there is only one table and the SQL and PL/SQL on the VPD-ed table is exactly the same as SQL against a non-VPD-ed table. VPD is particularly useful when depending on who has logged in to your web application, certain data should be visible. Note: for this feature, you require the Oracle Enterprise Edition.
  • Declarative Integrity – the core value of relational databases is in integrity. Think Primary Keys, Unique Keys and Foreign Keys. No better performing, more solid mechanism for maintaining referential integrity and the uniqueness of logical keys. Oracle also offers check constraints for declaratively specifying and enforcing tuple rules – at the record level. Do not even think about implementing either Foreign Key or Unique Key in the Java Middle Tier.
  • Ref Cursors – a great way of transfering data from a PL/SQL API to a Java application. Ref Cursors are easily defined in PL/SQL – using the SYS_REFCURSOR type – and can surprisingly easily be processed in Java: a Ref Cursor is converted into a JDBC ResultSet. Ref Cursors are ideal for hiding SQL inside the database while providing SQL-like access to data from the database
  • Communication from the database – Oracle knows how to access Web Services, do HttpRequests (get or post) , send emails, communicate with message queues and write to and read from files
  • XML functionality – the Oracle Database can easily return XML documents as query result; it can also validate XML against XSD, transform XML using XSLT, store XML documents either as relational data or as XML Document etc.
  • Good Old SQL – You may think you know pretty much about SQL. I know I thought so, a year and a half ago. Then I discovered Analytical Functions, In Line Views, Hierarchical Queries (9i style), Merge operations, Materialized Views, Database Event triggers, Regular Expressions inside SQL and many more useful syntax, functions and functionality (this resulted in our Oracle 7Up Workshop). Take a look. You might just be surprised.

Conclusion

The first step every Java/J2EE architect should take is be up to speed with the capabilities of the database you have at your disposal. Going from there – with an open mind towards what ought-to-be and what-should in terms of ‘proper’ implementation of functionality and even with regard to traditional boundaries between logical Tiers – decide how to implement functional requirements. Be pragmatical about them, not theoretical or even worse: dogmatic. There are many things that have been optimized in the database kernel. Things your Java code will never be able to do better – in a realistic timeframe. Aggregations, uniquess enforcement, data auditing, security, scalability, copy of large volumes of data, bulk updating or deleting are just a few examples of things that the database is simply better at.

If database portability really is a concern, try to access the database through simple APIs, SQL and Stored Procedure based interfaces, that can be defined in any serious database. The implementation can be different between databases, but your application only converses with the interfaces.

Consider whether the database is really only good for data tier stuff. Dumb data stuff. Be sure that you know what can be done before deciding you do not want it. Do not develop on an $100000 Enterprise Edition database as if it were Microsoft Access. Unless you want keep open the option to migrate to MS Access at some point in the future.

5 Comments

  1. daryl fong September 9, 2005
  2. Toon Koppelaars September 6, 2005
  3. Mike Friedman September 3, 2005
  4. Robert Gauf September 3, 2005
  5. Ben C September 2, 2005