RuleGen – the next generation framework for Data Logic (aka Data oriented business rules)?!

RuleGen is a framework that aids in implementing data integrity rules inside the Oracle RDBMS. That’s what it says on the website – www.rulegen.com – and that was the story told by Toon Koppelaars, architect of the RuleGen product and co-founder of the RuleGen company, at the AMIS KC Business Rules session two days ago. For an audience of some 25 Oracle specialists, Toon made it very clear where he stands, what RuleGen does – and what it does not – and what is in it for us. In this article I will discuss RuleGen as well as the question if RuleGen is the next generation framework for implementing Data Logic – and as such perhaps the successor of the CDM RuleFrame framework that I laid the foundation for in the late 1990s (while working at Oracle).

RuleGen - the next generation framework for Data Logic (aka Data oriented business rules)?! soahumantask082 ....

Scope 

First of all Toon defined the term Business Rule in the scope of RuleGen. He divided information systems and application code into  three segments: Data Logic – everything that describes valid states of the data -, UI – everything that contributes to painting the user interface and responding to user events – and Business Logic – that vague, hard to define area of everything else. RuleGen and his talk on Thursday were about Data Logic. Business Rules – data constraints – that apply to data – in the database. Not to data in files, client or middle tier. Not to process logic, workflow or authorization and not to UI. When a Business Rule can be completely expressed in terms of the tables and columns – then it’s a rule for RuleGen.

Some examples of such business rules:

  • Employees may not earn more than their managers
  • A department may not have more than two clerks
  • The most recently hired employee in a certain job may not have the highest salary of all employees with that job.

The real value of any company – apart from its people, Toon granted us – is its data. Application may come and go, the data(base) is here to stay. And protecting that data, ensuring its integrity is what Toon and RuleGen are all about.

RuleGen complements the Oracle integrity constraints infrastructure. Toon makes a clear choice: for attribute and tuple rule – use the Column Definition and Check Constraints that Oracle offers. Use PK, UK and FK constraints where appropriate. For the other constraint rules that Oracle does not have a declarative solution for – consider RuleGen. For those of you familiar with CDM and CDM RuleFrame – we are primarily talking about entity and inter entity rules. At this point RuleGen does not support dynamic rules – though some types of dynamic rules are within scope for the next release(s) of RuleGen. RuleGen will probably never handle ‘complementary’ rules such as the Change Event Rules with or without DML and complex default column values. Those are not business rules according to Toon’s definition.

Concepts 

At the heart of RuleGen is the idea of the Transition Effect (TE). That is the set of records that a DML statement has modified. It consists of three subsets: inserted rows, updated rows and deleted rows. Constraints only need to be enforced for rows that are part of the TE. Before the statement, the data integrity was satisfied. Only the changes – collectively known as TE – can have caused a violation of the integrity rules so when enforcing the business rules, all we need to do is look at those changes. RuleGen makes the TE available through three views (per table): the ITE, UTE and DTE. These views provide for all rows changed in a statement the relevant column values – and in case of updates both the old and the new values. (we will come back to the ‘relevant’  columns).

Of course not every row in a TE view for a certain table is to be validated for every rule defined against that table. Inserting a new clerk for example will never violate a rule that only applies to managers. Deleting an employee will never lead to violation of a rule that puts an upper limit on the number of employees. Going forward from that conclusion, the second stage in RuleGen is filtering – per rule – the set of rows in the TE, down to a subset of rows that is potentially capable of violating the rule.

The third stage is the actual validation: given the selected rows, those that are part of the TE – changed in the current statement – and have the potential of causing a rule violation, we will perform the actual check whether the rule has violations against it at the end of the statement. 

For example: if the rule specifies that the average salary in any department should be under 10000 –

  1. the first stage (TE) will give us all inserted, updated and deleted employees,
  2. the second stage will limit this to only newly inserted employees as well employees whose salary was increased (as salary decreases could never lead to violation of this particular rule) and employees that were deleted (as removal of a poorly paid employee could bring up the salary) AND for these employees it will return the set of department numbers (each department only once)
  3. the third stage will check for each department if the average salary is under 10000

Note that validation takes place at statement level – not at transaction level (commit time). More on this later on.

The first stage is based on row-level triggers – the only mechanism in the Oracle database that allows us to build up the TE views. The Oracle RDBMS does not offer a list of all rows modified during the current transaction. You need to keep track of those yourself.

The second and third stage are largely SQL driven: SQL to filter from the TE the rows that should be validated in the third stage and the SQL in the third stage to perform the actual validation (perform a query to find violations and declare success when none are found).

The theoretical foundation for RuleGen is described in the book Applied Mathematics for Database Professionals (2007, APress) that Toon co-authored with Lex de Haan. In particular Chapter 11 gives a clear overview of the logic underlying RuleGen. It also gives code examples of implementing TE views and triggers that use them for implementing constraints. This chapter also describes the locking mechanism for properly handling multiple sessions changing rows that trigger the same business rule. After closely reading this chapter – you could attempt to build your own RuleGen (although it would still be a considerable task). At least you will understand very well what RuleGen does on the inside.

Implementation 

RuleGen uses one After Row trigger for every table that has business rules defined against it – this trigger helps constructs the Transition Event (TE) – using a global temporary table. Note: there is but a single temporary table that supports every table with rules defined against it. This is an object table that holds instances of subtypes created per table. The table specific subtype has attributes for the involved columns.  It also has an After Statement trigger for every table with rules – that is where and when the validation is initiated. The actual 2nd and 3rd stage – filter with TE query and validate with CV query – is done in a package. Every rule is implemented in its own PL/SQL package.

For the implementation of a business rule, these steps are required:

  1. Create the rule and indicate the involved tables and columns (which column values are required for the validation which is equivalent to ‘values in/changes of which columns could trigger rule violation)
  2. Provide up to three TE queries – for Insert, Update and Delete – , that specify for the rule which rows from the TE should be investigated (note: this can also mean a distinct list of parameter values that help focus the validation on a specific area, such as a list of departments for which the number of Clerks needs to be validated). If for example deleting records can never cause the rule to be violated, no TE query for the delete event needs to be provided.
  3. Provide the CV query – Constraint Validation – that is executed for every row returned from the TE queries, can make use of parameters provided from the TE-queries, and queries for the existence of any rows that violate the business rule. When no row is returned by the CV query, the rule is not violated. If the query returns a row (never more than one) that signifies a violation and the CV query should return the appropriate error message.

TE queries refer to views: INSERTED_ROWS, UPDATED_ROWS and DELETED_ROWS. These views – part of the RuleGen runtime infrastructure – are implemented using Table Functions – they query from the table specific TE collection and filter on the required event.

RuleGen provides an APEX based – somewhat sober – Web Frontend where these steps are performed by the developer. It will generate the packages per rule as well as the two triggers and the subtype required for a table. Note that not only the RuleGen runtime packages are wrapped, but also the packages it generates for every rule!

RuleGen offers control of the sequence in which rules are validated for a table – something we cannot do with declarative constraints in the Oracle RDBMS.

Much more importantly, RuleGen allows us to specify both immediate and deferred as trigger moment. That means that we can have rule violations reported at the statement level: if the at the end of the statement the data is not in a valid state, an Application Error is raised and the statement rolled back. And also at the transaction level: only if at the end of the transaction, after perhaps many DML statements the data is not in a valid state, an Application Error is raised and the statement rolled back. Note however that the actual validation is at the statement level in both cases and after each statement, we can ask the RuleGen engine for the current list of violations. RuleGen will verify for each subsequent statement whether the outstanding violations have now been repaired. This initially puzzling approach – why not simply validate at transaction level? – is quite useful (though perhaps slightly less efficient performance wise) for supporting ‘rollback to savepoint’ – an operation that is bound to severely confuse a framework such as CDM RuleFrame. Note: Toon strongly recommends only to use the deferred trigger time if it is really required for some business reason. Even though conceptually a transaction only has to abide by all the rules when it is committed, Toon’s considers it preferable to handle all validations at statement level.

 

RuleGen does not have an equivalent to the 10gR2 feature of DML Error Logging that allows us to complete a statement that violates (immediate) constraints and saves the violations into a separate table for inspection. It would be nice if it did – but it is only a nice to have. However, the above mentioned deferred trigger moment with the per statement validation and list of violations comes very close.

RuleGen follows the Oracle approach in many other ways. Take the default behavior of Oracle with regard to constraints and you pretty much know the default behavior for rules – constraints – implemented with RuleGen. Deferrable, initially immediate for example (statement level validation) and Enabled. It allows us to disable  rules (per session if we like) and turn them on at some later point in time. It does not have – at this point, however it is on the list with Toon’s intended enhancements – the option to enforce the rule at the time of re-enabling the constraint (and possibly saving violations into an exceptions table).

A nice option is the switch between STOP_ON_ERROR and CONTINUE_ON_ERROR. The default – following Oracle’s approach with declarative constraints – is STOP_ON_ERROR which basically means that as soon as an error ( a rule violation) is found, an application error is raised and all further processing is aborted. With CONTINUE_ON_ERROR, RuleGen will continue the validation of all rows and rules, and at the end report back with an application error and a list of all violations it encountered – comparable to the error (message) stack in CDM RuleFrame.

I was impressed by the fact that RuleGen already knows how to deal with multiple sessions: changes made in one session at the same time as changes made in another session can lead to incorrect results. For example with a rule like ‘only one president allowed’: session 1 inserts a president and validates: since its president is the first, the validation succeeds. Session 2 does exactly the same, and since session 1 has not yet committed, the president is not yet visible to session 2 and therefore its president also seems to be the first and validation succeeds. Since both sessions are in a valid state, both can commit and we end up in an invalid state – with two presidents. Of course if we do the validation at the transaction level – like RuleFrame does – chances of this occurring are much smaller. But this situation can still arise.

The solution I came up with – back in 2003, with a valuable insight from Ton Elie as we landed on Miami Internation Airport for the ODTUG 2003 conference- uses the dbms_lock package for grabbing user defined locks for such business rules. And lo and behold – RuleGen has a very similar solution to this issue, that effectively repairs this potential hole – that exists in almost all homegrown business rule implementations! In the example above, session 1 would try to get the user defined lock called ‘BR_ONE_PRESIDENT_ONLY’; it would succeed and insert the president. Session 2 would fail to grab the lock, as session 1 holds it. Session 2 has to wait for session 1 to commit.By that time the president inserted by session 1 is visible to session 2 and validation of its president would correctly fail. If the rule would be ‘only one president in a department’ the user defined lock would be called ‘ONLY ONE PRESIDENT IN DEPT’||deptno – and the two sessions would only collide if they were dealing with the same department. All of this – fairly complex stuff – is already part of RuleGen.

Quick Summary of Strong and Weak spots of RuleGen 

In a very brief, black & white summary, I would say the strong points of RuleGen include:

  • Clear focus and scope: data logic, just entity, inter-entity and in the near future also dynamic rules 
  • SQL (set) oriented (not row by row processing); efficient, clear, robust; single expertise (no PL/SQL skills required)
  • Closely aligned with the standard mechanisms and approach of the Oracle RDBMS (same defaults, same terminology, very complementary)
  • Support for Immediate and Deferred
  • Support for continue on error (as well as stop on error)
  • Support for Rollback to Savepoint
  • Support for Multiple Sessions (where most solutions fall short)
  • Run-time Monitor (though only very simple Logging at this time)
  • No dependencies other than the Oracle RDBMS (no Designer, Forms,… etc.)
  • Run-time UI for managing rules (enabled/disabled, immediate/deferred) (there could be much more run-time management, but it is a nice start)

 Areas where improvement would be nice:

  • there is no support from the RuleGen framework for custom PL/SQL functions you might want to use for very complex business rules (where the SQL simply becomes to hard to write or maintain)
  • the support for the error messages is very rudimentary; no support for i18n – translation of error message or distinction between message and message parameters
  • the UI could be much more user friendly and much more productive
  • the UI should come in at least two, perhaps three flavors (or roles): developer, DBA and application manager/administrator
  • I would like to see more run-time monitoring support (which rules are frequently violated (upgrade the application), by whom (train this person!) and for which SQL statement (as suggested by Toon himself)
  • support for Analysis
  • support (guidelines and enformance) for naming conventions
  • projects and/or authorization in the RuleGen UI: not all developers should be able to access all rule definitions
  • earlier checking of the SQL (now erroneous SQL is only revealed when the rule package is installed in the database)
  • support for rule patterns through SQL templates for TE and CV queries or even wizards that compile those queries based on some input parameters (Toon mentioned rule patterns as one of the enhancement high on his list)
  • derive the involved columns from the TE and CV queries (using 11g Column Level dependencies) (perhaps even derive the TE queries from the CV query?)
  • support for validation of rule upon enabling the rule
  • the RuleGen repository currently has no support for version control or any type of life cycle management of rules. Since the generated code is wrapped, it does not work well in Version Control systems – especially with regard to compare and merge operations; I find this a limitation for putting RuleGen to use in larger, more complex environments.

Comparison with CDM RuleFrame

Given my own background – as well as the fact that it is the only alternative framework for Business Rule implementation in the Oracle database (working with Toon’s definition of Business Rule as Data Logic – data model based rules for valid end-states) – it seems logical to compare RuleGen to CDM RuleFrame – the framework I helped build starting in the Summer of 1999. See: Headstart Oracle Designer homepage on OTN for more details on CDM RuleFrame.

First of all, CDM RuleFrame reached its peek around 2002 and has more or less been dormant since 2004, the year Toon started development of RuleGen. For the last four years, CDM RuleFrame has not undergone any (serious) changes. In September 2004 I wrote an article on this blog: The future of CDM RuleFrame – the Oracle framework for server-side business rules. In this article, I provide a list of over a dozen enhancements that I considered from useful to mandatory for a healthy future for CDM RuleFrame. Many of the changes I proposed were fairly easy to implement. At least one of them was absolutely required, as CDM RuleFrame has a hole – it is not entirely robust. You can read about this hole in this article: On the false sense of security with PL/SQL based implementation of business rules – and what to do about it. RuleGen has a solution for this hole! CDM RuleFrame also was never disassociated from Oracle Designer – its strong ties with Designer initially a great boon but now an increasing liability, given where Oracle Designer is headed.

RuleGen only recently hit the market – although its development started in 2004 and it has been in use at Centraal Boekhuis – a major Oracle user with advanced logistical processes – since 2005. It uses modern database facilities, has a strong theoretical foundation and has dedicated, passionate attention from one of the best guys in this area. It is not yet complete (no dynamic rule support for example) and development cannot progress very rapidly at this point given the size of its development team. I wonder whether Toon would consider open sourcing (part of) RuleGen to invite a large community to help grow the product. I know we at AMIS would very much like to help with such a project (and we may do that anyway, although the wrappedness of much of RuleGen is not very inviting).

CDM RuleFrame is for most intents and purposes an Oracle 7 framework. At one point in time, that was a huge advantage, as any organization still on Oracle 7 could adopt it. However, going forward, CDM RuleFrame has failed to leverage many more modern facilities in the Oracle RDBMS. While there could have been gained much – especially in terms of performance and ease of implementation. There are no signs from Oracle Consultancy, the owner of CDM RuleFrame, that substantial improvements are to expected. I even heard rumors that the next generation Oracle Health (OpenZorg) – will not make use of CDM RuleFrame. We invited Oracle Consulting to also present in our Knowledge Center session on Business Rules, but they declined that invitation.

CDM RuleFrame – rooted in CDM, Oracle’s Custom Development Method, provides a very structured approach to analysis and design of business rules. Its classification scheme, naming conventions, event analysis and analysis tool support (tied to Oracle Designer) offer analysts, designers and developers a shared BR language, approach environment. It also covers a much wider range of rules – including much of what Toon considers Business Logic, outside his domain of Data Logic. RuleGen focuses on implementation of rules and offers a little support for technical design. Both RuleFrame and RuleGen have important run-time services in addition to the core engine that takes care of identifying the rows that need investigating and invoking the validation code at the proper time. I would like to see some guidance at least – and preferably considerably more – as to how do perform analysis and design for business rules leading up to a RuleGen implementation. Toon starts his story with the definitions of the Business Rules to be implemented already available. He currently does not have analysis/design in his scope. Nor does he pay much attention to naming business rules. I think RuleGen would be much improved by stronger support at that level – or at least a clear How-To on how to do analysis and make the transition to RuleGen. Perhaps even a way to import Business Rule definitions from for example an Excel document into RuleGen would we useful.

CDM RuleFrame is very much a PL/SQL oriented framework. Not just the triggers registering which rows are changed – unavoidable and unavoidably PL/SQL based – but also the logic identifying which rows to validate (the need functions in the CAPI packages) and the (per row) validations themselves are coded in PL/SQL. Especially for batch operations, this approach can cause performance issues. A set based (SQL) approach for validation can be much more efficient for larger numbers of rows. Writing the need function logic and the actual validation logic can be easier in SQL – for simple rules – or in PL/SQL – for very complex rules. RuleGen expects you to write all filtering (the TE queries) and validating logic (CV queries) in SQL. It is a clear, straightforward and consistent approach that only requires your team to have SQL skills – but sometimes quite advanced SQL skills (or supporting PL/SQL functions for which RuleGen does not seem to offer support for creating or managing). However, in many situations where Entity or Inter Entity rules are enforced, a SQL based set approach is superior to a row-by-row PL/QL implementation.

CDM RuleFrame is very much transaction oriented – rather than statement level. While it can do statement level validations when so instructed, the default behavior is to validate only once the transaction is complete. The philosophy behind this: we only care of the data satisfies all conditions for validity when the client (user) considers the transaction logically complete. Before that time, while still making changes, the transaction can be in any invalid state – it does not (yet) count. (in fact, I have used a slide created by Toon himself for making that point about intermediate ‘invalid’ database state when presenting on CDM RuleFrame!). All validations are performed at the end of the transaction. That can prove very efficient: for example a rule such as ‘there may not be more than two Clerks in a department’ only needs to be validated once during (at the end of) a transaction for every department. With statement level validation – as RuleGen does – you stand the risk of inspecting the same Department several times during a transaction, if during that transaction multiple statements insert or update (or delete if RuleGen is in deferred mode for this business rule). Transaction level validation would also mean taking locks for much shorter periods of time: instead taking a br-lock as soon as the statement occurs and only releasing it upon commit, a br-lock would only need to be taken at the end of the transaction, after which only validations follow and the commit will follow shortly afterwards. Thus transaction level validation reduces contention and helps scalability. (though in a rather small way for most applications).

CDM RuleFrame very much has client applications in mind. It has a good mechanism for making the message stack available, with complete implementations for PL/SQL, Forms and Java (ADF BC) clients. Toon -and therefore RuleGen – are more inwardly oriented it seems. RuleGen will raise an application error and leaves it up to the client to analyze the PL/SQL error stack. There is one run time service that gives some relieve, but it would be a good thing to do a little bit more in RuleGen for supporting the various clients. For example a code example of how to work with RuleGen from ADF BC would be very useful.

RuleGen requires an Oracle 10gR2 database, while CDM RuleFrame only needs Oracle 8i.

All code used in the implementation of your Data Logic is completely accessible with CDM RuleFrame – you can not only see but also edit all generated code as well as the runtime engine. RuleGen does not allow you to see or edit either the generated code or the run-time engine: it’s all PL/SQL, and it’s all wrapped.

CDM RuleFrame comes as part of the iDevelopment Accelerators Suite, of which Headstart (supporting generation of Oracle Forms from Oracle Designer) is the main component. The entire package sells at $2500 per developer. (or at least that is the latest information I have on the pricing). RuleGen has a very different pricing model: you pay per implemented rule: the first set of 40 rules go for 100 euro per rule (4000 euro). You can acquire additional sets of 20 rules for a decreasing price per rule. When you hit 400 rules, you can buy an Enterprise license with an unlimited quota of rules – for 30000 euro (currently some 45k USD). Note: these prices are not per application or project, but for your database.

RuleGen shows much promise: the core engine is there, the theoretical foundation is solid. The hardest challenges have been dealt with. Toon has great ideas for enhancements and extensions – for example Rule Patterns or Templates, more run-time services and monitoring and support for dynamic rules. 

Conclusions

Implementation of Data Logic – data oriented business rules or data constraints – is of the utmost importance for ensuring the integrity of an organization’s most important asset: its data. Implementing Data Logic is by no means trivial. It is not simple nor straightforward to achieve robust, efficient implementation of data logic. Data Logic should be implemented close to the data, inside the database. That is the only way to guarantee robustness. Besides: it typically gives the most efficient implementation, both to develop and to execute.

A supporting framework is valuable – to put it mildly, especially for the more complex business rules for which the declarative database constraints do not suffice. Such a framework can provide higher quality (robustness, real integrity) and better performance (of the rule validation) and productivity (during development and even more so during maintenance).

RuleGen is a modern, advanced and very well designed framework. It is the best solution for implementing Data Logic I have come across in many years. Even though it does not yet offer support in some key areas that should be well within its scope, it already has the key components for any BR framework: the core runtime engine, support for statement and transaction level rule enforcement (including support for Rollback to Savepoint!) , a solution for handling multi-session issues, (rudimentary) error message handling and (rudimentary) runtime monitoring. What is currently lacks in RuleGen from my list of desirable or even necessary functions can be added fairly easily – even by external parties. As long as Toon makes some haste with the addition of Transition Rules (dynamic rules).

I would like to hear a confirmation from Toon that any organization starting now with RuleGen is entitled to free updates of later releases – at least until some of the key functionality has been added. That would help the adoption of RuleGen starting right now.

From our (AMIS’s) point of view: we will seriously consider using RuleGen for our projects and advise our relations on how to use it. We will investigate in optimal Testing approaches for RuleGen Business Rule implementations – Patrick will hopefully soon be able to blog about that. We may work with Toon in creating how-to’s on ‘migration from RuleFrame’ and ‘integrating RuleGen into Java Web Applications’.

I am really enthousiastic, inspired almost, by Toon’s presentation. I have spent several years of my career working on data logic and business rules. This evening was a happy trip down memory lane as well as the beginning of a path leading into the future. Well done Toon. I am looking forward to working with you.

One Response

  1. Patrick Barel March 22, 2008