EJB 3.0 Persistence - Introducing EJBQL 3.0 americas cup win 2682133k1

EJB 3.0 Persistence – Introducing EJBQL 3.0

EJB QL or Enterprise Java Beans Query Language has been introduced
in the EJB 2.0 Specification, back in 2001. For some reason, it never
grew on me. I did not like EJB QL and I even ridiculed it. Having yet
another language to learn in the the J2EE stack, trying to imitate SQL,
in a very poor way. I was wary of it and avoided it.

However, at
the same time I was a happy user of the Top Query Language. And of
Hibernate HQL. I liked the elegant way in which queries could be
written in terms of objects. I especially liked the ‘navigation style’
of writing query expressions. And even though at times I missed the
liberty of writing direct SQL – being convinced that only I could write
superior SQL, efficient and functionally complete. The main thing of
writing queries in TopLink’s query language or HQL was that the result
of the query was a bunch of domain objects – POJOs or Entities. I did
not have to convert JDBC resultsets. And those domain objects lived in
the cache and could be used for further manipulation.

So how
come EJB QL was not good enough and those specialized OO Query
Languages were? Part of it was the fact that EJB QL had huge functional
limitations. I also tried to avoid EJBs in general – as they often
brought much more complexity than could be justified by what they had
to offer; I rarely needed to distribute my application for example.

In this article, I will introduce EJB QL as it is defined in EJB 3.0. And I will try to explain why now all of a sudden I am interested
in using it. In a subsequent article, I will go through code examples
of using the new EJBQL 3.0 language, on top of the GlassFish JEE 5
Reference Implementation.

History

EJB QL was introduced in September 2001, in EJB
2.0. Before that time, EJB had finder methods for retrieving objects
based on database data: find all objects and find by primary key. There
was a clear need for more refined queries, was CMP going to really fly.
However, there were many limitations to the EJB 2.0 QL, such as a very
poor selection of SQL functions, no support for joins or subqueries and
not even order by and aggregation.

Some of these shortcomings were rectified in EJB 2.1 in November
2003. In J2EE 1.4, the EJB 2.1 specification tackled two widely
anticipated enhancements to EJB QL. These included a new ORDER BY
clause enabling sorting capabilities that previously were either
hand-coded by developers or offered through proprietary extensions and
new aggregate functions such as AVG, MIN, MAX, and COUNT. Because EJB
QL naturally maps to SQL, these set-based manipulation additions were
seen as critical by many developers and were now required by the
specification for EJB QL 2.1. EJB 2.1 also allows string comparisons
using the greater-than and less-than symbols ( > , < ) in WHERE
statements.

However, even then there were large holes in EJB QL
compared to proper Query Languages such as SQL and even Toplink’s Query
Language and Hibernate’s HQL. And of course EJB QL was only used for
EJBs – and those were becoming less and less popular due to their –
perceived – complexity and relatively limited benefit.

In 2005
the EJB 3.0 specification evolved and in 2006 it will  be finalized as
part of the JEE 5 standard. It is rapidly becoming clear that EJB 3.0
will revolutionize EJB. And will mean the breakthrough for EJB usage by
a much wider audience than before. The main reasons for this:

  • broad support by the main O/R players in the industry, such as Sun, Hibernate/JBoss, Oracle and SolarMetric
  • functional enhancements that bring EJB 3.0 Persistence – including O/R Mapping and EJB QL – to whole new level
  • the ability to use EJB 3.0 Persistence outside the context of JEE
    containers; this not only means that you can much easier (unit-)test
    EJBs, it basically means that any Java application that needs to access
    a database can do so using EJB 3.0. That in turn means that your
    application is much more portable: from outside to inside an EJB
    Container as well as from one O/R vendor to another. Besides, EJB 3.0
    Persistence and EJB QL 3.0 add support for Native Queries: whenever the
    functionality in EJB QL is too limited, you have the option of stepping
    outside the boundaries without sacrifing most of its advantages –
    except for database portability
  • a very good and open source Reference Implementation (GlassFish)

Before diving into EJB QL in EJB 3.0, let’s take a brief look at the
discussions that have surrounded EJB QL from its incarnation in EJB 2.0:

Advantages and Benefits of EJB QL

What is EJB QL to start with? Unlike an SQL-based query, an EJB
QL query queries the abstract schema, not the database. When a finder
method is invoked, the EJB container executes the EJB QL query
associated with that method and returns references to one or more
entity beans that represent data satisfying the query criteria. An
example of a finder method might be one that identifies bank accounts
that have an account balance that is less than a certain threshold. The
EJB QL query for that finder method queries the abstract schema and
returns references to one or more entity beans that represent accounts
meeting the query criteria.
So you do not write SQL in terms of
tables and columns but a query in similar syntax in terms of objects,
properties and references or relations,supporting ‘navigation’ such as in: select e from employee e where e.department.location = ‘DALLAS’.

EJB QL provides a very good way to turn query results into
objects: the result of a EJB QL query is a Collection of Objects or
Entities. The alternative using JDBC and PreparedStatements based on
Strings with SQL statements results in ResultSet objects that the
application somehow has to turn into Entities, Domain Objects or POJOs.

EJB QL is portable across Application Servers and
across databases – any J2EE compliant application server had to
implement EJB QL. So an application using EJB QL deployed in any of the
compliant Application Servers would function okay – at least in theory.

Java Developer does not have to learn SQL to access database
and he/she most certainly does not have to learn the SQL dialect of
each RDBMS that the application may be deployed against. However of
course, they must learn EJB QL – even if they already happen to know
SQL…

An EJB Container parses
and validates EJB QL queries before entity beans are deployed, so there
is not really compile time checking of queries but definitely a
verification before run-time, which is better than Strings with JDBC
SQL statements that are only tested when the application is run. On the
other hand: such Strings typically can easily be tested in SQL frontend
tools such as Oracle’s SQL*Plus, so in fact they are easier to verify
than EJB QL queries that can only be verified inside the EJB container.
Or at least: up until EJB 2.1 that used to be the case.

Disadvantages/Limitations of EJB QL (at least before 3.0)

No native SQL so no way to leverage the SQL features of a particular
database such as calling PL/SQL functions from a Query, using advanced
features such as InLine Views, Scalar SubQueries and Analytical
Functions. Sometimes we could work around such limitations through the
use of Database Views, but typically the results could not be easily
mapped to the Entities.

Since no SQL knowledge is required,
queries may not be optimally efficient  – how much faith do you have in
the OO and Java specialists that developed your J2EE Server’s EJB
container when it comes to writing generic, yet optimally performing
SQL statements for your particular flavor of Database (version 8.1.7.4
of Oracle Standard Edition or version 3.23 of MySQL with InnoDB).

EJBQL up to and including 2.1 missed essential features like
HAVING for testing on aggregate results in the where clause. Also: the
number of functions that is supported in EJB QL 2.1 and before is
relatively small, compared to the richness of SQL in popular relational
databases such as MySQL, PostgreSQL and Oracle. In 2.0, EJB QL had just
five functions: CONCAT(), LENGTH(), LOCATE(), SUBSTRING(), ABS(), and SQRT(). In 2.1 the MOD() function was added, along with aggregate functions: COUNT(), MAX( ), MIN( ), AVG( ) and SUM( ).EJB
QL 2.0 even lacked features like the ORDER BY identifier, which becomes
very handy as the application becomes large and complex. Some of the
application servers may provide these features, but usage of the same
may limit portability across application servers. Date and time values
should be passed as millisecond value using Java primitive type, long.
EJB QL 2.0 did not support fixed decimal comparison in arithmetic
expressions. String and Boolean comparison is restricted to = and
<>. However, the built-in functions(CONCAT, SUBSTRING, etc.) can
be used to perform other operations on String. EJB QL 2.0 does not
support comments.

EJB 2.1 did (or I should say does since it is still the current specification) not support Group By, Having or SubQueries.

Since
EJBQL looks so much like SQL, you can hardly claim that developers do
not have to learn SQL: they have to learn EJBQL which is very close to
it. In fact, since most know already some SQL you can also call this a
disadvantage, as developers need to learn yet another syntax. On the
other hand, they do not have to learn the SQL dialect for each database
their application will run against. So there are pros and cons here.

Portability
across databases sounds great. However, in actual practice this is not
frequently required or made use of. Most organizations tend to
standardize on one or two Enterprise RDBMS’s, typically Oracle, DB2 or
SQL Server, and there is very little migration between those platforms.

And
of course compared to the out-of-container options of EJB 3.0, 2.1 and
before are somewhat limited as they can only be used for CMP in EJB
inside J2EE containers.

New and Improved in EJB QL in the 3.0 Specification

Native Queries
– whose results are still returned as objects! Allow direct SQL over
actual physical database schema. Very useful for some applications
where database portability is not important. Allow SQL query results to
be mapped into entity beans and/or instances of other Java classes.

Bulk DML
– EJBQL has a new update and delete statement, so it’s not just for
queries anymore. Update and delete let developers do things like, give
all of your favorite customers discount or something like that, without
having to load all of those customer objects into the application
server’s cache, so you can do the updates directly in the database and
that should be a big win for performance. Statements such as: delete
from Customer cust where cust.id = 12345 and update OrderLine ol set
ol.fulfilled = ‘Y’ where ol.order.id = 9876543

–Named parameter bindingThe
use of named parameters applies to EJB QL, and is not defined for
native queries. Only positional parameter binding may be portably used
for native queries.

–

Pagination control
the Query interface provides methods to set the first record to return
(for example starting at record 50) and the maximum number of records
to return – for example the next 25.

Projections – Multiple object types or partial objects returned from a single query.
Previously EJBQL was restricted to retrieving an entire object or one
column of data. The new EJBQL 3.0 specification allows developers to
create projections, which is a list of arbitrary columns from one or
more objects in the database. It is a powerful method for querying
reporting functions which can produce subsets of objects, etc. This
makes it unnecessary to retrieve an entire large object when only a
partial list of fields needs to be represented. Retrieving and
presenting data is now possible without the extra overhead of
retrieving the entire complex object or object model. This provides a
much simpler and useful method for querying complex object model data
which also increases application performance at the same time. It is
nice to be able to run queries that return a data
structure which spans multiple tables, without having to go to multiple
entities. At the moment people use fast reader patterns to bypass EJB
for this task.

A projection query is basically a query that
spans one or more mapped entity returning a select set of attributes. A
projection could optionally reference the relational schema directly
but most usages involve leveraging the mappings and defining the
attributes needed and selection criteria based on these mappings.

SELECT new CustomerDetails(c.id, c.status, o.count)
FROM Customer c JOIN c.orders o
WHERE o.count > 100

Join Support:
Explicit support for both inner and outer JOIN operations (select o
from Order o left join o.lineItems li where li.amount > 100). A
FETCH JOIN allows you to indicate that the related entities—e.g.,
Orders for a Customer—should also be fetched, though the SELECT clause
only selects Customers.

SELECT c.id, c.status
FROM Customer c JOIN c.orders o
WHERE o.count > 100

Sub Selects : select … where exists(select li from o.lineItems li). Example of correlated sub-query:

SELECT DISTINCT emp
FROM Employee emp
WHERE EXISTS (
SELECT spouseEmp
FROM Employee SpouseEmp
WHERE spouseEmp = emp.spouse)

Support for aggregation including group by and having Something like:  select o.id, sum(li.amount) from Order o join o.lineItems li group by o.id and

SELECT c.status, avg(c.filledOrderCount), count(c)
FROM Customer c
GROUP BY c.status
HAVING c.status IN (1, 2)

(Additional) EJB QL functions: concat(), trim(),  substring(), lower(), upper(), length(), locate(), abs(), sqrt(), size(), mod(),

Support for dynamic queries through EntityManager APIs. So in addition to static queries, EJB 3.0 supports dynamic queries,
which I think is a very useful thing and I am very excited that that
has been added.

Support for setting Hints on queries
These hints can be passed to the specific EJB 3.0 Implementation – for
example Hibernate or TopLink or perhaps even to the underlying SQL
engine of the RDBMS in use. If the Hint is not recognized, it is
ignored, so it should not impede portability.
PolyMorphic Queries

By
default, all queries are polymorphic. That is, the FROM clause of a
query designates not only instances of the specific entity class(es) to
which it explicitly refers, but subclasses as well. The
instances returned by a query include instances of the subclasses that satisfy the query conditions. For example, the query

select avg(e.salary) from Employee e where e.salary > 80000

returns the average salary of all employees, including subtypes of Employee, such as Manager and Exempt.

Resources

The New EJBâ„¢ 3.0 Persistence API. Linda DeMichiel, Sun Microsystems, Gavin King, JBoss, Mike Keith, Oracle, Patrick Linskey, SolarMetric – JavaOne 2005

How-To Queries using EJB 3.0 (Date: 2/28/05) by Andrei Ilitchev – Oracle Technology Network

Enterprise JavaBeans 3.0 by Kenneth Saks, NYC JavaSIG February 23, 2005

Hibernate HQL Reference – also doing a fair bit on EJB QL

EJB 3.0 Expert Group, JSR 220: Enterprise JavaBeansTM,Version 3.0 Java Persistence API – Public Draft

2 Comments

  1. Hoang Anh Nguyen December 11, 2006
  2. Lucas Jellema January 10, 2006