Oracle RDBMS 11gR2 – goodbye Connect By or: the end of hierarchical querying as we know it
Many years ago, Oracle basically set the standard in SQL. Whatever was Oracle SQL could be seen as the standard. This has never been absolutely true – ANSI SQL was different from Oracle SQL. In some respects the differences can be traced back to lack of functional richness in the standard. However, in certain areas, Oracle has walked its own path with certain functions and features with the rest of the RDBMS pack following another route. It seems to be as if starting with Oracle RDBMS 9iR2, Oracle has made several important steps towards embracing the ANSI SQL standard syntax, usually while maintaining its own specific flavor of those same functions.
Some examples of Oracle specific syntax that are also available through their ANSI SQL counterpart – that in most instances has even more functionality – and that is almost always the preferred approach going forward:
- outer join syntax: in addition to Oracle’s (+) notation, 9iR2 introduced the left outer join, right outer join and full outer join (like (+) on both ends of the join condition, something that Oracle does not support)
- the Decode function that with 9iR2 can (and should) be replace by the CASE expression
- the to_char function that in many instances can be replace by extract {year|month|day|HOUR | MINUTE | SECOND} from date-value
- the ltrim and rtrim functions that can be replaced by the trim function with leading and trailing settings
The 11gR2 release has another example of Oracle specific SQL that is now complemented with the ANSI SQL standard approach:the CONNECT BY syntax for hierarchical querying that can now be replaced by the ANSI SQL standard Recursive Subquery Factoring clause, which is an extension to the WITH clause introduced as long ago as Oracle7 and still relatively underused by Oracle SQL developers.
The table I will use for demonstrating this new syntax is EMP. The records in EMP refer via their MGR column to other records in EMP, allowing us to create an organization hierarchy starting with KING – an employee without manager – and traversing the org-chart all the way down.
The classic way of performing the hierarchical query looks like this:
with employees (empno, name, level) as ( select empno, ename, job, mgr, level from emp start with mgr is null connect by prior empno = mgr ) select * from employees
This syntax has been available – in this very Oracle proprietary way – from the very first releases of the Oracle RDBMS (release 2 or 3). Later releases add some niceties to hierarchical queries, most notably the Oracle RDBMS 9iR2 and 10g releases.
- order siblings by
- sys_connect_path
- connect_by_root, connect_by_is_leaf,
- nocycle, connect_by_is_cycle
However, ANSI SQL defines an alternative approach to Hierarchical Queries. It is called Recursive Subquery Factoring (or Recursive With Clause). Apparently it is supported (to some extend) in DB2, SQL Server and PostgreSQL. It extends the WITH clause to enable formulation of recursive queries: “The query in the WITH clause (subquery) can refer to itself”
This Recursive WITH clause is ANSI. SQL The introduction of this syntax in 11gR2 makes Oracle ANSI-SQL-compatible for recursive queries.
So what is the equivalent of the standard hierarchical query shown above in the new syntax:
with employees (empno, name, job, mgr, hierlevel) as
( select empno, ename, job, mgr, 1
from emp
where mgr is null
union all
select e.empno, e.ename, e.job
, e.mgr, m.hierlevel + 1
from emp e
join
employees m
on (m.empno = e.mgr)
)
select *
from employees
It looks a little complex at first. And it is more lines of code – so how can it be better? Well, give it a moment or two. When we look at more complex hierarchical queries and the use of the specials like sys_connect_path and the connect_by_root some advantages of this syntax will start to shine through.
For the moment, let’s disect this query:
The with clause consists of two queries that are combined the UNION ALL. Both queries return the same list of columns – as is required by UNION ALL. The first query is used to retrieve the root-node(s), the starting point(s) of the tree. In this case, this query only returns KING – the employee with MGR is null. The second query will add all other tree nodes to the result set returned by the subquery. This is where the magic lives: the second query refers to the subquery as a whole. The query that returns the other-than-the-root tree nodes joins with the subquery itself. What this is saying is something to the effect of:
- the second query in the recursive subuquery is recursively processed, in subsequent iterations
- in every iteration, we enter a deeper level in the tree
- each iteration uses the the collection of all nodes retrieved by the first subquery (the root nodes) and all previous iterations. that means for example that the first iteration of select e.empno, e.ename from emp e cross join employees m on (m.empno = e.mgr) has the root-node KING to join with from the employees set. The second iteration will have the other managers serving under KING to join with.
With this understanding in our heads, could we then come up with the syntax we need for retrieving what we can do with the Oracle goodie SYS_CONNECT_PATH, the path from every tree node all the way up to its root with node values added to a string concattenated together for all nodes in the path?
It is actually surprisingly simple. To return the alternative to SYS_CONNECT_PATH(ename, ‘/’) all we need is the following:
with employees (empno, name, mgr, hierlevel, path) as
( select empno, ename, job, mgr, 1, ename
from emp
where mgr is null
union all
select e.empno, e.ename, e.job
, e.mgr, m.hierlevel + 1
, m.path||'/'||e.ename
from emp e
join
employees m
on (m.empno = e.mgr)
)
select *
from employees
This will select ‘KING’ as path for the first node and then for each node deeper in the tree take the path value from the parent node and add ‘/’||ename to it.
In later articles we will look at the more advanced aspects of the new recursive subquery factoring syntax and functionality.
Resources
Oracle 11gR2 SQL Language reference – Recursive With Clause



2/9/2009 - 12:09 am
Nice post. Thanks a bunch for this recap.
9/9/2009 - 2:46 pm
Horrible, almost uniteligible! Hopefully, Oracle Corp. will not de-support CONNECT BY anytime soon.
9/9/2009 - 3:58 pm
The query fails using “cross join”, I need to replace it with “inner join”:
SQL> with employees (empno, name, job, mgr, hierlevel) as
2 ( select empno, ename, job, mgr, 1
3 from emp
4 where mgr is null
5 union all
6 select e.empno, e.ename, e.job
7 , e.mgr, m.hierlevel + 1
8 from emp e
9 cross join
10 employees m
11 on (m.empno = e.mgr)
12 )
13 select *
14 from employees
15 ;
on (m.empno = e.mgr)
*
ERROR at line 11:
ORA-00907: missing right parenthesis
SQL> ed
Escrito file tmp\8592.buf
1 with employees (empno, name, job, mgr, hierlevel) as
2 ( select empno, ename, job, mgr, 1
3 from emp
4 where mgr is null
5 union all
6 select e.empno, e.ename, e.job
7 , e.mgr, m.hierlevel + 1
8 from emp e
9 inner join
10 employees m
11 on (m.empno = e.mgr)
12 )
13 select *
14* from employees
15 /
EMPNO NAME JOB MGR HIERLEVEL
———- ———- ——— ———- ———-
7839 KING PRESIDENT 1
7566 JONES MANAGER 7839 2
7698 BLAKE MANAGER 7839 2
7782 CLARK MANAGER 7839 2
7499 ALLEN SALESMAN 7698 3
7521 WARD SALESMAN 7698 3
7654 MARTIN SALESMAN 7698 3
7788 SCOTT ANALYST 7566 3
7844 TURNER SALESMAN 7698 3
7900 JAMES CLERK 7698 3
7902 FORD ANALYST 7566 3
7934 MILLER CLERK 7782 3
7369 SMITH CLERK 7902 4
7876 ADAMS CLERK 7788 4
14 filas seleccionadas.
26/9/2009 - 7:18 am
Hi Mladen: the syntax will grow on you, trust me! I had that same initial difficulty with this syntax. And I remember feeling similar about Analytical Functions. And boy, have I gotten used to them later on! I trust that this recursive subquery clause will in the end prove more intuitive, especially when it comes to extras like sys_connect_path and connect_by_root. However, the good old connect by syntax will be supported for a very long time (just like the (+) legacy outer join notation or the decode statement.
Lucas
26/9/2009 - 7:19 am
Hi Joaquin, You are quite correct: the cross join syntax is incorrect. Cross Join are cross joins and do not need/work with the ON condition. I have updated the article likewise. Thanks for pointing it out to me.
Lucas
27/10/2009 - 2:29 am
Is this query any faster than the Connect By? Will it work in 10g?
27/10/2009 - 7:28 am
I assume it is equally fast – though it may be a little bit faster (I have not tried it out) – especially with functions like SYS_CONNECT_BY_PATH and the likes. More importantly: it is functionally richer. And no, it will not work on 10g – it is an 11gR2 feature that is not available in previous releases of the database.
Lucas
21/1/2010 - 5:39 pm
Very good article! I was wondering if I could run past you a problem that requires a recursive SQL such as this. Would this be ok?