Oracle 11gR2 – alternative for CONNECT_BY_ISLEAF function for Recursive Subquery Factoring (dedicated to Anton)
On our blog, we have been discussing the new hierarchical query functionality in Oracle Database 11g Release 2, using Recursive Suquery Factoring. Instead of using CONNECT BY and its close associates such as START WITH, PRIOR, LEVEL and more exotic comrades like SYS_CONNECT_BY_PATH, CONNECT_BY_ROOT and NOCYCLE this release gave us a new, less proprietary and eventually probably more intuitive and functionally rich approach. We have also written how though we have no straightforward alternatives for LEVEL, SYS_CONNECT_BY_PATH and CONNECT_BY_ROOT – in the new recursive approach they are fantastically easy to emulate.
Until recently I have been quite happy with the new hierarchical querying and telling the world how I felt. Then an esteemed colleague – a far more experienced SQL programmer than I am – came up to me and remined me how the recursive sub query syntax at the present does not have a replacement for the CONNECT_BY_ISLEAF function – the SQL function that tells us whether a node produced in an hierarchical query has any children or is at the bottom of the chain – i.e. a leaf node. For leaf nodes (child-less), the function returns a value of one and for parent nodes the value is zero.
Anton (my colleague) was right and unfortunately I did not have a quick retort. However, after giving it some thought I believe I have found a way of emulating the CONNECT_BY_ISLEAF as well, using the new DEPTH FIRST ordering capabilities of the recursive subquery. I hope this will satisfy Anton as well.
First a quick example of using CONNECT_BY_ISLEAF to illustrate what I am trying to regain. This next query brings back the organizational hierarchy from table EMP, starting with employees without a manager and following the (hierarchical) chain of command downward by linking EMPNO (manager) to MGR (in subordinate):
SELECT empno empid , ename name , mgr mgrid , CONNECT_BY_ISLEAF leaf_node FROM emp CONNECT BY PRIOR empno = mgr start with mgr is null
The CONNECT_BY_ISLEAF function indicates which employees are at the bottom of the pyramid.
and the result
Now the recursive subquery does not use the CONNECT BY clause. Instead it has a two-part subquery in the WITH clause of the query; both parts are combined through a UNION ALL . The first part selects the root-node(s) for the tree. The second part recursively joins with the subquery itself, to iteratively find the children for the previously selected round of nodes (in the first iteration that round contains the root-nodes produced by the first query, in later rounds these (candidate) parent nodes are returned by the second query. For example:
WITH employees (empid, name, mgrid, lev) AS ( SELECT empno, ename, mgr , 1 FROM emp WHERE mgr is null UNION ALL SELECT empno, ename, mgr, lev+1 FROM employees r, emp e WHERE r.empid = e.mgr ) select e.empid , e.name , e.mgrid from employees e /
My reasoning for the alternative solution with the is based on the SEARCH DEPTH FIRST option we have for specifying a sequence number to the result of the recursive query. This way of determining the order of the nodes is based on first going down the tree (root, first child of root, first grandchild of root, up again – down again to second grandchild of root, up again, up again, second child of root, down again …… ) instead of first visiting the root nodes, then the first level of children etc. Something like
1 2 5 8 3 4 6 7 9 10
1 2 3 4 5 6 7 8 9 10
Now I believe we can make use of this specific way of searching to locate the leafs. Every leaf node does not have a child (duh!!) and therefore the next node in the 'depth first' sequence HAS TO BE AT THE SAME LEVEL OR HIGHER. By looking at the 'next node's hierarchical level when the nodes are ordered by the depth-first-sequence-number; and checking whether that level is either lower (no leaf) or the same/higher (IS LEAF!) than the node under scrutiny, we can find the leaf nodes.
The recursive query will return all nodes and will this depth-first-sequence number. A second query can then use the analytical LEAD function to find the sequence number assigned to the next node:
lead(lev) over (order by seq))
and compare it with the current node, to declare either is leaf (1) or is no leaf (0).
case when (lev - lead(lev) over (order by seq)) < 0 then 0 else 1 end is_leaf
The complete query now would be something like this:
WITH each_level (empid, name, mgrid, lev) AS ( SELECT empno, ename, mgr , 1 FROM emp WHERE mgr is null UNION ALL SELECT empno, ename, mgr, lev+1 FROM each_level r, emp e WHERE r.empid = e.mgr ) <strong>SEARCH DEPTH FIRST BY mgrid SET seq</strong> select e.* , <strong>case when (lev - lead(lev) over (order by seq)) < 0 then 0 else 1 end is_leaf </strong>from each_level e /
The result is as follows:
- Oracle RDBMS 11gR2 – new style hierarchical querying using Recursive Subquery Factoring
- Oracle RDBMS 11gR2 – Solving a Sudoku using Recursive Subquery Factoring
- Subquery Factoring in Oracle 11g
- Oracle RDBMS 11gR2 – goodbye Connect By or: the end of hierarchical querying as we know it
- Oracle Database 11gR2 – New analytical function NTH_VALUE
- ADF DVT: Using the Tree Map visualization component – to compare relative sizes and distributions
- Oracle SQL – Finding free adjacent seats in an airplane, using Subquery Factoring, Analytical Functions (LISTAGG and LEAD), Outer Join, PIVOT and good old INSTR
- Out of the box usage of ADF DVT Scheduling Gantt Chart to report Database Query Results using stacked bar charts per time period
- Oracle SQL – spotting combinations that occur and those that do not – demonstrating Analytical Functions, Outer Join and SubQuery Factoring
- Oracle SQL: Using subquery factoring in an INSERT statement
- OTN Yathra 2013 – Spreading the story of Oracle across India – (Half time)
- ADF DVT Speed Date: Present Metrics per Year, Quarter and Month using a zoom-enabled ADF DVT Resource Utilization Gantt and ADF BC
- Using TRUNC in SQL to get the first date in a period
- ADF DVT: Visualizing valid periods using Project and Scheduling Gantt Charts
- Advanced SQL to find valid periods – juggling with outer joins, running totals and analytical functions