Is this a bug in Analytical Functions? NULL is supposed to be UNEQUAL to NULL?

Lucas Jellema 13
0 0
Read Time:1 Minute, 37 Second

While presenting the 7Up Training – upgrading experienced Oracle developers from their Oracle 7 base of knowledge and experience ("still writing 7 backwards compatible code against a 10g database") to 10g, we ran into a funny little example with the Oracle Analytical SQL Functions – that also applies to standard GROUP BY operations by the way – that suggests that NULL == NULL in some circumstances. To be fair: it feels (intuitively) correct. but theoretically (what do you say Toon??) it is a little fishy I think.

....
 

The situation is very simple: I have updated my EMP table to have several employees without department. Then I queried to find out for each employee his name, department, salary and the percentage of the total salary sum within the department allocated to the employee. The query is straightforward:

select ename
, deptno
, sal
, ratio_to_report(sal)
over ( partition by deptno)
sal_percentage
from emp
order
by 2, 4

The data in EMP shows two employees without Department (i.e. their DEPTNO is NULL). The results of this query show the two employees without department – MILLER and KING – as having 20 respectively 80 percent of the total salary sum in their department (the partition of all employees with the same DEPTNO). However, their DEPTNO is NULL and since NULL is never equal to NULL, there are no other employees with the same DEPTNO. Or are there?

ENAME          DEPTNO        SAL SAL_PERCENTAGE
---------- ---------- ---------- --------------
CLARK 10 2695 1
SMITH 20 800 .071604386
ADAMS 20 1100 .09845603
SCOTT 20 3000 .268516447
FORD 20 3000 .268516447
JONES 20 3272.5 .292906691
JAMES 30 950 .09808983
MARTIN 30 1250 .129065565
WARD 30 1250 .129065565
TURNER 30 1500 .154878678
ALLEN 30 1600 .165203924
BLAKE 30 3135 .323696438
MILLER 1300 .200896307
KING 5171 .799103693

On further analysis – something I apparently never actively realized – the classic GROUP BY does the exact same thing:

 select deptno
, count(*)
from emp
group
by deptno

The results also suggest an equality between NULL…

DEPTNO COUNT(*)
-------- ----------
30 6
2 <<<<-------
20 5
10 1
 

About Post Author

Lucas Jellema

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director and Oracle Developer Champion. Solution architect and developer on diverse areas including SQL, JavaScript, Kubernetes & Docker, Machine Learning, Java, SOA and microservices, events in various shapes and forms and many other things. Author of the Oracle Press book Oracle SOA Suite 12c Handbook. Frequent presenter on user groups and community events and conferences such as JavaOne, Oracle Code, CodeOne, NLJUG JFall and Oracle OpenWorld.
Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

13 thoughts on “Is this a bug in Analytical Functions? NULL is supposed to be UNEQUAL to NULL?

  1. I think, though, that you alway should see the result of those queries from the point of view from the end user. And what an end user wants to see is not

    DEPTNO COUNT(*)
    ===== =======
    30 6
    “null” 1
    “null” 1
    “null” 1
    “null” 1
    “null” 1
    “null” 1

    20 5
    10 1

    That’s just not convenient. Any query I would write for an end user that yields this result would not be accepted by that user. He would want me to rewrite the query so that all the ‘null’ or ‘unknown’ departments to be put together in one result, because separate records with count=1 have no meaning at all.
    So probably Oracle has made the choice to handle nulls like that, and my opinion is that they made the most logical (or call it: intuitive) choice 🙂

  2. Giving each NULL value its own group seems unwieldy, especially if you have an N-column GROUP BY, and then all-but-1, all-but-2, …, or all-but-(N-1) of those columns have NULL values.

    And what about:

    select null from dual union select null from dual;

    It only returns 1 row. Should GROUP BY be any different?

  3. Hmmmm, even wordpress has problems with (or NOT?)

    😉

    Second try…

    Who argues for outcome:

    DEPTNO COUNT(*)
    ===== =======
    30 6
    “null” 1
    “null” 1
    20 5
    10 1

    (hopefully it will now print correctly, otherwise “Lucas Help?!!!”)

  4. From “An introduction to Database Systems” (International Editon – Eighth Edition) by C.J.Date:

    Possible meanings / reasons why there is a absence of a value (page 577):

    – value unknown
    – value not applicable
    – value does not exist
    – value undefined
    – value not supplied
    – …

    (page 595): Aggregate Operators:

    The SQL aggregate operators (SUM,AVG,etc.) do not behave in accordance with the rules for scalar operators explained in Section 19.2, but instead simply ignore any nulls in their argument (except for count(*), where nulls are treated as if they were regular values). Also if the argument to such an operator happens to evaluate to an empty set, COUNT returns zero; the other operators all return NULL. (As noted in Chapter 8, this bahaviour is logically incorrect, but it is the way SQL is defined.)

    MG

  5. @Patrick: “…and the employees that are not in any department are summed”, but this is the problem using NULL Patrick, you are assuming that they “are not in any department”. A NULL coudl also imply “we KNOW the belong to a department but it is UNKNOWN to us in which…”

    @Jurgen: you don’t want to get into that one… It could be the difference between a salary check with a VALUE on it at the end of the month or a salary check without a VALUE… (you could think of more worse-case real life example of course…you did a 911 call, which is registered in a database, which triggers…).

  6. If think your

    select deptno
    , count(*)
    from emp
    group
    by deptno

    is a great example of Oracle desperatly wanting to give answer where it should not give one.

    A department with value NULL is not equal to a department with value NULL. Also a nice example why one should avoid NULL’s

  7. “The results also suggest an equality between NULL…”

    From the viewpoint of Oracle’s implementation of NULL’s, it does. Your “count(*)” spoils the soup. See the examples (nice post) of beter nullogy on http://www.databasedesign-resource.com/null-values-in-a-database.html. I think we are dealing with a “this is an error in the COUNT function: Any column would give the same COUNT result…” problem here.

    Nice links found regarding nullogy:

    *) Fabian Pascal: http://www.dbdebunk.com/page/page/1396241.htm
    *) Hugh Darwen: http://web.onetel.com/~hughdarwen/TheThirdManifesto/Missing-info-without-nulls.pdf
    *) http://www.dbdebunk.com/about.html

  8. It feels indeed intuitively correct. But what could theoretically be against having a group of undefined objects?

  9. I don’t think you can say that this function suggests that null = null. What you show here is the count of employees per department. The function sums op the employees that are in dept 30 (6), in dept 20 (5), in dept 10 (1), and the employees that are not in any department are summed: 2 employees apply to that condition.
    So in my opinion it’s not a bug. On the contrary, it is a very correct behaviour againts null values in these functions!

Comments are closed.

Next Post

Have MERGE remove records from Target that are not in the Source - Oracle 10g

The Oracle 10g Database release saw an extension of the MERGE statement with a DELETE clause. That by the way makes MERGE the only statement that can fire all three Statement Level triggers: Insert, Update, Delete. And it will make Oracle glad that back in 2003 when it discussed the […]
%d bloggers like this: