Analytic Functions and the Oracle Data Cartridge

Last week at the ODTUG conference, I did a presentation on Analytic Functions. Later on someone came up to me and asked me if there is also a Median-function in Oracle. At that time I was pretty sure there is no such a function in Oracle SQL.
He told me that he wrote a quite elaborate function to perform a Median calculation, but it wat quite costly. This functions worked quite well. There was a downside however, using this function with a large dataset took a long time to complete.
My first suggestion was to use the Oracle Data Cartridge to leverage the power of the Oracle database to perform such calculations. This might improve performance significantly.


The nice thing of using the Oracle Data Cartridge is that it's "Analytic"-friendly. This means that the function you create that implements the Oracle Data Cartridge can be used as an Analytic Function. In the next section I will show you an example of using the Oracle Data Cartridge as an Analytic Function.....

Analytic Function and the Oracle Data Cartridge

In order to use the Oracle Data Cartridge, you need to cohere to the specification of the Oracle Data Cartridge. The complete reference guide can be found here.
In short it boils down to:

  • Create an Object that implements the Oracle Data Cartridge Interface
  • Also Implement the Objects body which specifies the functionality
  • Create a function that uses this Object, with the AGGREGATE USING keywords

And that's it.

In the demo-file that comes in this blog, I create a function that "aggregates" strings. With this function you can execute queries just like:

SQL> select Amisvarchar2Sum(dname) sumstring
  2  from   dept d
  3  /
SUMSTRING
--------------------------------------------
ACCOUNTING, RESEARCH, SALES, OPERATIONS

And because these User Defined Aggregates are "Analytic Friendly" you can also use them like

SQL> select ename
  2       , amisvarchar2sum (ename) over (partition by deptno
  3                                      ) sumstring
  4    from emp
  5  /
ENAME          SUMSTRING
---------- --------------------------------------------------
CLARK      CLARK, KING, MILLER
KING       CLARK, KING, MILLER
MILLER     CLARK, KING, MILLER
SMITH      SMITH, ADAMS, FORD, SCOTT, JONES
ADAMS      SMITH, ADAMS, FORD, SCOTT, JONES
FORD       SMITH, ADAMS, FORD, SCOTT, JONES
SCOTT      SMITH, ADAMS, FORD, SCOTT, JONES
JONES      SMITH, ADAMS, FORD, SCOTT, JONES
ALLEN      ALLEN, BLAKE, MARTIN, JAMES, TURNER, WARD
BLAKE      ALLEN, BLAKE, MARTIN, JAMES, TURNER, WARD
MARTIN     ALLEN, BLAKE, MARTIN, JAMES, TURNER, WARD
JAMES      ALLEN, BLAKE, MARTIN, JAMES, TURNER, WARD
TURNER     ALLEN, BLAKE, MARTIN, JAMES, TURNER, WARD
WARD       ALLEN, BLAKE, MARTIN, JAMES, TURNER, WARD

14 rows selected.

Or do an "upside down anchored window" (I don't know why you would want to, but it's possible):

SQL> select ename
  2       , amisvarchar2sum(ename) over (partition by deptno
  3                                          order by ename
  4                                          range between current row
  5                                                    and unbounded following
  6                                      ) sumstring
  7    from emp
  8  /
ENAME          SUMSTRING
---------- --------------------------------------------------
CLARK      CLARK, KING, MILLER
KING       KING, MILLER
MILLER     MILLER
ADAMS      ADAMS, FORD, JONES, SCOTT, SMITH
FORD       FORD, JONES, SCOTT, SMITH
JONES      JONES, SCOTT, SMITH
SCOTT      SCOTT, SMITH
SMITH      SMITH
ALLEN      ALLEN, BLAKE, JAMES, MARTIN, TURNER, WARD
BLAKE      BLAKE, JAMES, MARTIN, TURNER, WARD
JAMES      JAMES, MARTIN, TURNER, WARD
MARTIN     MARTIN, TURNER, WARD
TURNER     TURNER, WARD
WARD       WARD

14 rows selected.

Google is your friend

Before I started out to create a Median function using the Oracle Data Cartridge, it's always a good idea to Google and see if something similar already exists.
And lo and behold the Median is implemented in Oracle 10 release 2.
But what to do if your database has a lower version? You can use the Analytic Function PERCENTILE_CONT with a value of 0.5 percentile. This will give you the same answer but is less descriptive than the new MEDIAN function.

 

Conclusion

In conclusion, before you start developing all kind of fancy new functionality. Check if it is not implemented in the Database already, it can save you lots of time and effort. When I found that the MEAN function was already implemented I told the man asking the question and that really made his day. It's always nice if you can help someone, gives you a nice fuzzy feeling Smiley


For a definition of the Median function I refer you to this Google listing.
The Demo script can be found here.

3 Comments

  1. bob de theije July 4, 2006
  2. site admin June 27, 2006
  3. bob de theije June 27, 2006