Using an aggregation function to query a JSON-string straight from SQL

2

Last week I read this blogpost by Scott Wesley. In this post he describes that he uses a custom aggregate function to create large JSON-strings.
And for that he used a solution as described in this post by Carsten Czarski. That post of Scott reminded me of a post by my collegue Lucas Jellema, in which he uses the “normal” listagg aggregation function. When Lucas wrote his post I thought that I could beat the 4000 char limit of his aproach with a custom aggregate function.

I started out with “Tom Kytes stragg_type”, see here, just changed the type of the stragg_type attribute string to clob and the return-type of the functions to clob.

That worked, no problem to aggregate strings of size 4000, no problem for size 10000, but for larger strings it became slower and slower.
Too slow, 15000 bytes took 15 seconds.

So I changed the type back to varchar2, but with a size of varchar2(32767).
The worked, and fast. But only for strings shorter than 29989 bytes. For larger strings I would get a
ORA-22813: operand value exceeds system limits

To solve that I added a clob attribute, just as Carsten Czarski does in his listagg.
Used the varchar2 string for speed, and as soon as it result became to large, the clob for size.
And that worked too. But as soon as the aggregated string size exceeded 58894 bytes the ORA-22813 popped up again.
And as soon as the odciaggregatemerge function used the clob another error: ORA-22922: nonexistent LOB value
So I gave up, 4000 bytes is a nice limit for JSON, if you want something bigger you have to use PL/SQL. So I thought.

But after reading the post of Scott I compared my code with the code of Carsten Czarski to see how he solved my problems.
And it turned out that the first one was easy to solve, just limit the string to 4000 again.
And Carsten’s odciaggregatemerge function will raise a ORA-22922 too. I expect that it is an Oracle bug 🙂
But, because the odciaggregatemerge function is only executed if the optimizer decides that it will execute the aggregating query parallel, you aggregate very large strings without ever seeing that error.

So, now it’s time to introduce my JSON aggregator. It’s a custom aggregate function, which aggregates a query into a JSON-array. The elements of this array are JSON-objects.

create or replace type agg_json as object
( t_varchar2 varchar2(32767)
, t_clob clob
, static function odciaggregateinitialize( sctx in out agg_json )
  return number
, member function odciaggregateiterate
    ( self in out agg_json
    , a_val dbmsoutput_linesarray
    )
  return number
, member function odciaggregateterminate
    ( self in out agg_json
    , returnvalue out clob
    , flags in number
    )
  return number
, member function odciaggregatemerge
    ( self in out agg_json
    , ctx2 in out agg_json
    )
  return number
, static function json_obj( p_obj dbmsoutput_linesarray )
  return varchar2
, static function escape_json( p_val varchar2 )
  return varchar2
)

Just a type with two attributes, the standard functions for implementing a custom aggregation function, and two supporting static functions.
But notice the a_val parameter of odciaggregateiterate. dbmsoutput_linesarray is a varray of varchar2(32767).
Every name-value pair in the JSON-Object is formed by 3 entries in that varray.
The first entry is the name of the name-value pair.
The second entry is the value of the name-value pair.
And the third is a indicator for the value, is it a string or not.
The fourth entry is the name of the second name-value pair.
The fifth entry is the value of the second name-value pair.

After creating the aggregation function you can create JSON

create or replace function json_agg( agg dbmsoutput_linesarray )
return clob
parallel_enable aggregate using agg_json;

For example, this query

select json_agg( dbmsoutput_linesarray( 'id', level, 'n'
                                      , 'name', level, '' 
                                      , 'test', 'test' || level, ''
                                      )
               ) 
from dual
connect by level <= 3

produces this JSON

 [{"id":1,"name":"1","test":"test1"}
,{"id":2,"name":"2","test":"test2"}
,{"id":3,"name":"3","test":"test3"}]

And to get the JSON from Lucas example nest two calls to this new aggregation function

select agg_json.json_obj
         ( dbmsoutput_linesarray( 'company'
                                , json_agg( dbmsoutput_linesarray( 'name', d.dname, ''
                                                                 , 'identifier', d.deptno, '' 
                                                                 , 'location', d.loc, '' 
                                                                 , 'employees', json_agg( dbmsoutput_linesarray( 'name', e.ename, ''
                                                                                                               , 'job', e.job, ''
                                                                                                               , 'salary', e.sal, 'n'
                                                                                                               , 'manager', nvl( ( select agg_json.json_obj( dbmsoutput_linesarray( 'name', m.ename, ''
                                                                                                                                                                                  , 'salary', m.sal, 'n'
                                                                                                                                                                                  , 'hiredate', to_char( m.hiredate, 'DD-MM-YYYY' ), ''
                                                                                                                                                                                  )
                                                                                                                                                           )
                                                                                                                                   from emp m
                                                                                                                                   where m.empno = e.mgr
                                                                                                                                 ), '{}' ), 'o'
                                                                                                               , 'hiredate', to_char( e.hiredate, 'DD-MM-YYYY' ), '' 
                                                                                                               ) 
                                                                                        ), 'a'
                                          )
                                          )
                                , 'a'
                                )
         )
from dept d
   , emp e
where d.deptno = e.deptno
group by d.dname, d.deptno, d.loc

Here is the code.

About Author

Oracle Consultant at AMIS

2 Comments

  1. Hi,
    You have written an excellent routine. I tried to use it but I am getting an error.
    ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 1726726, maximum: 4000)

    I am using Oracle12c db.I want to call this procedure to receive the full data as JSON

    My code:
    create or replace procedure company
    (
    myOracleSearch IN varchar2 DEFAULT NULL,
    nd IN NUMBER DEFAULT NULL,
    rows IN NUMBER DEFAULT NULL,
    page IN NUMBER DEFAULT NULL,
    sidx IN varchar2 DEFAULT NULL,
    sord IN VARCHAR2 DEFAULT NULL
    )

    as
    l_data clob;
    cerror varchar2(1000);

    begin

    select agg_json.json_obj
    ( dbmsoutput_linesarray( ‘company’
    , json_agg(
    dbmsoutput_linesarray
    (
    ’empno’, a.empno, ”
    , ‘ename’, a.ename, ”
    , ‘job’, a.job, ”
    , ‘mgr’, a.mgr, ”
    , ‘hiredate’, a.hiredate, ”
    , ‘sal’, a.sal, ”
    , ‘comm’, a.comm, ”
    , ‘deptno’, a.deptno, ”
    )
    ) , ‘a’
    )
    )
    into l_data
    from emp1 a; /* around 10000 rows in this */

    htp.p(l_data);

    exception
    when others then
    DBMS_OUTPUT.PUT_LINE( SQLERRM);
    htp.p( sqlerrm );
    end company;

    • Anton Scheffer on

      That’s because you wrap the aggregation function in a agg_json.json_obj function, which is returning a varchar2. Restricted to length 32767 in PLSQL, and just 4000 in SQL (as you did).
      It’s easier to use something like
      select ‘{“company”:’ || json_agg(
      dbmsoutput_linesarray
      (
      ’empno’, a.empno, ”
      , ‘ename’, a.ename, ”
      , ‘job’, a.job, ”
      , ‘mgr’, a.mgr, ”
      , ‘hiredate’, a.hiredate, ”
      , ‘sal’, a.sal, ”
      , ‘comm’, a.comm, ”
      , ‘deptno’, a.deptno, ”
      ) || ‘}’
      into l_data
      from emp1 a; /* around 10000 rows in this */

      By the way, I don’t think that htp.p accepts a clob parameter 🙂