How to unit test a SQL query?

Lucas Jellema March 5, 2021 Database 7 Comments

A simple challenge: I am working on a SQL query. Not a trivial query. I am trying various approaches. I believe I may have found a solution that works. That produces the correct solution. How can I be sure? How can I try to further refine the query while maintaining its integrity? I am still in a development phase where I make quick changes and want to get quick feedback. The SQL is not yet embedded in an application – that may happen later. For now, I am just laboring away on the query.

Given a test data set that I have set up in my database tables, I know what the query should produce. So I can test the query by comparing that expected result with the actual outcome of my SQL statement. A quick – tool-free – way of doing that could be:

create a view from my SQL query, for example called my_query
create a test-query that compares the outcome from this view with the inline defined query result

That second query could look like this (note: based on VinceM’s comment, I made an edit here on 7th March):

with result_to_verify as
( select *
   from   my_query
)
, expected_result as
( select ‘X’, ‘Y’, 42 from dual
   union all
   select ‘A’, ‘B’, 3.141592 from dual
)
( select *
   from result_to_verify
   minus
   select *
   from expected_result
) — superfluous results
union all
(
   select *
   from expected_result
   minus
   select *
   from result_to_verify
   ) — missing results

This query should return zero rows.

Applied to SQL challenge in this article SQL–Only Counting Records Sufficiently Spaced apart using Analytics with Windowing Clause and Anti Join – where I created a solution that turned out to be incorrect for a lack of testing! – this works out as follows:

Create View for the Query that I believe is the correct solution:

Test the result of executing the query vs the expected result. If the result of this query is NO DATA FOUND then this test has passed.

A second test case

All test cases can be stored in a simple SQL Developer worksheet, a script file or even a PL/SQL function that returns an OK or NOK.

Try out these statements in Live SQL.

Note: if I had created such a simple testing harness before starting out with creating my query, I would not have sent my incorrect solution to my colleague and to the world. Of course the quality of the test set – test data and test cases – is crucial for the quality of my tests.

Using JSON for describing More Complex Result Sets

As I was reaching out to esteemed peers such as Tim Scott and Connor McDonald to learn their thoughts about simple SQL query test harnasses, it became clear that such a simple straightforward way of testing a SQL query under development does not exist out there, We also concluded that leveraging the database itself for comparing query result with the expected results is the obvious way to go. There are several ways for defining the expected results – for example as records in a special table, hardcoded in PL/SQL code or hardcoded in a SQL statement. Balancing light-weightedness vs reusability of test results, you can arrive at slightly different approaches. Comparing the results with the expected results can be done as shown above in a purely relational way – with the MINUS SET operator – or could be done based on the JSON representation of the results.

Using JSON_TABLE we can turn the contents of a JSON document into a relation result set to be used in the context of a SQL query. The next illustration shows how we can define the expected result as a JSON document and use the contents of that document in the comparison with the actual results.


with expected_result as (   
  SELECT *   
  FROM JSON_TABLE('   
  [ {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "00:00"}   
  , {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "02:39"}   
  , {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "04:59"}   
  ]' , '$[*]'   
  COLUMNS (PERSONID VARCHAR2(10) PATH '$.PERSONID',   
           BLOCK_START_LOGIN_TIME VARCHAR2 PATH '$.BLOCK_START_LOGIN_TIME'   
           )   
  )   
)   
, result_to_verify as (   
  select PERSONID  
  ,      BLOCK_START_LOGIN_TIME
  from   my_query   
  where  personid = 1
)   
( select *   
  from result_to_verify   
  minus   
  select *   
  from expected_result   
) -- superfluous results
union all   
( 
  select *   
  from expected_result   
  minus   
  select *   
  from result_to_verify   
  ) -- missing results

Now the expected_result can be expressed as a JSON document – which is somewhat more convenient than expressing it as select value, value, value from dual UNION ALL select … etc. Note the ‘$[*]’ path expression that indicates that we want to loop over all array elements in the source JSON document – turning each array element into a row in the result set. From each array element we take the properties PERSONID and BLOCK_START_LOGIN_TIME, as designated by $.PERSONID and $.BLOCK_START_LOGIN_TIME.

We can even use a somewhat more compact way to express the JSON document:

Or the terse and a little cryptic:

Instead of turning the expected result from JSON into relational data and using MINUS to compare the expected and actual result set, we can do the reverse: turn the actual result into JSON and do the comparison for the JSON documents.

The JSON_ARRAYAGG operator can be used to turn a SQL result set into a corresponding JSON document, as is shown here:

Using this operator, we can represent the result_to_verify as a JSON document. Now the expected_result is again expressed as a JSON document and the result from the query against the view under test scrutiny is also turned to JSON as is shown in the figure. This result (result_to_verify) is the query against the view to test; the results from this query are turned into a JSON document using the JSON_ARRAYAGG operator. Note the use of ORDER BY – the order of the records in the JSON document is meaningful in the eventual comparison. The expected results are in this case defined in a text-block inside an inline function. The text block can easily be manipulated. It cannot easily be reused though.

The comparison between the actual result and the expected result are done using the JSON_EQUAL operator (introduced in Oracle Database 18c). This comparison does not look at the order of properties or meaningless whitespace. The order of the elements in the array is meaningful however. JSON_EQUAL returns true for JSON documents that are both valid and have the same content. The query as shown here will return only a result when the expected and actual results are not equal according to JSON_EQUAL

In green are the elements that are specific to this test case. The none-highlighted elements are generic – they are the same for all test cases. The fact that so much is generic – captured in a template – is attractive, as is the fact that no additional database objects need to be created for running this test case.

The CODE used for this last example:

with function expected_result return varchar2 
is
begin
return ' [ {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "00:00"}
           , {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "02:39"}
           , {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "04:59"}
         ]';
end;
result_to_verify as
( SELECT JSON_ARRAYAGG(JSON_OBJECT(*)) result_json
  FROM   my_query
  WHERE  personid =1
  order
  by     BLOCK_START_LOGIN_TIME
)
select 'FAILURE'
from   DUAL
where  NOT JSON_EQUAL( (select result_json from result_to_verify), expected_result )

Here is the result when the expected and actual result are not the same. Note that this does not indicate where the deviation is in the actual result from the desired result. For this we would have to further analyze the two JSON documents.

Thanks to Tim Scott and Connor McDonald for their willingness to brainstorm on this and their insights and code contributions.

Resources

Oracle Base (Tim Scott’s) article on JSON_EQUAL – https://oracle-base.com/articles/18c/json_equal-condition-18c

Database docs on JSON_EQUAL https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/SQL-JSON-Conditions.html#GUID-35C7012D-FCDB-4106-88C1-CABA78326896

ASK Tom – Inline Function – https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:9534452300346807610

Database 21c Docs – JSON Developer Guide – https://docs.oracle.com/en/database/oracle/oracle-database/21/index.html

Database Docs on JSON_ARRAYAGG – https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/JSON_ARRAYAGG.html#GUID-6D56077D-78DE-4CC0-9498-225DDC42E054 and on JSON_TABLE https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/JSON_TABLE.html#GUID-3C8E63B5-0B94-4E86-A2D3-3D4831B67C62

Tags:json, JSON_EQUAL, json_table, Oracle, query, sql, test, unit test

About The Author

Lucas Jellema

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director and Oracle Developer Champion. Solution architect and developer on diverse areas including SQL, JavaScript, Kubernetes & Docker, Machine Learning, Java, SOA and microservices, events in various shapes and forms and many other things. Author of the Oracle Press book Oracle SOA Suite 12c Handbook. Frequent presenter on user groups and community events and conferences such as JavaOne, Oracle Code, CodeOne, NLJUG JFall and Oracle OpenWorld.

7 Comments

Micke Vrolijk May 24, 2021
utPLSQL also has good cursor and comparison logic in place. The bonus of utPLSQL is that it is easy to set up and execute multiple cases independently. For non-trivial complex queries we create a shadow table with expected results and compare the two. In my experience, this is the most maintainable way, as other alternatives to express the expected results become quickly unreadable.
Reply
- Lucas Jellema May 25, 2021
  Hi Micke,
  Thanks for your comment and sharing this information. I should probably take another look at at utPLSQL. My main concern is the setting up of database objects just to verify the functionality of a query – especially in case that happens in a database where I do not have privileges to create such objects.
  kind regards,
  Lucas
  Reply
G.C. March 8, 2021
I have recently wrote a library that is able to test the correctness of our sql logic. The logic is stored in db views and the the library mocks such views, substituting the input objects (views/tables in the FROM clause) with the desired test inputs and checking that the view produces the correct output.
It runs as normal JUnit tests, and the input and output values are stored in the project, allowing for proper versioning and pull requests to be created. Test are also executed when the views are modified (build pipeline currently in bitbucket, db object management done with liquibase).
I am trying to open source this work, if there’s enough interest and if the company approves. Please let me know if you would be interested in seeing such work when available.
Reply
- Lucas Jellema March 8, 2021
  Hi Giovanni,
  This certainly sounds interesting. Perhaps you could share a preview under NDA and I can give you my two cents’?
  kind regards,
  Lucas
  Reply
VinceM March 6, 2021
This doesn’t necessarily produce the correct results, e.g. the following query gives a NO_DATA_FOUND result but the queries are not equivalent:
with result_to_verify as (
select ‘X’, ‘Y’, 42 from dual
union all
select ‘A’, ‘B’, 3.141592 from dual
)
, expected_result as (
select ‘X’, ‘Y’, 42 from dual
union all
select ‘A’, ‘B’, 3.141592 from dual
union all
select ‘C’, ‘C’, 100 from dual
)
select *
from result_to_verify
minus
select *
from expected_result
Reply
- Lucas Jellema March 6, 2021
  Hi Vincent,
  Good point. I probably should have extended the query to
  with result_to_verify as ( select ‘X’, ‘Y’, 42 from dual union all select ‘A’, ‘B’, 3.141592 from dual ) , expected_result as ( select ‘X’, ‘Y’, 42 from dual union all select ‘A’, ‘B’, 3.141592 from dual union all select ‘C’, ‘C’, 100 from dual ) (select * from result_to_verify minus select * from expected_result ) union all (select * from expected_result minus select * from result_to_verify)
  I hope the readers get the drift of what I am trying to do here. Create a set of simple to execute queries – that do not require additional tools to be set up – that are used to verify the results from the SQL query I am developing.
  thanks,
  Lucas
  Reply
Tako de Jong March 5, 2021
This is a great mathematically proven and very useful mechanism, chapeau!
Reply

How to unit test a SQL query?

Using JSON for describing More Complex Result Sets

Resources

Like this:

About The Author

Lucas Jellema

7 Comments

Leave a ReplyCancel reply

Using JSON for describing More Complex Result Sets

Resources

Share this:

Like this:

Related Posts

About The Author

Lucas Jellema

7 Comments

Leave a ReplyCancel reply