How to unit test a SQL query?

Lucas Jellema 5

A simple challenge: I am working on a SQL query. Not a trivial query. I am trying various approaches. I believe I may have found a solution that works. That produces the correct solution. How can I be sure? How can I try to further refine the query while maintaining its integrity? I am still in a development phase where I make quick changes and want to get quick feedback. The SQL is not yet embedded in an application – that may happen later. For now, I am just laboring away on the query.

Given a test data set that I have set up in my database tables, I know what the query should produce. So I can test the query by comparing that expected result with the actual outcome of my SQL statement. A quick – tool-free – way of doing that could be:

  • create a view from my SQL query, for example called my_query
  • create a test-query that compares the outcome from this view with the inline defined query result

That second query could look like this (note: based on VinceM’s comment, I made an edit here on 7th March):

with result_to_verify as
( select *
   from   my_query
)
, expected_result as
( select ‘X’, ‘Y’, 42 from dual
   union all
   select ‘A’, ‘B’, 3.141592 from dual
)

( select *  
   from result_to_verify  
   minus  
   select *  
   from expected_result  
) — superfluous results
union all  
(
   select *  
   from expected_result  
   minus  
   select *  
   from result_to_verify  
   ) — missing results  

This query should return zero rows.

Applied to SQL challenge in this article SQL–Only Counting Records Sufficiently Spaced apart using Analytics with Windowing Clause and Anti Join – where I created a solution that turned out to be incorrect for a lack of testing! – this works out as follows:

Create View for the Query that I believe is the correct solution:

SNAGHTML24247bb

Test the result of executing the query vs the expected result. If the result of this query is NO DATA FOUND then this test has passed.SNAGHTML23fbdb5

A second test case

image

All test cases can be stored in a simple SQL Developer worksheet, a script file or even a PL/SQL function that returns an OK or NOK.

Try out these statements in Live SQL.

Note: if I had created such a simple testing harness before starting out with creating my query, I would not have sent my incorrect solution to my colleague and to the world. Of course the quality of the test set – test data and test cases – is crucial for the quality of my tests.

Using JSON for describing More Complex Result Sets

As I was reaching out to esteemed peers such as Tim Scott and Connor McDonald to learn their thoughts about simple SQL query test harnasses, it became clear that such a simple straightforward way of testing a SQL query under development does not exist out there, We also concluded that leveraging the database itself for comparing query result with the expected results is the obvious way to go. There are several ways for defining the expected results – for example as records in a special table, hardcoded in PL/SQL code or hardcoded in a SQL statement. Balancing light-weightedness vs reusability of test results, you can arrive at slightly different approaches. Comparing the results with the expected results can be done as shown above in a purely relational way – with the MINUS SET operator – or could be done based on the JSON representation of the results.

Using JSON_TABLE we can turn the contents of a JSON document into a relation result set to be used in the context of a SQL query. The next illustration shows how we can define the expected result as a JSON document and use the contents of that document in the comparison with the actual results.

image


with expected_result as (   
  SELECT *   
  FROM JSON_TABLE('   
  [ {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "00:00"}   
  , {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "02:39"}   
  , {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "04:59"}   
  ]' , '$[*]'   
  COLUMNS (PERSONID VARCHAR2(10) PATH '$.PERSONID',   
           BLOCK_START_LOGIN_TIME VARCHAR2 PATH '$.BLOCK_START_LOGIN_TIME'   
           )   
  )   
)   
, result_to_verify as (   
  select PERSONID  
  ,      BLOCK_START_LOGIN_TIME
  from   my_query   
  where  personid = 1
)   
( select *   
  from result_to_verify   
  minus   
  select *   
  from expected_result   
) -- superfluous results
union all   
( 
  select *   
  from expected_result   
  minus   
  select *   
  from result_to_verify   
  ) -- missing results   

Now the expected_result can be expressed as a JSON document – which is somewhat more convenient than expressing it as select value, value, value from dual UNION ALL select … etc. Note the ‘$[*]’ path expression that indicates that we want to loop over all array elements in the source JSON document – turning each array element into a row in the result set. From each array element we take the properties PERSONID and BLOCK_START_LOGIN_TIME, as designated by $.PERSONID and $.BLOCK_START_LOGIN_TIME.

We can even use a somewhat more compact way to express the JSON document:

imageOr the terse and a little cryptic:

image

Instead of turning the expected result from JSON into relational data and using MINUS to compare the expected and actual result set, we can do the reverse: turn the actual result into JSON and do the comparison for the JSON documents.

The JSON_ARRAYAGG operator can be used to turn a SQL result set into a corresponding JSON document, as is shown here:

image

Using this operator, we can represent the result_to_verify as a JSON document. Now the expected_result is again expressed as a JSON document and the result from the query against the view under test scrutiny is also turned to JSON as is shown in the figure. This result (result_to_verify) is the query against the view to test; the results from this query are turned into a JSON document using the JSON_ARRAYAGG operator. Note the use of ORDER BY – the order of the records in the JSON document is meaningful in the eventual comparison. The expected results are in this case defined in a text-block inside an inline function. The text block can easily be manipulated. It cannot easily be reused though.

The comparison between the actual result and the expected result are done using the JSON_EQUAL operator (introduced in Oracle Database 18c). This comparison does not look at the order of properties or meaningless whitespace. The order of the elements in the array is meaningful however. JSON_EQUAL returns true for JSON documents that are both valid and have the same content. The query as shown here will return only a result when the expected and actual results are not equal according to JSON_EQUAL

In green are the elements that are specific to this test case. The none-highlighted elements are generic – they are the same for all test cases. The fact that so much is generic – captured in a template – is attractive, as is the fact that no additional database objects need to be created for running this test case.

image

The CODE used for this last example:

with function expected_result return varchar2 
is
begin
return ' [ {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "00:00"}
           , {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "02:39"}
           , {"PERSONID" : "1", "BLOCK_START_LOGIN_TIME" : "04:59"}
         ]';
end;
result_to_verify as
( SELECT JSON_ARRAYAGG(JSON_OBJECT(*)) result_json
  FROM   my_query
  WHERE  personid =1
  order
  by     BLOCK_START_LOGIN_TIME
)
select 'FAILURE'
from   DUAL
where  NOT JSON_EQUAL( (select result_json from result_to_verify), expected_result )

Here is the result when the expected and actual result are not the same. Note that this does not indicate where the deviation is in the actual result from the desired result. For this we would have to further analyze the two JSON documents.

image

Thanks to Tim Scott and Connor McDonald for their willingness to brainstorm on this and their insights and code contributions.

Resources

Oracle Base (Tim Scott’s) article on JSON_EQUAL – https://oracle-base.com/articles/18c/json_equal-condition-18c

Database docs on JSON_EQUAL https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/SQL-JSON-Conditions.html#GUID-35C7012D-FCDB-4106-88C1-CABA78326896 

ASK Tom – Inline Function – https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:9534452300346807610 

Database 21c Docs – JSON Developer Guide – https://docs.oracle.com/en/database/oracle/oracle-database/21/index.html 

Database Docs on JSON_ARRAYAGG – https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/JSON_ARRAYAGG.html#GUID-6D56077D-78DE-4CC0-9498-225DDC42E054 and on JSON_TABLE https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/JSON_TABLE.html#GUID-3C8E63B5-0B94-4E86-A2D3-3D4831B67C62

5 thoughts on “How to unit test a SQL query?

  1. I have recently wrote a library that is able to test the correctness of our sql logic. The logic is stored in db views and the the library mocks such views, substituting the input objects (views/tables in the FROM clause) with the desired test inputs and checking that the view produces the correct output.

    It runs as normal JUnit tests, and the input and output values are stored in the project, allowing for proper versioning and pull requests to be created. Test are also executed when the views are modified (build pipeline currently in bitbucket, db object management done with liquibase).

    I am trying to open source this work, if there’s enough interest and if the company approves. Please let me know if you would be interested in seeing such work when available.

  2. This doesn’t necessarily produce the correct results, e.g. the following query gives a NO_DATA_FOUND result but the queries are not equivalent:

    with result_to_verify as (
    select ‘X’, ‘Y’, 42 from dual
    union all
    select ‘A’, ‘B’, 3.141592 from dual
    )
    , expected_result as (
    select ‘X’, ‘Y’, 42 from dual
    union all
    select ‘A’, ‘B’, 3.141592 from dual
    union all
    select ‘C’, ‘C’, 100 from dual
    )
    select *
    from result_to_verify
    minus
    select *
    from expected_result

    1. Hi Vincent,

      Good point. I probably should have extended the query to
      with result_to_verify as (
      select ‘X’, ‘Y’, 42 from dual
      union all
      select ‘A’, ‘B’, 3.141592 from dual
      )
      , expected_result as (
      select ‘X’, ‘Y’, 42 from dual
      union all
      select ‘A’, ‘B’, 3.141592 from dual
      union all
      select ‘C’, ‘C’, 100 from dual
      )
      (select *
      from result_to_verify
      minus
      select *
      from expected_result
      )
      union all
      (select *
      from expected_result
      minus
      select *
      from result_to_verify)

      I hope the readers get the drift of what I am trying to do here. Create a set of simple to execute queries – that do not require additional tools to be set up – that are used to verify the results from the SQL query I am developing.

      thanks,
      Lucas

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Next Post

How to build [a CI/CD pipeline for] an Oracle Database application–kicking off a six part series

Build is an overloaded term in the world of software engineering. Building software used to be mainly a fancy way of saying “programming”. Now after the construction of the code, the actual build process is kicked off, that takes the work from the individual programmer and turns it into something […]