Today, against all priorities, I have resumed an old hobby: the creation of charts with nothing but SQL to work with. Several years back I created pure SQL Hi-Lo Charts, Pie Charts, Gauge (Speedometer), Bar Chart and Gantt Chart (all these chart types and their underlying SQL are available on the AMIS Technology Blog). Today, I felt an urgent need for a scatterplot – a simple X-Y axis presentation of two variable data points. Examples are the Euro/Dollar conversion ratio against time, a growth curve (height vs. age), position vs. time, nerdiness vs. distance from Silicon Valley etc. Oracle SQL has many facilities – especially the WITH clause for subquery factoring – that make the creation of charts not as hard as you might think. Their usefulness is still pretty questionable though. To be honest: the Scatterplot took a little more effort than I had anticipated. Maybe I am getting a little rusty in SQL (with all my Java and SOA activities and all). Anyways, one of the charts I will be creating in SQL looks like this: It presents the growth curve for a female subject between ages 9 and a half years and 18 years. In that period, the subject grew from just over 134 cm to about 170 cm. The chart has a Y-axis with some markers to indicate the value scale (for the Y-coordinate of the data points). The horizontal axis has similar markers to indicate the value scale for the X-values. Both axes are configurable in the underlying SQL-query: their value range, the number of markers and the overall size of the chart. The marker used in the chart (* in this case) can also be set. The data set for this sample chart can be presented as a simple table: This table does not exist as a database table – instead, an inline view is created that injects the data in to the chart generator, as we will see in a moment. Two types were created to help adapt the data to the SQL query:
create type measurement_t as object (x number(5,2), y number(12,5)); create type measurement_tbl_t as table of measurement_t;
Using these types, the Sub Query is defined as follows:
with age_height_measurements as ( select x, y from table( measurement_tbl_t ( measurement_t(9.5, 137) , measurement_t(10.0, 138) , measurement_t(10.5, 139) , measurement_t(11, 142) , measurement_t(11.5, 144) , measurement_t(12, 147) , measurement_t(12.5, 150) , measurement_t(13, 153) , measurement_t(13.5, 155) , measurement_t(14, 158) , measurement_t(14.5, 161) , measurement_t(15, 163) , measurement_t(15.5, 165) , measurement_t(16, 167) , measurement_t(16.5, 168) , measurement_t(17.0, 169) , measurement_t(17.5, 169) , measurement_t(18.0, 169) ) ) ) , chart_query as -- the query that drives the chart; one bar per groupid, one sub-stack per category ( select x , y from age_height_measurements )
Based on this sub query, the more generic chart-initialization subqueries chart_query (for the data), chart_dimensions (for the configuration of the chart, independent from the data) and limits (to specify the value ranges for the two axes) are defined:
, chart_query as -- the query that drives the chart; one bar per groupid, one sub-stack per category ( select x , y from dollar_euro_measurements ) , chart_dimensions as ( select 20 height , 50 width -- the width of the chart, can be changed , 10 x_margin -- positions to the left of the y-axis , 5 y_marker_distance , 20 x_marker_distance , '*' marker from dual ) , limits as ( select min(y)*0.8 low_limit -- decide to use 0 as the baseline for the chartÅ› y-axis or some other value , max(y)*1.2 high_limit , min(x) start_x , max(x) end_x from chart_query )
The Y-axis with the marker values can be produced using two additional queries. One produces a record for each line in the chart and the other finds the lines that should have an Y-axis marker as well as the value associated with that marker:
, lines as ( select level line from chart_dimensions connect by level <= height ) , y_markers as ( select line , (height-line)/height * (high_limit - low_limit) + low_limit marker_value from lines cross join chart_dimensions cross join limits where mod(height - line, y_marker_distance) = 0 )
The X-axis and its markers is produced as two additional records, to be added (UNION ALL) to the final result of the query:
... union all select height+1 line, lpad(' ',x_margin)||axis from x_axis_1 cross join chart_dimensions where x = width union all select height + 2 line, lpad(' ',x_margin)||axis from x_axis_2 cross join chart_dimensions where x = width
The lines produced by these two queries look like this:
The subquery columns produces a single record for every column in the chart (determined by the width setting in chart_dimensions).
The x_axis subquery produces a record for each column and determines for the column if it represents a marker or not (resulting in a | or _ character) and if it is a marker, it also determines the value to be displayed for the marker. Subquery x_axis_1 is a recursive subquery that creates the axis itself by concatenating together all characters (_ and |) produced in the per-column-records produced by x_axis. Subquery x_axis_2 is similar, also using the recursive subquery mechanism. This query creates line with marker values in the right columns – matching the markers.
, columns as ( select level x from chart_dimensions connect by level <= width ) , x_axis as ( select x , case mod(width - x, x_marker_distance) when 0 then '|' else '_' end character , case when mod(width-x , x_marker_distance)= 0 then substr( lpad(x/width * ( end_x- start_x) + start_x,4), 1,4) when mod(width -x, x_marker_distance)< 4 then '' else ' ' end marker_value from columns cross join chart_dimensions cross join limits ) , x_axis_1 (axis, x) as ( select character axis , x from x_axis where x = 1 union all select axis || character , x_axis.x from x_axis_1 join x_axis on (x_axis_1.x +1 = x_axis.x) ) , x_axis_2 (axis, x) as ( select to_char(marker_value) axis , x from x_axis where x = 1 union all select axis || marker_value , x_axis.x from x_axis_2 join x_axis on (x_axis_2.x +1 = x_axis.x) )
Subquery values_on_line finds for each line which values should be plotted on that line. These values to be plotted in the chart are subsequently gathered by the chart_lines sub query. This query returns records for all lines in the chart using a recursive subquery that joins with values_on_line. This query works its way through all columns for all lines and for each column determines whether it should contain a marker or a space character.
, values_on_line as ( select line , c.x , round((c.x - start_x)/(end_x - start_x) * width) col from lines l cross join limits cross join chart_dimensions left outer join chart_query c on (c.y between (height-line-0.5)/height * (high_limit - low_limit) + low_limit and (height-line+0.5)/height * (high_limit - low_limit) + low_limit ) ) , chart_lines (line, chart_line, col) as ( select line , '|' axis , 0 from lines union all select c.line , chart_line|| case when v.line is null then ' ' else marker end , c.col+1 from chart_lines c left outer join values_on_line v on (v.line = c.line and c.col+1 = v.col) cross join chart_dimensions where c.col < width )
Finally the main query sweeps all results together – combining the y_markers with the chart_lines and adding the x_axis_1 and x_axis_2:
select line , substr(lpad(nvl(to_char(round(marker_value,1)),' '),25),26-x_margin,x_margin) ||chart_line y from lines left outer join y_markers using (line) join chart_lines using (line) cross join chart_dimensions where chart_lines.col = width union all select height+1 line, lpad(' ',x_margin)||axis from x_axis_1 cross join chart_dimensions where x = width union all select height + 2 line, lpad(' ',x_margin)||axis from x_axis_2 cross join chart_dimensions where x = width order by line
The complete query is now:
with age_height_measurements as ( select x, y from table( measurement_tbl_t ( measurement_t(9.5, 137) , measurement_t(10.0, 138) , measurement_t(10.5, 139) , measurement_t(11, 142) , measurement_t(11.5, 144) , measurement_t(12, 147) , measurement_t(12.5, 150) , measurement_t(13, 153) , measurement_t(13.5, 155) , measurement_t(14, 158) , measurement_t(14.5, 161) , measurement_t(15, 163) , measurement_t(15.5, 165) , measurement_t(16, 167) , measurement_t(16.5, 168) , measurement_t(17.0, 169) , measurement_t(17.5, 169) , measurement_t(18.0, 169) ) ) ) , chart_query as -- the query that drives the chart; one bar per groupid, one sub-stack per category ( select x , y from age_height_measurements ), chart_dimensions as ( select 20 height , 50 width -- the width of the chart, can be changed , 10 x_margin -- positions to the left of the y-axis , 5 y_marker_distance , 20 x_marker_distance , '*' marker from dual ) , limits as ( select min(y)*0.8 low_limit -- decide to use 0 as the baseline for the chartÅ› y-axis or some other value , max(y)*1.2 high_limit , min(x) start_x , max(x) end_x from chart_query ) , lines as ( select level line from chart_dimensions connect by level <= height ) , y_markers as ( select line , (height-line)/height * (high_limit - low_limit) + low_limit marker_value from lines cross join chart_dimensions cross join limits where mod(height - line, y_marker_distance) = 0 ) , columns as ( select level x from chart_dimensions connect by level <= width ) , x_axis as ( select x , case mod(width - x, x_marker_distance) when 0 then '|' else '_' end character , case when mod(width-x , x_marker_distance)= 0 then substr( lpad(x/width * ( end_x- start_x) + start_x,4), 1,4) when mod(width -x, x_marker_distance)< 4 then '' else ' ' end marker_value from columns cross join chart_dimensions cross join limits ) , x_axis_1 (axis, x) as ( select character axis , x from x_axis where x = 1 union all select axis || character , x_axis.x from x_axis_1 join x_axis on (x_axis_1.x +1 = x_axis.x) ) , x_axis_2 (axis, x) as ( select to_char(marker_value) axis , x from x_axis where x = 1 union all select axis || marker_value , x_axis.x from x_axis_2 join x_axis on (x_axis_2.x +1 = x_axis.x) ) , values_on_line as ( select line , c.x , round((c.x - start_x)/(end_x - start_x) * width) col from lines l cross join limits cross join chart_dimensions left outer join chart_query c on (c.y between (height-line-0.5)/height * (high_limit - low_limit) + low_limit and (height-line+0.5)/height * (high_limit - low_limit) + low_limit ) ) , chart_lines (line, chart_line, col) as ( select line , '|' axis , 0 from lines union all select c.line , chart_line|| case when v.line is null then ' ' else marker end , c.col+1 from chart_lines c left outer join values_on_line v on (v.line = c.line and c.col+1 = v.col) cross join chart_dimensions where c.col < width ) select line , substr(lpad(nvl(to_char(round(marker_value,1)),' '),25),26-x_margin,x_margin) ||chart_line y from lines left outer join y_markers using (line) join chart_lines using (line) cross join chart_dimensions where chart_lines.col = width union all select height+1 line, lpad(' ',x_margin)||axis from x_axis_1 cross join chart_dimensions where x = width union all select height + 2 line, lpad(' ',x_margin)||axis from x_axis_2 cross join chart_dimensions where x = width order by line
This chart displays the Dollar/Euro conversion ratio for the last three years Using this query combined with the data for the Dollar/Euro conversion ratio for the last three years, it is easy to create another scatterplot:
The start of the the query producing this chart is like this:
with dollar_euro_measurements as ( select x, y from table( measurement_tbl_t ( measurement_t(10.08, 1.43) , measurement_t(10.16, 1.37) , measurement_t(10.25, 1.136) , measurement_t(10.5, 1.28) , measurement_t(10.75, 1.39) , measurement_t(11.0, 1.34) , measurement_t(11.25, 1.45) , measurement_t(11.5, 1.43) , measurement_t(11.75, 1.37) , measurement_t(12.0, 1.29) , measurement_t(12.25, 1.33) , measurement_t(12.5, 1.236) , measurement_t(12.75, 1.30) , measurement_t(13.0, 1.33) , measurement_t(13.25, 1.30) , measurement_t(13.5, 1.31) , measurement_t(13.75, 1.36) , measurement_t(14.0, 1.36) , measurement_t(14.25, 1.38) , measurement_t(14.375, 1.37) ) ) ) , chart_query as -- the query that drives the chart; one bar per groupid, one sub-stack per category ( select x , y from dollar_euro_measurements ) , chart_dimensions as ( select 20 height ,...