Posts tagged analytical functions
In a recent article – Advanced SQL to find valid periods – juggling with outer joins, running totals and analytical functions – I discussed how to use Analytical Functions in SQL to cleverly (!) derive the valid periods from a database table that contains periods of inclusion and exclusion. A valid period is a period for which there is at least one inclusion and for which there is no exclusion. I used several powerpoint based graphics to illustrate the business case. For example:
to depict the periods of inclusion and exclusion and this figure to demonstrate how to derive the valid periods (the blue bars):
After completing this article – and fiddling around in Powerpoint quite a bit – I realized that for visualizing data in a table, I have a perfect tool at my fingertips: the Data Visualization Tags (DVT) in ADF 11g are created for this very purpose: turning data into information through visualization. And this rich library of DVTs components contains – in addition to fairly straightforward visualizations such as bar charts, pie charts and line graphs – also more complex visualization components such as the Bubble Chart, Thematic Map and Gantt Chart. The Gantt Chart has three More >
Advanced SQL to find valid periods – juggling with outer joins, running totals and analytical functions2
After teaching the Advanced SQL Masterclass recently, one of the attendees confronted me with an interesting challenge for me to solve with all the tools I had been discussing all day. This challenge originates in the world of insurance and revolves around policies. Apparently (I am not much of an insurance expert) policies can have periods of inclusions (when they are valid) as well as periods of exclusion (when the policy is definitely not valid, even when there are inclusions that suggest the opposite. The exclusions win, so to say.
Visually, I can describe the situation for one specific policy like this:
Here we see a policy with five inclusions (in green), that partly overlap, as well as four exclusions (in red). The challenge is that we need to find out from a table that contains all periods of inclusion (green) and exclusion (red) what the resulting periods of validity are for the policy. Visually that would be deriving the blue bars in the following figure:
This article describes how this challenge can be approached in SQL.
It is a strange thing really: querying the database to find missing data. Queries usually report on what is there. Not the queries in this article (and its prequel): they report on what is not there. We explore a number of approaches to uncovering missing information – because the fact that something is not found in itself is a fact we may want to know about. And it gives me a nice motive to demonstrate a number of fun SQL features.
We are still very much engaged in the EMP and DEPT tables in the SCOTT schema that back in the early ’80s was pivotal in demonstrating the capabilities of the first Oracle RDBMS releases. It still does an excellent job for most of my demos today. Must be me…
This article delves into urgent matters such as ‘which letters do not occur in the names of the employees’, ‘in which months of the year was no one hired’, ‘who in each department does not have a colleague with less experience (or a lower salary)’, ‘which employees do not have a colleague in the same job (either in their own department or in the entire company)’, and more.
You are probably familiar with the FIRST_VALUE and LAST_VALUE analytical functions that were introduced some time ago into the Oracle RDBMS, in the 9iR2 release I believe (or at least that is when they made their way into the Standard Edition). These values are used to find the first respectively last value in a window in a partition that has been ordered in a certain way.
Oracle introduces a new, related function in 11gR2, called NTH_VALUE. Instead of simply the first or last value in an analytical window, we can ask for a specific row number, such as 2nd row (ND_VALUE?), the 7th or the 223th. This article shows the syntax for this new function.
Last week, this weblog hit the 9M post reads mark. Since we started technology.amis.nl/blog in July 2004, it took us some 5 years to get to that point. And I started to wonder when we reached earlier milestones, like 100k, 1M and 5M. This took me to some SQL puzzles that before I started in anger might have been difficult to solve, but turned out to be rather straightforward – thanks to Analytical functions.
In this article a brief report on how to reconstruct history using Running Sums and a bit of LAG/LEAD analysis.
One of the intriguing subjects at last weeks Oracle Open World conference for me was Complex Event Processing. The ability to process streams of real time events, and up to 1000s per second of them, opens up many new avenues of interesting opportunities. Detecting patterns and anomalies, aggregating and comparing – reporting only the noteworthy events or findings, thereby making sense (information) of enormous volumes of data is both great fun and tremendously useful in specific situations. The SQL like CQL language – Continuous Query Language – provides us with a syntax to analyze data streams in various ways – detecting trends and patterns, variations and aberrations. There are several key characteristics in the CEP architecture: handling large volumes of events, applying this special language for event analysis and executing the queries continuously. That last bit means that the query is registered with a stream of events, not executed at a single moment in time against the data as it is at that moment but looking for query results continously as the events keep streaming in. Query results can be reported for as long as the query registration remains active and the events keep More >