Posts tagged lead
Oracle SQL – Finding free adjacent seats in an airplane, using Subquery Factoring, Analytical Functions (LISTAGG and LEAD), Outer Join, PIVOT and good old INSTR1
Using Oracle SQL to resolve meaningful and slightly less serious challenges is one of my favorite (semi-)professional pastimes. In the last two weeks, I have been presenting on various topics including Oracle SQL to audiences in six cities all across India as part of the OTN Yathra 2013. These presentations and the interaction with the attendees on the various capabilities of SQL have inspired me in several ways. One of the outcomes is this article – also inspired by the fairly long journey home and the many flights within India. In this article I will use several powerful options in Oracle SQL to resolve some simple to ask questions. The SQL functions I am using include:
- Insert with Multiple Subqueries
- Insert generating some random data
- LISTAGG for aggregating strings
- LEAD to produce the result for one row using information from the next
- PIVOT to present the data in a matrix format
The statements are straightforward (relatively), the data model is simple. You will like it.
Oracle SQL – spotting combinations that occur and those that do not – demonstrating Analytical Functions, Outer Join and SubQuery Factoring0
The challenge in this article is a simple one. I have a table with records (duh!) and each record describes the occurrence of a certain payload. In this example the payload will be a color. Every record also has a sequence number to indicate well, when it appeared relative to the other observations. A subset of the data would look like this:
The challenges I will discuss in this article are simple:
- which sequences of three color observations occur in the data set
- how often do these sequences of three occur (give me the top 3 sequences)
- given the colors that have been observed, which combinations of three can be created
- which possible combinations (or color sequences of three colors) have not been observed at all
Using Oracle SQL features such as Analytical Functions, Outer Join and Subquery Factoring – these questions become very easy to answer.
Advanced SQL to find valid periods – juggling with outer joins, running totals and analytical functions2
After teaching the Advanced SQL Masterclass recently, one of the attendees confronted me with an interesting challenge for me to solve with all the tools I had been discussing all day. This challenge originates in the world of insurance and revolves around policies. Apparently (I am not much of an insurance expert) policies can have periods of inclusions (when they are valid) as well as periods of exclusion (when the policy is definitely not valid, even when there are inclusions that suggest the opposite. The exclusions win, so to say.
Visually, I can describe the situation for one specific policy like this:
Here we see a policy with five inclusions (in green), that partly overlap, as well as four exclusions (in red). The challenge is that we need to find out from a table that contains all periods of inclusion (green) and exclusion (red) what the resulting periods of validity are for the policy. Visually that would be deriving the blue bars in the following figure:
This article describes how this challenge can be approached in SQL.
Some IT organizations that I visit have a sign somewhere indicating the number of days that have passed without a major issue. Production sites with proper respect for health and safety to something similar with signs like this one:
In this article, I am wondering about two things:
- given a list of all incidents, can I easily derive the longest streak of days without accident
- and the reverse: given a list of events, can I easily find the longest consecutive string of days on which that particular type of event happened (this time not accidents but more positive things like ‘the boss giving a compliment’, ‘an inspection passed’ , ‘a shiny new blog article published’, ‘my neighbour not drinking an alcoholic beverage’ or ‘my son getting up on time’
Using Analytical Functions, it should be fairly easy to find out both things, I suspect – although it turns out that the first is a little more straightforward.
Oracle 11gR2 – alternative for CONNECT_BY_ISLEAF function for Recursive Subquery Factoring (dedicated to Anton)5
On our blog, we have been discussing the new hierarchical query functionality in Oracle Database 11g Release 2, using Recursive Suquery Factoring. Instead of using CONNECT BY and its close associates such as START WITH, PRIOR, LEVEL and more exotic comrades like SYS_CONNECT_BY_PATH, CONNECT_BY_ROOT and NOCYCLE this release gave us a new, less proprietary and eventually probably more intuitive and functionally rich approach. We have also written how though we have no straightforward alternatives for LEVEL, SYS_CONNECT_BY_PATH and CONNECT_BY_ROOT – in the new recursive approach they are fantastically easy to emulate.
Until recently I have been quite happy with the new hierarchical querying and telling the world how I felt. Then an esteemed colleague – a far more experienced SQL programmer than I am – came up to me and remined me how the recursive sub query syntax at the present does not have a replacement for the CONNECT_BY_ISLEAF function – the SQL function that tells us whether a node produced in an hierarchical query has any children or is at the bottom of the chain – i.e. a leaf node. For leaf nodes (child-less), the function returns a value of one and for parent nodes the More >
One of the intriguing subjects at last weeks Oracle Open World conference for me was Complex Event Processing. The ability to process streams of real time events, and up to 1000s per second of them, opens up many new avenues of interesting opportunities. Detecting patterns and anomalies, aggregating and comparing – reporting only the noteworthy events or findings, thereby making sense (information) of enormous volumes of data is both great fun and tremendously useful in specific situations. The SQL like CQL language – Continuous Query Language – provides us with a syntax to analyze data streams in various ways – detecting trends and patterns, variations and aberrations. There are several key characteristics in the CEP architecture: handling large volumes of events, applying this special language for event analysis and executing the queries continuously. That last bit means that the query is registered with a stream of events, not executed at a single moment in time against the data as it is at that moment but looking for query results continously as the events keep streaming in. Query results can be reported for as long as the query registration remains active and the events keep More >