Using Oracle SQL to resolve meaningful and slightly less serious challenges is one of my favorite (semi-)professional pastimes. In the last two weeks, I have been presenting on various topics including Oracle SQL to audiences in six cities all across India as part of the OTN Yathra 2013. These presentations and the interaction with the attendees on the various capabilities of SQL have inspired me in several ways. One of the outcomes is this article – also inspired by the fairly long journey home and the many flights within India. In this article I will use several powerful options in Oracle SQL to resolve some simple to ask questions. The SQL functions I am using include:
- Insert with Multiple Subqueries
- Insert generating some random data
- LISTAGG for aggregating strings
- LEAD to produce the result for one row using information from the next
- PIVOT to present the data in a matrix format
The statements are straightforward (relatively), the data model is simple. You will like it.
The case is an airplane. It has 46 rows and 6 seats per row. These seats are recorded in table SEATS:
Boarding passes are handed out to passengers who check in. When the boarding pass is created, the seat is allocated. The table BOARDING_PASSES holds… guess what?
The initial data in the SEATS table is created using a single INSERT statement. Interesting about this statement is the way in which it uses the WITH clause for subquery factoring to prepare the data to be inserted:
The statement contains the subquery positions that returns capitals A through F. Subqwuery rowws produces 46 rows. The corss join between the two subqueries produces the 276 seat records.
Next, in order to later on create meaningful queries for finding free adjacent seats, I will hand out boarding passes for approximately 35% of the seats, in a more or less random fashion, scattered across the plane. This is done with the next insert statement:
This subquery seat_allocations produces 276 records that each have a random value between 1 and 100. The main query filters on records that have a value below 35. These records represent seats that have been taken. Records are created in table BOARDING_PASSES for these records.
To get an overview of which seats are taken and which are free – a bit like the seat map of the plane often shown when doing an electronic check in – we can use a PIVOT query to create a matrix that has the 6 positions (A through F) as the columns. For each row the six seats are represented with a 0 (free) or a 1 (taken).
The query that produces this matrix:
Note how the result from the inner most query contains a single record for each seat with an indication of 0 or 1. The PIVOT operator pivots these results and turns six rows into one: all the records with the same roww value are condensed into one with the columns A through F introduced into the result, returning the flag indicator for the corresponding seat.
The result of the PIVOT is shown next:
Every row is shown and for each row the 6 seats from left to right (A through F). A value of 1 indicates a taken seat and a 0 indicates a seat that is still free.
Next I want to select a bitmap per row – the bitmap is a string with six 0 and 1 values, representing the free an taken seats. Similar to the matrix result produced by the pivot, but this time as simple strings. The interesting part about this query is the LISTAGG operator. It can be used as an analytical function as well as an aggregator. The latter is the case here. This operator adds strings together by concatenating them. In this case, we group by roww and for each roww, the bitmap of the seats is produced by concatenating the seat free flag values (0 or 1) for the seats. The (order by position) ensures that the flags are included in the string in the correct order [A..F].
The selected bitmaps per row are produced like this:
We can use these bitmaps to easily find the rows that have adjacent seats available. However, we need to cater for the fact that positions C and D are not really adjacent because there is an aisle in between. The query to find such rows looks like this – catering for the aisle:
When we query for only those rowws that have two adjacent free seats, here is the result:
When the seats start filling up, we may not always be able to find adjacent seats. In addition to truly adjacent we have alternatives that are perhaps acceptable as well. For example: seats that are on both sides of the aisle (C and D on the same row). Or seats next to the aisle on two subsequent rows. Each of these could be awarded a certain score. For example:
- truly adjacent = 10 points
- adjacent across the aisle = 8 points
- aisle seats in subsequent rows = 4 points
Using such a scoring scheme, we can create a query that returns multiple options, even if the ideal option is not available. A query that will award scores to the various rows could be created like this next one:
The CASE expression calculates the scores, by trying out the various options and picking the first one that actually is the case for a row. It uses LEAD again to find two subsequent rows that have free seats on the same side of the aisle.
The scores for the first few rows:
Resources
Download the SQL scripts for this article: seats_in_plane.txt.
Nice! One slight remark, For your random barding passes the good old sample clause is much more suitable
insert into boarding_passes( seat_id, more_details )
select id, ‘Seat was allocated’
from seats sample( 35 )