Tour de France 2011 – Analysis using ADF DVT Graphs – Part 3 – Distance, Speed and Withdrawals with Pareto, Combination, Stock Chart (High/Low) and Bubble Chart

1

Another article on analyzing and visualizing the results from the Tour de France 2011 using the ADF DVT components. This article uses the same set of data already discussed in several previous articles – including the standings per stage as well as the overall standings after each stage. This article will focus on using the combination graph, the bubble graph and the high/low (aka stock) chart for taking a closer look at speed, length and withdrawals (and any connection there can be between these aspects).

Some of the pretty pictures created in this article:

ImageImage

ImageImage

Image

Distance – per stage and overall using a Combination Graph

The distance varies quite substantially between stages. A combination graph is a good way to visualize both the distance per stage as well as the overall distance in the Tour up until a certain stage.

The SQL Query that underlies the ViewObject that underpins this graph is shown here:

select stagesequence
  ,      totaldistance - nvl(lag(totaldistance) over (order by stagesequence), 0) stagedistance
  ,      totaldistance
  from   stage e
         join
         tour t
         on (e.tour_id = t.id)
  where  t.year = 2011

The collection on the Data Control based on the ViewObject based on this query can be dragged to the page and droppped as a Graph. Select Combination graph this time:

Image

Configure the graph:

Image

Note how the two data points are Stagedistance (for the bar chart against the Y1 axis) and Totaldistance (for the line against the Y2 axis).

Run the page to show the following overview:

Image

Withdrawals per Stage – using Combination and Pareto

The withdrawals – riders who do not appear in the overall ranking from a certain stage onwards – can be analyzed in various ways. One way is also through the use of a Combination Chart that shows both the number of withdrawals in a stage (bar) as well as the total number of withdrawals until a certain stage (using a line). This is configured using the following query:

with stages as
(
  select e.stagesequence
  ,      e.totaldistance - lag(e.totaldistance) over (order by e.stagesequence) stagedistance
  ,      e.totaldistance
  ,      (select count(s.id) from standing s where s.stage_id = e.id and s.type_of_standing='ITG') rider_count
  from   stage e
         join
         tour t
         on (e.tour_id = t.id)
  where  t.year = 2011
)
, stages_and_withdrawals as
(
select stagesequence
,      stagedistance
,      totaldistance
,      rider_count
,      lag(rider_count) over (order by stagesequence) - rider_count stage_withdrawals
from   stages
)
select stagesequence
,      stagedistance
,      totaldistance
,      rider_count
,      stage_withdrawals
,      sum(stage_withdrawals) over (order by stagesequence) cummulative_withdrawals
from    stages_and_withdrawals

Based on this query (using a ViewObject in the Application Module) the following Combination Graph is created:

Image

And the when the page is run, it shows up like:

Image

One other, similar way of visualizing the number of withdrawals per stage, especially in order to find the stages that were particularly rough for some reason, is through the use of a Pareto chart. It is easily created, using only the stage sequence and the number of withdrawals in a particular stage for input. Drag the same collection as before to the page, drop it as Graph and choose the Pareto chart:

Image

Configure the chart:

Image

and run the page – giving the following overview of withdrawals:

Image

Clearly stage 9 was a bastard of stage, accounting for close to 30% of all withdrawals all by itself!

Speed per Stage – High, Low and Median using a Stock Chart

The Stock Chart (aka High/Low chart) displays an indicator per date. The indicator is constructed using three values: the maximum and minimum for a day and a third value that for real stocks indicates the close value and will be used for the median speed in my visualization. The stock chart can be used with additional values: a fourth one for the opening value and optionally a fifth one to indicate the stock volume. I will not be using those 4th and 5th parameters – initially at least.

The following query returns the highest, lowest and average speed for each stage of the 2011 Tour de France:

with stages as
(
  select e.stagesequence
,      nvl(e.totaldistance - lag(e.totaldistance) over (order by e.stagesequence), e.totaldistance) stagedistance  ,      e.totaldistance
  ,      (select count(s.id) from standing s where s.stage_id = e.id and s.type_of_standing='ITG') rider_count
  ,      (select min(s.totaltime) from standing s where s.stage_id = e.id and s.type_of_standing='ITE' ) winner_time
  ,      (select max(s.totaltime) from standing s where s.stage_id = e.id and s.type_of_standing='ITE' ) last_time
  ,      (select median(s.totaltime) from standing s where s.stage_id = e.id and s.type_of_standing='ITE' ) median_time
,      to_date('30-06-2011','DD-MM-YYYY')+ stagesequence date_of_stage
  from   stage e
         join
         tour t
         on (e.tour_id = t.id)
  where  t.year = 2011
)
, stages_and_withdrawals as
(
select stagesequence
,      stagedistance
,      totaldistance
,      rider_count
,      lag(rider_count) over (order by stagesequence) - rider_count stage_withdrawals
,      winner_time
,      last_time
,      median_time
,      stagedistance/(extract(hour from winner_time) + 1/60* extract(minute from winner_time) + 1/3600 * extract(second from winner_time)) winner_speed
,      stagedistance/(extract(hour from last_time) + 1/60* extract(minute from last_time) + 1/3600 * extract(second from last_time)) last_speed
,      stagedistance/(extract(hour from median_time) + 1/60* extract(minute from median_time) + 1/3600 * extract(second from median_time)) median_speed
,      date_of_stage
from   stages
)
select stagesequence
,      stagedistance
,      totaldistance
,      rider_count
,      stage_withdrawals
,      sum(stage_withdrawals) over (order by stagesequence) cummulative_withdrawals
,      winner_time
,      last_time
,      median_time
,      winner_speed
,      last_speed
,      median_speed
,      date_of_stage
,      'Speed' metric
from    stages_and_withdrawals
where  median_speed is not null

Note: I had some errors regarding lacking data when I used the true date for the stages – because of rest-days the days are not consecutive and the stock chart had a problem dealing with that. So I have manipulated the query in creating a time wise consecutive set of data.

After creating a ViewObject based on query, adding it to the Application Module, creating a new page and refreshing the Data Control, I am ready to create the data bound stock chart.

Drag the collection to the page, drop it as Graph:

Image

Select Stock Chart:

Image

Configure the stock chart using the attribute from the ViewObject:

Image

Make some small stilistic changes and run the page.

So here is the chart that indicates the speed for each stage. The top end of the marker indicates the highest speed – or the speed of the winner of the day – expressed in kilometer per hour. The low end of the indicator represents the speed for the last rider to finish in a stage. The horizontal bar indicates the median speed – that is the speed in the middle of all speed values in sorted order.

Image

When winner and median speed are the same: the stage ended with a group sprint – such as the first stage. When the indicator is nothing but a horizontal bar – such as for the 21st stage in Paris – all riders finished in one group. When the horizontal bar is at the bottom, equal to the last rider’s speed, there was a group or a single rider ahead with a compact group behind it. Obviously the length of the marker indicates the spread in speed or the gap in time between the winner and the rider to come last. The contre-le-montre (speed time trial) on Saturday (20th stage) has the largest difference between winner and last rider.

Bubble Chart to visualize the relation between Speed, Distance and the number of Withdrawals

I have wondered whether there is a relation between the number of withdrawals in a certain stage and the (combination of) speed and distance in the stage. Such a association can be visualized using a Bubble Chart – where bubbles are plotted for each stage against both distance (X-axis) and speed (Y-axis). The radius (size) of the bubble visualizes the number of withdrawals in the stage.

Image

The Bubble Chart is configured with attributes for X-axis, Y-axis and Bubble Size and optionally for bubble color as well.

Image

Well, I am not sure too many conclusion can be drawn from this chart:

Image

The fastest stage was of in-between length and has two withdrawals, the shortest have no witdrawals but the slowest all have at least one withdrawal. The largest number of withdrawals – the biggest bubble – is fairly long, of average speed. But the longest – of almost the same speed – has not withdrawals at all. I am not ready to declare a strong correlation I am afraid. Slippery roads, carelss car drivers or wreckless riders may also have something to do with this.

Resources

Download JDeveloper 11.1.2 Application: TourDeFranceBubbleParetoStockPlusExport.

Tour de France 2011 – the official website – http://www.letour.fr/us/index.html

Previous Articles:

Building Java Object Graph with Tour de France results – using screen scraping, java.util.Parser and assorted facilities –http://technology.amis.nl/blog/12786/building-java-object-graph-with-tour-de-france-results-using-screen-scraping-java-util-parser-and-assorted-facilities

Using JPA to persist the Tour de France Java Object Graph to relational database tables http://technology.amis.nl/blog/12796/using-jpa-to-persist-the-tour-de-france-java-object-graph-to-relational-database-tables

Tour de France – Analysis using ADF DVT Graphs - http://technology.amis.nl/blog/12956/tour-de-france-analysis-using-adf-dvt-graphs

Tour de France 2011 – Analysis using ADF DVT Graphs – Part 2 – Gap with Cadel for Top 5 throughout the stages  http://technology.amis.nl/blog/12972/tour-de-france-2011-analysis-using-adf-dvt-graphs-part-2-gap-with-cadel-for-top-5-throughout-the-stages

Share.

About Author

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director for Fusion Middleware. Consultant, trainer and instructor on diverse areas including Oracle Database (SQL & PLSQL), Service Oriented Architecture, BPM, ADF, Java in various shapes and forms and many other things. Author of the Oracle Press book: Oracle SOA Suite 11g Handbook. Frequent presenter on conferences such as JavaOne, Oracle OpenWorld, ODTUG Kaleidoscope, Devoxx and OBUG. Presenter for Oracle University Celebrity specials.

1 Comment

  1. ravindraveeramalla on

    Hi Lucas,

    Very nice post.

    I am struggling to find the property to remove the thick intersecting lines, which are drawn in the middle point of X and Y axis in the bubble graph. Can you please suggest me the property to remove those lines?

    Your help is really appreciated.

Leave a Reply