Using JPA to persist the Tour de France Java Object Graph to relational database tables

In a recent article – https://technology.amis.nl/blog/12786/building-java-object-graph-with-tour-de-france-results-using-screen-scraping-java-util-parser-and-assorted-facilities – I described how I retrieved the statistics for recent Tour de France editions from the official Tour de France website from my Java program and constructed an Java Object Graph for the data on stages, riders and rankings. In this article, I will show how I have persisted that data, from the Java Objects to Relational Tables in my local Oracle XE database. Note: the fact that this concerns Tour de France data is not really relevant for this story – it is a generic story about how JPA is used to map and persist Java Classes and Objects to a relational database.

The Class diagram for the classes involved looks like this:

Image

The Java program discussed in the previous article retrieves data from the Le Tour De France website and creates an Object Graph according to these object definitions. Note: in comparison with the previous article, I have already applied a few small changes that will help with othe ORM mapping that JPA will do for me. The stage and rider references in Standing as well as the tour references in both Rider and Stage have been added. These will easily result in foreign keys to the table holding the Tour instances from the tables for Riders and Stages for example.

The steps for persisting the Java Object graph to the database are fairly straightforward:

  • define the JPA Persistence Unit (through the persistence.xml file)
  • specify annotations in the Entities (the Java POJOs that have to be persisted)
  • write the code that takes the Object Graph, acquires an EntityManager, begins a transaction, persists the relevant objects and commits the transaction

This article will show these three simple steps.

Configuring the Persistence Unit

The persistence.xml file (in the classes/META-INF directory of the Java application) contains the definitions for the database connection to use for persisting the data, some settings regarding logging and the creation of any tables (mapped to the Entities) that do not already exists as well as a list of all the Java classes that represent or are Entities.

The file in this particular case looks like this:

<?xml version="1.0" encoding="windows-1252" ?>
<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd"
             version="2.0">
    <persistence-unit name="Model">
        <provider>org.eclipse.persistence.jpa.PersistenceProvider</provider>
        <jta-data-source>java:/app/jdbc/jdbc/ScottLocalXEDS</jta-data-source>
        <class>nl.amis.tourdefrance.model.Tour</class>
        <class>nl.amis.tourdefrance.model.Rider</class>
        <class>nl.amis.tourdefrance.model.Stage</class>
        <class>nl.amis.tourdefrance.model.Standing</class>
        <properties>
            <property name="javax.persistence.jdbc.url" value="jdbc:oracle:thin:@localhost:1521:XE"/>
            <property name="javax.persistence.jdbc.user" value="scott"/>
            <property name="javax.persistence.jdbc.driver" value="oracle.jdbc.OracleDriver"/>
            <property name="javax.persistence.jdbc.password" value="tiger"/>
            <property name="eclipselink.target-server" value="WebLogic_10"/>
            <property name="javax.persistence.jtaDataSource" value="java:/app/jdbc/jdbc/ScottLocalXEDS"/>
            <property name="eclipselink.ddl-generation" value="create-tables"/>
            <property name="eclipselink.logging.level" value="FINEST"/>
        </properties>
    </persistence-unit>
</persistence>

Note the setting true for eclipselink.ddl-generation that instructs the JPA implementation to generate tables and constraints in the designated database schema when these do not already exist.

Specify JPA annotations in the Entities

Most of the persisting of the POJOs is done automatically by JPA. All we need to do is add some well chosen Annotations in the Entities – the four classes that we want to persist to the database. Each of the classes gets an @Entity annotation – to indicate the class is in fact a JPA entity. Additionally, we will make use of generated, surrogate, meaningless primary keys. Each entity is extended with a property called id that is annotated with @Id and @GeneratedValue; the latter means that during insert of the entity, a column Id created for this entity attribute will be populated with an auto-generated value.

The Tour class is annotated according to these guidelines and now looks like this:

@Entity
public class Tour {

    @Id
    @GeneratedValue
    private Integer id;

    @Column(nullable = false)
    Integer year;
    int numberOfStages;
    boolean prologue;
    @Temporal(value = TemporalType.DATE)
    Date startDate;
    @OneToMany(cascade = { CascadeType.PERSIST, CascadeType.MERGE }, mappedBy = "tour")
    List<Stage> stages = new ArrayList<Stage>();

    @OneToMany(cascade = { CascadeType.PERSIST, CascadeType.MERGE }, mappedBy = "tour")
    Map<Integer, Rider> riders = new HashMap<Integer, Rider>();

an interesting annotation is found for both the riders and stages properties. Both are annotated with

@OneToMany(cascade = {CascadeType.PERSIST,CascadeType.MERGE }, mappedBy = "tour")

This indicates that one Tour instance may have multiple associated Rider (and Stage) instances, that refer to the Tour through their property tour. Additionally, the CascadeType.PERSIST and CascadeType.MERGE specify that when the Tour entity instance is persisted or merged, its associated riders and stages should be persisted or merged as well.

The corresponding annotations can be found in Rider and Stage – for the tour property:

@ManyToOne(cascade=CascadeType.ALL)
@JoinColumn(name = "TOUR_ID")
private Tour tour;

This specifies how a TOUR_ID column should be creeated in the table generated for Riders (or Stages) to contain the Foreign Key reference to Tour.

The full set of annotations for Stage is shown here. Stage is somewhat interesting as it has both a ManyToOne relation with Tour as well as a OneToMany relation with Standing.

@Entity
public class Stage {

    @Id @GeneratedValue
    private Integer id;
    @Column(name = "departure")
    private String from;
    @Column(name = "finish")
    private String to;
    private float totalDistance;
    @Column(name = "stageSequence")
    private int sequence;
    @Column(name = "stageType")
    private String type; // contre montre, team suivant,
    @Column(name = "stage_date")

    private String date;
    @OneToMany(cascade = {CascadeType.PERSIST,CascadeType.MERGE },
       mappedBy = "stage")

    private List<Standing> standings = new ArrayList<Standing>();
    @ManyToOne(cascade=CascadeType.ALL)
    @JoinColumn(name = "TOUR_ID")
    private Tour tour;

Standing finally refers to both Stage and Rider, using similar annotations as the ones we have seen above.

Code for interacting with the Entity Manager and persisting the object graph

With both the persistence.xml file and all annotations in place – and leveraging the code introduced in the previous article for creating the object graph from the Tour data on www.letour.fr – the code to actually persist that object graph is remarkably simple.

    public static void main(String[] args) {
        // Use persistence.xml configuration
        EntityManagerFactory emf = Persistence.createEntityManagerFactory("Model");
        EntityManager em = emf.createEntityManager(); // Retrieve an application managed entity manager

        em.getTransaction().begin();
        TourManager tm = new TourManager();
        for (Tour tour : tm.getTours()) {
            em.persist(tour);
        }
        em.getTransaction().commit();
        em.close();
        emf.close();
        ...

The EntityManager is retrieved from the EntityManagerFactory that itself is instantiated using the definition of a Persistence Unit (Model) in the persistence.xml file. Once the EntityManager is available, the transaction is begun. Every Tour entity is persisted (and because of the Cascade settings, all contained Riders, Stages and Standings are persisted along with the Tour). Then the transaction is committed and both EntityManager and EntityManagerFactory are closed. The code is not very robust (no proper exception handling for example) but will serve my purpose of getting the data from the website into the database.

Running the Java program for persisting the Tour de France data

Running the Java program is a piece of cake now. The current logging level in the persistence.xml file is set to produce finest log messages – every last little detail is written to the output. For example:

Image

Once the program is done, we can inspect the database schema. We will find four new tables – TOUR, RIDER, STAGE and STANDING – that have been generated by JPA:

Image

Upon closer inspection of the contents of these tables, we will find that they hold data – that has all the desired mutual references:

Image

This allows for SQL queries with joins, such as (to find the top 3 for the 18th stage in the 2010 Tour de France):

select s.rank
,      r.name
from   standing s
       join
       rider r
       on (s.rider_id = r.id)
       join
       stage st
       on (st.id = s.stage_id)
       join
       tour t
       on (st.tour_id = t.id)
where  st.stagesequence = 18
and    t.year = 2010
order
by     s.rank asc

Image

The website tells us:

Image

So the SQL Query against the locally persisted data returns the same answer as the official website. That seems like a good start!

Resources

Download the sources for this article: TourDeFrancePersistingObjectGraphWithJPA.