The Split-Join can be a very useful tool in your OSB services yet seems to be underestimated. When I did some asking around it turned out not many developers use this, even though I can come up with plenty of uses for the Split-Join. The Split-Join’s strength is in numbers, meaning it is the most powerful when you need to process a lot of pieces of similar data. For this example I used a simplified version of a project I am working on. In this project mobile devices are set to send data about rainfall to a database. The data is collected at a regular interfal creating a record and sent to the database per session which contains a large set of records. Instead of processing these records one at time I can process them concurrently and save a lot of processing (and waiting) time.
I created the XML Schema files and WSDL’s for the two services using JDeveloper and not Eclipse/OEPE because its design interface for these files is a lot more userfriendly (although this is of course personal preference).
The image shows the Record element, at first I had defined Record to be a complex type, using it as a type to set for the element in request messages. However this actualy makes it implementing your Split-Join smoothly a bit harder. Having this element defined allows me to create variables using this structure and processing them without using XQuery translations. Saving you a significant bit of processing time inside your service. Especially when dealing with repeating actions (like the parallel processing of multiple records) you want to be aware of unnecessary overhead and try to avoid it. Or as a colleague put it, ‘for each step consider if you really need it’.
As you can see there are Requests and Responses for both ‘InsertRecord’ and ‘InsertRecords’ the first are for the service that processes a single record called storeRecord. This service processes and stores a single record in the database. The second set of messages is for a service that is exposed to the outside and the endpoint where devices will send their collections of records. As you see the messages are relatively straightforward and most importantly, they both use the same Record element.
The Split-Join is a seperate file/component in your project, so first we create a new Split-Join, when asked for an operation we use the InsertRecords operation from the exposed WeatherData service. This will automatically create a Split-Join object with a recieve and reply action, as well as a request and response variable. Keep in mind that before we can do anything with a variable in the Split-Join they need to be initialized, in this example we use an assign or copy action to intializa variables. The request variable is initialized for us since it contains the request we will be sending to the Split-Join. The response we initialize by assigning the an empty InsertRecordsResponse to it: <InsertRecordsResponse xmlns=”http://tu.delft/iot/Services/messages”/>.
I added a storeRecordResponseList because I want to collect all responses from the StoreRecord service but I don’t want to send them back as a list of tens or hundreds of response messages, instead we will process this into a single, concise InsertRecordsResponse after all the records have been processed. In a Split-Join you can add a variable by right-clicking the Variables listing. Here you pick the structure for your element. Remember how I created a Record element instead of type, just like I created an element for this list. This is where it comes in handy. Were I to use a type, the variable will automatically use the types name as the root elements name. Even if you initialize it with another root element (of the apropriate type). You could solve this by giving your types the names you would use for elements, but that would mess-up any naming convention and create a messy schema.
Now that we have set up the basics, the real fun can begin. After the Assign Actions place a for-each component in your Split-Join. This for-each component is where the magic happens. Instead of the normal for-each component, this one is able process multiple items in parallel, and with that it is the strength of the Split-Join.
In the for-each we set the counter start value to 1, and the final value to the total number of records: ‘count($request.parameters/weat:Record)’ . Name your counter simple yet clear since you will need it later.
The second step is to create a variable for storing the Record we are currently working on. After we create the Record variable we initialize it by copying the Record for this exexution of the for-each loop into it. To find the right Record we use the previously set Counter.
Once we have our record savely stored away we’ll go ahead and create the service call-out that processes our Record. In a Split-Join this component is called Invoke Service, and just like a Service Call-Out we configure the service and operation we are invoking, as well as an input and output variable. Remember to create these variables inside your for-each loop. In this case we’ll call them storeRecordRequest and storeRecordResponse. storeRecordResponse will be initialized with the response message of the service we invoke, storeRecordRequest however needs to be initialized by us.
We will initialize it by assigning the value ‘<InsertRecordRequest xmlns=”http://tu.delft/iot/Services/messages”/>’ to it.
Once initialized when can go ahead and put our Record in the request. Because we use the same Record element structure everywhere we can simply go ahead and insert it inside the storeRecordRequest variable as a child of the InsertRecordRequest element.
Last but not least we need to do something with the return messages. Remember the storeRecordResponseList variable? It is set to hold one or more storeRecordResponse messages. Again, we do not want to do too much, so with a simple insert we add our local storeRecordResponse to the global list.
Now that we have processed every Record all we need to do is process the response list into a concise response. We can do this by putting an assign after the for-each but before the reply. This will be executed once all parallel executions of the for-each are finished. With an XQuery resource that takes a ResponseList for input we construct the InsertRecordsResponse message for the Reply (response).
To put our newly created Split-Join to use all we have to do is generate a business service based upon the .flow file that we just created (make sure you save it first). Right-click the file and select Oracle Service bus>Generate Business Service. We call this business service from a proxy service using a simple routing that simply passes through the in- and outbound messages, make sure to select that option in your routing. Since the Split-Join and our proxy service use the same WSDL, we don’t have to do anything to these messages, obviously you could add some functionality to your proxy service like a validation step before sending the message of to your Split-Join.
To test the perfomance increase I created a mockservice using SoapUI for the StoreRecord service. In this service I set a 40ms to simulate processing time before it responds. Next I created a similar proxy service using a standard for-each component instead of a Split-Join, using the same steps as I used in the Split-Join example with as little alteration as necessary. Again using SoapUI I created a request with 10 records and sent it alternately to the Split-Join service and the for-each service. The first averages around 520ms of total response time while the latter takes all of 3000ms for the complete roundtrip.
On a last note I can recommend creating a proxy service using a normal for-each after you’ve created your first proper Split-Join to to see exactly what the differences are.
You can download the project here, import it in Eclipse and play around with it. For ease of use the SoapUI mockservice has been replaced with a storeRecord proxy service that echoes a standard response.
There is now a follow-up on this post about dealing with large amounts of messages that can be read here.
5 thoughts on “Using Split-Joins in OSB Services for parallel processing of messages.”
I am having an problem to understand one of the above step
When you used the $Record variable inside the copy node, helpme to know this record type is any or this $Record should be the type: request.parameters.
eg if i have an input xml
here i am trying to form the loops in copy as
what should be the declaration type of the $Record variable here, is it a random variable or is it has to pointlike below:
PLease help me to understand as i am stuck since a week
Can we do similar implementation in 10g OSB
Hello Rutger, I wonder if this approach could also be used with retrieval.
Let say I have a hardware store, with a lot of different types of parts. (millions )
Now the service request contains the article ID’s of the requested parts.
Now I have a cluster of managed nodes and a OSB. I like to spread the retrieval load over the cluster but also needs to join the end result.
Or even could I choose a retreival option by determining total request amount? When smaller then 250 do single retrieval. Otherwise split in request of 250 items.
Rutger, I believe it is worth to mention that, while the basic parallel service implementation shown in your post is a good start, to implement a reliable service one needs to do some extra work.
For instance, any of the insertRecord calls may fail. How to handle it? Ignore? Fail the whole parallel execution immediately? Continue but report the failed requests to the caller?
Or batching: your set of records could become enormously large. How to cap the number of records inserted in parallel?
Split-Join, like any concurrent programming, could become tricky and introduce hard to troubleshoot issues, if not handled with care.
Hi Vlad, you raise a good point in your reply. Obviously this blog post only takes you through the basic steps. A complete implementation would require error handling to be implemented properly. As well as capping the amount of calls that should/could be made concurrently. And this is indeed not something to be ignored.
However including all this would turn the blog post into a small book. And it is merely meant to get people started. I will however take your feedback and work on a more in depth follow-up of this blog post.
Comments are closed.