Performance in service oriented environments is often an issue. This is usually caused by a combination of infrastructure, configuration and service efficiency. In this blog article I provide several suggestions to improve performance by using patterns in service implementations. The patterns are described globally since implementations can differ across specific use cases. Also I provide some suggestions on things to consider when implementing such a pattern. They are technology independent however the technology does of course play a role in the implementation options you have. This blog article was inspired by a session at AMIS by Lucas Jellema and additionally flavored by personal experience.
Patterns
Asynchronous services
Suppose a synchronous call is made and the system takes a while to process the information. In the mean time the end-user might be waiting for the processing to be completed while the end-user might not (immediately) be interested in the response. Why not make the process asynchronous?
Making a process asynchronous has some drawbacks. The result of the processing of the request will not be available immediately in the front- and back-end so you cannot use this information yet and often you do not know when (and if) the information will become available. If something goes wrong during processing, who will be informed to take measures? (How) does the back-end inform the front-end when it’s done? You can think of server push mechanisms.
Claim-check
This is of course a famous pattern. The claim-check pattern is often used when large objects are used such as large binary files, which you do not want to pull through your entire middleware layer. Often the data is labelled and saved somewhere. The middleware can get a reference to the data. This reference can be send to the place it needs to be and the data can be fetched and processed there.
Set processing
Service calls are expensive since they often traverse several layers of hard- and software. For example I need to fetch data on a lot of persons and I have a service to fetch me person information. I can call this service for every individual person. This can mean a Service Bus instance, a SOA composite instance, a SOA component instance,a database adapter instance, a database connection and fetching of a single item all the way back (not even talking about hard- and software load-balancers). Every instance and connection (e.g. HTTP, database) takes some time. If you can minimize the instances and connections, you obviously can gain a lot of performance. How to do this is more easy than it might seem. Just fetch more than one person in a single request.
Caching
If fetching certain pieces of information takes a lot of time, it might be worthwhile not fetch it every time you need it from the source but to use a cache. Of course you need to think about (among other things) how up to date the data in the cache needs to be, how/when you are going to update it and cache consistency. Also you only want to put stuff in the cache of which you know it is very likely is is going to be retrieved again in the near future. This might require predictive analyses. You can preload a cache or add data the moment it is fetched for the first time.
Caching can be done at different layers, in the front-end, at the service layer or at the database layer. You can even cache service requests in a proxy server.
Often data from different sources needs to be integrated. This can be done efficiently in a database. In such a case you can consider implementing an Operational Data Store. This is also a nice place to do some caching.
Parallel processing
If you are serially processing data and you have cores/threads/connections to spare, you might consider running certain processing steps in parallel. There is often an optimum in the performance when increasing the number of parallel threads (at first processing time decreases, after the optimum has been reached, processing time increases). You should do some measures on this in order to determine this optimum.
If a system has to interface with another system which does not support concurrency, you can use throttling mechanisms (pick up a message from the queue every few seconds) to spread load and avoid parallel requests. Do not forget the effects of having a clustered environment here.
If a service is sometimes unavailable, a retry might solve the issue. If however the system is unstable due to high load, a retry might increase the load further and make the system more unstable, causing more errors. This can cause a snowball effect since all the errors caused might go into retries further increasing the load on the system. If you throttle a queue near the back-end system causing the issue and put the normal requests and the retry requests on the same queue, you can avoid this. This probably requires custom fault-handling though.
Quality of service
You might want to avoid your background processing (e.g. a batch) to interfere with your front-end requests. There are several ways to help achieve this. First you can use queues with priority messages. The batch requests can have a lower priority than front-end requests. You can also split the hardware. Have separate servers to do the batch-processing and other servers doing the front-end serving. Whether this is a viable option depends on the effort required to create new servers to do the batch on.
Service granularity and layering
Usually services in a SOA landscape are layered (e.g. data services, integration services, business services, presentation services, utility services, …). A service call usually has overhead (e.g. different hardware layers are crossed, cost of instance creation). This layering increases the amount of overhead which is suffered because it increases the amount of service calls which are made. If however you have only a few services which provide a lot of functionality, you suffer in terms of re-use and flexibility. You should take the number of required service calls for a specific piece of functionality into account when thinking about your service granularity and layering. When performance is important, it might help to create a single service to provide the functionality of a couple of services in order to reduce the service instance creation and communication overhead. Also think about where you join your data. A database is quite efficient for those kind of things. You might consider the previously mentioned Operational Data Store and have your services fetch data from there.
Summary
If you want to keep performance in mind when creating services, there are several patterns you can use which can help you. Most are based on;
- Reduce superfluous data fetching. Think about efficient caching (and cache maintenance). Think about implementing a claim-check pattern.
- Reduce the number of service calls. Think about service layering, granularity and fetching of data sets instead of single pieces of data.
- When will the data be needed? Think about asynchronous processing (and how you know when it’s done).
- Optimally use your resources. Think about parallel processing and efficient load-balancing.
- Keep the user experience in mind. Think about priority of message-processing (batches often have lower priority).
Of course this article does not give you a complete list of patterns. They are however the patterns I suggest you at least consider when performance becomes an issue. This of course next to technical optimizations on infrastructure, configuration and service level.
Another example of spreading load over a longer period is, when a (uploaded) file has be to processed, to read and process it chunk by chunk as described in my blog: https://technology.amis.nl/2015/11/27/processing-large-xml-files-in-the-soa-suite/
Another pattern was to spread the load over a longer period.
For example when you mail all your customers and there are a lot of customers, they can or have to do something on your site (e.g. personalized offer). You can mail them all at once or divide them into several groups and mail group by group (with a certain time, week or month, in between).
Hi Maarten,
Nice summary. I only miss the most effective one… NOT doing something! :-))
It might sound like a joke, but sometimes it’s seriously something to consider.
There are also several ‘levels’ of not-doing-something:
– Don’t do debug/test functionality in the production (and acceptance) environment.
Sometimes it’s forgotten to change certain settings. e.g We recently had a load test on the acceptance environment. This environment is used to do the user acceptance tests based on functionality. So the log level was set to debug (more specific, the Audit level in the SOA environment was set to Development mode). Doing a load test doesn’t make sense when these settings are not changed.
– Skip (optional) functionality when this takes a long time (so end user had to wait) or causes a big load on the system performance. You might consider to switch it on/off depending on current load or certain timestamps (when load is high).
Another approach is to do it later and inform the user about it. e.g. Confirm to the user the file has been uploaded successfully and will be processed later on. (Don’t forget to inform hem how he can check if the processing has been succesfull! By email, check on the website, …)
– Don’t even build the functionality at all! Sometimes you should consider if the functionality is worth to be build. This is more a business question, but sometimes IT has to indicate certain functionality is very costly to build for the added value it does. Sometimes, especially when a system is rebuild, some business functionality is not used any more or not needed any more (they do it, because it always had been done).
e.g. A while ago I worked in a project where a financial ERP system was completely replaced. Instead of changing all the connecting systems, they chose not the change them but change (convert) the messages to the new format on the fly. One of the connecting systems was a meeting room reservation system which did not generate more than one thousand euro per month. After indicating this to the business (the costs of building a adapters for a small system is about the same as for a large system), this functionality (adapter) was skipped.