Saturday 22 November 2014

SOA11 and Coherence integration case study

In one of the project where I've worked on, the customer was facing issues due to the many web services calls triggered by their portal. Basically the Service Bus was crashing since it was reaching its limit in terms of web services transactions supported.

The initial architecture



The logged in users in Web Center portal were triggering calls trough the ADF framework to the OSB, and those calls were validated and enriched and then routed to the target third party services. All those operation were synchronous.

The analysis

We decided to take a deeper look to understand which services were being called, how many times and with which frequency, etc.
The tool we used here, was the Statistics in the OSB. We enabled them on all the Web Services, we run the Portal, and played with the functionality, which were calling the WS under analysis.

After 10 minutes we got the following result, where we noted 2 main things:

  1. There were some WS operations called very frequently (1 each 20 msecs for each logged users) like GetTask and GetCurrentState, while all the others were being called only few times.
  2.  The WS calls were getting information only for the specified WC Portal user which was triggering theggering thise calls.




The Solution

We spoke with the developers of the third party WS, proposing them to move to a "bundle" approach. Basically we asked them to expose two more operations, GetBundleTask and GetBundleState, which respectively were like GetTask and GetCurrentState, but with a list of usersId/taskId in input.
We configured a new Coherence server, thinking about using an asynchronous approach and the caching of the data. The architecture now looks like this:


Benefits:
  • The WebCenter/ADF layer did not need any modification since we did not touch the OSB WS interfaces.
  • The SOA Async process was getting data in bundle for all the Users logged in in one call (or very few) with much better results.
  • The services exposed on the OSB were getting the data 99% of the time from the local cache, so the network latency was now almost zero and the average response time was 30 times less.


Here are the new statistics after applying those changes. The number of calls to getTask and getCurrent state external WebServices from the OSB were dropped of the 99%, all the data were coming from the cache.



Here the Async BPEL service was facing the same latency calling the bundle services then the singe OSB services, but at least this time we were calling them with a list of users.

Here are some sequence diagram which explains now the sequence of the operations. The OSB Service now try to get the data from Coherence, if it is  not found it get the same from the OLD external services, and "subscribe" the users for update.

The SOA Business service was calling the external bundle services based on the "subscribed" users list.  





No comments: