Welcome!

Oracle Authors: Dan Joe Barry, Ray DePena, Maureen O'Gara, Steve Hanna, Jeremy Geelan

Related Topics: SOA & WOA, XML, Oracle

SOA & WOA: Article

Improving the Efficiency of SOA-Based Applications

Using an Application Grid with large XML documents to build SOA applications that scale linearly and predictably

SOA and the Application Grid
The next-generation SOA platform that utilizes an application grid provides the usual things you would expect to have in a service infrastructure, such as service-level abstraction, mediation in the form of data transformation and routing, multi-protocol support, adapters, etc., which is combined with application grid capabilities to more seamlessly enable in-memory data caching of service request payloads and shared service state data, service result caching, and event-driven architectures (EDA)

How can we implement patterns that realize the benefits of this today? In a typical SOA scenario, multiple services in a process flow may interact with the same data. Without the grid, each service must be given the data it needs every time the service is invoked. With the grid we implement a variation of the "Claim Check" pattern [2], made popular by the Enterprise Integration Patterns [3] book by Gregor Hohpe and Bobby Woolf, the "State Repository" and "Service Grid [4] patterns [5] from SOA Design Patterns [6] by Thomas Erl, et al.

The variance on this pattern is rather than use a database to store message payload data, we use the application grid to hold that information in memory, and each service simply is passed a key or a list of keys to the data on which it will operate. The means that passing the key from one service to the next will vary depending on the ESB, process engine, and transport but will typically be carried in the service request as part of a protocol-specific header property, or an agreed-upon element of the (much smaller) XML payload. The services become "grid-aware" and are able to access data as necessary and call for aggregate operation on the data set. Once processed, the data set can remain in memory for extremely fast read operations, or it can be persisted to a database using asynchronous write-behind, often in subsets and in formats suitable for longer-term relational data storage (see Figure 5).

Figure 5: SOA and an Application Grid provide in-memory access to service state data, minimizing boundary costs using the "Claim Check" pattern

The XML Grid Example - Describing the Scenario
Using the grid to store service request payload data to implement a Claim Check pattern for a multistep business process is interesting background information for this article. However, the main focus of our example is centered on how to store and manipulate large XML documents in the grid so that they could be processed using a Claim Check pattern.

The scenario for the example is as follows: A large XML document needs to be processed by a number of services. Rather than having each service de-serialize, parse, manipulate, and re-serialize the entire document, the document will be broken up into smaller parts, converted to Java objects, and stored in the application grid. This operation would be performed once by the first service in the chain, or by a utility service that intercepts the XML document before it reaches the first service. A much smaller XML message - the "claim check" - which contains a key for accessing the data in the application grid, is passed from one service to the next.

By the way you don't necessarily need to have multiple services executed serially in order for this pattern to be useful. It could be reference data such as rental car rates or airline flight data and availability that are populated once a day and operated on all day by various services, or accessed via user queries through portal apps.

Splitting up the XML
We use a STAX parser to stream through the XML document and break it up into its constituent parts. Because STAX will still materialize an object tree when it is told where to start, we intentionally look for the first of many repeating elements to start with. If the XML has thousands of ‘item' nodes in a container node called ‘items', we start at the parser at ‘item' to avoid materializing the entire ‘items' tree. Listing 1 shows the key part of this operation.

We then use JAXB to convert individual XML elements into Java objects. JAXB allows for POJO representation of XML, which will make serialization easier than other XML-to-Java technologies. Before we run this code, we have taken an XML Schema and generated JAXB classes from it using the Eclipse XJC plug-in. Listing 2 shows the beginning of the loop through the XML stream, where we create a JAXB object during each iteration.

Once we have a reference to the Item as a JAXB object, we put it onto an application grid. In this case, we are using Oracle Coherence, which can mimic the Java Map APIs for Java and C++, or Dictionary interfaces for .NET apps (see Listing 3).

Tapping into the Power of the Application Grid
Once the objects are stored in the grid, data can be accessed as in-memory Java, C++, or C# objects. We can take advantage of some advanced capabilities such as parallel query operations against the in-memory object data, continuous query, and parallel processing in the application grid.

In our example we are going to perform a simple query across the data in the grid. The code in Listing 4 shows simple processing logic that updates the comment field of all occurrences of items containing "foo".

Executing Grid-Based Processing Logic as Events
Using a simple JavaBean listener pattern, the application grid can run Java-based logic as events that get triggered whenever a piece of data is written to the grid, or when it is read. This is very much like a Java-based read/write trigger or stored procedure. Listing 5 shows a grid-based event that simply displays the new value when data in the grid is updated.

Reconstructing the XML
Eventually the business process that is manipulating this XML document needs to come to an end, and the data that is in the grid needs to be stored or shared with another application or service. If the data needs to be stored in a relational database, then an object to relational mapping operation can occur in the background in a manner that doesn't interfere with the real-time interaction between the services and the application grid.

As with any implementation of the Claim Check pattern, all services participating in the process need to be aware that they are implementing this pattern in order to effectively realize the benefits. Inevitably the process needs to communicate with a service that cannot or has not yet been converted to use the Claim Check pattern, and therefore a "Transform and Route" step from Figure 5 needs to be put in place to serialize the data back into its XML on-the -wire format. In this case we're still incurring the boundary cost, but for steps 1, 2, and 3 we are much more efficient.

Listing 6 shows a technique for using the application grid API to find all of the objects by key and using JAXB to create XML from the objects. This is a simple example that uses Java to brute force create the XML, but one could get more creative and use an XSLT style sheet, using multiple queries to the grid to get the data necessary to populate the style sheet. In addition, streaming techniques can once again be used to reconstitute the XML in pieces, to avoid fully materializing an in-memory DOM tree.

In summary, what we have learned is that an application grid can be used to dramatically improve the efficiency and scalability of a SOA-based application that deals with large amounts of data. By storing and manipulating service request payloads, whether large and unwieldy or small and many, in a horizontally scalable application grid, we can scale the application with predictable latency. In addition, using the powerful capabilities of the application grid we can execute distributed parallel queries and updates with in-memory access speeds. As the data gets bigger, we scale to meet the demands by simply scaling the grid. This helps us to achieve more processing capability than before, with less robust hardware, and architect for scalability from the beginning without having to revisit the application design every time you hit the scalability wall.

Complete listings and instructions for running this example can be found here.

References

  1. Moore's Law
  2. Claim Check pattern
  3. "Enterprise Integration Patterns " Gregor Hohpe and Bobby Woolf.
    ISBN 0321200683
  4. Service Grid pattern - Dave Chappell
  5. State Repository pattern - Thomas Erl
  6. "SOA Design Patterns" Thomas Erl, et al. ISBN: 013613516

More Stories By Dave Chappell

David Chappell is vice president and chief technologist for SOA at Oracle Corporation, and is driving the vision for Oracle’s SOA on App Grid initiative.

More Stories By Andrew Gregory

Andrew Gregory is currently a Sales Consultant at Oracle Corporation. He has worked in Development, Product Support, Infrastructure, and Sales over 13 years in the industry.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
jhv1blz5 07/03/09 10:31:00 AM EDT

The article validated SOA as an IT architecture paradigm that can be leveraged in many ways. Taking data storage, scalability and application performance to a nifty level using SOA Application Grid infrastructure will no doubt enhance data and application performance on Oracle architecture platforms, it also has the promise of a cost effective and efficient IT delivery model. The very benefits of SOA.