How to test service-oriented architectures is no idle question. A failure in a SOA system at Heathrow Airport’s $8.6 Billion Terminal 5 caused 1.6 British Pounds (about 3.2 million U.S. dollars) of losses in one week. The error? Simply that a filter put in to ensure that the baggage handler was tested in isolation was never removed—so event messages were never passed on to other, dependent systems.
That error was pure functionality; we haven’t even begun to cover orchestration, security, load or performance. In this article, I cover some of the fundamental issues with testing service-oriented architectures, expand on risks and strategies, and close with a few personal lessons learned.
So, What are We Testing?
Please allow me to introduce you to Stoic Financial Services, or SFS, as an example. Until recently, SFS was restricted to credit cards, but has decided to grow through acquisition to cover the entire financial services sector. That means that SFS will sell mortgages, insurance, investment and retirement services. When it acquires a new company, SFS will want new customer information, sales leads and account and financial information to flow into its corporate HQ system, while keeping the old system running.
Each proprietary system will be ‘wrapped’ in Web service capability in order to integrate these systems without re-writing them. When a change occurs on any system, the Web service will capture that change and notify the company’s Enterprise Service Bus, or ESB. The ESB is responsible for communicating that change throughout the company.
What this means is that when a new customer purchases an insurance product in Schenectady, New York, the local sales office is notified, the other business units are notified of a new sales lead and the financial system records the transaction. Automatically.
But how do you test this capability?
Layers and Layers of Transactions
The new customer example above is a complex example—and that’s by design. Instead of one transaction, we have a half-dozen. Ideally, all of these will track back to use cases in requirements, but, remember, SFS is going to grow through acquisition. That means that the companies it acquires will have legacy systems in place that may predate use cases or any modern programming techniques.
Somehow, somewhere, someone at the acquired company will know how to make maintenance changes, so they will know how to do traditional testing for each system in isolation. Out of those test cases, we will extract cases where we want to interact with other components and our system test plan will come out of that work.
The overall SOA effort should include requirements for how our systems interoperate; each of these becomes a use case for messaging.
We’ll also want to do volume testing, soap opera testing, attribute testing (more about those later) and exploratory testing. Exploratory testing is just what it sounds like: learning about the product as you test, with the results from one action helping guide your strategy. To do that, you’ll need to watch the messages that are sent to different systems, if not simulate them by hand.
Fundamentally, we are talking about testing a messaging system — making sure that the messages that go across the wire for every transaction are correct. If something does go wrong, was the problem at the sender or the receiver? It can be hard to tell. If there is a central “hub,” it may have incoming and outgoing log messages to check. If your company is just integrating commercial, off-the-shelf software, you may be able to test using a tool like Empirix or TKO’s Lisa.
If your technical staff is creating its own Web services, they should be able to create tools to send and monitor messages on the service bus. Once these tools are in place, you can record what messages go across the wire and the results, and use some kind of driver to “play them back”—a form of test automation. With such a test suite in place, adding a new service is relatively easy. Run all the old tests, record and run new tests, and see success. These tools will be especially valuable for attribute testing, which I discuss next.
The software may function, but how quickly does it perform each call? How how quickly is fast enough? And what happens at the end of the month, when the systems run billing in batch? If a customer orders a part, how fast does that order need to appear on a website? Minutes? Hours? We call these “ities” (scalability, security, reliability) attributes; thus the term “attribute testing.”
Security, redundancy, failover, perfomance, load and volume—we should consider the risks of each, consult with management, and put these tests into our test plan.
Soap Opera Testing
If you’ve gotten this far, you probably realize that comprehensive testing is impossible. What we strive to do is the best possible testing with the time we are given, or, as Bill Hetzel put it so well “The only truly exhaustive testing is when the tester is exhausted.” Our goal is not “perfect software,” but instead software that is fit for use, or responsible testing.
To be effective with the time we have, we want to test the most powerful test scenarios, tests that check many, many different variables at one time; we call these “soap opera tests.” For example, in an insurance system, that might be a man who starts a business with the wife having primary coverage. Then she has a baby and leaves the husband two weeks later, keeping the baby and herself on the policy. Then the man’s long-lost son, who is in college, returns, and the (now ex-)wife agrees to purchase a policy for the son. Two days before the son turns 21 and becomes ineligible for the policy, he has a heart attack. The tests are: does the insurance policy pay out? To whom?
Sounds a bit like a soap opera, doesn’t it? Soap opera tests stress multiple business rules at the same time, making them very powerful. The logic is very simple: If basic testing and the soap opera tests pass, the general case will probably do just fine.
Live Data Testing
John McConda, a consultant at Moser Consulting, points out that many testing efforts are naive in that they ignore the full end-game scenario. After, all, the ultimate test of a service-oriented architecture is “orchestration:” all of the pieces working together under a realistic load, such as a monthly billing cycle. According to McConda, “It seems to me that what you really want is for all these systems to work together; I’ve seen test plans that checked each component but failed to consider if the whole was going to work together with real data.”
The simplest way to do this is to have a copy production, perhaps with unique identifiers like social security number removed, and perform the largest operations the system will have, such as bill batches, claims batches and so.
A Few Lessons Learned
SOA is more than a technique; it is a design philosophy. Another core design philosophy is incremental and evolving delivery, instead of big bang up front. In other words, the easiest way to have a huge, messy SOA is to develop each system independently and then to put off integration until the end.
Instead, allow your SOA project to start with two or three components that need to work together. Integrate them successfully, deploy to production, and then add piecemeal.
Second, be sure to test multiple passes of the software. I remember one project where the account balances were stored on the first pass for every transaction. I was brought in to diagnose the bugs. Of course, the lookup was failing; it had never been tested.
A common problem with SOA systems is performance involving multiple Web service requests. For example, I worked with a system that needed to create a bill, so it asked for a list of all accounts then called “Create a Bill” for each. This cause a number of repetitive lookups. We refactored the code so that it called “Create BillS” and let the client keep that data in memory; this saved the system from thrashing and cut processing time by 80 percent.
David Christiansen, a project management and testing expert, points out that identifying who is responsible for testing which pieces of the SOA can be critical for success, especially during maintenance. For example, if a service provider changes its implementation, you still have to do integration testing, and the integration team may no longer exist.
Putting It All Together
It’s just not possible to have enough time to test every possible input and output in a SOA system. The challenge of testing SOA is to pick the right tests — the ones that will provide the most information about the software in the smallest amount of time.
To test any service-oriented architecture, we need to test all the components in isolation, as well as the common use cases where the systems are interdependent. To have a complete strategy, we will also perform soap opera testing, exploratory testing, and large volume testing of, ideally, live data. To test these, we have to test for both the happy, use-case path and the extreme, negative, destructive path.
If you follow the advice in this article and think critically, you probably won’t end up with a system that is perfect — but you might end up with one that is good enough — that is fit for use. And as it turns out, “Fitness for Use” is how the Quality Guru, Dr. Joseph M. Juran defined quality itself.
So go forth, and test!