Thursday, February 19, 2009

Integration Testing Asynchronous Services

Whether you use Mule or another integration framework, writing integration tests for asynchronous services can be a little tricky, as you may start running assertions in your main test thread while messages are still being processed.

Suppose you want to ensure that your content based router sends messages to the expected target services. Your integration test suite will consist in sending valid and invalid messages to the input channel in front of this router and check they end-up in the expected destination services.

But how to be sure that the delivery has happened and it is safe to start asserting values?

The naive solution to this problem consists in:
Thread.sleep(1000);

It is naive because it arbitrarily slows down your test suite without giving any guarantee that the message will be delivered. It may work on your machine but the same suite running on a busy integration server will behave very differently: one second may not be enough a wait if the threads handling the message delivery behind the scene are slower than usual.

A problematic fix to this broken solution consists in:
while (messageNotReceived) Thread.sleep(1000);

This is problematic because if, for any reason, the message does not hit the expected service, this loop will hang the test suite for ever. Of course, you can add a counter to limit the loop and reduce the sleep time to be more reactive, but this amounts to re-inventing the wheel when higher level concurrency primitives are ready-made for that.

Indeed, we can use a countdown latch to hold the main test thread until the message hits the expected service. Here is the code I use for my Mule integration testing:

I leverage the notification mechanism of Mule and the capacity its functional test component has to broadcast such an event when it is hit by a message. Though a Mule specific implementation, the design principle I discuss here is applicable to other situations.

Notice how I pass the triggering action of the test (like sending a message with the MuleClient) as a Runnable to this method. This allows me to prepare a latch and register a listener before calling this Runnable and, then, to wait on the latch until it is lowered. When the execution comes back to the caller method, it can safely assess properties on the received message.

Obviously, the test is designed so its normal execution leads to a message always hitting the right service. Hence, in practice, the wait on the latch is very short, way under the safeguard timeout of 15 seconds.

But what if we want to test that no message will hit a particular service?

This is when things get tricky with asynchronous testing, because there is not systematic approach to testing the absence of an event when it can occur unpredictably in the future.

One strategy I use is to turn the problem around and change the negative assertion into a positive one. For example, by always ensuring a message will go somewhere, even if it is a service that sends messages to oblivion. By doing so, instead of testing that a bad message never hits a good service, I test that a bad message hits the oblivion service.

Do you have other advices to share about asynchronous testing?