Runtime Integration Testing

Another Use for Unit Testing Techniques

Runtime integration testing is important, even in statically-typed languages, because software components are thrown together by end-users in new combinations without that particular combination having been tested before. These component integrations can be quite complex, involving several large components and assorted glue code, leading to combinatorial explosion of the number of unanticipated edge cases encountered by end-users.

Component Interfaces

When I start using some software component, I look for the usual suspects:

By "interface," I mean the full specification of an API, not a Java Interface. This is more of the CORBA/COM meaning of the word, but the only place I seen a really elegant language feature for this is in Standard ML. Go learn enough ML to grok structures — it's fun. The more sophisticated sample programs could easily be written as well-commented unit tests. Making them unit tests ensures that the examples are kept in sync as the API evolves. Unit tests like these for an external API are also a little different from most unit tests. They're restricted to using only the external parts of the API, verifying the semantic side of the API. This goes much further than Java Interfaces or C++ ABCs in ensuring that an implementation of the interface does what you expect it to.

The big advantage of component interfaces is orthogonality. You have one interface that allows m users to use n implementing components, giving you (m * n) combinations. If you write one set of unit tests against the component API and one that implements the API as a mock component for testing, you come really close to ensuring that those (m * n) combinations all work by testing (m + n) pieces.

Plugin APIs as Component Interfaces

This argument breaks down when m = 1, as it seems to be in the most common component API pattern, plugins. Plugin APIs are typically complicated, application-specific beasts. For example, browser plugins are tied to specific browsers like Firefox or IE. But using these two types of unit tests you can still get an (m * n) to (m + n) efficiency conversion.

That's a strong claim — you still need to write (m * n) different adapters, but you can get away with (m + n) unit suites or mocks. Maybe you think that defeats the purpose: my code base size is still ∝ (m * n). But in the real world, kicking the amount of unit test code you need to write out of the exponential category makes a big difference.

First, you write m unit test suites, one for each plugin API. Then write n mock component implementations, one for each software component being adapted. Now you can run the right combination to test any plugin adapter you want.

As a thought experiment, consider IDEs and Version Control systems. Each IDE must have a plugin for working with each major VC system. One for CVS, one for Subversion, etc. If each IDE comes with unit tests for verifying the correctness of plugins, and each version control system provides a mock test implementation of the APIs, you can have unit tests pretending to be Eclipse load the Eclipse-Subversion plugin and test it using the mock Subversion implementation.

The mock implementation is needed so that you don't need to create new repositories for testing. Its job is to make sure the Subversion API being used correctly, so it should also provide validation of incoming data values. Like checking repository URLs for schemes you wouldn't normally test if you were setting up a test repository (think svn+ssh:// URLs).

Runtime Integration Testing

Since these software components are combined by end-users at run time, that's probably the right time and place to test the various combinations. An application unit tests a plugin at registration time and reports any problems with the plugin's behavior to the user, and optionally to the software's maintainers. As a developer, this protects you from the embarrassment of having poorly-written plugins crash your product. And each party has incentive to provide their part of the testing infrastructure.

Testing this sort of semantic conformance to an API is very powerful. You can check to make sure that the plugin doesn't take too long, or allocate too much memory during a simulated interaction to protect your users' experience using your program. With the tests forked off in their own process, you can blame the plugin for nasty situations like infinite loops, heap exhaustion and even memory corruption.

Picture of my email address.
December 29, 2007