Shouldn't We Check Only One Integration Point Per Test?

Question

I watched a video you demonstrate Integration Tests in practice: https://www.youtube.com/watch?v=RyQnJUWcXFo

In that video you test the SellOneItemController integration with Catalog and Display.

I have a doubt about the right strategy to test integration between components: what is better?

  1. Test ProductFound(), ProductNotFound() and EmptyBarcode() as you showed in that video.
  2. Test FindPriceMustBeCalledWithBarcodeWhenBarcodeIsPresent(), FindPriceMustNotBeCalledWhenBarcodeIsEmpty(), DisplayPriceMustBeCalledWhenProductFound(), DisplayProductNotFoundMessageMustBeCalledWhenProductNotFound(), DisplayScannedEmptyBarcodeMessageMustBeCalledWhenBarcodeIsEmpty()?

Wouldn't be better to check only one integration per test method?

Discussion

Yes, I generally prefer to check a single bit of desired behavior in each test. I don't, however, like your suggestion for (b) here, because it assumes that querying the Catalog is a significant goal of the Controller, and I don't see it that way.

Finding Interesting Integration Surfaces

I have chosen to design the Controller in a way that its goal is to display the right things at the right time. (The implementation of the Display will decide precisely how to "display things".) I have chosen to design this with a Display interface, so that the goal of each test becomes for the Controller to invoke the appropriate method on the Display. The integration between Controller and Display, therefore, becomes the integration surface (where components fit together) that interests me. Now I have a choice:

  • Look at all the various inputs to the Controller and decide what actions the Display should be told to perform.
  • Look at all the various actions that the Display could perform and decide in which situations the Controller should tell the Display to perform those actions.

This is only a choice of how to think about the integration. Neither choice "is better" than the other. I have no rule for choosing; I think it is merely related to my mood at the time. In this case, I chose to think from the point of view of the input to the Controller: a barcode matching a price, a barcode not matching a price, and then the special case of an empty barcode, only because this last case might "be weird". (As you will have seen in the rest of the course, this point is questionable, but it leads to some interesting design choices later.) In each case, we decide what to ask the Display to display, and these become the assertions of the respective tests. Since we only care about one side-effect (one thing to display), each test conveniently has one assertion, which happens to respect a useful principle of designing tests (One Assertion Per Test).1

This explains how I arrived at the three cases:

  1. Product found
  2. Product not found
  3. Empty barcode

You asked about checking the integration between Controller and Catalog. In this case, I don't worry about whether the Controller invokes the "right" methods on the Catalog, because this is not the purpose of the Controller, nor the goal of handling a scanned barcode. I want to focus on checking the goals of a scenario or test, without becoming distracted by the details of every little aspect of the integration between components. (In legacy code, by contrast, we check every little aspect of the integration between components while we reverse-engineer which ones are important.)

What's The Goal?

Why do we handle barcodes? In order to display the price on the Display. What if we can't display a price on the Display? We display some kind of message that explains why we haven't displayed a price. The Catalog plays only a supporting role in this bundle of scenarios. Accordingly, I don't feel the need to directly verify the integration between the Controller and the Catalog; I ask the Controller to use the Catalog only in order to be able to decide what to Display. The goal remains to check how the Controller directs the Display. I could, if I wanted to, implement that searching behavior in a variety of ways (Catalog interface, lambda, embedded SQL rolled up into the Controller itself... not all these designs are equally sensible), but the objective of the Controller would remain to ask the Display to display the "right thing" at the "right time".

Since the Controller/Catalog integration merely supports the Controller/Display integration, I don't check the Controller/Catalog integration, but rather simply specify what the Controller expects from the Catalog so that I can implement the Catalog correctly. This leads to articulating the contract of the Catalog, that findPrice turns barcodes into prices and that if the price is null, then we should interpret that to mean "barcode not found" and if the price is not null, then we should interpret that to mean "this is the price of that barcode".

Stub Queries; Expect Actions

In general, queries (like findPrice) tend to help a component decide which actions to perform. The interfaces of queries tend to be less stable than the interfaces of actions, so expecting queries tends to be more risky, whereas expecting actions tends to be less risky. The precise interface of actions tends to be more important, whereas the precise interface of queries tends to be less sensitive to changes. Invoking queries multiple times tends to create fewer problems than invoking actions multiple times: the former might be a performance problem, but the latter would be clearly incorrect behavior. All this leads to the rule of thumb stub queries; but expect actions.

But Sometimes Expect Queries

I might expect a query if I'm generating a report and the goal of the scenario is to apply the correct filters to the data in that report. In this case, I'd verify (expect) that my Controller asks the CustomerBase for all the customers with orders pending as of today, as opposed to some other date. I might have one test for each interesting different query, checking that the Controller invokes each query with the correct parameters, and then one more test that stubs the query and expects the report module to receive exactly the data set that the query returns. Although less common, sometimes I will expect queries; but unless I have a very good reason to do so, I typically only stub queries and expect actions.

A Parting Thought About Names

I typically don't name tests with both the special case input and the expected result, as in "DisplayScannedEmptyBarcodeMessageMustBeCalledWhenBarcodeIsEmpty". Instead, I either name the tests according to the special case input ("empty barcode") or according to the desired outcome ("display empty barcode message") and let the test code explain either what happens in each special case or what situation leads to the desired outcome.

If I have several special cases leading to the same desired outcome, then that desired outcome becomes a group of tests (a test case class in JUnit or a "context" group in RSpec) within which I write a test for each special case leading to that outcome. You don't have to remember any special rules here—it's enough to remove duplication in the names of the tests, because the duplication in the names reflects the structure that the tests want to arrange themselves in.

References

Steve Freeman, "Test Smell: Too Many Expectations". This offers a mechanical explanation for when to stub and when to expect method invocations: expect a method when it would be an error for other things to happen or for the thing to happen more than once, but stub when it would be acceptable to invoke the method multiple times (or never!).


  1. I recommend practising for several months following the principle of one assertion per test to see how it affects your designs. I recommend trying it, even when it doesn't seem "helpful" or "better". Think of it like Object Calisthenics, but for tests: I don't always design this way, but trying to do it helped me understand the value of deeper design principles.

Complete and Continue