BattlefyBlogHistoryOpen menu
Close menuHistory

Bad tests pass; good tests fail; great tests say why

Ronald Chen June 6th 2022

Img What's the deal with software developers wanting more tests? Why do others hate tests? Why would testing ever be controversial? What makes a test valuable?


How valuable a test is measured by the amount of useful information it conveys.


Bad tests pass

In the extreme, bad tests can actively harm as they not only convey no information, they only add noise. An example of this is a test that tests nothing at all. Sometimes these tests are a result of a refactoring and the test should be deleted. Sometimes these tests are actively written and added by those who do not understand the system or testing itself.

Unfortunately, bad tests are common. While most bad test do convey some information, they make one work for it by:

  • testing the implementation not the API
  • being unclear what is being tested
  • testing more than one thing at a time
  • overly DRY test code which makes it difficult to separate scenarios that should diverge
  • being under abstracted by showing the wrong level of detail (missing fixture or not hiding code in a cucumber step)
  • being slow
  • being flaky and intermittently failing
  • using inappropriate fake/stub/mock

Good tests fail

Test should be written to fail. In the extreme, there isn’t even a need for happy path test cases. A set of good tests can fully constrain what the code can do and any bugs will cause exactly one test to fail.

Great tests say why

Let’s look at an example of a bad test and how to refactor a set of great tests that convey useful information when they fail.

Our system under test is a re-implementation of padStart. Example is loosely written in TypeScript and Jest.

/**
 * Pads input string to target length by prepending pad string, repeating pad string if necessary.
 * Pad string is truncated if target length has been reached.
 * Input string is returned as-is if already at or exceeds target length.
 * Input string is also returned as-is for nonsensical zero length padString.
 * Pad string defaults to "space" character (U+0020) if not provided.
 */
function padStart(input: string, targetLength: number, padString?: string): string

Bad test

test ('happy path', () => {
  expect(padStart('foo', 6, 'x') === 'xxxfoo').toBe(true)
});

Problems with this test:

  • useless "happy path" description
  • combined multiple cases into one test
  • missing cases, such as not passing in optional padString or padString longer than a single character
  • poor usage of Jest and emits pointless false is not true when test fails

Great tests

padStart is simple in concept, but is composed of a surprising number of cases. By carefully reading the function documentation and analysis, we can extract all the different cases to test.

The critical functionality of padStart is determined by the length of the input string compared to targetLength. This is valuable insight as we can organize our test along these lines.

  1. When input is shorter than targetLength, the result will be of targetLength
  2. When input is already at or exceeds targetLength, the input string is returned as-in

Cases for: When input is shorter than targetLength, the result will be of targetLength

The simplest case is where the input string is short a single character to reach targetLength and padString is a single character. The result should be padString is prepended to input.

The next case is when padString needs to be used repeatedly.

What if padString isn't provided? Then space should be prepended.

What if the input is an empty string?

What if padString is longer than 1 character? Then we have multiple cases to consider:

  • What if a single usage of padString reaches targetLength?
  • What if a single usage of padString exceeds targetLength? padString should be truncated
  • What if padString is used repeatedly and reaches targetLength?
  • What if padString is used repeatedly but last the usage exceeds targetLength? Last usage should be truncated

Cases for: When input is already at or exceeds targetLength, the input string is returned as-in

The simplest case is when input is longer than targetLength The next case is when input the same length as targetLength

What if input is an empty string and targetLength is zero?

What if targetLength is a negative number?

One last edge case

What the two previous categories didn't cover is the edge case when padString is a zero length string. If this case wasn't handled then any time input is shorter than targetLength it would be impossible to return something that is of targetLength. We add one final test case.

Here are all the tests written out completely.


describe("Results in string of targetLength when input is shorter than targetLength", () => {
  test("Single character padString is prepended to bring input up to targetLength", () => {
    const input = "foo";
    const targetLength = 4;
    const padString = "x";

    const result = padStart(input, targetLength, padString);

    expect(result).toBe("xfoo");
  });

  test("Single character padString is prepended repeatedly to bring input up to targetLength", () => {
    const input = "foo";
    const targetLength = 6;
    const padString = "x";

    const result = padStart(input, targetLength, padString);

    expect(result).toBe("xxxfoo");
  });

  test("padString defaults to space if not provided", () => {
    const input = "foo";
    const targetLength = 6;

    const result = padStart(input, targetLength);

    expect(result).toBe("   foo");
  });

  test("Single charcter padString repeated targetLength times if input is empty", () => {
    const input = "";
    const targetLength = 6;

    const result = padStart(input, targetLength);

    expect(result).toBe("      ");
  });

  test("padString can be a string longer than 1", () => {
    const input = "foo";
    const targetLength = 6;
    const padString = "abc";

    const result = padStart(input, targetLength, padString);

    expect(result).toBe("abcfoo");
  });

  test("padString can be a string longer than 1 can be used repeatedly", () => {
    const input = "foo";
    const targetLength = 9;
    const padString = "abc";

    const result = padStart(input, targetLength, padString);

    expect(result).toBe("abcabcfoo");
  });

  test("padString is truncated when targetLength has been reached", () => {
    const input = "foo";
    const targetLength = 6;
    const padString = "abcd";

    const result = padStart(input, targetLength, padString);

    expect(result).toBe("abcfoo");
  });

  test("last repeated padString is truncated when targetLength has been reached", () => {
    const input = "foo";
    const targetLength = 6;
    const padString = "ab";

    const result = padStart(input, targetLength, padString);

    expect(result).toBe("abafoo");
  });
});

describe("Results in input string as-is if input length is already at or exceeds targetLength", () => {
  test("input is longer targetLength", () => {
    const input = "foo";
    const targetLength = 1;

    const result = padStart(input, targetLength);

    expect(result).toBe("foo");
  });

  test("input is already at targetLength", () => {
    const input = "foo";
    const targetLength = 3;

    const result = padStart(input, targetLength);

    expect(result).toBe("foo");
  });

  test("empty input and targetLength of zero results in empty string", () => {
    const input = "";
    const targetLength = 0;

    const result = padStart(input, targetLength);

    expect(result).toBe("");
  });

  test("input is always longer when targetLength is negative", () => {
    const input = "foo";
    const targetLength = -1;

    const result = padStart(input, targetLength);

    expect(result).toBe("foo");
  });
});

test("input string returned as-is if padString is empty string", () => {
  const input = "foo";
  const targetLength = 6;
  const padString = "";

  const result = padStart(input, targetLength, padString);

  expect(result).toBe("foo");
});

This may seem like a lot of test code for such a simple function, but this should be considered normal. A well maintained system typically had an order of magnitude of test code compared to actual runtime code.

Notice how each case tests very little. This maximizes the chance that very specific information is conveyed when a bug is introduced. Clearly and narrowly written tests make it a joy when they fail.

Do you want to practise writing great tests? You're in luck, Battlefy is hiring.

2022

Powered by
BATTLEFY