Property-Based Testing – Matthaei Dev

Property-based testing is a technique I came into contact with that probably changed the way I code the most. It is a powerful tool to describe the properties (/the behavior) of the units software consists of. It greatly reduces the amount of test code that needs to be written, while increasing the coverage of edge cases at the same time. We will start by finding out that we actually have a problem with existing test methods.

The Problem

Achieving a high or even 100% coverage rate with unit tests is already fairly labor-intensive. But even if all the lines, or the branches of code are covered, there still is the possibility of errors being in there. A quick example:

def pop_last(container):
    last = container[-1]
    container.pop()
    return (container, last)

def test_pop_last():
    assert pop_last([0,1]) == ([0], 1)

This function gets the last element of a container and returns it in a tuple together with all the other elements in a new container. So far so good. The test case passes. We’re done!

Unfortunately there are edge cases here. The most glaring one is the empty list. Passing an empty list raises ‘IndexError: list index out of range’. Also not all containers provide the pop method. This would namely fail for tuples. In fact, all non-mutable containers would not work.

To tackle this problem equivalence class decomposition should be used to find the “classes” of input that would likely cause (the same) failures. This causes a lot of work in two ways: First, there is a lot of cognitive work involved in thinking through the kinds of inputs that could be given to a function (even ignoring that basically values of all types can be passed in Python). Second, one would have to write a lot of test cases with a lot of repetitive code.

This is where property-based testing aims to help.

How It Works

Property-based testing basically shifts the responsibility of finding relevant test data from the programmer to a framework. The programmer only states that the implementation shall have some properties and lets the framework figure out the rest. Typically, the framework needs to be told the types of the input parameters as well. This sounds fairly abstract now, but it isn’t if we look at an example. The hypothesis library is used for this example. ‘hypothesis’ is an implementation of property-based testing in Python.

import hypothesis
from hypothesis import strategies as hs

@hypothesis.given(hs.one_of(
    hs.lists(hs.integers()),
    hs.tuples(hs.integers())))
def test_pop_last(container):
    if len(container) == 0:
        # typically assert an exception here,
        # but skip for this example
        return
    assert len(pop_last(container)[0]) == len(container) - 1
    assert pop_last(container) == (container[:-1], container[-1])

What happens here? We state the properties that we would expect the implementation to have. The returned container should be one shorter than before, and we explicitly state the split we would expect. The test function is decorated with hypothesis, which will provide us with (100) samples drawn from the “strategy”. It is either a tuple or a list.

It will do so by some thought through code from that framework that tries empty lists, long tuples, zeros in the containers, one element lists and tuples, etc. To summarize, it is acting a lot smarter and working much harder than a programmer that is on a tight schedule and just wishing to get it over with. Hypothesis also offers “shrinking” to find the “easiest” example producing a failure and report it.

From this specifically we see that return (container[:-1], container[-1]) is a much better implementation for all indexable containers (as it creates copies). It will pass this much stricter test. Depending on requirements, the empty list might need to be handled specifically (i.e. by returning ([], None)).

Implementations and Further Reading

Frameworks for property-based testing are available for all major (and some minor) programming languages. I personally have experience with hypothesis for Python and rapidcheck for C++. Both of which I found to be excellent. List of implementations for other languages are available.

I would also recommend reading the article of the creator of hypothesis about what property-based testing means to him. But most important is trying it now and seeing whether it helps your development process or not.

The Problem

How It Works

Implementations and Further Reading

By Tilmann Matthaei