A quick intro to Property-Based Testing

2020-12-08

With Elixir, PropCheck and the Fibonacci series

First off, what’s Property-Based Testing? to describe it in a phrase that’s simple, easy to understand and wrong, “it’s a method by which you automatically generate automated tests that check for a given rule”. It is kind of like that, but the most important part of PBT doesn’t lie in the code itself but in the mindset you have to adopt to make full use of it: thinking in terms of properties instead of tests is key, but it’s not something that can be taught in a simple tutorial such as this. So instead, my aim is to just show it a little, and see if you’re interested enough to study it on your own.

The sample project

We’ll do a simple library in Elixir that implements a well-known algorithm: the Fibonacci series.

We start with creating the project:

$ mix new fib

Then we add in mix.exs our dependency, PropCheck:

defp deps do
  [
    {:propcheck, "~> 1.3", only: [:test, :dev]}
  ]
end

And of course, we add our tail-optimized implementation:

def fib(n), do: f(n, 0, 1)

defp f(n, a, b)
defp f(0, a, _b), do: a
defp f(n, a, b), do: f(n - 1, a + b, a)

Now, you’d be tempted to write some traditional tests to make sure this works—the code does look a bit messy, no? so let’s go ahead and add them:

describe "traditional tests" do
  test "validate the initial elements in the series" do
    assert Fib.fib(1) == 1
    assert Fib.fib(2) == 1
    assert Fib.fib(3) == 2
    assert Fib.fib(4) == 3
    assert Fib.fib(5) == 5
    assert Fib.fib(6) == 8
  end
end

A bit wordy, but they validate just fine, and will give us some peace of mind as we move on to testing properties instead.

Our first property

The beauty of mathematical functions is that properties are so easy to find. And what’s the easiest property we can imagine for the Fibonacci series? that for all $n > 0, Fib(n) > 0$. So first we add PropCheck into our code:

use PropCheck, default_opts: [numtests: 100]

The numtest variable already hints at why Property-Based Testing is so cool: with just the coding of a single property, we can get a hundred tests for “free”. So let’s go ahead and do that:

describe "the fibonacci series" do    
  property "is composed only of positive integers for n > 0" do
    forall n <- non_neg_integer() do
      num = Fib.fib(n)

      is_integer(num) and num > 0
    end
  end
end

So we run it with mix test and… oops.

Property Elixir.FibTest.property the fibonacci series is composed only of non-zero integers() failed. Counter-Example is:
     [0]

Of course, we forgot to make sure our n variable wasn’t zero and $Fib(0)=0$, so we make two simple changes: add our own custom generator, and rewrite the property to use it:

defp positive_int, do: such_that num <- non_neg_integer(), when: num > 0

describe "the fibonacci series" do    
  property "is composed only of positive integers for n > 0" do
    forall n <- positive_int() do
      num = Fib.fib(n)

      is_integer(num) and num > 0
    end
  end
end

And…

$ mix test
....................................................................................................
OK: Passed 100 test(s).

Great!

Now, the part about writing our own custom generators might sound scary at first, but as you can see it’s just a bit of metaprogramming magic that makes it easy to extend PropCheck’s own generators and as you can see, it flows quite naturally once you learn the basic syntax.

The model

Now we’ll implement another kind of property: the model. Which is, simply, ensure that our production-grade implementation works the same as a slower, but more obviously correct reference implementation. So let’s try that, first by writing a straight implementation of the definition for the Fibonnaci series as a test helper:

defp model_fib(n) do
  if n <= 2 do
    1
  else
    model_fib(n-1) + model_fib(n-2)
  end
end

And then our property comparing both their results:

property "works the same as the reference implementation for n in (1, 30)" do
  forall n <- integer(1, 30) do
    num = Fib.fib(n)
    ref = model_fib(n)

    num == ref
  end
end

OK, I admit it: I cheated. Not because of any problem with my implementation, but the reference one: it’s extremely slow for integers as small as $n=100$, so I forced it to choose a number between 1 and 30 instead of reusing our generator. And…

$ mix test
.....................................................................................................
OK: Passed 100 test(s).
.....................................................................................................
OK: Passed 100 test(s).
.

Finished in 0.6 seconds
2 properties, 1 test, 0 failures

Sweet!

Refactoring

This model-based Property-Based Testing is particularly useful for TDD, since you can take your initial implementation as the model, then optimize to your heart’s content with a really strong regression suite backing it up; no matter how crazy your idea, your tests ensure your implementation remains consistent, among a wider variety of inputs than with tests written by hand.

It happened while I was writing this article, in fact, when I thought of implementing the Fibonacci series using Streams. Streams are a feature present in many languages (such as Java) where you construct a List-like object using a lazily-evaluated generator function; usually to avoid having the entire list in memory, but they can also be used to implement infinite series like Fibonacci.

So, first we use our tail-recursive implementation, whose correctness we’ve already established, as the new model:

defp model_fib(n), do: f(n, 0, 1)

defp f(n, a, b)
defp f(0, a, _b), do: a
defp f(n, a, b), do: f(n - 1, a + b, a)

This allows us to remove our “cheat” from last section, and test the entirety of positive integers again:

property "works the same as the reference implementation" do
  forall n <- positive_int() do
    num = Fib.fib(n)
    ref = model_fib(n)

    num == ref
  end
end

And finally, our Fibonacci implementation using Streams:

@seed {0, 1}

def stream do
  @seed
  |> Stream.iterate(fn {a, b} -> {a + b, a} end)
  |> Stream.map(fn {a, _b} -> a end)
end

def fib(n) do
  stream()
  |> Enum.at(n)
end

Much like before, we check that our properties are still fulfilled:

$ mix test
.....................................................................................................
OK: Passed 100 test(s).
.....................................................................................................
OK: Passed 100 test(s).
.

Finished in 0.2 seconds
2 properties, 1 test, 0 failures

Even though our Stream implementation is short, I wouldn’t be surprised if some were confused by it; they work quite differently from other enumerable types like List and Map that one may be accustomed to. However, thanks to our property-based test suite, we know it’s not only correct, but highly performant as well.

Conclusion

This is a really quick introduction to Property-Based Testing, but hopefully you can now see why I like it, and maybe even motivate you enough to study it in depth yourself. With only this little code, we already have hundreds of tests checking the validity of our small library, checking for data we likely wouldn’t have thought to check with traditional tests.

To study Property-Based Testing, I recommend Fred Hebert’s excellent book, Property-Based Testing with PropEr, Erlang, and Elixir), also available over at PragProg.

Also, if you want, you can check out the code for the samples used in this post over at my GitHub.