This will be used in the randomized tests a lot more than it is in the
unit tests / benchmarks; randomized tests will run the test function
multiple times, check the result and optionally start shrinking the
failing input. Generators will also be able to fail, resulting in some
of the new TestResult variants.