Delta Debugging

About Delta Debugging
Implementation
1. remove_check_each_fragment()
2. ddmin()
Recursive
Recursive2
1. ddrmin()
Artifacts

Important: Pyodide takes time to initialize. Initialization completion is indicated by a red border around Run all button.

Note: This is based on the ddmin in the fuzzingbook.

About Delta Debugging

Delta Debugging is a method to reduce failure inducing inputs to their smallest required size that still induces the same failure. It was first formally introduced in the paper Simplifying and Isolating Failure-Inducing Input by Zeller and Hildebrandt.

The idea of delta debugging is fairly simple. We start by partitioning the given input string, starting with two partitions – which have a given partition length. Then, we check if any of these parts can be removed without removing the observed failure. If any of these can be removed, we remove all such parts of the given length. Once no such parts of the given length can be removed, we reduce the partition length by two, and do the same process again. This obtains us the 1-minimal failure causing string where removal of even a single character will remove the observed failure.

Given a causal function as below,

Here is an example run:

$ python ddmin.py '<SELECT NAME="priority" MULTIPLE SIZE=7>'
.  ty" MULTIPLE SIZE=7> 20
.  <SELECT NAME="priori 20
.  ME="priority" MULTIPLE SIZE=7> 30
+  <SELECT NAty" MULTIPLE SIZE=7> 30
+  <SELECT NALE SIZE=7> 20
.  <SELECT NA 10
.  CT NALE SIZE=7> 15
.  <SELELE SIZE=7> 15
+  <SELECT NAZE=7> 15
.  <SELECT NA 10
.  ELECT NAZE=7> 13
.  <SECT NAZE=7> 13
.  <SELT NAZE=7> 13
.  <SELECNAZE=7> 13
+  <SELECT ZE=7> 13
+  <SELECT =7> 11
+  <SELECT > 9
.  <SELECT  8
.  SELECT > 8
.  <ELECT > 8
.  <SLECT > 8
.  <SEECT > 8
.  <SELCT > 8
.  <SELET > 8
.  <SELEC > 8
+  <SELECT> 8
.  <SELECT 7
<SELECT>

Implementation

How do we implement this?

First, the prerequisites:

remove_check_each_fragment()

Given a partition length, we want to split the string into that many partitions, remove each partition one at a time from the string, and check if for any of them, the causal() succeeds. If it succeeds for any, then we can skip that section of the string.

There is a reason this function is split from the main function unlike in the original implementation of ddmin. The function remove_check_each_fragment obeys the contract that any string returned by it obeys the contract represented by the causal function. This means that any test case that is produced by remove_check_each_fragment will reproduce the specified behavior, and can be used for other computations. For example, one may use it for evaluating test reduction slippage, or for finding other reductions.

ddmin()

The main function. We start by the smallest number of partitions – 2. Then, we check by removing each fragment for success. If removing one fragment succeeds, we change the current string to the string without that fragment. So, we remove all fragments that can be removed in that partition size. If none of the fragments could be removed, then we reduce the partition length by half. If the partition cannot be halved again (i.e, the last partition length was one) or the string has become empty, we stop the iteration.

The driver.

The nice thing is that, if you invoke the driver, you can see the reduction in input length in action. Note that our driver is essentially a best case scenario. In the worst case, the complexity is \(O(n^2)\). The worst case is when tests can result in unresolved status, and the last change always fails.

Recursive

That was of course illuminating. However, is that the only way to implement this? delta-debug at its heart, is a divide and conquer algorithm. Can we implement it recursively? This is the direct translation of ddmin from the paper’s formalization, which is recursive.

Recursive2

But, is all that work necessary? The basic idea is that given a string, we can split it into parts, and check if either part reproduces the failure. If either one does, then call ddrmin() on the part that reproduced the failure.

If neither one did, then it means that there is some part in the first partition that is required for failure, and there is some part in the second partition too that is required for failure. All that we need to do now, is to isolate these parts. How should we do that?

Call ddrmin() but with an updated check. For example, for the first part, rather than checking if some portion of the first part alone produces the failure, check if some part of first, when combined with the second will cause the failure.

All we have left to do, is to define the base case. In our case, a character of length one can not be partitioned to strictly smaller parts. Further, we already know that any string passed into ddrmin() was required for reproducing the failure. So, we do not have to worry about empty string. Hence, we can return it as is.

Here is the implementation. Note: This is the algorithm(1) from the paper ¹ with minor modifications.

ddrmin()

Let us redefine our ddmin

Given that it is a recursive procedure, one may worry about stack exhaustion, especially in languages such as Python which allocates just the bare minimum stack by default. The nice thing here is that, since we split the string by half again and again, the maximum stack size required is \(log(N)\) of the input size. So there is no danger of exhaustion.

Artifacts

The runnable Python source for this notebook is available here.

Yesterday, my program worked.Today, it does not. Why? Zeller, 1999. ↩

Contents