Important: Pyodide takes time to initialize. Initialization completion is indicated by a red border around Run all button.

This post is the implementation of my paper Input Algebras

In my previous post on inducing faults I explained the deficiency of abstract failure inducing inputs mined using DDSet, and showed how to overcome that by inserting that abstract (evocative) pattern into a grammar, producing evocative grammars that guarantee that the evocative fragment is present in any input generated. However, what if one wants to produce inputs that contain two evocative fragments? or wants to produce inputs that are guaranteed to contain at least one of them? This is what we will discuss in this post.

We import the following modules

# Producing inputs with two fault inducing fragments guaranteed to be present.

From the previous post inducing faults we extracted two evocative subtrees

We now want to produce a grammar such that any input produced from that grammar is guaranteed to contain both evocative subtrees. First, let us extract the corresponding grammars. Here is the first one

We save this for later

Here is the second grammar.

We save this for later

## And Grammars

Now, we want to combine these grammars. Remember that a gramamr has a set of definitions that correspond to nonterminals, and each definition has a set of rules. We start from the rules. If we want to combine two grammars, we need to make sure that any input produced from the combined grammar is also parsed by the original grammars. That is, any rule from the combined grammar should have a corresponding rule in the original grammars. This gives us the algorithm for combining two rules. First, we can only combine rules that have similar base representation. That is, if ruleA is [<A f1>, <B f2>, 'T'] where <A> and <B> are nonterminals and T is a terminal and ruleB is [<A f1>, <C f3>], these can’t have a combination in the combined grammar. On the other hand, if ruleB is [<A f3>, <B f4> 'T'] then, a combined rule of [<A f1 & f3>, <B f2 & f4>, 'T'] can infact represent both parent rules. That is, when combining two rules from different, grammars, their combination is empty if they have different base representation.

The idea for combining two definitions of nonterminals is simply using the distributive law. A definition is simply # A1 or B1 or C1 where A1 etc are rules. Now, when you want to and two defintions, you have and(A1 or B1 or C1, A2 or B2 or C2) , and you want the or out again. So, this becomes

(A1 AND A2) OR (A1 AND B2) OR (A1 AND C2) OR
(A2 AND B1) OR (A2 AND C1) OR
(B1 AND B2) OR (B1 AND C2) OR
(B2 AND C1) OR (C1 AND C2)


which is essentially that many rules.

### Combining tokens

If they have the same base representation, then we only have to deal with how to combine the nonterminal symbols. The terminal symbols are exactly the same in parent rules as well as combined rule. So, given two tokens, we can combine them as follows. The and of a refined nonterminal and a base nonterminal is always the refined nonterminal. Otherwise, it is simply an and() specialization of both refinements.

Using it.

## Combining rules

Next, we define combination for rules

Using it.

### Combining rulesets

Next, our grammars may contain multiple rules that represent the same base rule. All the rules that represent the same base rule is called a ruleset. combining two rulesets is done by producing a new ruleset that contains all possible pairs of rules from the parent ruleset.

Using it.

Next, we define a few helper functions that collects all rulesets

### definition conjunction

Now, we can define the conjunction of definitions as follows.

Using it.

### grammar conjunction

We can now define our grammar conjunction as follows.

Using it.

This grammar is now guaranteed to produce instances of both characterizing subtrees.

## Producing inputs with at least one of the two fault inducing fragments guaranteed to be present.

How do we construct grammars that are guaranteed to contain at least one of the evocative patterns? This is actually much less complicated than and The idea is simply using the distributive law. A definition is simply A1 or B1 or C1 as before where A1 etc are rules. Now, when you want to or two definitions, you have or(A1 or B1 or C1, A2 or B2 or C2), then it simply becomes A1 or B1 or C1 or A2 or B2 or C2 At this point, our work is essentially done. All that we need to do is to merge any rules that potentially allow us to merge. However, this is not compulsory.

### Nonterminals

For nonterminals, it is similar to and except that the base cases differ. or of a base nonterminal with a refined nonterminal is always the base.

## rules

What rules can be merged? Only those rules can be merged that has a single refinement difference. That is if we have or(<A 1> <B 5> <C>, <A 2> <B 5> <C>), then this merges to <A or(1,2)><B 5><C>. However or(<A 1> <B 5> <C>, <A 2> <B 6> <C>) is not mergeable.

## rulesets

For or rulesets we first combine both rulesets together then (optional) take one at a time, and check if it can be merged with another.

Using it.

### definition disjunction

Now, we can define the disjunction of definitions as follows.

### grammar disjunction

Using it.

This grammar is now guaranteed to produce at least one of the evocative subtrees Using it.

As before, the runnable source of this notebook can be found here