TLDR; This tutorial is a complete implementation of the TTT algorithm for active automata learning in Python. TTT combines the discrimination tree of Kearns and Vazirani with binary search counterexample analysis from Rivest and Schapire, and adds prefix transformation and discriminator finalization to eliminate all redundant membership queries. The Python interpreter is embedded so that you can work through the implementation steps.

In my previous post, I implemented Angluin’s L* algorithm for learning regular languages from a blackbox oracle. L* uses a flat observation table to track state distinctions, which leads to redundant membership queries: when a counterexample arrives, all its suffixes are added as columns even though most distinguish no new states.

The key insight that the discrimination tree is a better data structure for this job was due to Kearns and Vazirani 1, who replaced L*’s observation table with a binary tree of discriminators. Rivest and Schapire 2 independently contributed binary search counterexample analysis, which finds the single relevant suffix in a counterexample in \(O(\log k)\) queries rather than adding all \(k\) suffixes.

TTT by Isberner, Howar and Steffen 3 adds two further refinements: prefix transformation, which keeps access sequences minimal, and discriminator finalization, which keeps the discrimination tree shallow. Together these make TTT provably redundancy-free. That is, it never makes a membership query whose answer could have been derived from earlier queries.

TTT is the algorithm of choice in practical automata learning tools such as LearnLib 4. ADT 5 extends TTT with adaptive distinguishing sequences, which can reduce resets in hardware settings, though performance differences in software engineering settings are modest.

Definitions

  • Alphabet \(A\): the set of input symbols the DFA reads.
  • Membership query: a string passed to the blackbox oracle. The oracle answers yes (accepted) or no (rejected).
  • Equivalence query: a hypothesis grammar passed to the teacher. The teacher answers yes, or returns a counterexample string where the hypothesis and the target disagree.
  • PAC oracle: a probabilistic approximation to the equivalence oracle. After \(N\) random tests without finding a counterexample, we declare the hypothesis probably approximately correct.
  • Discrimination tree (DT): a binary tree whose inner nodes are discriminator suffixes and whose leaves are states. Sifting a string \(w\) through the tree classifies it to a state using one membership query per level.
  • Access sequence \(acc(q)\): the shortest known string that reaches state \(q\) in the target.
  • Spanning tree: a mapping from each known state to its access sequence. A dict from state to the shortest string known to reach it.
  • Open transition: a transition from state \(q\) on symbol \(a\) whose target state has no access sequence yet, meaning TTT has not yet determined which state it leads to.
  • Counterexample decomposition: the process of finding the split point in a counterexample, extracting a new discriminator, and splitting a leaf in the DT.

Contents

  1. Definitions
  2. Prerequisites
  3. The Road from L* to TTT
  4. The DFA Representation
  5. The Oracle
  6. The Discrimination Tree
    1. Sifting
  7. The Spanning Tree
  8. Hypothesis Construction
    1. Incremental Hypothesis Update
  9. Counterexample Decomposition
    1. The Split Point
    2. Prefix Transformation
    3. Splitting a Leaf
    4. Discriminator Finalization
    5. Putting Decomposition Together
    6. Worklist Growth in close_transitions
  10. Non-Redundancy
  11. A Note on the Equivalence Oracle
  12. DT Coherence After Split
  13. The Main Loop
  14. Examples
  15. Evaluating Model Accuracy
  16. Comparison with L*
  17. References
  18. Artifacts

Important: Pyodide takes time to initialize. Initialization completion is indicated by a red border around Run all button.

Prerequisites

We use the Teacher and Oracle from the L* post unchanged. The PAC equivalence oracle in Teacher is a direct drop-in for TTT. The rest of the algorithm is completely independent of how equivalence queries are answered.

Available Packages These are packages that refer either to my previous posts or to pure python packages that I have compiled, and is available in the below locations. As before, install them if you need to run the program directly on the machine. To install, simply download the wheel file (`pkg.whl`) and install using `pip install pkg.whl`.
  1. simplefuzzer-0.0.1-py2.py3-none-any.whl from "The simplest grammar fuzzer in the world".
  2. rxfuzzer-0.0.1-py2.py3-none-any.whl from "Fuzzing With Regular Expressions".
  3. earleyparser-0.0.1-py2.py3-none-any.whl from "Earley Parser".
  4. cfgrandomsample-0.0.1-py2.py3-none-any.whl from "Uniform Random Sampling of Strings from Context-Free Grammar".
  5. cfgremoveepsilon-0.0.1-py2.py3-none-any.whl from "Remove Empty (Epsilon) Rules From a Context-Free Grammar.".
  6. gatleastsinglefault-0.0.1-py2.py3-none-any.whl from "Specializing Context-Free Grammars for Inducing Faults".
  7. hdd-0.0.1-py2.py3-none-any.whl from "Hierarchical Delta Debugging".
  8. ddset-0.0.1-py2.py3-none-any.whl from "Simple DDSet".
  9. lstar-0.0.1-py2.py3-none-any.whl from "L* Algorithm".


Since this notebook serves both as a web notebook as well as a script that can be run on the command line, we redefine canvas if it is not defined already. The __canvas__ function is defined externally when it is used as a web notebook.



The Road from L* to TTT

In L*, when the equivalence oracle returns a counterexample \(ce\) of length \(k\), the algorithm adds all \(k\) suffixes of \(ce\) as new columns. For each new column, it must re-query every existing row. Many of these new columns are, however, redundant. They do not distinguish any new pair of states, but given that they exist, the cells need to be filled, and hence you pay the price in queries.

The key insight in TTT is that a counterexample only ever witnesses one new distinction: one pair of states was wrongly merged. Hence, exactly one new discriminator is sufficient, not \(k\). Getting to this point required three independent contributions:

  • Kearns and Vazirani (1994)1 replaced the observation table with a discrimination tree. A binary tree of discriminator suffixes where each leaf is a state. Sifting a string down the tree classifies it in \(O(depth)\) queries rather than \(O(|suffixes|)\).
  • Rivest and Schapire (1993)2 showed that binary search over the counterexample finds the single relevant split point in \(O(\log k)\) queries, rather than adding all \(k\) suffixes.
  • Isberner, Howar and Steffen (2014)3 combined these with prefix transformation (keeping access sequences minimal) and discriminator finalization (keeping the DT shallow), producing TTT.

The DFA Representation

The DFA class is similar to the one from the RPNI post.



We also add a run() method which returns the state reached after consuming a string, and ensure_state() which registers a state in the grammar without allocating a new key, which is needed when manually constructing DFAs in tests and when close_transitions (defined later) discovers states from the DT.



The DFA class can now model a deterministic finite state machine. We also add a helper to render any DFA as a Graphviz dot diagram.



Let us test it thoroughly.



The Oracle

Let us define a simple mock oracle for testing the components in isolation. The Teacher imported from L* is the full oracle, and will be used in the main loop.



The Discrimination Tree

The discrimination tree (DT) replaces L*’s flat observation table. Think of it as a game of 20 questions: each inner node asks: if I append suffix \(d\) to this string, does the target accept it? and routes left (no) or right (yes). Each leaf is a known state.

There are exactly two kinds of nodes: Leaf and Inner.



We test the node types before proceeding.



Sifting

To classify any string \(w\) to a state, we walk the DT from root to leaf. At each inner node labeled \(d\), we query \(member(w \cdot d)\) and go right (yes) or left (no). The leaf we land on is the state \(w\) belongs to. Each step is one membership query, so sifting costs at most \(depth(DT)\) queries, far fewer than L*’s \(O(|suffixes|)\).



We also add a helper to render a DT as a Graphviz dot diagram. Inner nodes show their discriminator suffix; leaves show the state name. Left edges (membership query returned False) are labelled “no”; right edges (True) are labelled “yes”. An empty discriminator is shown as ε (epsilon), meaning the string itself is queried directly with nothing appended. An optional tracer argument (a DTTracer, defined below) colours the recorded sift path blue.



DTTracer wraps a real DT node and records every step taken during a sift. It proxies is_leaf, discriminator, left, right, and state to the wrapped node, but intercepts the left/right accesses to record which edge was taken. After sift(DTTracer(root), w, oracle) returns, the tracer holds path_nodes and path_edges as sets of id()s referencing the unwrapped nodes, so they match what dt_to_dot sees.



We test sifting on the even-a’s example: the DT has one discriminator (the empty string) that separates even-a states from odd-a states.

single leaf: everything maps to start



two-level tree: [''] / <1> (odd) \ <start> (even)



Sifting 'a' (odd a’s) goes left to <1>; sifting 'aa' (even a’s) goes right to <start>.



Sifting 'aa' goes right.



The even-a’s DT always uses ε as its discriminator because membership of the access sequence alone is enough to tell states apart. Most targets produce non-empty discriminators. Consider strings over {a, b} that end in ba. After two counterexamples the DT has two levels: the root asks member(w + 'a') (discriminator 'a'), and the left subtree then asks member(w + ε) to separate <start> from the accepting state <2>.



Sifting 'b' (access of <1>): member('b'+'a') = member('ba') = True so go right -> leaf <1>.



Sifting 'ba' (access of <2>): member('ba'+'a') = member('baa') = False so go left; then member('ba'+ε) = member('ba') = True -> go right -> leaf <2>.



Sifting '' (access of <start>): member(''+a) = member('a') = False so go left; then member(''+ε) = member('') = False -> go left -> leaf <start>.



The Spanning Tree

If you have read the RPNI post, the spanning tree will look familiar. The RPNI Prefix Tree Acceptor (PTA) is a tree-shaped DFA where every path from root to a node spells out the string that reaches that state. The spanning tree is the dual of the PTA:

  • PTA maps strings -> states. You start with examples and build states to match them.
  • Spanning tree maps states -> strings. You start with states (discovered by TTT) and record the string that reaches each one.

In TTT, we never traverse the spanning tree as a tree. We only ever look up \(acc(q)\) for a given state, or add a new state with its access sequence. So the implementation reduces to a simple dict.



We add a helper to render a spanning tree as a Graphviz dot diagram. Each node shows its state name and access sequence. Edges are labelled with the character that extends the parent’s access sequence to reach the child.



We test the spanning tree.



Hypothesis Construction

At any point during learning, we have a DT and a spanning tree. Together they are enough to construct a complete hypothesis DFA. The idea is simple: to find the target state of transition \(q \xrightarrow{a} ?\), form the string \(acc(q) \cdot a\) and sift it through the DT. Whatever leaf it lands on is the state we assign as the target.

Concretely, for the even-a’s example with a two-leaf DT (discriminator '', left = <1>, right = <start>), and spanning tree {<start>: '', <1>: 'a'}, we sift each transition in turn. The blue path shows the route taken through the DT.

\(\langle start \rangle \xrightarrow{a} ?\): sift '' + 'a' = 'a'. Is 'a' + '' accepted? No (odd a’s) — go left — leaf <1>. So <start> -a-> <1>.



\(\langle start \rangle \xrightarrow{b} ?\): sift '' + 'b' = 'b'. Is 'b' + '' accepted? Yes (zero a’s) — go right — leaf <start>. So <start> -b-> <start>.



\(\langle 1 \rangle \xrightarrow{a} ?\): sift 'a' + 'a' = 'aa'. Is 'aa' + '' accepted? Yes (even a’s) — go right — leaf <start>. So <1> -a-> <start>.



\(\langle 1 \rangle \xrightarrow{b} ?\): sift 'a' + 'b' = 'ab'. Is 'ab' + '' accepted? No (odd a’s) — go left — leaf <1>. So <1> -b-> <1>.



A transition is open if we have not yet determined its target: either the transition is missing from the DFA entirely, or its recorded target has no access sequence in the spanning tree. The second case arises when a sift landed on a leaf whose state is not yet in the spanning tree, meaning a new state was just discovered and needs to be registered.



close_transitions works through all known states, sifting each open transition. When sifting discovers a state not yet in the spanning tree, that state is added and appended to the work list so its own transitions are processed in the same pass. This means the loop may grow as it runs: starting with one state, it may end with the full set.

leaf_index records, for each DT leaf, which (state, char) transitions were routed there. After a split, we use this index to find and re-sift only the transitions that are now stale, rather than re-sifting everything.



Accepting states are determined by querying the oracle directly on each access sequence. If \(acc(q)\) is accepted by the target, then \(q\) is an accepting state.



To see what closing looks like step by step, consider the even-a’s target with alphabet {a, b}. We start with one state <start> and a single-leaf DT. Every transition is open because no target state has been determined yet. Closing sifts acc(<start>) + 'a' = 'a' and acc(<start>) + 'b' = 'b' through the single-leaf DT. Both land on <start>, so both transitions point back to <start> – the hypothesis says everything loops. We visualise the DT, spanning tree, and resulting DFA at this stage.



The spanning tree



DFA before the split.



After the first counterexample (say 'a') is processed, decompose splits the DT: a new inner node with discriminator '' separates <start> (goes right: accepts '') from new state <1> (goes left: rejects ''). Re-sifting now correctly routes 'a' to <1> and 'b' back to <start>.



Spanning tree



DFA after split



Incremental Hypothesis Update

When decompose splits a leaf \(\ell\) into an inner node, every transition that was previously routed to \(\ell\) is now stale: it pointed to a single state, but the DT now says some of those strings belong to the left child and some to the right.

We could re-sift every transition in the hypothesis from scratch, but that is wasteful. Instead, leaf_index tells us exactly which (state, char) pairs were routed to \(\ell\). We remove only those transitions from the DFA and re-sift only them. Every other transition remains correct.

Concretely, continuing the even-a’s example: after the DT is split on '' and state <1> is created, the stale transitions are those that sifted to the old single leaf – all of them, since there was only one leaf. Each is removed and re-sifted through the new two-node DT. The transition <start> -a-> <start> becomes <start> -a-> <1>, and all others stay the same. We show the DFA before and after the update. split_id is the Python id() of the leaf before it was mutated; the caller must capture it before calling split_leaf, since the mutation changes the object in place and the old id is the key in leaf_index.



We test hypothesis construction on the even-a’s example.



Counterexample Decomposition

When the equivalence oracle returns a counterexample \(ce\), we know the hypothesis is wrong on \(ce\). But where exactly does it go wrong?

The Split Point

Walk \(ce\) through the hypothesis. At position 0, both hypothesis and target are in \(\langle start \rangle\), so they agree. At position \(|ce|\), they disagree (that is what the counterexample means). So somewhere in between is the first point of disagreement. That is, the position \(i\) where the hypothesis first takes a wrong transition.

We find this by binary search in \(O(\log|ce|)\) queries. At each midpoint \(m\), we check whether \(acc(q_m) \cdot ce[m:]\) gives the same answer as the full counterexample. If yes, the split point is to the right; if no, it is here or to the left.

Prefix Transformation

After finding the split point \(i\), we need the string that reaches state \(q_i\) and then reads \(ce[i]\). The raw counterexample prefix \(ce[:i+1]\) would work, but we use \(acc(q_i) \cdot ce[i]\) instead. This is the prefix transformation, and it gives two guarantees:

  • Correctness: \(acc(q_i)\) traces a known path through the hypothesis, so the sift is guaranteed to work even if the hypothesis is partially stale.
  • Minimality: the new state gets access sequence \(acc(q_i) \cdot ce[i]\), which is the shortest possible. Using \(ce[:i+1]\) could produce a much longer access sequence, making future sifts more expensive.


Splitting a Leaf

Once we have the split point, we know:

  • A leaf \(\ell\) currently represents old_state
  • old_state and new_state were treated as identical by the hypothesis, but the counterexample proves they are different
  • The discriminator \(ce[i+1:]\) is the suffix that tells them apart

We mutate the leaf in place into an inner node. This is essential because other parts of the tree already hold references to \(\ell\), so replacing it with a new object would leave those references stale.



We test split_leaf.



After the split, the single leaf becomes an inner node separating <start> and <1>.



We now show update_hypothesis in action. We build the stale hypothesis (single-leaf DT, everything loops to <start>), then split the leaf and call update_hypothesis. The stale transition <start> -a-> <start> is removed and re-sifted to <1>.



Capture split_id before mutating the leaf, then split it.



Now update: stale transitions are removed and re-sifted.



DFA after update: -a-> <1>, all other transitions unchanged.



Discriminator Finalization

The discriminator \(ce[i+1:]\) is correct but may be longer than necessary. A shorter suffix of \(ce[i+1:]\) may distinguish the two states just as well. Keeping discriminators short keeps the DT shallow, which reduces sifting costs in all future iterations.

We try suffixes from shortest to longest, stopping at the first one that distinguishes the two states. Once a candidate fails, all shorter suffixes will also fail, so we stop early.



We test discriminator finalization.



Putting Decomposition Together

We now have all the pieces. decompose finds the split point by binary search, applies the prefix transformation, finalizes the discriminator, and splits the leaf. One counterexample yields one new state and one new discriminator. This is the tightest possible refinement.

Note: decompose uses hypothesis transitions only to find the split point. The actual split uses \(acc(q_i)\), which is always correct with respect to the target. This means decompose is correct even if the hypothesis is partially stale. The access sequences in the spanning tree are ground truth, independent of hypothesis quality.



We test decompose.



test 1: single symbol counterexample ‘a’



test 2: longer counterexample ‘aab’ — binary search finds split at position 0, so the new state still gets access sequence ‘a’ and discriminator is ‘b’.



test 3: two states, counterexample ‘aa’ reveals a third state. The DT gains a second level; the new state gets access sequence ‘aa’.



Worklist Growth in close_transitions

We now trace the full (a|b)*ba learning run to show how state names emerge from the algorithm and how the worklist grows during close_transitions. No states are pre-declared — they are created by dfa.new_state() inside decompose as each counterexample is processed.

Step 1. Single-leaf DT, only <start> in the spanning tree. close_transitions sifts '' + 'a' and '' + 'b'. Both land on the only leaf <start>, so no new states are discovered.



Spanning tree and DFA after step 1.



DFA step 1 — everything loops back to <start>.



Step 2. Counterexample 'ba': the hypothesis accepts it as <start> (which is accepting) but shouldn’t — <start> should only accept ''. decompose creates a fresh state (call it s1) and splits the DT leaf with discriminator 'a'. update_hypothesis re-sifts stale transitions; sifting '' + 'b' now lands on s1 — new, so it is appended to the worklist and its transitions are closed immediately. Worklist grows from ['<start>'] to ['<start>', s1].



Spanning tree after step 2 — s1 now has access sequence 'b'.



DFA after step 2.



Step 3. Counterexample 'ba' again — now the hypothesis rejects it because s1 -a-> <start> is wrong (should reach an accepting state). decompose creates s2 and splits the <start> leaf with discriminator ε. update_hypothesis re-sifts; sifting acc(s1) + 'a' = 'ba' lands on s2 — new, appended. Worklist grows to include s2. Here is that sift path — the one that grows the worklist:



And sifting acc(s1) + 'b' = 'bb' lands on <start> — known, no append.



After update_hypothesis finishes, all three states are wired up.



Spanning tree after step 3 — three states, all access sequences minimal.



Final DFA — states named by the algorithm as they were discovered.



Non-Redundancy

The central claim of the TTT is that it never makes a membership query whose answer could have been derived from earlier queries. To see what this means concretely: in L*, if the counterexample is ba, the algorithm adds both a and ba as new suffix columns and re-queries every existing state against both. If a was already a column, that work is wasted. TTT avoids this by extracting exactly one new suffix per counterexample and routing all future classification through the DT. This holds at every level:

  • Sifting is non-redundant. Every query is \(w \cdot d\) where \(d\) was placed in the DT by a previous split that proved it necessary.
  • Splitting is non-redundant. Each split adds exactly one discriminator, proven necessary by the counterexample.
  • Closing is non-redundant. Each transition is sifted exactly once per iteration. Newly discovered states come with their DT position already established by the sift that found them, so no extra queries are needed.

This contrasts with L*, where adding all \(k\) suffixes of a counterexample forces re-querying every existing row against every new column, most of which add no new information.

A Note on the Equivalence Oracle

TTT assumes the equivalence oracle is exact: if it says the hypothesis is wrong, it returns a string the hypothesis genuinely misclassifies. The Teacher we use is a PAC oracle: it samples a finite set of strings and declares equivalence if none expose a mistake. This is an approximation.

In principle, a false counterexample from a PAC oracle could cause TTT to create a redundant state: one that could have been merged with an existing state without changing the language the DFA accepts. The DFA would still be correct, just slightly larger than necessary. Neither TTT nor L* guarantees a globally minimal DFA under a PAC oracle: both can acquire spurious states or columns from false counterexamples, and neither performs a global minimization pass afterward. In practice this is a minor concern: the PAC oracle is unlikely to produce false counterexamples with reasonable parameters, and the language accepted by the DFA is unaffected either way.

DT Coherence After Split

After split_leaf turns a leaf into an inner node, some transitions in the hypothesis that targeted the old leaf are now stale, because the DT now routes some of those strings to the new child instead.

The incremental strategy finds exactly those stale transitions via leaf_index, removes them from the DFA, and re-sifts only them. Every other transition remains valid and costs no queries. New states discovered during re-sifting are registered and their own transitions are closed in the same pass.

The Main Loop

The main loop orchestrates everything:

  1. Build the initial hypothesis: one state, all transitions open.
  2. Ask the equivalence oracle. If it says yes, we are done.
  3. If not, decompose the counterexample to find one new state and one new discriminator.
  4. Incrementally update the hypothesis: re-sift only the stale transitions.
  5. Repeat from step 2.

The loop runs exactly \(n - 1\) times where \(n\) is the number of states in the minimal DFA, one counterexample per new state discovered.



Examples

We test TTT on three targets of increasing complexity.

target 1: strings over {a, b} with an even number of a’s



target 2: strings over {a, b} that end in ‘b’



target 3: binary strings whose value is divisible by 3

This target has no convenient regex, so we write a custom teacher. It is a good stress test: the minimal DFA has exactly 3 states (one per remainder mod 3), the alphabet is {0, 1}, and transitions are determined by how reading a bit updates the current value modulo 3. This exercises TTT on a target where the states correspond to arithmetic structure rather than string patterns.



Test this.



Evaluating Model Accuracy

We measure precision and recall by cross-fuzzing the target grammar and the inferred grammar. Precision is the fraction of strings generated by the inferred DFA that the target accepts. Recall is the fraction of strings generated by the target that the inferred DFA accepts.

The inferred DFA may contain a dead/sink state: a non-accepting state with no exit, representing strings the target permanently rejects. Such a state causes LimitFuzzer to loop, because the grammar has no finite derivation from it. We remove dead states before fuzzing using fuzzer.compute_cost, which assigns each nonterminal the minimum number of steps needed to reach a terminal string. Any nonterminal with infinite cost is a dead state; we remove it and all rules that reference it.



We define a match helper that wraps the Earley parser in a boolean check.



Testing Each pair is (regex, alphabet). Cases cover a range of DFA shapes: two-segment and three-segment chains, prefix-anchored, suffix-anchored, substring-containment, exact-alternation, and disjoint finite sets.



Comparison with L*

  L* TTT
Data structure Flat observation table Discrimination tree (Kearns-Vazirani)
Counterexample processing Add all \(k\) suffixes Binary search for 1 suffix (Rivest-Schapire)
Prefix transformation No Yes (minimal access sequences, TTT)
Discriminator finalization No Yes (shallow DT, TTT)
Redundant queries Many None: the DT structure prevents re-querying known distinctions
Closedness check Explicit global scan Lazy, local (open transitions)
Consistency check Explicit global scan Structurally prevented by DT

The DT depth is bounded by \(n\) (one split per state), so sifting never becomes expensive. This makes TTT the preferred algorithm when membership queries are costly, as is typical when learning protocol implementations or library APIs through testing.

References

Artifacts

The runnable Python source for this notebook is available here.

  1. Michael Kearns and Umesh Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994. pp. 44-58.  2

  2. Ronald L. Rivest and Robert E. Schapire. Inference of Finite Automata Using Homing Sequences. Information and Computation, 103(2):51-73, 1993.  2

  3. Malte Isberner, Falk Howar, and Bernhard Steffen. The TTT Algorithm: A Redundancy-Free Approach to Active Automata Learning. RV 2014.  2

  4. Falk Howar and Bernhard Steffen. Active Automata Learning in Practice. Springer, 2022. 

  5. Markus Frohme. Active Automata Learning with Adaptive Distinguishing Sequences. Master Thesis, TU Dortmund, 2015. https://arxiv.org/abs/1902.01139