- Conjunction of two regular grammars
- Disjunction of two regular grammars
- Complement of regular grammars
Important: Pyodide takes time to initialize. Initialization completion is indicated by a red border around Run all button.
In the previous post I showed how to produce a grammar out of regular expressions. In the second post, I claimed that we need a regular grammar because regular grammars have more interesting properties such as being closed under intersection and complement. Now, the question is how do we actually do the intersection between two regular grammars? For this post, I assume that the regular expressions are in the canonical format as given in this post. That is, there are only two possible rule formats \(A \rightarrow a B\) and \(A \rightarrow \epsilon\). Further, the canonical format requires that there is only one rule in a nonterminal definition that starts with a particular terminal symbol. Refer to this post for how convert any regular grammar to the canonical format. We start with importing the prerequisites
System ImportsThese are available from Pyodide, but you may wish to make sure that they are installed if you are attempting to run the program directly on the machine.
Available PackagesThese are packages that refer either to my previous posts or to pure python packages that I have compiled, and is available in the below locations. As before, install them if you need to run the program directly on the machine.
The imported modules
There are only two patterns of rules in canonical regular grammars.
- \[A \rightarrow aB\]
- \[A \rightarrow \epsilon\]
The idea is that when evaluating intersection of the start symbol, pair up all rules that start with the same terminal symbol. Only the intersection of these would exist in the resulting grammar. The intersection of
<A1> ::= a <B1>
<A2> ::= a <B2>
<and(A1,A2)> ::= a <and(B1,B2)>
For constructing such rules, we also need to parse the boolean expressions in the nonterminals. So, we define our grammar first.
Next, we define our expression class which is used to wrap boolean expressions and extract components of boolean expressions.
Ensure that it can parse boolean expressions in nonterminals.
Conjunction of two regular grammars
Next, we define the conjunction of two regular grammars in the canonical format. We will define the conjunction of two definitions, and at the end discuss how to stitch it together for the complete grammar. The nice thing here is that, because our grammar is in the canonical format, conjunction disjunction and negation is really simple, and follows roughly the same framework.
Conjunction of nonterminal symbols
Ensure that it works
Conjunction of rules
We only provide conjunction for those rules whose initial terminal symbols are the same or it is an empty rule.
We check to make sure that conjunction of rules work.
Conjunction of definitions
Checking that conjunction of definitions work.
Disjunction of two regular grammars
For disjunction, the strategy is the same. We line up rules in both definitions with same starting symbol, then construct the conjunction of the nonterminal parts. Unlike conjunction, however, we do not throw away the rules without partners in the other definition. Instead, they are added to the resulting definition.
Disjunction of nonterminal symbols
Ensure that it works
Disjunction of rules
We assume that the starting terminal symbol is the same for both rules.
We check to make sure that disjunction for rules work.
Disjunction of definitions
We check that disjunction of definitions work.
Complement of regular grammars
For complement, the idea is to treat each pattern separately. We take the definition of each nonterminal separately.
- If the nonterminal definition does not contain \(\epsilon\), we add
NT_EMPTYto the resulting definition. If it contains, then we skip it.
- We collect all terminal symbols that start up a rule in the definition. For each such rule, we add a rule that complements the nonterminal used. That is, given
<A> := a <B>
<neg(A)> := a <neg(B)>
as one of the complement rules.
- For every remaining terminal in the
TERMINAL_SYMBOLS, we add a match for any string given by
We start by producing the complement of a single nonterminal symbol.
Ensure that it works
Complement of a single rule
Ensure that it works
Complement a definition
We complement a definition by applying complement to each rule in the original definition, and adding any new rules that did not match the original definition.
Checking complement of a definition in an example.
Until now, we have only produced conjunction, disjunction, and complement for definitions. When producing these, we have introduced new nonterminals in definitions that are not yet defined. For producing a complete grammar, we need to define these new nonterminals too. This is what we will do in this section. We first define a few helper procedures.
remove_empty_defs() recursively removes any nonterminal that has empty
definitions. That is, of the form
"<A>" : . Note that it is different from
an epsilon rule which is
"<A>" : []
Next, we define
complete() which recursively computes the complex
nonterminals that is left undefined in a grammar from the simpler
That is, for any conjunction, disjunction, or negation of grammars, we start
at the start symbol, and produce the corresponding operation in the definition
of the start symbol. Then, we check if any new new nonterminal was used in any
of the rules. If any were used, we recursively define them using the
nonterminals already present in the grammar. This is very similar to the
ReconstructRules from fault expressions
for context-free grammars, but is also different enough. Hence, we define a
completely new class.
We start with reconstructing a single key. For example, given the two grammars
G2, and their start symbols
S2, to compute an intersection
G1 & G2, we simply reconstruct
<and(S1,S2)> from the two grammars, and
recursively define any undefined nonterminals.
Given a complex boolean expression, construct the definition for it from the grammar rules.
Produce disjunction of grammars
Produce conjunction of grammars
We now verify that it works.
Next, we come to complement.
Ensure that negation also works.
The runnable code for this post is available here