The algorithm uses the graphical representation of the automaton intensively. The automaton itself is generalized, allowing, as labels transitions, not only letters, but regular expressions. Starting from a finite automaton, one gradually eliminates the states, and at the end, it left an automaton having only one transition. The label of this transition is a regular expression for the language recognized by the automaton.
A generalized automaton is defined as a traditional non-deterministic finite automaton, with the following particularities:
- it has a single initial state α and a single final state ω
- transitions are labeled with regular expressions
- no transition enters in state α and no transition leaves the final state ω
We can easily transform an automaton A into a generalized automaton: add the states α and ω and ε-transitions of α to the initial states of A, and ε-transitions of the terminal states from A to ω.
Reduction of the automaton
Given a generalized automaton, one calculates a generalized automaton having fewer transitions or states, by applying the transformations below without modifying the recognized language. In the end, only the two states α and ω remain and a transition of α and ω whose label is a regular expression denoting the recognized languages. Transformations are reductions in transitions and state reductions.
To reduce a state, all paths through this state (to an adjacent state) must be performed. The language recognized by this path is then added to the reduced automaton (see the example).
The following example shows the main reductions as well as the regular expressions generated on the automaton:
Which gives the following generalized automaton:
We eliminate q2: the path q0 -> q2 -> q3 gives the language a², so it is added to the path q0 -> q3. The path q3 -> q2 -> q3 gives the language ba or (ba) * because it is a loop on q3 -> q3
We eliminate q1: here the loop on q1 will give a*
We eliminate q0:
We eliminate q3:
Note that the expression obtained also depends on the order of elimination, but that all generated languages are equal.