McNaughton and Yamada’s algorithm

Given an automaton with n states, whose states are numbered from 1 to n, we give an expression for the compound languages of the words that label the paths from i to j, for any pair i, j. This expression is built by recurrence by means of a condition on the paths. This condition states that the paths only pass through certain authorized states.

Since our automaton has n states, we focus on the construction of a regular expression for each of the n×n languages Ls,s’ = { w∈X* | s.w = s‘ } formed of all words that send any state s to another state s’. If we succeed, the language recognized by the automaton will be the value of a finite union of regular expressions so a regular expression.

The idea is to proceed step by step taking into account a growing number of intermediate states between s and s’. The states are supposed numbered from 1 to n, we start with no intermediate state, only s1, then s1 and s2, and so on.

Let Ls,s’(k) be the expression for the set of words that label paths from s to s’ and whose all intermediate vertices are less than or equal to k. The starting and ending vertices s and s’ are not intermediate, so they are not constrained to be less than or equal to k.

For k = 0, since vertices are numbered from 1, the constraint simply expresses that there is no intermediate vertex. The only paths are transitions from s to s’.

So we have with s = p and s’ = q:


For the recurrence, consider a path from s to s’ whose intermediate vertices are smaller than k. Two cases are then possible:

  1. the intermediate vertices are smaller than k-1, then the label is in Ls,s’(k-1)
  2. the path goes through state k. We then break the path into parts whose intermediate vertices are smaller than k-1. For this, we consider each occurrence of the vertex k in this path: between two consecutive occurrences, the intermediate vertices are smaller than k-1. We then have the formula with s = p and s’ = q: langage41.

So there are n + 1 steps. Each step requires the computation of n² expressions, and the size of the expressions themselves increases with k.


Take the following automaton:


With k = 0 we have:


Then k = 1:


Then k = 2:


Then k = 3:


For the last step we simply calculate two expressions of interest (initial states to final states):


The language is: ab(a + b)∗ + (b + aa)a∗.