| Introduction to Automata
Theory, Languages, and Computation
|
Solutions for Chapter 7
Revised 2/18/05.
Solutions for Section 7.1
Solutions for Section 7.2
Solutions for Section 7.3
Solutions for Section 7.4
Solutions for Section 7.1
Exercise 7.1.1
A and C are clearly generating, since they have
productions with terminal bodies.
Then we can discover S is generating because of the production
S->CA, whose body consists of only symbols that are
generating.
However, B is not generating.
Eliminating B, leaves the grammar
S -> CA
A -> a
C -> b
Since S, A, and C are each reachable from S, all
the remaining symbols are useful, and the above grammar is the answer to
the question.
Exercise 7.1.2
Revised 6/27/02.
- a)
- Only S is nullable, so we must choose, at each point where
S occurs in a body, to eliminate it or not.
Since there is no body that consists only of S's, we do not have
to invoke the rule about not eliminating an entire body.
The resulting grammar:
S -> ASB | AB
A -> aAS | aA | a
B -> SbS | bS | Sb | b | A | bb
- b)
- The only unit production is B -> A.
Thus, it suffices to replace this body A by the bodies of all the
A-productions.
The result:
S -> ASB | AB
A -> aAS | aA | a
B -> SbS | bS | Sb | b | aAS | aA | a | bb
- c)
- Observe that A and B each derive terminal strings, and
therefore so does S.
Thus, there are no useless symbols.
- d)
- Introduce variables and productions C -> a and D -> b,
and use the new variables in all bodies that are not a single terminal:
S -> ASB | AB
A -> CAS | CA | a
B -> SDS | DS | SD | b | CAS | CA | a | DD
C -> a
D -> b
Finally, there are bodies of length 3; one, CAS, appears twice.
Introduce new variables E, F, and G to split these bodies,
yielding the CNF grammar:
S -> AE | AB
A -> CF | CA | a
B -> SG | DS | SD | b | CF | CA | a | DD
C -> a
D -> b
E -> SB
F -> AS
G -> DS
Exercise 7.1.10
It's not possible.
The reason is that an easy induction on the number of steps in a
derivation shows that every sentential form has odd length.
Thus, it is not possible to find such a grammar for a language as simple
as {00}.
To see why, suppose we begin with start symbol S and try to pick
a first production.
If we pick a production with a single terminal as body, we derive a
string of length 1 and are done.
If we pick a body with three variables, then, since there is no way for
a variable to derive epsilon, we are forced to derive a string of length
3 or more.
Exercise 7.1.11(b)
The statement of the entire construction may be a bit tricky, since you
need to use the construction of part (c) in (b), although we are not
publishing the solution to (c).
The construction for (b) is by induction on i, but it needs to be
of the stronger statement that if an Ai-production has a body
beginning with Aj, then j > i (i.e., we use part (c)
to eliminate the possibility that i=j).
Basis:
For i = 1 we simply apply the construction of (c) for i =
1.
Induction:
If there is any production of the form Ai -> A1..., use the
construction of (a) to replace A1.
That gives us a situation where all Ai production bodies begin
with at least A2 or a terminal.
Similarly, replace initial A2's using (a), to make A3
the lowest possible variable beginning an Ai-production.
In this manner, we eventually guarantee that the body of each
Ai-production either begins with a terminal or with Aj,
for some j >= i.
A use of the construction from (c) eliminates the possibility that i
= j.
Exercise 7.1.11(d)
As per the hint, we do a backwards induction on i, that the
bodies of Ai productions can be made to begin with terminals.
Basis:
For i = k, there is nothing to do, since there are no variables
with index higher than k to begin the body.
Induction:
Assume the statement for indexes greater than i.
If an Ai-production begins with a variable, it must be
Aj for some j > i.
By the induction hypothesis, the Aj-productions all have bodies
beginning with terminals now.
Thus, we may use the construction (a) to replace the initial Aj,
yielding only Ai-productions whose bodies begin with terminals.
After fixing all the Ai-productions for all i, it is time
to work on the Bi-productions.
Since these have bodies that begin with either terminals or Aj
for some j, and the latter variables have only bodies that begin
with terminals, application of construction (a) fixes the Bj's.
Return to Top
Solutions for Section 7.2
Exercise 7.2.1(a)
Let n be the pumping-lemma constant and consider string z =
anbn+1cn+2.
We may write z = uvwxy, where v and x, may be
``pumped,'' and |vwx| <= n.
If vwx does not have c's, then uv3wx3y has at
least n+2 a's or b's, and thus could not be in the
language.
If vwx has a c, then it could not have an a,
because its length is limited to n.
Thus, uwy has n a's, but no more than 2n+2
b's and c's in total.
Thus, it is not possible that uwy has more b's than
a's and also has more c's than b's.
We conclude that uwy is not in the language, and now have a
contradiction no matter how z is broken into uvwxy.
Exercise 7.2.1(d)
Let n be the pumping-lemma constant and consider
z = 0n1n2.
We break Z = uvwxy according to the pumping lemma.
If vwx consists only of 0's, then uwy has n2 1's
and fewer than n 0's; it is not in the language.
If vwx has only 1's, then we derive a contradiction similarly.
If either v or x has both 0's and 1's, then
uv2wx2y is not in 0*1*, and thus could not be in the
language.
Finally, consider the case where v consists of 0's only, say
k 0's, and x consists of m 1's only, where k
and m are both positive.
Then for all i, uvi+1wxi+1y consists of
n + ik 0's and n2 + im 1's.
If the number of 1's is always to be the square of the number of 0's, we
must have, for some positive k and m:
(n+ik)2 = n2 + im, or
2ink + i2k2 = im.
But the left side grows quadratically in i, while the right side
grows linearly, and so this equality for all i is impossible.
We conclude that for at least some i, uvi+1wxi+1y is
not in the language and have thus derived a contradiction in all cases.
Exercise 7.2.2(b)
It could be that, when the adversary breaks z = uvwxy, v =
0k and x = 1k.
Then, for all i, uviwxiy is in the language.
Exercise 7.2.2(c)
The adversary could choose z = uvwxy so that v and
x are single symbols, on either side of the center.
That is, |u| = |y|, and w is either epsilon (if z
is of even length) or the single, middle symbol (if z is of odd
length).
Since z is a palindrome, v and x will be the same
symbol.
Then uviwxiy is always a palindrome.
Exercise 7.2.4
The hint turns out to be a bad one.
The easiest way to prove this result starts with a string
z = 0n1n0n1n where the
middle two blocks are distinguished.
Note that vwx cannot include 1's from the second block and also
1's from the fourth block, because then vwx would have all
n distinguished 0's and thus
at least n+1 distinguished symbols.
Likewise, it cannot have 0's from both blocks of 0's.
Thus, when we pump v and x, we must get an imbalance
between the blocks of 1's or the blocks of 0's, yielding a string not in
the language.
Return to Top
Solutions for Section 7.3
Exercise 7.3.1(a)
For each variable A of the original grammar G, let A' be a
new variable that generates init of what A generates.
Thus, if S is the start symbol of G,
we make S' the new start
symbol.
If A -> BC is a production of G, then in the new grammar
we have A -> BC, A' -> BC', and A' -> B'.
If A -> a is a production of G, then the new grammar has
A -> a, A' -> a, and A' -> epsilon.
Exercise 7.3.1(b)
The construction is similar to that of part (a), but now A' must
be designed to generate string w if and only if A
generates wa.
That is, A''s language is the result of applying /a to
A's language.
If G has production A -> BC, then the new grammar has A
-> BC and A' -> BC'.
If G has A -> b for some b != a, then the new
grammar has A -> b, but we do not add any production for
A'.
If G has A -> a, then the new grammar has A -> a
and A' -> epsilon.
Exercise 7.3.3(a)
Consider the language L = {aibjck | 1 <= i and 1 <= j and (i
<= k or j <=
k)}.
L is easily seen to be a CFL; you can design a PDA that guesses
whether to compare the a's or b's with the c's.
However, min(L) = {aibjck | k = min(i,j)}.
It is also easy to show, using the pumping lemma, that this language is
not a CFL.
Let n be the pumping-lemma constant, and consider z =
anbncn.
Exercise 7.3.4(b)
If we start with a string of the form 0n1n and intersperse any
number of 0's, we can obtain any string of 0's and 1's that begins with
at least as many 0's as there are 1's in the entire string.
Exercise 7.3.4(c)
Given DFA's for L1 and L2, we can construct an NFA for
their shuffle by using the product of the two sets of states, that is,
all states [p,q] such that p is a state of the
automaton for L1, and q is a state of the automaton for
L2.
The start state of the automaton for the shuffle consists of the start
states of the two automata, and its accepting states consist of pairs of
accepting states, one from each DFA.
The NFA for the shuffle guesses, at each input, whether it is from
L1 or L2.
More formally, δ([p,q],a) = {[δ1(p,a),q], [p,δ2(q,a)]},
where δi is the transition function for the DFA for
Li (i = 1 or 2).
It is then an easy induction on the length of w that
δ-hat([p0,q0],w) contains [p,q] if and only if w
is the shuffle of some x and y, where δ1-hat(p0,x) =
p and δ2-hat(q0,y) = q.
Exercise 7.3.5
- a)
-
Consider the language of regular expression (0l)*.
Its permutations consist of all strings with an equal number of 0's and
1's, which is easily shown not regular.
In proof, use the pumping lemma for regular languages, let n be
the pumping-lemma constant, and consider string 0n1n.
- b)
-
The language of (012)* serves.
Its permutations are all strings with an equal number of 0's 1's, and
2's.
We can prove this language not to be a CFL by using the pumping lemma on
0n1n2n, where n is the pumping-lemma constant.
- c)
-
Assume the alphabet of regular language L is {0,1}.
We can design a PDA P to recognize perm(L), as follows.
P simulates the DFA A for L on an input string that
it guesses.
However, P must also check that its own input is a permutation of
the guessed string.
Thus, each time P guesses an input for A, it also reads
one of its own symbols.
P uses its stack to remember whether it has seen more 0's than it
has guessed, or seen more 1's than it has guessed.
It does so by keeping a stack string with a bottom-of-stack marker and
either as many more 0's as it has seen than guessed, or as many more 1's
as it has seen than guessed.
For instance, if P guesses 0 as an input for A but sees a
1 on its own input, then P:
- If 0 is the top stack symbol, then push another 0.
- If 1 is the top stack symbol, then pop the stack.
- If Z0, the bottom-of-stack marker is on top, push a 0.
In addition, if P exposes the bottom-of-stack marker, then it has
guessed, as input to A, a permutation of the input P has
seen.
Thus, if A is in an accepting state, P has a choice of
move to pop its stack on epsilon input, thus accepting by empty stack.
Return to Top
Solutions for Section 7.4
Exercise 7.4.1(a)
If there is any string at all that can be ``pumped,'' then the language
is infinite.
Thus, let n be the pumping-lemma constant.
If there are no strings as long as n, then surely the language is
finite.
However, how do we tell if there is some string of length n or
more?
If we had to consider all such strings, we'd never get done, and that
would not give us a decision algorithm.
The trick is to realize that if there is any string of length n
or more, then there will be one whose length is in the range n
through 2n-1, inclusive.
For suppose not.
Let z be a string that is as short as possible, subject to the
constraint that |z| >= n.
If |z| < 2n, we are done; we have found a string in the
desired length range.
If |z| >= 2n, use the pumping lemma to write z = uvwxy.
We know uwy is also in the language, but because |vwx| <=
n, we know |z| > |uwy| >= n.
That contradicts our assumption that z was as short as possible
among strings of length n or more in the language.
We conclude that |z| < 2n.
Thus, our algorithm to test finiteness is to test membership of all
strings of length between n and 2n-1.
If we find one, the language is infinite, and if not, then the language
is finite.
Exercise 7.4.3(a)
Here is the table:
{S,A,C}
{B} {B}
{B} {S,C} {B}
{S,C} {S,A} {S,C} {S,A}
{A,C} {B} {A,C} {B} {A,C}
------------------------------------------
a b a b a
Since S appears in the upper-left corner, ababa is in the
language.
Exercise 7.4.4
The proof is an induction on n that if A =>* w, for any
variable A, and |w|
= n, then all parse trees with A at the root and yield
w have 2n-1 interior nodes.
Basis:
n = 1.
The parse tree must have a root with variable A and a leaf with
one terminal.
This tree has 2n-1 = 1 interior node.
Induction:
Assume the statement for strings of length less than n, and let
n > 1.
Then the parse tree begins with A at the root and two children
labeled by variables B and C.
Then we can write w = xy, where B =>* x and C =>* y.
Also, x and y are each shorter than length n, so
the inductive hypothesis applies to them, and we know that the parse
trees for these derivations have, respectively, 2|x|-1 and
2|y|-1 interior nodes.
Thus, the parse tree for A =>* w has one (for the root) plus the
sum of these two quantities number of interior nodes, or
2(|x|+|y|-1) interior nodes.
Since |x|+|y| = |w| = n, we are done; the parse tree for A =>*
w has 2n-1 interior nodes.
Return to Top