exercise 2.16

Show that the class of context-free languages is closed under the regular operations, union, concatenation, and star.

After the meeting with Calzavara, we'll rewrite the proofs using the grammars instead of PDAs

Indice dei contenuti

Union

Let A, B be CFLs, then exist G, H CFGs when L(G) = A and L(H) = B. Let S₁, S₂ the start symbols for G and H.

We can build a new CFG with start symbol S where S → S₁ | S₂, including all the productions of G and H.
Before the union, we must ensure that the terminals in G and H are different, by replacing the common names.

Concatenation

Let A, B be CFLs, then exist G, H CFGs when L(G) = A and L(H) = B. Let S₁, S₂ the start symbols for G and H.

We can build a new CFG with start symbol S where S → S₁ S₂, including all the productions of G and H.
As the union, we must ensure that the terminals in G and H are different.

Star

Let A be CFL, then exist G CFG when L(G) = A. Let S₁ the start symbols for G.

We can build a new CFG with start symbol S where S → S S₁|ε, including all the productions of G.

ε warrantees that the recursion ends and that the new CFG accepts an empty string.

OLD VERSION

Union

Proof idea

Union of two PDAs can be similar to the union of two NFA: a new start state points to both the old start states with an ε arrow. What about the stack? Since it's empty - by definition of PDA - at the new start state, the two arrows must not read or write anything from the stack, to be sure to keep the old starts states with an empty stack.

The union of two NFAs, used to prove the closure for the CFLs

Proof

Let be A₁, A₂ two context-free languages. If A₁ and A₂ are context-free, there exists two CFG N₁, N₂ that recognize them. We define N₁, N₂ as follow:

N₁ = {Q₁, Σ, Γ₁, δ₁, q_1-start, F₁⊆Q₁}

N₂ = {Q₂, Σ, Γ₂, δ₂, q_2-start, F₂⊆Q₂}

Construct N that recognize A₁ ∪ A₂ as
N = {Q, Σ, Γ, δ, q_start, F⊆Q}
where

Q = Q₁ ∪ Q₂ ∪ q_start

Γ = Γ₁ ∪ Γ₂

q_start = "new" start state for N

F = F₁ ∪ F₂

δ = (for a ∈ Σ and b ∈ Γ)

δ(q₁, a, b) if q ∈ Q₁
δ(q₂, a, b) if q ∈ Q₂
{q₁, q₂} if q = q_start and a = ε
∅ if q = q_start and a ≠ ε

Note that the stack is managed independently for each automaton N₁ and N₂

Concatenation

Proof idea

As for the union, we can start our proof using the concatenation proof for the NFAs, that consists in linking all the final states of the "first" NFA to the start state of the "second".

The concatenation of two NFAs, used to prove the closure of the CFLs

Can we be sure that at the final state the stack is empty in every case? That's an open question.

If yes, the proof idea is completed.

If not, we must add a state that loops the stack in order to empty it. This state, accepting an empty string, reads all the symbols in Γ except $, where $ is the placeholder for the end of the stack, and then goes to the next PDA where the stack contains $. This is a little tricky, but guarrantees that the stack is empty when the second PDA starts.

Proof

Let be A₁, A₂ two context-free languages. If A₁ and A₂ are context-free, there exists two CFG N₁, N₂ that recognize them. We define N₁, N₂ as follow:

N₁ = {Q₁, Σ, Γ₁, δ₁, q_1-start, F₁⊆Q₁}

N₂ = {Q₂, Σ, Γ₂, δ₂, q_2-start, F₂⊆Q₂}

Construct N that recognize A₁ ○ A₂ as
N = {Q, Σ, Γ, δ, q_start, F⊆Q}
where

Q = Q₁ ∪ Q₂ ∪ q_discard

Γ = Γ₁ ∪ Γ₂

q_start = q_1-start

F = F₂

δ = (for a ∈ Σ and b ∈ Γ)

δ(q₁, a, b) if q ∈ Q₁and q ∉ F
δ(q₁, a, b) if q ∈ Q₁and q ∈ F and a ≠ ε
δ(q₁, a, b) ∪ q_discard if q ∈ Q₁∪ q_discardand a = ε and b ≠ $
q_2-start if q ∈ Q₁∪ q_discardand a = ε and b = $
δ(q₂, a, b) if q ∈ Q₂

Star

Proof idea

As for the union, we can start our proof using the star proof for the NFAs, that consists in:

create a new start state (accepted) that goes to the old start state with a ε arrow
create one ε arrow for each accepted state to the old start state

As for the concatenation, we must be sure that the stack is empty before go to the old start state again.

The star of a NFA, used to prove the closure of the CFLs

So, before return to the old start state we must add a new state q_discardthat reads all the symbols in Γ except $.

Proof

Let be A₁ a context-free language. If A₁ is context-free, there exists a CFG N₁ that recognize it. We define N₁ as follow:

N₁ = {Q₁, Σ, Γ₁, δ₁, q_1-start, F₁⊆Q₁}

Construct N that recognize A₁* as
N = {Q, Σ, Γ, δ, q_start, F⊆Q}
where

Q = Q₁∪ q_start

Γ = Γ₁

q_start is the new start

F = F₁∪ q_start

δ = (for a ∈ Σ and b ∈ Γ)

δ(q₁, a, b) if q ∈ Q₁and q ∉ F
δ(q₁, a, b) if q ∈ Q₁and q ∈ F and a ≠ ε
δ(q₁, a, b) ∪ q_discard if q ∈ Q₁∪ q_discardand a = ε and b ≠ $
q_2-start if q ∈ Q₁∪ q_discardand a = ε and b = $
∅ if q = q_start and a ≠ ε