Probabilistic Languages and Developmental Systems Jerod Michel 2007

Introduction In this paper we will be concerned with how probabilistic constructs in formal language theory can be used to describe biological development. These constructs we refer to are those presented in 1969 by Clarence Ellis in Probabilistic Tree Automata. Ellis generalized the structures of formal language theory to the probabilistic case and proposed probabilistic versions of finite state automata and grammars. Brief introductions to classical automata and grammars are given in sections 1 and 2, the equivalence of regular grammars and finite state automata in section 3, probabilistic grammars in section 4, and classical 0L systems in section 5. In section 6 we introduce a new construct, notably definition 6.1. For formal language theory see [1], for 0L systems see [2], for probabilistic automata and grammars see [3].

1

Finite State Automata

We will begin with some basic definitions. Definition 1.1 An alphabet is any finite set of symbols. A word over an alphabet is any string of finite length composed of symbols from some alphabet. The length |w| of a word w is the number of symbols of which it is composed. The empty word ε is the word consisting of no symbols. If Σ is an alphabet, then Σ ∗ denotes the set of all words over Σ , including the empty word. We use Σ + to denote Σ ∗ − {ε}, i.e., the set of all non-empty words. A language is any set of words over an alphabet. Definition 1.2 A finite automaton is a 5-tuple M = (Q, Σ , δ, q0 , F ), where 1. Q is a finite set of states. 2. Σ is an alphabet. 3. δ : Q × Σ → 2Q is a transition function, where 2Q denotes the family of subsets of Q. 4. q0 ∈ Q is the initial state of M . 5. F ⊂ Q is the set of final states of M . We say M is deterministic if δ(q, a) is a singleton for each q ∈ Q and each a ∈ Σ . Otherwise M is called nondeterministic. A finite automaton M = (Q, Σ , δ, q0 , F ) may be expressed pictorially by a directed graph whose vertices are labelled by states in Q, and directed arcs labelled by symbols of Σ such that (q, q ′ ) is an arc with label a if q ′ ∈ δ(q, a). All vertices labelled by elements of F are bold-circled, and the 1

start state is indicated by an arrow pointing to it. Such a picture is referred to as the state diagram of M . Example 1.1 Let M = ({S, A}, {a, b}, δ, S, {S}) be be a finite automaton with δ defined by the following table δ S A

a S A

b A S

The transition diagram of M is given in figure 1. Definition 1.3 Let M = (Q, Σ , δ, q0 , F ) be a finite automaton. A configuration of M is a pair (q, w) in Q × Σ ∗ . A move by M , denoted by ⊢M , is a binary relation on configurations such that (q, aw) ⊢M (q ′ , w) if q ′ ∈ δ(q, a).

a

a

b S

A b Figure 1:

For the automaton in example 1.1, notice M reads the word aabbaa in the sequence of moves (S, aabbaa) ⊢M (S, abbaa) ⊢M (S, bbaa) ⊢M (A, baa) ⊢M (S, aa) ⊢M (S, a) ⊢M (S, ε) Definition 1.4 A word w ∈ Σ ∗ is said to be recognizable by a finite automaton M = (Q, Σ , δ, q0 , F ) if and only if there exists a sequence of moves (q0 , w) ⊢M ... ⊢M (q, ε), for some q ∈ F . The language recognized by M , denoted Λ(M ), is the set of all words in Σ ∗ recognized by M . A language is regular if and only if it is recognized by some finite automaton. It follows from the definition above that a word w is recognized by M if, starting in the initial state, M begins reading the leftmost symbol of w and eventually leaves the rightmost side of the word while entering a final state. Also note that a nondeterministic finite automaton recognizes a word w if there exists one sequence of moves from the initial configuration (q0 , w) to a final configuration (q, ε) with q ∈ F . Deterministic and nondeterministic finite automata recognize the same family of languages, which we illustrate by the following theorem. Theorem 1.1 For every finite automaton M , there is a deterministic finite automaton M ′ with Λ(M ) = Λ(M ′ ). Proof: Let M = (Q, Σ , δ, q0 , F ) be a (possibly nondeterministic) finite automaton. Consider the deterministic finite automaton M ′ = (2Q , Σ , δ ′ , {q0 }, F ′ ) where F ′ consists of all subsets K of Q such that K ∩ F is nonempty, and δ ′ is given by

2

δ ′ (K, a) =



{q ′ }, ∅,

if there exists q ∈ K such that q ′ ∈ δ(q, a) if δ(q, a) is not defined, or K = ∅

One can see that Λ(M ) = Λ(M ′ ), and the proof is complete. Finite automata describe regular languages by recognizing them. They are, in general, not effective in determining whether a given language is regular. We give a theorem for this purpose. The theorem is referred to as the ”pumping lemma” which utilizes a property of regular languages: If a sufficiently long word is given in a regular language, then one can always select a subword in it which can be repeated (or pumped) indefinitely such that the resulting string is still in the language. The result is illustrated in the following diagram in which xyz is a word in a regular language. Clearly, xy i z is also in the language, for i ≥ 0.

y

q0

q1 x

z

q2

Figure 2:

Theorem 1.2 For each regular language Λ, there exists a nonnegative integer p such that if w ∈ Λ and |w| ≥ p, then w can be written as xyz, where 0 < |y| ≤ p and xy i z ∈ Λ for all i ≥ 0. Proof: Let M = (Q, Σ , δ, q1 , F ) be a deterministic finite automaton recognizing Λ and p the number of states of M . Let s = s1 s2 ...sn be a word in Λ of length n where n ≥ p. Let r1 , ..., rn+1 be the sequence of states that M enters when processing s, so ri+1 = δ(ri , si ) for 0 ≤ i ≤ n. This sequence has length n + 1, which is at least p + 1. Among the first p + 1 elements in the sequence, two must be the same state, by the pigeonhole principal. We call the first of these rj and the second rl . Since rl occurs among the first p + 1 places in a sequence starting at r1 , we have l ≤ p + 1. Now let x = s1 ...sj−1 ,y = sj ...sl−1 , and z = sl ...sn . As x moves M from r1 to rj , y moves M from rj to rj , and z moves M from rj to rn+1 , which is an accept state, M must accept xy i z for i ≥ 0. Since j 6= l we have |y| > 0. We also have l ≤ p + 1 whence |xy| ≤ p. This completes the proof. Example 1.2 We show Λ = {an bn | n ≥ 1} is not regular. Suppose Λ is regular. Consider a word w = an bn ∈ Λ such that |w| ≥ p. By theorem 1.2, w = xyz such that y 6= ε. Notice y can have only one of three forms: ak , bk , ai bj . If y has form either ak or bk , then xy 0 z ∈ / Λ. If y = ai bj , then xyyz ∈ / Λ. Thus Λ cannot be regular. Context-free languages, which will be introduced in the next section, can also be characterized by recognizing devices. These are called push-down automata. Since context-free languages are not our primary concern we refer interested readers to [1]. Automata describe languages by recognizing them. Sometimes it is more appropriate to describe languages by generating them. We now turn grammars, objects used to describe languages by generating them.

3

2

Grammars

Definition 2.1 A grammar is a 4-tuple, Γ = hV, Σ , P, ωi, where V is a finite set of symbols, called variables, Σ is a finite set of symbols, called terminals, P is a finite set of productions, ω is a member of V , called the start symbol, and V and Σ share no symbols. A production is an expression of the form α → β, where α ∈ (V ∪ Σ )+ and β ∈ (V ∪ Σ )∗ . Notice the resriction that V and Σ share no symbols. The construct called a 0L system which we discuss in section 5 is similar to a grammar, only it has no such restriction on the variables and terminals. Definition 2.2 Given a grammar Γ = hV, Σ , P, ωi, and words ρ, σ ∈ (V ∪ Σ )∗ , we say ρ directly derives σ in Γ if and only if the following conditions are satisfied: 1. ρ can be written as γαδ, 2. σ can be written as γβδ, 3. α → β is a production in P . Γ

We denote this by ρ ⇒ σ. Example 2.1 The following is an example of a grammar, which we call G. Grammar G consists of a set of variables {A, B}, a set of terminals {0, 1, ♦}, a set of productions {A → 0A1, A → B, B → ♦}, and a start symbol A. A brief way of writing {A → 0A1, A → B, B → ♦} is by using a “|” symbol as an ”or” operator. That is, {A → 0A1|B, B → ♦}. This grammar generates the word 000♦111. The sequence of substitutions used to obtain a word is called a derivation. A derivation of the word 000♦111 is A ⇒ 0A1 ⇒ 00A11 ⇒ 000A111 ⇒ 000B111 ⇒ 000♦111 You may also represent derivations pictorially with a parse tree. The vertices of a parse tree are labelled with variables and terminals of the grammar. If an interior vertex is labelled A and the offspring of A are labelled X1 , ..., Xn from left to right, then A → X1 ...Xn is a production in the grammar. A parse tree for the derivation A ⇒ 000♦111 is shown in Figure 3.

4

A

A

A

A

B

0

0

0

1

1

1

Figure 3:

Definition 2.3 If Γ = hV, Σ , P, ωi is a grammar, the language generated by Γ is defined to be Γ

Γ

Γ

Γ

Λ(Γ ) = {σ | σ ∈ Σ ∗ and there exists a chain ω ⇒ α1 ⇒ ... ⇒ αm ⇒ σ with m ≥ 0}. Example 2.2 Consider the grammar Γa = h{X}, {a, b}, {X → aX, X → Xb, X → a}, Xi. This grammar generates the words a, aa, ab, aab, aabb, .... The language generated by Γa is Λ(Γa ) = {an bm | n > 0 and m ≥ 0}. Definition 2.4 Let Γ = hV, Σ , P, ωi be a grammar. If for every production α → β ∈ P , |α| ≤ |β|, Γ is said to be context-sensitive. If for every production α → β ∈ P , α is a single variable and |β| ≥ 0, Γ is said to be context-free. If every production in P is of the form X → aY or X → a, where X and Y are variables and a ∈ Σ , then Γ is said to be regular. Example 2.3 Let Γ = h{S, A, B}, {a, b, c}, P, Si where P = {S → abc|aAbc, Ab → bA, Ac → Bbcc, bB → Bb, aB → aa|aaA, }. Grammar Γ generates the context-sensitive language {an bn cn | n ≥ 1}. Notice how the the productions are used in the following derivation of a3 b3 c3 . S ⇒ aAbc ⇒ abAc ⇒ abBbcc ⇒ aBbbcc ⇒ aaAbbcc ⇒ aabAbcc ⇒ aabbAcc ⇒ aabbBbccc ⇒ aabBbbccc ⇒ aaBbbbccc ⇒ aaabbbccc The relationship between the classes of finite, regular, context-free, and context-sensitive languages is given in figure 4.

5

context-sensitive languages context-free languages regular languages

finite languages

Figure 4:

3

Equivalence of Regular Grammars and Finite State Automata

To show that the class of languages generated by regular grammars is that recognized by finite automata, we start with a language that is recognized by a deterministic fintie automaton, and show there is a regular grammar that generates this language. Theorem 3.1 If Λ is recognized by the deterministic finite automaton M = (Q, Σ , δ, q0 , F ), then there exists a regular grammar Γ = hV, Σ , P, ωi such that Λ = Λ(Γ ). Proof: We assume that Q = {q0 , ..., qn } and Σ = {a1 , ..., an }. Construct the regular grammar Γ = hV, Σ , P, ωi as follows. Let V = {q0 , ..., qn } and ω = q0 . For each transition δ(qi , aj ) = qk we let qi → aj qk be in P . In addition, if qk is in F , we let qk → ε be in P .

6

We now show that Γ can generate every word in Λ. Consider ω ∈ Λ of the form ω = ai aj · · · ak al . For M to accept this string it must make moves δ(q0 , ai ) = qp , δ(qp , aj ) = qr , · · · δ(qs , ak ) = qt , δ(qt , al ) = qf ∈ F , Because of the way we have constructed the grammar, it will have one production for each of these δ’s. Thus we can make the derication q0 ⇒ ai qp ⇒ ai aj qr ⇒ · · · ⇒ ai aj · · · ak qt ⇒ ai aj · · · ak al qf ⇒ ai aj · · · ak al in Γ , whence ω ∈ Λ(Γ ). Conversely, if ω ∈ Λ(Γ ), then its derivation will have the above form. This implies that the moves δ(q0 , ai ) = qp , δ(q0 , ai ) = qp , δ(qp , aj ) = qr ,...,δ(qs , ak ) = qt , δ(qt , al ) = qf will be made by M completing the proof. Theorem 3.2 Let Γ = hV, Σ , P, ωi be a regular grammar. There exists a finite automaton M such that Λ(Γ ) = Λ(M ). Proof: We assume that V = {V0 , V1 ...}, ω = V0 , and P consists of productions of the form V0 → v1 Vi , Vi → v2 Vj , ... or Vn → vl , ... If w is a word in Λ(Γ ), then because of the form of the productions in Γ , the derivation must have the form , V0 ⇒ v1 Vi ⇒ v1 v2 Vj · · · ⇒ v1 v2 · · · vk Vn ⇒ v1 v2 · · · vk vl = w.

7

We construct the automaton M as follows. The initial state of the automaton will be labeled V0 , and for each variable Vi there will be a non-accept state labeled Vi . For each production Vi → a1 a2 · · · an Vj , the automaton will have transitions to connect Vi and Vj , i.e. δ will be defined so that, δ(Vi , a1 ) = Vi+1 δ(Vi+1 , a2 ) = Vi+2 · · · δ(Vi+l , an ) = Vj . For each production Vi → a1 a2 · · · an , the corresponding transitions of M will be δ(Vi , a1 ) = Vi+1 δ(Vi+1 , a2 ) = Vi+2 · · · δ(Vi+l , an ) = Vf , where Vf is an accept state. The intermediate states that are needed to do this are of no concern and can be labeled arbitrarily. Suppose now that w ∈ Λ(Γ ) such that δ(Vi , a1 ) = Vi+1 δ(Vi+1 , a2 ) = Vi+2 · · · δ(Vi+l , an ) = Vj . Because of the way we constructed the automaton, there is a path from V0 to Vi labeled v1 , a path from Vi to Vj labeled v2 , and so on. Thus Vf ∈ δ(an , Vl ) for some l and w is accepted by M . Conversely, assume that w is accepted by M . Because of the way M was constructed, to accept w the automaton has to pass through a sequence of states V0 , Vi , ... to Vf , using paths labeled v1 , v2 , ... Therefore, w must have the form w = v1 v2 · · · vk vl whence the derivation V0 ⇒ v1 v2 · · · vk vl is possible. Thus w ∈ Λ(Γ ) and we are done. 8

4

Probabilistic Grammars

This section will introduce the concepts of probabilistic grammars and probabilistic languages. The results of this section are taken from [3]. Definition 4.1 A probabilistic grammar is a 4-tuple Ξ = hV, Σ , P, ∆i where V is a finite set {A1 , ..., An } of variables, Σ is an alphabet, P is a finite set of probabilistic productions of the form ψ →p ζ with ψ ∈ (V ∪ Σ )+ , ζ ∈ (V ∪ Σ )+ and p a nonzero real number, and ∆ is an n-dimensional vector (δ1 , ..., δn ) with δi being the probability that Ai is chosen as the start variable. The probability of a derivation of ζ from the start symbol Al is defined to be σ(Al ⇒ ζ) = Pk Qki i=1 j=1 pij where k is the number of derivations of ζ from Al , ki is the number of productions used in the i-th derivation, and pij is the probability associated with the j-th Pn step of the i-th derivation. The derived probability of a word w ∈ Σ + is defined to be µ(w) = i=1 δi σ(Ai ⇒ w).

Definition 4.2 A normalized probabilistic grammar is a probabilistic grammar Ξ = hV, Σ , P, ∆i such that the entries in ∆ add up to 1 and the probabilities associated with productions that share the same variable on Pthe left-hand side add up to one. i.e. If the productions in P are ψi →pij ζj , then we must have j pij = 1 for all i.

Definition 4.3 Let Ξ = hV, Σ , P, ∆i be a probabilistic grammar. We say Ξ is regular if each production in P has the form A →p aB or A →p a where A ∈ V and a ∈ Σ . We say Ξ is context free if each production in P has the form A →p ζ where A ∈ V and ζ ∈ (V ∪ Σ )+ . We say Ξ is admissible if there exists a derivation of some x ∈ Σ + from each A ∈ V . Definition 4.4 Let Ξ = hV, Σ , P, ∆i be a probabilistic grammar. The probabilistic language generated by Ξ is defined to be Λ(Ξ ) = {(w, µ(w)) | w ∈ Σ + and µ(w) exists and is nonzero}. A probabilistic language is normalized if µ is a probability measure, i.e. for each positive integer k, the sum of the derived probabilities of all w ∈ Σ + such that |w| ≤ k, is 1. Example 4.1 Consider Ξ = hV, Σ , P, ∆i where P = {A → 13 aA, A → 23 aB, B → 13 bB, B → 32 b}, V = {A, B}, Σ = {a, b}, and ∆ = (1, 0). The probabilistic language generated by Ξ is Λ(Ξ ) = 4 4 ), (abb, 27 ), ...}. {(ab, 49 ), (aab, 27 One of the theorems that can be proven using these definitions is the following:

Theorem 4.1 Every normalized regular admissible probabilistic grammar generates a normalized probabilistic language. Proof: Assume the conditions for the probabilistic grammar Ξ = hV, Σ , P, ∆i. To show that µ is a probability measure on Λ(Ξ ), it is sufficient to show that for each positive integer k, the sum of the derived probabilities of all w ∈ Σ + such that |w| ≤ k, is 1. Let n = |V |. We first define an (n + 1) × (n + 1) matrix U = (uij ) as follows: P uij = (Ai →aAj )∈P σ(Ai → aAj ) where i ≤ n, j ≤ n, a ∈ Σ uij =

P

(Ai →b)∈P

σ(Ai → b) where i ≤ n, j = n + 1, b ∈ Σ

uij = 0 where i = n + 1, j ≤ n un+1,n+1 = 1.

9

Notice that ui,n+1 is, by definition 4.1, the total probability of a derivation from Ai a word of length 1. We now consider powers of the matrix U . By definition 4.1 again, (U k )i,n+1 gives the total probability of a derivation from Ai a word of length ≤ k. If U k is premultiplied by the row vector ∆ augmented by zero, ∆′ = (δ1 , ..., δn , 0), then the (n + 1)-st entry of the resulting vector is the sum of the derived probabilities of all w ∈ Σ + such that |w| ≤ k. Therefore, P P ′ k w∈Σ + µ(w) = limk→∞ w∈Σ + ,|w|≤k µ(w) = limk→∞ (∆ · U )n+1 . Since Ξ is normalized, U is stochastic. Thus, since Ξ is admissible, there must exist a positive integer k such that (U k )i,n+1 > 0 for i = 1, 2, ..., n + 1. Using the theory of Markov chains [4],   (U k )1   .. Uk =   . (U k )n+1

where each row vector (U k )i approaches a steady-state vector v as k approaches infinity. Thus (U k )n+1 = (0, 0, ..., 0, 1) for all positive integers k so that v = (0, 0, ..., 0, 1), whence limk→∞ (∆′ · U k ) = ∆′ limk→∞ U k = ∆′ (v, v, ..., v)T . Therefore, the (n + 1)-st entry is ∆′ (v, v, ..., v)Tn+1 =

Pn+1 i=1

δi = 1,

completing the proof. We now turn to 0L languages, generalizations of a grammar that are used to describe developmental systems.

5

0L Systems

In many multicellular organisms, especially plants, growth processes suggest branching patterns. The same structure being repeated several times along an axis we refer to as simple branching. When the entire structure of a previous stage is repeated as part of an organism in later stages of development we refer to this as compound branching. As motivation behind constructs presented formally later in this section, we give a biological example from Herman and Rozenberg [2]. Figure 5 represents the first 7 developmental stages of a Callithamnion roseum (an alga). Note that cells need not be represented by boxes as in figure 5; one could use line segments if so inclined. We may assign distinct symbols for distinct states that a developing organism may assume. One representation of this kind is as follows:

10

1 2 3 4 5

6

7

Figure 5:

1. c 2. cc 3. cccc 4. cc(c)cccc 5. cc(cc)cc(c)cccc 6. cc(ccc)cc(cc)cc(c)cccc 7. cc(cccc)cc(ccc)cc(cc)cc(c)cccc In this representation each cell is denoted by a “c”, the beginning of a branch by a “(” and the end of a branch by a “)”. The sides on which a branch lies are not distinguished between in our description, but could be by using different types of brackets. Also note that branches within branches are denoted by nested parentheses. This type of development, which zoologists refer to as mosaic development, which we will discuss later in more detail. This developmental system can be described by a construct similar to definition 2.1. We now give the basic definitions underlying the family of 0L lanuages. Definition 5.1 A 0L scheme is a pair S = hΣ , P i, where Σ is an alphabet and P ⊂ Σ × Σ ∗ is a finite set of productions such that for every α ∈ Σ there exists β ∈ Σ ∗ and α → β ∈ P . Definition 5.2 A 0L scheme S = hΣ , P i is propagating if there is no production in P of the form α → ε, otherwise S is nonpropagating. A 0L scheme is deterministic if for every α ∈ Σ there exists exactly one β ∈ Σ ∗ such that α → β ∈ P , otherwise S is nondeterministic. Definition 5.3 A 0L system is a triple Γ = hΣ , P, ωi where S = hΣ , P i is a 0L scheme and ω is a word over Σ called the axiom (start symbol) of Γ . We say Γ is propagating if S is propogating and we say Γ is deterministic if S is deterministic. 11

Definition 5.4 Let Γ = hΣ , P, ωi be a 0L system, ρ = a1 ...am with m ≥ 0 and a1 , ..., am ∈ Σ , and let σ ∈ Σ ∗ . We say that ρ directly derives σ in Γ if there exist α1 , ..., αm ∈ Σ ∗ such that Γ a1 → α1 , ..., am → αm ∈ P and σ = α1 ...αm . We denote a direct derivation of σ from ρ as ρ ⇒ σ. Definition 5.5 Let Γ = hΣ , P, ωi be a 0L system. Then the language generated by Γ is Γ

Γ

Γ

Γ

Λ(Γ ) = {σ | σ ∈ Σ ∗ and there exists a chain ω ⇒ α1 ⇒ ... ⇒ αm ⇒ σ with m ≥ 0}. We now give some examples. The following example was given in [2] as the a possible description of the development of the cell sequence along the margin of a leaf. Example 5.1 Consider the 0L system Γ = h{S, a, b, c, d, e, f, g, h, i, j, k, m, 0, 1, 2}, {S → ab, a → dg, b → e0, c → 22, d → 0e, e → ef, f → 1c, g → hb, h → di, i → jk, j → m1, k → c0, m → 0c, 0 → 0, 1 → 1, 2 → 2}, Si. This system is both deterministic and propagating. Starting with the axiom S of Γ we have the derivation S ⇒ ab ⇒ dge0 ⇒ 0ehbcf 0 ⇒ 0cf die0221c0 ⇒ 0221c0ejkcf 0221220 ⇒ 0221220cf m1c0221c0221220 ⇒ 0221220221c0c1220221220221220 ⇒ 0221220221220221220221220221220. Since the last string in the derivation derives itself, the language generated by Γ contains only the nine words listed above. A detailed description of Γ in its biological context is given in [2]. Example 5.2 Let Γ = hΣ , P, ωi, where Σ = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, (, )}, P = {0 → 10, 1 → 32, 2 → 3(4), 3 → 3, 4 → 56, 5 → 37, 6 → 58, 7 → 3(9), 8 → 50, 9 → 39, (→ (, ) →)}, and ω = 4. Then Γ is the propogating, deterministic 0L system that describes the compound branching of algal organisms discussed in the beginning of this section. Notice that if we replace each symbol, except for “(” and “)”, by a c in the following developmental sequence for Γ : 1. 4 2. 56 3. 3758 4. 33(9)3750 5. 33(39)33(9)3710 6. 33(339)33(39)33(9)3210 7. 33(3339)33(339)33(39)33(4)3210 then we obtain the same sequence given in the beginning of this section. Notice in the above biological example that the left-hand side of each production is a single terminal. What happens to a cell in the next stage of development is not influenced by the neighboring cells of the current stage, i.e. there is no interaction between cells. This is called mosiac development.

12

Theorem 5.1 There exist finite languages that are not 0L languages. Proof: We show that {a, aa} is not a 0L language. It suffices to show the following for arbitrary 0L scheme S = hΣ , P i of some 0L system Γ = hΣ , P, ωi, and a ∈ Σ . 1. If there exists a chain Γ Γ Γ Γ a ⇒ α1 ⇒ ... ⇒ αm ⇒ aa with m ≥ 0, then there must exist a chain Γ Γ Γ Γ a ⇒ β1 ⇒ ... ⇒ βn ⇒ aaaa with n ≥ 0. Γ

Γ

Γ

Γ

2. If there exists a chain aa ⇒ α1 ⇒ ... ⇒ αm ⇒ a for some m ≥ 0, then there must exist a chain Γ Γ Γ Γ aa ⇒ β1 ⇒ ... ⇒ βm ⇒ ε. Γ

Γ

Γ

Γ

Proof of 1: Assume the conditions. Then there is a chain a ⇒ α1 ⇒ ... ⇒ αm ⇒ aa for some Γ Γ Γ Γ m ≥ 0. Then there must exist a chain aa ⇒ β1 ⇒ ... ⇒ βm ⇒ aaaa. Thus there is a chain Γ Γ Γ Γ Γ Γ Γ Γ a ⇒ α1 ⇒ ... ⇒ αm ⇒ aa ⇒ β1 ⇒ ... ⇒ αm ⇒ aaaa. Γ

Γ

Γ

Γ

Proof of 2: Assume the conditions. Then there is a chain aa ⇒ α1 ⇒ ... ⇒ αm ⇒ a for some Γ Γ Γ Γ Γ Γ Γ Γ m ≥ 0. Then there is a chain a ⇒ γ1 ⇒ ... ⇒ γm ⇒ a and a chain a ⇒ δ1 ⇒ ... ⇒ δm ⇒ ε. Thus Γ Γ Γ Γ there is a chain aa ⇒ κ1 ⇒ ... ⇒ κm ⇒ ε. We now consider all possibilities for ω. 1. Suppose ω = a. If aa is not in Λ(Γ ) then Λ(Γ ) is not equal to {a, aa}. If aa ∈ Λ(Γ ) then by (1) above aaaa ∈ Λ(Γ ). Thus Λ(Γ ) is not equal to {a, aa}. 2. Suppose ω = aa. If a is not in Λ(Γ ), then Λ(Γ ) is not equal to {a, aa}. If a ∈ Λ(Γ ) then by (2) above we get ε ∈ Λ(Γ ). Thus Λ(Γ ) is not equal to {a, aa}. Since ω must be in Λ(Γ ) it follows that Λ(Γ ) is not equal to {a, aa}, and {a, aa} must not be a 0L language as there is no 0L system to generate it. This completes the proof. The relationship between the classes of finite, regular, context-free, and 0L languages is given in figure 6. When describing developmental systems, it is often appropriate to incorperate stochasticity into such constructs as definition 5.3. Next we discuss the probabilistic analogue of a 0L system.

13

context-sensitive languages context-free languages regular languages

finite languages

0L languages

Figure 6:

6

Probabilistic 0L Systems

We now extend Elis’s ideas to the basic family of 0L languages and propose probabilistic analogues of the definitions from section 5. Definition 6.1 A probabilistic 0L system is a 3-tuple Ω = hΣ , P, ∆i where Σ is an alphabet {a1 , ..., an }, P is a finite set of probabilistic productions of the form ai →p ζ with ζ ∈ Σ ∗ such that for all ai ∈ Σ there exists ζ ∈ Σ ∗ and ai →p ζ ∈ P , and ∆ is an n-dimensional vector (δ1 , ..., δn ) with δi being the probability that ai is chosen and the axiom. Pk Qki pij The probability of a derivation of ζ from the axiom al is defined to be σ(al ⇒ ζ) = i=1 j=1 where k is the number of derivations of ζ from al , ki is the number of productions used in the i-th derivation, and pij is the probability associated with the Pnj-th step of the i-th derivation. The derived probability of a word w ∈ Σ ∗ is defined to be µ(w) = i=1 δi σ(ai ⇒ w).

Definition 6.2 Let Ω = hΣ , P, ∆i be a probabilistic 0L system. The probabilistic 0L language generated by Ω is defined to be Λ(Ω ) = {(w, µ(w)) | w ∈ Σ + and µ(w) exists and is nonzero}. A probabilistic 0L language is normalized if µ is a probability measure. Example 6.1 Let Ω = h{a}, P, 1i where P = {a →p a}. Notice Λ(Ω ) is the singleton P {a}, but ∞ there are infinitely many derivations for the word a. The derived probability of a is µ(a) = k=1 pk which is the familiar geometric series and converges when p < 1. n

Example 6.2 Consider Ω = h{a}, P, 1i where P = {a →p aa}. Notice Λ(Ω ) = {a2 | n ≥ 0}, n n n and µ(a2 ) = p2 −1 . The tree in figure 7 illustrates how µ(a2 ) can be calculated.

14

a p aa p2 aaaa p4 a8 Figure 7:

7

Conclusion and Directions for Further Work

We have defined a probabilistic version of a 0L system and that of a 0L language in analogy with the probabilistic analogue proposed by Ellis. We leave the reader with a set of questions as direction for further work. 1. How does the class of probabilistic 0L languages sit with that of probabilistic regular languages? Probabilistic context free languages? 2. What biological phenomena can be described by the quantum analogue of a 0L system? 3. Can one formulate definitions for probabilistic versions of hk, li systems 1 and hk, li languages? 4. What is the roll of the probability associated with a production in its biological context? 5. When can we be sure that each word generated by a 0L system has finitely many derivations only? Our biggest hope is that (3) will be full of surprises.

1 For

hk, li systems see [6].

15

References 1. J.E. Hopcroft and J.D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison Wesley, New York, 1979. 2. G.T. Herman and G. Rozenberg, Developmental Systems and Languages, North-Holland Publishing Co., Amsterdam, 1974. 3. C.A. Ellis, Probabilistic Tree Automata. PhD thesis, University of Illinois, Urbana, 1969. 4. W. Feller, An Introduction to Probability Theory and its Applications, vol. 1, Wiley New York, 1957. 5. C. Moore and J.P. Crutchfield, ”Quantum Automata and Quantum Grammars”, Theoretical Computer Science, 237, 275-306, 2000. 6. W. Cheng and J. Wang, ”Regular Quantum Grammars and Regular Grammars”, Laboratory of Complex Systems and Intelligence Science, 75-78, Beijing, 2004. 7. G.T. Herman, ”Using Formal Language Theory to Model Biological Processes”, Applied Computation Theory: Analysis, Design, Modeling, 530-564, Prentice Hall, New Jersey, 1976. 8. M. Sipser, Introduction to the Theory of Computation, Second Edition, Massachusetts Institute of Technology, Thomson ,2006.

16

Probabilistic Languages
Probabilistic Languages

A treatment of natural developmental systems by probabilistic languages.