Probabilistic Languages

Probabilistic Languages and Developmental Systems Jerod Michel 2007

Introduction In this paper we will be concerned with how probabilistic constructs in formal language theory can be used to describe biological development. These constructs we refer to are those presented in 1969 by Clarence Ellis in Probabilistic Tree Automata. Ellis generalized the structures of formal language theory to the probabilistic case and proposed probabilistic versions of finite state automata and grammars. Brief introductions to classical automata and grammars are given in sections 1 and 2, the equivalence of regular grammars and finite state automata in section 3, probabilistic grammars in section 4, and classical 0L systems in section 5. In section 6 we introduce a new construct, notably definition 6.1. For formal language theory see [1], for 0L systems see [2], for probabilistic automata and grammars see [3].

1

Finite State Automata

We will begin with some basic definitions. Definition 1.1 An alphabet is any finite set of symbols. A word over an alphabet is any string of finite length composed of symbols from some alphabet. The length |w| of a word w is the number of symbols of which it is composed. The empty word ε is the word consisting of no symbols. If Σ is an alphabet, then Σ ∗ denotes the set of all words over Σ , including the empty word. We use Σ + to denote Σ ∗ − {ε}, i.e., the set of all non-empty words. A language is any set of words over an alphabet. Definition 1.2 A finite automaton is a 5-tuple M = (Q, Σ , δ, q0 , F ), where 1. Q is a finite set of states. 2. Σ is an alphabet. 3. δ : Q × Σ → 2Q is a transition function, where 2Q denotes the family of subsets of Q. 4. q0 ∈ Q is the initial state of M . 5. F ⊂ Q is the set of final states of M . We say M is deterministic if δ(q, a) is a singleton for each q ∈ Q and each a ∈ Σ . Otherwise M is called nondeterministic. A finite automaton M = (Q, Σ , δ, q0 , F ) may be expressed pictorially by a directed graph whose vertices are labelled by states in Q, and directed arcs labelled by symbols of Σ such that (q, q ′ ) is an arc with label a if q ′ ∈ δ(q, a). All vertices labelled by elements of F are bold-circled, and the 1

Turn static files into dynamic content formats.

Create a flipbook