Static Watson-Crick Context-Free Grammars

—Sticker systems and Watson-Crick automata are two modellings of DNA molecules in DNA computing. A sticker system is a computational model which is coded with single and double-stranded DNA molecules; while Watson-Crick automata is the automata counterpart of sticker system which represents the biological properties of DNA. Both of these models use the feature of Watson-Crick complementarity in DNA computing. Previously, the grammar counterpart of the Watson-Crick automata have been introduced, known as Watson-Crick grammars which are classified into three classes: Watson-Crick regular grammars, Watson-Crick linear grammars and Watson-Crick context-free grammars. In this research, a new variant of Watson-Crick grammar called a static Watson-Crick context-free grammar, which is a grammar counterpart of sticker systems that generates the double-stranded strings and uses rule as in context-free grammar, is introduced. The static Watson-Crick context-free grammar differs from a dynamic Watson-Crick context-free grammar in generating double-stranded strings, as well as for regular and linear grammars. The main result of the paper is to determine the generative powers of static Watson-Crick context-free grammars. Besides, the relationship of the families of languages generated by Chomsky grammars, sticker systems and Watson-Crick grammars are presented in terms of their hierarchy.


Introduction
DNA (Deoxyribonucleic Acid) molecule plays an important role in DNA computing. DNA is a polymer which is constructed from monomers namely deoxyribonucleotides. Each deoxyribonucleotide consists of three parts of components; a sugar, a phosphate group, and a nitrogenous base. The four nitrogenous bases are adenine (A), thymine (T), guanine (G), and cytosine (C) which are paired as A-T and C-G according to the Watson-Crick (WK) complementarity. DNA computing is a branch of biomolecular computing which concerns with the utilization of DNA as an information carrier. The birth of this field has been marked by Adleman [1] in 1994. By using DNA strands in his experiment, he was able to solve the Hamiltonian path problem for a simple graph with the sticker operation.
Sticker systems and Watson-Crick automata are DNA computing models which are based on different principles, but the complementarity relation do exist in a computation or derivation step. In 1998, sticker systems were introduced by Kari et al. [2] as language generating devices based on the sticking operation which is a model of techniques used by Adleman [1]. The operation starts from a finite set of axioms and then prolongs to the right of the generated sequences (single or double) symbols by using single-stranded strings, either to the upper or lower strands, therefore matching the sequence based on the complementarity relation. Some variants of sticker systems have been defined such as one-sided, regular, simple, simple one-sided and simple regular sticker systems [3]. In order to increase the generative power of sticker systems, some additional restrictions have been imposed such as by assigning an element of a monoid to the sticker operation [4], by introducing probabilistic sticker systems [5] and by including the presence of weight for the variant of sticker systems [6].
Watson-Crick automata was proposed by Freund et al. in 1997 [7] which is an extension of finite automata with the addition of two reading heads on double-stranded sequences. Some restrictions and extensions have been proposed on the basic model of Watson-Crick automata in order to achieve higher generative powers of Watson-Crick automata. Some variants of Watson-Crick automata have been proposed such as Watson-Crick transducers [8], Watson-Crick omega-automata [9] and weighted Watson-Crick automata [10].
On the other hand, formal language theory is a natural framework in formalizing and investigating DNA computing models. Grammars act as language generator, and besides grammar, automata are devices for defining language which work differently from the grammar. Historically, earlier grammar models introduced in DNA computing did not utilise Watson-Crick complementarity of DNA molecules [11,12]. Following that, a grammar model that uses this feature has been proposed, known as Watson-Crick grammars which produce each stranded string independently [13]. In this research, a new variant of Watson-Crick grammars, called a static Watson-Crick context-free grammar is introduced as an analytical counterpart of sticker systems. This paper is organized as follows: Section 1 introduces the background of the research. In Section 2, some necessary definitions and notations used in this research are presented. Next, the definition of Watson-Crick grammars, the concepts of sticker systems and Watson-Crick Chomsky normal form are discussed and shown in Section 3. In Section 4, the results on the static Watson-Crick context-free grammar with some of the generative powers are given.
In the next section, some preliminaries concepts which are used in this paper are stated.

Literature Review
In the following section, some information on grammars, static Watson-Crick regular and linear grammars are stated. In this paper, the symbol ⊆ denotes the inclusion while ⊂ denotes the strict (proper) inclusion. The membership of an element to a set is denoted by ∈. Let be a finite alphabet. Then, * is the set of all finite strings (words) over . A string with no symbols, or we called it as empty string, is denoted by . The set * always contains and to exclude the empty string, the symbol ( is defined as the set of all nonempty finite strings over where ( = * − { }. In formal language theory, a grammar acts as a mechanism to describe languages mathematically; in other words, it acts as a language generator. A Chomsky grammar (sometimes simply called a grammar) is a set of rule formation for rewriting strings. Chomsky grammar is classified depending on their respective form of production rules. The definition of Chomsky grammar is stated as follows.
The family of languages generated by regular grammars is equal to the family of languages generated by the right-or left-linear grammars. The families of contextsensitive, context-free, linear and regular languages are denoted as CS, CF, LIN and REG respectively. Other than that, RE and FIN represent the family of recursive enumerable languages, i.e., arbitrary languages and finite language. Hence, the following strict inclusion holds for Chomsky hierarchy, where ⊂ ⊂ ⊂ ⊂ ⊂ [3]. Next, we recall the definition of static Watson-Crick regular and linear grammars. In this paper, we state only for static Watson-Crick right-linear grammars for the regular grammars since the definition is almost similar to left-linear grammars.

Definition 2 [14] Static Watson-Crick right-linear grammar.
A static Watson-Crick right-linear grammar is a 5-tuple = ( , , , , ) where , are disjoint nonterminal and terminal alphabets respectively, ⊆ × is a symmetric relation (Watson-Crick complementarity), ∈ is a start symbol (axiom) and is a finite set of production rules in the form of

Definition 3 [15] Static Watson-Crick Linear Grammar.
A static Watson-Crick linear grammar is a 5- tuple = ( , , , , ) where , are disjoint nonterminal and terminal alphabets respectively, ⊆ × is a symmetric relation (Watson-Crick complementarity), ∈ is a start symbol (axiom) and is a finite set of production rules in the form of 1. → G K where ∈ − { } and I 9 9 K ∈ M * ( ).
In the next section, the concepts of sticker systems, the definition of Watson-Crick grammars and Watson-Crick Chomsky normal form are presented.

Methodology
In this research, the static Watson-Crick context-free grammars are introduced by referring from the Watson-Crick grammars with some modifications and by using the concept of sticker systems. Besides, the generative power of these grammars are classified through comparison with the Chomsky hierarchy and Watson-Crick languages. In the following subsections, some information on sticker systems and Watson-Crick grammars are stated.

Sticker Systems
Let be an alphabet and let be a symmetric relation for ⊆ × over (of complementarity). The symbol * represents a set of all strings which is composed of elements of and the empty string denoted as , and ( is the set * − { }. The set of all pairs of strings over is denoted as I * * K. To represent DNA molecules as the string, the elements ( 9 , ; ) ∈ * × * are written in the form of I

Watson-crick grammars
In this subsection, the definitions of Watson-Crick grammars are stated as follows.
Next, the definition of Watson-Crick Chomsky normal form is stated. =  ( , , , , ) is said to be in Watson-Crick Chomsky normal form if all productions are of the form
In the next section, the definition and the generative power of static Watson-Crick context-free grammars are presented.

Results and Findings
In this research, the definition of static Watson-Crick context-free grammar is introduced whereby it is a grammar counterpart of the sticker system that has rules as in context-free grammars.

Definition 7. A static Watson-Crick context-free grammar
Is a 5-tuple = ( , , , , ) where , are disjoint nonterminal and terminal alphabets, respectively, ∈ × is a symmetric relation (Watson-Crick complementarity), ∈ is a start symbol (axiom) and is a finite set of production rules in the form of 1. The derivation step for the static Watson-Crick context-free grammar is shown in the following. Let = ( , , , , ) be a static Watson-Crick context-free grammar. We say that derives in , denoted or written as ⇒ such that 1. = and = 9 9 ; ; q ⋯ e e e(9 where ⇒ ∈ ; 2. = and = 9 9 ; ; q ⋯ e e e(9 where , ∈ − { } , , ∈ r M ( ) ∪ s * and ⇒ 9 9 ; ; q ⋯ e e e(9 ∈ ; or Hence, generates the language ( ) = {~€ €~| , ≥ 1}. In the investigation of the generative power of static Watson-Crick context-free grammars, we observe the results on the relationships between the families of languages generated by static Watson-Crick grammars to the families of Chomsky languages and Watson-Crick languages.
The following two lemmas immediately follow from the definition of static Watson-Crick context-free grammars. Lemma 1 shows the inclusion between context-free languages with static WK context-free languages; while Lemma 2 presents the inclusion between static Watson-Crick regular, linear and context-free languages.

Lemma 2.
The following inclusion holds: iJOE -Vol. 15, No. 10, 2019 Proof. The inclusion follows from the definition of static Watson-Crick grammars and by referring to the Chomsky hierarchy.
Next, we show that the language generated by static Watson-Crick context-free grammar can generate some non-context-free languages to relate the generative power in Lemma 2 with the result in [14,15] Thus, the derivation for each of the production rules is defined as follows: Step 1.