DNA and
splicing
(circular)
Paola Bonizzoni, Clelia De Felice, Giancarlo Mauri, Rosalba Zizza
Dipartimento di Informatica Sistemistica e Comunicazioni, Univ. di Milano - Bicocca ITALY
Dipartimento di Informatica e Applicazioni, Univ. di Salerno, ITALY
Circular splicing, definitions
State of the art
Our contributions
Works in progress
We apologize...
<<An important aspect of this year’s meeting can be summed
up us: SHOW ME THE EXPERIMENTAL RESULT! >>
(T. Amenyo, Informal Report on 3rd Annual
DIMACS Workshop on DNA Computing, 1997)
theoretical results
Before Adleman experiment (1994)...
Tom Head 1987
(Bull. of Math. Biology)
“ Formal Language Theory and DNA:
an analysis of the generative capacity of
specific recombinant behaviors”
SPLICING
Unconventional
models of computation
LINEAR
SPLICING
CIRCULAR
CIRCULAR SPLICING
restriction
enzyme 2
restriction
enzyme 1
ligase enzymes
Circular languages: definitions and examples
w, w A*, w
• Conjugacy relation on A*
Example
abaa, baaa, aaab,aaba
• A~ = A* ~ = set of all circular words
• Circular language
C A~
~w = [w] , w A*
~
set of equivalence classes
A* ~
L
Cir(L) = {~w | w L} (circularization of L)
L
C
(A linearization of C, i.e. Cir(L)=C )
{w A*| ~w C}= Lin(C)
(Full linearization of C)
w=xy, w = yx
are conjugate
A*
~ w
C
Definition:
FA~ ={ C A~ | L A*, Cir(L) = C, L FA, FA Chomsky hierarchy}
Theorem [Head, Paun, Pixton]
C Reg ~ Lin (C) Reg
Circular splicing systems
(A= finite alphabet, I A~ initial language)
Paun’s definition
~hu u ,
1 2
u2 hu1
SCPA = (A, I, R)
~ku u
3 4
u4ku3
A~
R A* | A* $ A* | A* rules
r = u1 | u2 $ u3 | u4 R
~ u hu u ku
2
1 4
3
Definition
A circular splicing language C(SCPA) (i.e. a circular language generated
by a splicing system SCPA ) is the smallest circular language containing
I and closed under the application of the rules in R
Other definitions of splicing systems
(A= finite alphabet, I A~ initial language)
Head’s definition
~hpxq ,
q hpx
SCH = (A, I, T)
h
(p, x, q ), ( u,x,v) T
~ hpx vkux q
vkux
Pixton’s definition
~h ,
A~
~kuxv
T A* A* A* triples
SCPI = (A, I, R)
~ h
h
A~
R A* A* A* rules
(, ; ), (, ; ) R
~ h h
Problem:
Characterize
C(Reg, Fin)
FA~ C(Fin, Fin)
class of circular languages C= C(SCPA)
generated by SCPA with I and R both
finite sets.
Theorem [ Paun96]
F{Reg~, CF~, RE~}
R +add. hyp. (symmetry, reflexivity, self-splicing)
C(F, Fin) F
Theorem [Pixton95-96] F{Reg~, CF~, RE~}
R Fin+add. hyp. (symmetry, reflexivity)
C(Reg~, Fin)Reg~,
C(F, Reg) F
Circular finite splicing languages
and Chomsky hierarchy
CS~
CF~
Reg~
~((aa)*b)
~(aa)*
I= ~aa ~1, R={aa | 1 $ 1 | aa}
~(an
bn)
I= ~ab ~1, R={a | b $ b | a}
Our contributions
Reg~ C(Fin, Fin)
Fingerprint closed
star languages
X*, X regular
group code
Reg~
cyclic
languages
Cir (X*)
X finite
weak cyclic,
other examples
~ (a*ba*)*
Our contributions (continued)
Comparing the three
definitions of splicing systems
C(SCH ) C(SCPA ) C(SCPI )
= ... ?
~ (a*ba*)*, ~ ((aa)*b)
Star languages
Definition
L A* is star language if L is regular, closed under
conjugacy relation and L=X*, with X regular
Proposition: SCPA=(A,I,R), I Cir(X*) C(SCPA) Cir (X*)
“Consistence” easily follows!!!
Examples
• (b*(ab*a)*)* = X*
X=b ab*a
• (a*ba*)* = X*
X= a*ba*
Fingerprint closed languages
Definition
For any cycle c, L contains the Fingerprints of c
Fingerprint of a cycle
c
cnc L
q0
q0
x
x’
y
y’
c=(x(y(zz’)jy’)i x’)nc
z
z’
power of the cycle, where the internal
cycles are crossed a finite number of times
i n y , j n x
Theorem
Fingerprint closed star languages C(Fin,Fin)
Sketch
Take SCPA = (A, I, R) with
I=Cir({successful path containing fingerprint of cycles})
R={1 | 1 $ 1 | ƒ
| ƒ fingerprint of cycle c, for any cycle c}
Star languages fingerprint closed
• X*, X regular group code
(for example X=b ab*a)
• X finite, Cir(X*)
(for example X=Ad )
Star languages not fingerprint closed
(a*ba*)*
but not generated!!!
new!
Not Star Languages in C(Fin, Fin)
Cyclic Languages
Definition
Cyclic(z) ={(~(z* p)) | p Pref (Lin( ~z))}
Example
z = abc A*
Lin ( ~ z) =Lin ( ~ abc) ={abc, bca,cab}
Pref(Lin ( ~ z)) =Pref(Lin ( ~ abc)) =Pref({abc, bca,cba}) = {a, ab, b, bc, c, ca}
Cyclic(abc)= ~(abc)*a ~(abc)*ab
~(abc)*b ~(abc)*bc
~(abc)*c ~(abc)*ca
Theorem
For any z, |z|>2, z unbordered word, then
Cyclic(z) C(Fin,Fin)
i.e. z uA* A*u
The proof is quite technical ...
Example (continued)
Cyclic (abc) is generated by SCPA = (A,I,R) where I,R are defined as follows
I={~ ((abc)i p | 0 i 3, p Pref(Lin(~ (abc))) }
R={z ab | z $ z | ca z,
z a | z $ z | b z,
z ab | z $ z b | c z,
zb|z$z$cz,
z ca | z $ z $ bc z,
zc|z$z|az }
Other circular regular splicing languages
• ~(abc)*a ~(abc)*ab ~(abc)*b
~(abc)*bc ~(abc)*c ~(abc)*ca
Cyclic(abc)
weak cyclic languages
• Cyclic (abca) ....
bordered word...
~(abc)*ac
Works in progress
• Characterize Reg~ C(Fin, Fin)
• Characterize FA~ C(Fin, Fin)
• C(SCPI) = Star languages
• Additional hypothesis
r= u1 | u2 $ u3 | u4 in R
• Reflexive: r’ = u1 | u2 $ u1 | u2
• Symmetric: r” = u3 | u4 $ u1 | u2
• Self-splicing: From ~ xu1u2yu3u4 ,
with r,r” as above, generates ~u4 xu1 , ~u2yu3 .
DNA6
auditorium
Thanks!