DNA and splicing (circular) Paola Bonizzoni, Clelia De Felice, Giancarlo Mauri, Rosalba Zizza Dipartimento di Informatica Sistemistica e Comunicazioni, Univ. di Milano - Bicocca ITALY Dipartimento di Informatica e Applicazioni, Univ. di Salerno, ITALY Circular splicing, definitions State of the art Our contributions Works in progress We apologize... <<An important aspect of this year’s meeting can be summed up us: SHOW ME THE EXPERIMENTAL RESULT! >> (T. Amenyo, Informal Report on 3rd Annual DIMACS Workshop on DNA Computing, 1997) theoretical results Before Adleman experiment (1994)... Tom Head 1987 (Bull. of Math. Biology) “ Formal Language Theory and DNA: an analysis of the generative capacity of specific recombinant behaviors” SPLICING Unconventional models of computation LINEAR SPLICING CIRCULAR CIRCULAR SPLICING restriction enzyme 2 restriction enzyme 1 ligase enzymes Circular languages: definitions and examples w, w A*, w • Conjugacy relation on A* Example abaa, baaa, aaab,aaba • A~ = A* ~ = set of all circular words • Circular language C A~ ~w = [w] , w A* ~ set of equivalence classes A* ~ L Cir(L) = {~w | w L} (circularization of L) L C (A linearization of C, i.e. Cir(L)=C ) {w A*| ~w C}= Lin(C) (Full linearization of C) w=xy, w = yx are conjugate A* ~ w C Definition: FA~ ={ C A~ | L A*, Cir(L) = C, L FA, FA Chomsky hierarchy} Theorem [Head, Paun, Pixton] C Reg ~ Lin (C) Reg Circular splicing systems (A= finite alphabet, I A~ initial language) Paun’s definition ~hu u , 1 2 u2 hu1 SCPA = (A, I, R) ~ku u 3 4 u4ku3 A~ R A* | A* $ A* | A* rules r = u1 | u2 $ u3 | u4 R ~ u hu u ku 2 1 4 3 Definition A circular splicing language C(SCPA) (i.e. a circular language generated by a splicing system SCPA ) is the smallest circular language containing I and closed under the application of the rules in R Other definitions of splicing systems (A= finite alphabet, I A~ initial language) Head’s definition ~hpxq , q hpx SCH = (A, I, T) h (p, x, q ), ( u,x,v) T ~ hpx vkux q vkux Pixton’s definition ~h , A~ ~kuxv T A* A* A* triples SCPI = (A, I, R) ~ h h A~ R A* A* A* rules (, ; ), (, ; ) R ~ h h Problem: Characterize C(Reg, Fin) FA~ C(Fin, Fin) class of circular languages C= C(SCPA) generated by SCPA with I and R both finite sets. Theorem [ Paun96] F{Reg~, CF~, RE~} R +add. hyp. (symmetry, reflexivity, self-splicing) C(F, Fin) F Theorem [Pixton95-96] F{Reg~, CF~, RE~} R Fin+add. hyp. (symmetry, reflexivity) C(Reg~, Fin)Reg~, C(F, Reg) F Circular finite splicing languages and Chomsky hierarchy CS~ CF~ Reg~ ~((aa)*b) ~(aa)* I= ~aa ~1, R={aa | 1 $ 1 | aa} ~(an bn) I= ~ab ~1, R={a | b $ b | a} Our contributions Reg~ C(Fin, Fin) Fingerprint closed star languages X*, X regular group code Reg~ cyclic languages Cir (X*) X finite weak cyclic, other examples ~ (a*ba*)* Our contributions (continued) Comparing the three definitions of splicing systems C(SCH ) C(SCPA ) C(SCPI ) = ... ? ~ (a*ba*)*, ~ ((aa)*b) Star languages Definition L A* is star language if L is regular, closed under conjugacy relation and L=X*, with X regular Proposition: SCPA=(A,I,R), I Cir(X*) C(SCPA) Cir (X*) “Consistence” easily follows!!! Examples • (b*(ab*a)*)* = X* X=b ab*a • (a*ba*)* = X* X= a*ba* Fingerprint closed languages Definition For any cycle c, L contains the Fingerprints of c Fingerprint of a cycle c cnc L q0 q0 x x’ y y’ c=(x(y(zz’)jy’)i x’)nc z z’ power of the cycle, where the internal cycles are crossed a finite number of times i n y , j n x Theorem Fingerprint closed star languages C(Fin,Fin) Sketch Take SCPA = (A, I, R) with I=Cir({successful path containing fingerprint of cycles}) R={1 | 1 $ 1 | ƒ | ƒ fingerprint of cycle c, for any cycle c} Star languages fingerprint closed • X*, X regular group code (for example X=b ab*a) • X finite, Cir(X*) (for example X=Ad ) Star languages not fingerprint closed (a*ba*)* but not generated!!! new! Not Star Languages in C(Fin, Fin) Cyclic Languages Definition Cyclic(z) ={(~(z* p)) | p Pref (Lin( ~z))} Example z = abc A* Lin ( ~ z) =Lin ( ~ abc) ={abc, bca,cab} Pref(Lin ( ~ z)) =Pref(Lin ( ~ abc)) =Pref({abc, bca,cba}) = {a, ab, b, bc, c, ca} Cyclic(abc)= ~(abc)*a ~(abc)*ab ~(abc)*b ~(abc)*bc ~(abc)*c ~(abc)*ca Theorem For any z, |z|>2, z unbordered word, then Cyclic(z) C(Fin,Fin) i.e. z uA* A*u The proof is quite technical ... Example (continued) Cyclic (abc) is generated by SCPA = (A,I,R) where I,R are defined as follows I={~ ((abc)i p | 0 i 3, p Pref(Lin(~ (abc))) } R={z ab | z $ z | ca z, z a | z $ z | b z, z ab | z $ z b | c z, zb|z$z$cz, z ca | z $ z $ bc z, zc|z$z|az } Other circular regular splicing languages • ~(abc)*a ~(abc)*ab ~(abc)*b ~(abc)*bc ~(abc)*c ~(abc)*ca Cyclic(abc) weak cyclic languages • Cyclic (abca) .... bordered word... ~(abc)*ac Works in progress • Characterize Reg~ C(Fin, Fin) • Characterize FA~ C(Fin, Fin) • C(SCPI) = Star languages • Additional hypothesis r= u1 | u2 $ u3 | u4 in R • Reflexive: r’ = u1 | u2 $ u1 | u2 • Symmetric: r” = u3 | u4 $ u1 | u2 • Self-splicing: From ~ xu1u2yu3u4 , with r,r” as above, generates ~u4 xu1 , ~u2yu3 . DNA6 auditorium Thanks!