17 dicembre 2010 Trento Linguaggi sovrapposti: il codice dello splicing Piva F, Giulietti M, Principato G Department of Biochemistry, Biology and Genetic Politechnic University of Marche, Ancona One point mutation at a time BRCA1 exon 18 100% 20% 17 18 19 80% 17 19 17 18 19 Binding of DAZAP1 and hnRNPA1/A2 to an Exonic Splicing Silencer in a Natural BRCA1 Exon 18 Mutant Goina E, Skoko N, Pagani F. Mol Cell Biol 2008; 28: 3850–3860 Two point mutations at a time BRCA1 exon 18 Complete exon 18 skipping Decreased efficiency Binding of DAZAP1 and hnRNPA1/A2 to an Exonic Splicing Silencer in a Natural BRCA1 Exon 18 Mutant Goina E, Skoko N, Pagani F. Mol Cell Biol 2008; 28: 3850–3860 Effect of variations in CFTR exon 9 pathological pathological 100 90 80 70 60 50 40 30 20 10 0 5’-ACAGTTGTTGGCGGTTG-3’ TACCACCC TTATT GGTTC AA CCGC G G T pathological % exon 9 inclusion WT A G T G A G T C T C G C A C A C A C C T T C A G T T C T WT 144A 145C 146A 147G 148T 149T150G 151T 153G 154G 155C 156G 157G ex9 + ex9 - Pagani, F., Buratti, E., Stuani, C., and Baralle, F. E. (2003) J Biol Chem Pagani, F., Stuani, C., Zuccato, E., Kornblihtt, A. R., and Baralle, F. E. (2003) J Biol Chem The genetic code is degenerate, but it is not all rodustness . . . Ala Val Arg . . . GCA GTA CGA C C C G G G T T T AGA G 4 * 4 * 6 = 96 Three AAs specified by 96 synonymous words GCAGTACGA GCAGTACGC GCAGTACGG GCAGTACGT GCAGTAAGA GCAGTAAGG GCAGTCCGA GCAGTCCGC GCAGTCCGG GCAGTCCGT GCAGTCAGA GCAGTCAGG GCAGTGCGA GCAGTGCGC GCAGTGCGG GCAGTGCGT GCAGTGAGA GCAGTGAGG GCAGTTCGA GCAGTTCGC GCAGTTCGG GCAGTTCGT GCAGTTAGA GCAGTTAGG GCCGTACGA GCCGTACGC GCCGTACGG GCCGTACGT GCCGTAAGA GCCGTAAGG GCCGTCCGA GCCGTCCGC GCCGTCCGG GCCGTCCGT GCCGTCAGA GCCGTCAGG GCCGTGCGA GCCGTGCGC GCCGTGCGG GCCGTGCGT GCCGTGAGA GCCGTGAGG GCCGTTCGA GCCGTTCGC GCCGTTCGG GCCGTTCGT GCCGTTAGA GCCGTTAGG GCGGTACGA GCGGTACGC GCGGTACGG GCGGTACGT GCGGTAAGA GCGGTAAGG GCGGTCCGA GCGGTCCGC GCGGTCCGG GCGGTCCGT GCGGTCAGA GCGGTCAGG GCGGTGCGA GCGGTGCGC GCGGTGCGG GCGGTGCGT GCGGTGAGA GCGGTGAGG GCGGTTCGA GCGGTTCGC GCGGTTCGG GCGGTTCGT GCGGTTAGA GCGGTTAGG GCTGTACGA GCTGTACGC GCTGTACGG GCTGTACGT GCTGTAAGA GCTGTAAGG GCTGTCCGA GCTGTCCGC GCTGTCCGG GCTGTCCGT GCTGTCAGA GCTGTCAGG GCTGTGCGA GCTGTGCGC GCTGTGCGG GCTGTGCGT GCTGTGAGA GCTGTGAGG GCTGTTCGA GCTGTTCGC GCTGTTCGG GCTGTTCGT GCTGTTAGA GCTGTTAGG An additional exonic constraints: the splicing code cryptic exon exon31 NF1 gene ttttatagTGAGAATA A>G WT MUT La mutazione attiva un esone criptico (in rosso) Raponi M, Upadhyaya M, Baralle D. Functional splicing assay shows a pathogenic intronic mutation in neurofibromatosis type 1 (NF1) due to intronic sequence exonization. Hum Mutat. 2006; 27(3):294-295. cryptic exon exon31 NF1 gene Disruption of 5’ss restores normal splicing TAGataata TAGgtggga TAGgtaata CAGgtattg CAAgtattg CAAgtaagc CAAgtaagg La seq 2 ha un sito di splicing in 5’ più debole della seq 1. La seq 3 non ha il sito. Raponi M, Upadhyaya M, Baralle D. Functional splicing assay shows a pathogenic intronic mutation in neurofibromatosis type 1 (NF1) due to intronic sequence exonization. Hum Mutat. 2006;27(3):294-295. ATM gene structure 20 WT: GGCCAGGTAAGTGATA 20 mutations 21 21 DEL: GGCCAG____GTGATA MUT: GGCCAGGTCTGTGATA M WT del mut results A new type of mutation causes a splicing defect in ATM Pagani F, Buratti E, Stuani C, Bendix R, Dörk T, Baralle FE Nature Genetics 2002, 30: 426-429 20 20 21 21 AIM: mRNA structure pre mRNA sequence SPLICING PREDICTION TOOL A compact formalism, but… score matrix Compression and reconstruction of motifs Experimental assessed binding sites AGG AGT CGT AGG CGT zip consensus sequence A G G C T unzip AGG AGT CGG CGT elements promoting exons elements promoting introns ESE, ISS: esone ESS, ISE: introne PROTEINS REGULATING SPLICING STORED IN SPLICEAID 9G8, CUG-BP1, DAZAP1, ETR-3, Fox-1, Fox-2, FMRP, hnRNP A0, hnRNP A1, hnRNP A2/B1, hnRNP C, hnRNP C1, hnRNP C2, hnRNP D, hnRNP D0, hnRNP DL, hnRNP E1, hnRNP E2, hnRNP F, hnRNP G, hnRNP H1, hnRNP H2, hnRNP I (PTB), hnRNP J, hnRNP K, hnRNP L, hnRNP LL, hnRNP M, hnRNP P (TLS), hnRNP Q, hnRNP U, HTra2alpha, HTra2beta1, HuB, HuD, HuR, KSRP, MBNL1, Nova-1, Nova-2, nPTB, PSF, RBM4, RBM25, Sam68, SAP155, SC35, SF1, SF2/ASF, SLM-1, SLM-2, SRp20, SRp30c, SRp38, SRp40, SRp54, SRp55, SRp75, TDP43, TIA-1, TIAL1, YB-1, ZRANB2 … Some comparisons among literature data (SpliceAid) and prediction tools SEQUENCE ACAAC SPLICEAID EXPERIMENTALLY ASSESSED BINDING YB-1 PREDICTORS ESE Finder Rescue ESE no binding no ESE Splicing Rainbow SRp40 3 ESE Tra2B GAAGAAGA HTra2A, HTra2B1, SF2/ASF, SC35, no binding SRp40, SRp55, SRp75 CUGGCGUCGUCGC no binding SF2/ASF, SRp55 2 ESE SRp40, SRp55 UGACUG hnRNP A1 no binding no ESE SRp40, SRp55 UUUUAGACAA hnRNP C1, Sam68, hnRNP A1, hnRNP D, no binding hnRNP E1, hnRNP E2, SRp38 1 ESE hnRNP A2/B1, hnRNP C1/C2, hnRNP E1/E2, SRp40, SRp55, U2AF65 UGUGUGUGUGUGUGUGUG CUG-BP1, ETR-3, TDP43 no ESE hnRNP U SRp55 SpliceAid 2 Giunzioni esoniche in fase 0 m -1 1 n 10 A……………………………………………………..A T……………………………………………………...T T…………………………T G……………………………G G………………………..G G…………………………………………..C G………………………………………………..G A…………………………………A A………………………………………….A A………………………….A A………………………………......A C………G C……………….C T…...G A………A C………G T………T A………………A C………………C A……………………….A A……………………….T C……………………….C C………………………….C Giunzioni esoniche in fase 1 -10 m -1 1 n 10 G………………………………………………..…..G A………………………………....A A.................................................T T……………………………………………..T A……………….A A…………………………A A…………………………………..A C......C C……….C G………C A..T C..C T…C T…T C……G T……T T………G C………….G C………………………..T C……………………………G Giunzioni esoniche in fase 2 -10 m -1 1 n 10 A………………………………....A A…………………………………………..A A……………………………………………A A………T A……………………….A T……T C…………...G Correlazioni favorite tra la fine di un esone e l’inizio del successivo Esoni che iniziano in fase 0 e terminano in fase 0 1 10 -10 -1 G…………………………………..G G……………………………………..G G………………………………………………………...C G…………………………………………………………...C G…………………………………………………….……..T A……………………………A T…………………………….A T…………………………….T A…………………………………….A T……………………………………..T A………………………………................A T.................................................A T.................................................T T...........................................................T A………………….A T…………………..A A……………………..……A A........................................A A.................................................T T..................................................T C.......................G C....................G A...........A T............A T............T A.....................A T......................T A...............................A T................................T C...........C T...........T G….......G G…..........G Fase 0 1 1 10 -10 Fase 0 2 -1 T……………………………………..A T…………………………………..…...G T………………………………………….……….A T………………………………………...........................C C................................................................C A.......................................A T.......................................A A................................................A A................................................T T…………………………………..………….A T.................................................T C............................................................C T............................................................T T..................................A A..........................A T..........................A A....................................A C.................G G.......................................G A.................T T.................T A……………….……….A C……………………………………C 1 10 -10 -1 T…………………………………………………………………….C A....................................A T....................................A A..............................................A T..............................................A C.....................................................C A.........................................................A A……………………………………………...........T A.............................................................C A........................A T........................A A..................................A T...................................T A.............................................A T..............................................T G..........................................T T.............A A.......................A A.......................T A.................................T T................................A A.........................................C G.....................................C Correlazioni tra l’inizio e la fine degli introni umani Elaborazioni in corso al CASPUR tramite ClustalW multiprocessore e programmazione multithreading… per ripetere le analisi su un insieme di geni con minore ridondanza Seq ridondanti Seq NON ridondanti Seq 1 Seq 2 Seq 3 Seq 4 Seq 5 … Seq N Seq ridondanti Seq NON Ridondanti Seq ridondanti Seq NON Ridondanti Seq 2 Seq 3 Seq 4 Seq 5 … Seq N Seq 1 Seq 3 Seq 4 Seq 5 … Seq N Seq 1 Seq 2 Seq ridondanti Seq NON Ridondanti Seq ridondanti Seq NON Ridondanti Seq 4 Seq 5 … Seq N Seq 1 Seq 2 Seq 3 Seq 5 … Seq N Seq 1 Seq 2 Seq 3 Seq 4 nallineamen ti N 1 x x 1 N ( N 1) 2 Partendo da un insieme di 10.000 sequenze, se non effettuo nessun pruning, dovrei compiere al massimo 49.995.000 allineamenti Altri lavori pubblicati o accettati nel 2010: Piva F, Giulietti M, Nardi B, Bellantuono C, Principato G. An improved in silico selection of phenotype affecting polymorphisms in SLC6A4, HTR1A and HTR2A genes. Human Psychopharmacology 2010; 25: 153-61. Piva F, Ciaprini F, Onorati F, Benedetti M, Fattorini D, Ausili A, Regoli F Assessing sediment hazard through a Weight Of Evidence approach with bioindicator organisms: a practical model to elaborate data from sediment chemistry, bioavailability, biomarkers and ecotoxicological bioassays Chemosphere 2010 accepted Bianchi F, Raponi M, Piva F, Viel A, Bearzi I, Galizia E, Bracci R, Belvederesi L, Loretelli C, Brugiati C, Corradini F, Baralle D, Cellerino R. An intronic mutation in MLH1 associated with familial colon and breast cancer. Familial Cancer 2010 published Nardi B, Turchi C, Piva F, Giulietti M, Castellucci G, Arimatea E, Rocchetti D, Rocchetti G, Principato G, Tagliabracci A, Bellantuono C Searching for a relationship between the Serotonin Receptor 2A Gene variations and the development of Inward and Outward Personal Meaning Organisations Psychiatric Genetics 2010 accepted Lavori inviati nel 2010: Piva F, Giulietti M, Ballone Burini A, Principato G SpliceAid 2: a database of human splicing factors expression data and RNA target motifs Piva F, Giulietti M, Baldelli L, Nardi B, Bellantuono C, Armeni T, Saccucci F, Principato G Bioinformatic analyses to select phenotype affecting polymorphisms in HTR2C gene Piva F, Giulietti M, Principato G CLIP data to detect polymorphisms lying in splicing regulatory motifs: a method to refine SNP selection in association studies Turchi C, Piva F, Solito G, Principato G, Buscemi L, Tagliabracci A ADH4 intronic variations are associated with alcohol dependence: results from an Italian case-control association study Lenzi L, Facchin F, Piva F, Giulietti M, Pelleri MC, Frabetti F, Vitale L, Casadei R, Canaider S, Bortoluzzi S, Coppe A, Danieli GA, Principato G, Ferrari S, Strippoli P TRAM (Transcriptome Mapper): database-driven creation and analysis of transcriptome maps from multiple sources Facchin F, …, Piva F,.... Complexity of bidirectional transcription and alternative splicing at human RCAN3 locus Giovanni Principato Francesco Piva Matteo Giulietti