The coalition between Italian goats and Italian researchers: the Italian Goat Consortium Paolo Ajmone Marsan and the Italian Goat Consortium Institute of Zootechnics Università Cattolica del Sacro Cuore Piacenza, Italy [email protected] Cardiff, 17/06/2014 Outline IMMAGINE IMMAGINE • The Italian Goat Consortium • SNP diversity in Italian goats • Perspectives Italian goat Consortium • Hard time for economy in Europe – Harder in Italy than in Central/Northern Europe • Even harder for research funding – Very difficuly for small ruminant research » No way for goat diversity! • From crysis to opportunity – Modest seed funding of a project started in 2008(Innovagen funded by the Ministry of Agriculture) • Coalization and Coordination….. – Definetely a new model for Italian scientists……. Italian goat Consortium Paola Crepaldi, Coordinator Università degli Studi di Milano, ITALY Fabio PILLA, Maria Silvia D'ANDREA Università degli Studi del Molise, Campobasso, ITALY Paolo AJMONE-MARSAN, Nicola BACCIU, Lorenzo BOMBA, Licia COLLI, Marco MILANESI Università Cattolica del Sacro Cuore, Piacenza, ITALY Antonello CARTA, Tiziana SECHI AGRIS, Loc. Bonassai, Sassari, ITALY Stefania CHESSA, Bianca CASTIGLIONI Consiglio Nazionale delle Ricerche, Lodi, ITALY Donata MARLETTA, Salvatore BORDONARO Università degli Studi di Catania, ITALY Salvatore MURRU Associazione Nazionale della Pastorizia, Roma, ITALY Riccardo NEGRINI, Raffaele MAZZA Associazione Italiana Allevatori, Roma, ITALY Giulio PAGNACCO, Beatrice COIZET, Letizia NICOLOSO, Università degli Studi di Milano, ITALY Alessio VALENTINI Università degli Studi della Tuscia, Viterbo, ITALY Paola Crepaldi [email protected] Pooling local efforts and resources for the genomic characterisation of Italian goat breeds www.italiangoatconsortium.eu Sampling Camosciata delle Alpi Saanen Orobica Bionda dell’Adamello Val Passiria Valdostana Teramana Grigia Ciociara Nicastrese Sarda Aspromontana Maltese Girgentana Argentata dell’Etna 50K Illumina goat SNP chip • Discovery on 6 breeds (meat, mixed and milk) • Detection of ~12 million variations with > 10 millionSNPs • 60,000 SNPs (spaced on the genome, with >0.2 MAF, >0.8 Illumina ADT score…) • 52,295 successful loci (tested with 288 goat DNA samples from 10 different breeds) • Pseudochromosomes aligned on cattle • Details on www.goatgenome.org • Sequencing and novel de novo assembly on going at USDA Dataset cleaning • Filtering exclusion threshold – MAF < 1% – Missing (SNP) > 5% – Missing (animal) > 5% – HW within breed FDR > 20% • Working Dataset – 15 breeds – 350 animals (15-32 per breed) – 51,136 SNPs SNPchip affected by ascertainment bias (EU Nextgen project) but highly informative for the Italian gene pool Within breed MAF distribution CAM SAA SAR ARG BIO X=0 VPS 0<X≤0.05 ASP 0.05<X≤0.1 NIC 0.1<X≤0.2 GCI 0.2<X≤0.3 VAL 0.3<X≤0.4 MAL 0.4<X≤0.5 ORO SAM GIR TER 0 4000 8000 12000 16000 20000 24000 28000 32000 36000 40000 44000 48000 52000 Expected vs Observed Heterozygosity K=2 Alps Center South & Islands Cross Validation error plot ADMIXTURE Software (10 runs) The Best K (K=11) Alps Center South & Islands Geographic distribution of 11 genomic components Neighbour Net based on Reynolds distance Reynolds Distance DRe ynolds = å (x - y ) 1- å x y i 1 2 i i i i i 2 Principal Component Analysis CAM = Camosciata (Alpine) VAL = Valdostana Teramana SAA = Saanen BIO = Bionda ORO = Orobica VPS = Valpassiria GCI = Grigia Ciociara ARG = Argentata NIC = Nicastrese ASP = Aspromontana Alps Center-South MAL = Maltese Siciliana GIR = Girgentana TER = Teramana SAM = Maltese Sarda Maltese SAR = Sarda Principal Component Analysis CAM = Camosciata (Alpine) West-East Alps VAL = Valdostana SAA = Saanen BIO = Bionda ORO = Orobica Aspromontana VPS = Valpassiria GCI = Grigia Ciociara Girgentana ARG = Argentata Maltese NIC = Nicastrese ASP = Aspromontana MAL = Maltese Siciliana GIR = Girgentana TER = Teramana Orobica SAM = Maltese Sarda SAR = Sarda r2 LD in Chromosome 6 Distance (Mb) Ne Historical Ne of Italian goat breeds was estimated using SNeP*. For each pair of SNPs within a chromosome the LD is calculated according to Hill & Robertson (1968) using the method of Sved (1971) and correcting for sample size and mutations (Weir & Hill 1980, Hayes 2003, Corbin et al. 2012). 5000 4500 ARG ASP 4000 BIO Effec ve Popula on size 3500 CAM GCI 3000 GIR 2500 MAL NIC 2000 ORO 1500 SAA SAM 1000 SAR TER 500 VAL 0 0 500 1000 Genera ons Ago 1500 2000 VPS Selection signatures Lositan software Simulation of markers under the neutral model Detection of outliers having Fstor lower than expected under a neutral model at that value of heterozygosity - 456 markers under directional selection - 629 under balancing selection What about breed diversity at these loci? Markers under directional/balancing selection 0.06 0.02 0.06 0.10 0.25 0.02 0.02 0.06 0.05 Under_sel Neutral = 50051 markers Under_Sel = 456 markers Bal_Sel = 629 markers Neutral 1 = 456 markers Neutral 2 = 456 markers 0.06 Neutral1 0.08 0.02 Neutral 0.010 Bal_sel 0.018 0.02 Neutral2 0.05 0.20 0.35 0.02 0.06 0.010 0.016 Ntr Pearson’ correlation between genetic distances between breeds Dir Bal Ntr 1 Ntr 2 Ntr Dir Bal Ntr1 Ntr2 1 0.85 0.89 0.98 0.99 1 0.70 0.83 0.88 1 0.85 0.88 1 0.97 1 Directional selection 8 ● 6 ● ● ● 4 ● 2 ●● ● ●● ● ● ● ● ● ● ● ● ● ● TER 0 VarExplained(%) 10 12 d=5 PCA − PC1 (%VarExp: 13.509) and PC2 (%VarExp: 4.794) PCs ● ● ● ● ●● ●●● ●● ● ● ● ● ●● ● ● ●● ●● ●●● ● ●● ● GIR ●● ● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ●●●● ●● ● ● ●● ●● ●●● ●● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●●●● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ●●●● ● ●●●●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ●●● ●●● ●● ● ● ● ●● ● ORO VAL ●● ● ●● ● ●● ●● ● ● VPS BIO SAA CAM ● ● ●●● ● ● GCI ●● ● ● ● ● ● ●● ● ASP ARG NIC ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ●● ●● ● ● ● ●●●●●● ●● ● ● ●●●● ● ●● ● ●● ●●● ● ●● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ●● SAR ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● MAL ●● ● ● ● ● ● SAM ● ● ● ●● ● ● ● ● ● ● ● Directional selection 0.68 0.62 0.64 0.66 CV error 0.70 0.72 CV error 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 K SAR SAM MAL GIR ARG ASP NIC TER GCI VAL BIO CAM SAA ORO VPS 0.0 0.4 0.8 Directional Selection Ancestry ADMIXTURE Trial K=11 Mostly Neutral Alps Center South & Islands Balancing selection d=2 PCA − PC1 (%VarExp: 0.884) and PC2 (%VarExp: 0.846) 0.4 ● ● 0.2 ● ● 0.0 VarExplained(%) 0.6 0.8 ● ● ● ● PCs ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ●● ● ● ●●● ● ●● ● ●● ● ●● ●● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ●● ● ●●● ●● ●●● ● ● ● ● ● ● ● ● ●●● ● ● ●●● ● ● ●●●● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●●● ●● ●● ●● ●●● ●● ●● ●● ● ●●● ● ● ●●● ●●●●● ●●●● ● ● ● ● ●● ● ● ● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ●●● ●● ● ●●●● ●● ●● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● CAM NIC GCI ARG SAR ASP MAL VAL SAA VPS GIR BIO SAM ● ● ● TER ● ● ORO ● ● ● ● ● ● ● ● ● ● ● Balancing selection 0.65 0.70 CV error 0.75 0.80 CV error 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 K SAR SAM MAL GIR ARG ASP NIC TER GCI VAL BIO CAM SAA ORO VPS 0.0 0.4 0.8 Balancing Selection Ancestry ADMIXTURE Trial K=11 Mostly Neutral Alps Center South & Islands 0.38 0.410.34 0.38 Correlation between heterozygosities of breeds 0.34 0.38 0.34 0.38 0.42 0.38 Neutral2 Neutral2 Neutral2 Neutral Neutral Neutral Neutral Neutral1 0.42 0.39 0.36 0.39 0.34 Neutral1 Under_sel Under_sel 0.34 0.26 Neutral1 0.26 0.38 0.36 0.39 0.36 0.39 0.35 Neutral1 0.34 0.38 0.35 0.41 0.36 0.39 0.36 0.39 0.26 0.380.26 0.32 0.34 0.39 0.32 0.38 Neutral 0.39 0.41 0.36 0.39 0.36 0.39 1 0.26 Directional Balancing Neutral 1 0.32 0.26 Under_sel Neutral Under_sel Directional 0.380.26 0.87 0.32 0.38 1 0.26 0.39 0.39 Bal_sel Neutral2 0.42 0.39 0.41 0.35 0.39 0.39 0.38 0.34 0.410.34 0.34 0.38 Bal_sel 0.35 _sel 0.41 0.41 0.35 0.36 0.39 _sel 0.41 0.38 0.42 0.35 Balancing Neutral 1 0.33 0.99 0.14 0.85 1 0.33 1 Correlation between heterozygosities of individuals 0.35 0.40 0.40 0.50 0.30 0.30 0.35 0.40 0.30 Bal_sel Under_sel 0.30 0.40 0.50 0.15 0.25 Neutral Directional Balancing 0.35 0.15 0.25 0.35 0.45 Neutral 0.45 Neutral Directional Balancing 1 0.76 0.66 1 0.33 1 IMMAGINE Conclusions IMMAGINE SNPs vs others • Much higher level of resolution (many thousand vs a few markers) • Robust and non homoplasic • Easier comparability across projects and data merging • Suited to genome wide analyses (ROH, Ne, Selection signatures, GWAS, breeding applications) • However panels should be carefully prepared and evaluated (ascertainment bias) Italian goats • Little or no inbreeding. • Variable level of admixture. • Some distinct breeds: Girgentana, Teramana, Orobica, Maltese. • Low Ne nowadays (bottlenecks, breeding management), higher in the past. • Geographic partition of diversity at small geographic scale (North-South and East-West in the Alps). • Markers under selection are valuable for conservation decisions • Neutral marker diversity is a reasonably good proxy of diversity of markers under directional selection Breeding • Breeding will be more and more guided by molecular analyses if cost continues to decrease • Methods customised to populations (small vs large pop. improvement, inbreeding control, maintenance of diversity) • Knowledge of population structure is needed for any kind of application to avoid false positives 30 International Networking ADAPTMAP Goat Adaptmap Traditional and novel approaches to study adaptation genomics: Selection signatures Spatial analysis Enriched SNP panel Detection of new variation To chacterize the study Population studies based on Mutation Classification Mutation effect on protein structure and function used as a tag for adaptation Alessandra Stella NextGen project Pierre Taberlet Francois Pompanon Next generation methods to preserve farm animal biodiversity by optimizing present and33 future breeding options. Interdisciplinarity and training GIS Training 34 Future challenges Copy Number Variations (CNVs) Mefford and Eichler 2009 See following presentation of Fernando Garcia Final consideration • Very fast molecular tool development. • Faster than our capacity to understand. Under these circumstances any loss of diversity before characterization is a loss of unvaluable opportunity for science and agriculture 36 ACKNOWLEDGMENTS Marco Milanesi Elia Vajana Lorenzo Bomba Licia Colli Goat farmers International Goat Consortium (SNPChip) ASSONAPA (Italian small ruminant breeder association)