Università degli Studi di Bologna - Dottorato in Chimica Ind.– 17/2/ 2004 Applicazione delle metodologie QSAR a problematiche ambientali di inquinanti organici Prof. Paola Gramatica QSAR and Environmental Chemistry Research Unit DBSF - University of Insubria - Varese http:// www.qsar.it http://www.qsar.it Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) QSAR QSAR Quantitative Structure - Activity Relationships Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Quantitative Structure -Activity Relationships (QSAR) Structure-Activity and and Quantitative Structure -Property Relationships (QSPR) Structure-Property “The structure of a chemical influences its properties and biological activity” “Similar compounds behave similarly” (Hansch 1964) Activity or Property = f (Structure) It is possible to find a relationship (f) between Structure and behavior (Activity or Property) of a chemical PREDICTED DATA Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Chemicals Synthesis Natural products Xenobiotics Physico-chemical properties Environmental fate and behavior Biological Activity degradation toxicity persistence mutagenicity bioaccumulation carcinogenicity partitioning endocrine disrupt. Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) THE CHEMICAL UNIVERSE NEW 1.000.000 / year 22.000.000 in C.A.S. Q S A R 100.000 on market EINECS TSCA 5% known data experiments Environmental fate? Human effects? Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) NEW 2.000 / year (Q)SAR History Alkane m.p. and b.p. (Cros, 1863) n.C M.W. Alcohol water solubility n.C M.W. PHYSICO-CHEMICAL PROPERTIES STRUCTURE Alcohol toxicity part. coeff. oil/water (Meyer-Overton 1899-1901) Log P BIOLOGICAL ACTIVITY PHYSICO-CHEMICAL PROPERTIES STRUCTURE/ PROPERTIES BIOLOGICAL ACTIVITY (Hansch 1964) Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Classical Hansch equation: “Toxicity” = a + b logP + c E + d S logP E S or log Kow, partition coefficient between octanol and water: hydrophobicity term electronic term steric term related to bulk and shape The probability or ability of the chemical to reach the target site The possibility of the chemical to interact with the target and to be active Congenericity principle Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) CHEMICALS D M A MOLECULAR DESCRIPTORS R1 PHYSICO -CHEMICAL PHYSICO-CHEMICAL PROPERTIES R2 R3 BIOLOGICAL ACTIVITIES M: experimental measures of properties A: experimental measures of activities D: theoretical procedures for descriptors R1, R2, R3: mathematical relationships Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) THE 3 NECESSITIES: GOOD INPUT DATA High-quality experimental data as input data to find the Structure-Activity Relation MEANINGFUL STRUCTURAL INFORMATION Good representation of the chemical structure: molecular descriptors PREDICTIVE MODELS Quantitative models with validated predictive performances Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Experimental data set There is a need for a “limited” number of HIGH-QUALITY experimental data on which to develop QSAR models! The models will only be as good as the data used to develop them! “Garbage in, garbage out” NEEDS FOR EXPERIMENTAL DATA: AS NUMEROUS AS POSSIBLE CORRECT REPRESENTATIVE HOMOGENEOUS (ideally, same lab, same method) Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) CHEMICALS D M A MOLECULAR DESCRIPTORS R1 PHYSICO -CHEMICAL PHYSICO-CHEMICAL PROPERTIES R2 R3 BIOLOGICAL ACTIVITIES M: experimental measures of properties A: experimental measures of activities D: theoretical procedures for descriptors R1, R2, R3: mathematical relationships Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) The “magic” molecular descriptor molecular fragments C log P Software OH Bioconcentration Log P (or Kow) Sorption Water solubility Cl Toxicity “ 35 years of (ab)using of log P for everything modelling is enough! “ (R. Schwarzenback) SETAC 2000 FROM PARTITION PROPERTY TO ACTUAL MOLECULAR STRUCTURE... Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) MOLECULAR DESCRIPTORS . . . 0D . ·· ·· ·· ·· . 1D . · · . .· · . . Representation of a chemical by numerical indices . . . . ·· ·· ·· ·· . . · ·. .· · . . Cl H C Cl Cl Cl H C C C H H C C C C C C C H Cl Cl C H 2D Cl Cl H Cl Cl H H H H Cl Cl 3D Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) H CHEMICALS D M A MOLECULAR DESCRIPTORS R1 PHYSICO -CHEMICAL PHYSICO-CHEMICAL PROPERTIES R2 R3 BIOLOGICAL ACTIVITIES M: experimental measures of properties A: experimental measures of activities D: theoretical procedures for descriptors R1, R2, R3: mathematical relationships Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) CHEMICALS EXPERIMENTAL DATA MOLECULAR DESCRIPTORS EXPLORATIVE ANALYSIS: x1 - Principal Component Analysis - Cluster Analysis CLASSIFICATION METHODS: - Classification Tree (CART) - Discriminant Analysis - Neural Networks Quantitative models for qualitative responses x2 ... xn TRAINING TRAINING SET SET Y Chemometric Methods REGRESSION METHODS: - Multivariate Linear Regression (MLR) - Partial Least Squares Regression (PLS) Quantitative models for quantitative responses Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) DATA SET SPLITTING •Dimension TRAINING TRAINING SET SET •Chemical composition TEST TEST SET SET REGRESSION MODEL FITTING R2 INTERNAL VALIDATION Q2LOO Q2LMO EXTERNAL VALIDATION Q2EXT PREDICTABILITY NEW DATA Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) MOLECULAR DESCRIPTORS T O X I C I T Y INFORMATION SELECTION Model with relevant information Y = f ( selected descriptors) Y QSAR MODEL Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) LIMITATIONS OF QSAR MODELS Statistical quality Fitting R2 Predictability Q2 Chemical domain Pred. response Outliers Prediction reliability Exp. response Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) EXPERIMENTAL DATA CHEMICALS MOLECULAR DESCRIPTORS Y X MAXIMUM FITTING PREDICTIVE POWER MODEL REVERSIBLE DECODING NEW CHEMICALS ??? PREDICTION MOLECULAR DESCRIPTORS Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) APPLICATIONS of QSAR PREDICTIONS •Filling of data gaps •Validation of experimental data •Screening, ranking and priority setting •Highlighting chemicals of concern (also before their synthesis) PRIORITY PRIORITY LISTS LISTS Optimize industry resource allocation Minimize animal testing Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) ENVIRONMENTAL PARAMETERS Priority setting / Risk Assessment Parameters Quality of QSAR models Physico-chemical data m.p.; b.p.; vapour pressure; Henry law constant; water solubility; partition coefficients (Kow, Koc, …). OPTIMUM Environmental fate and pathways chemical-, photo- and bio-degradation; bioaccumulation; compartment partitioning. Ecotoxicity algae; Daphnia; fish; …. Mammalian toxicity skin-, eyes-, oral-, inhalation acute toxicity; mutagenicity; carcinogenicity; toxicity to reproduction system;... HIGH MEDIUM-HIGH MEDIUM Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) QSAR in U.S. Since 1979/80 wide use and “abuse” EPA / OPPT TSCA NCP Office Offic of Pollution Prevention and Toxics Toxic Substances Control Act inventory (~75.000 chem.) New Chemicals Program (PMN with QSAR data) QSAR in E.U. Since 1992/93 but, so far, limited use EC Regulation on Evaluation and Control of Risks of Existing Substances IPS EURAM Informal Priority Setting method Europe Union RAnking Method WHITE PAPER on the Strategy for a Future Chemicals Policy (Feb 2001) •Same regulation for new and existing chemicals (1:15) •Responsibility from authorities to industries for testing and risk assessment •REACH system: Registration Evaluation Authorisation of Chemicals - Registration by companies for > 1 t prod (30.000) into 2005 1000 t (HPV), into 2012 all - Evaluation of information by authorities for > 100 t (5000) into 2008 - Authorisation for carcinogenic, mutagenic, toxic to reproduction and POPs Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) QSAR in WHITE PAPER Art. 3.2 ….”to keep animal testing to a minimum” • Development and validation of alternative methods ECVAM (Europ. Centre for Validation of Alternative Methods) –JRC Ispra • Inclusion in the Community legislation and OECD Test Guidelines Programme for international recognition Task Force of Esperts in QSAR •Particular research efforts for developing and validating modelling (e.g. QSAR) and screening methods for assessing the potential adverse effects of chemicals. Setubal Principles Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Classi di composti studiate POP (Persistent Organic Pollutants), PBT (Persistent Bioaccumulative Toxics) VOC (Volatile Organic Compounds) e HPV (High Production Volume) Pesticidi: insetticidi, erbicidi, … Idrocarburi aromatici policondensati (PAH) Bifenili policlorurati (PCB), diossine Benzeni e fenoli sostituiti Prodotti Industria Chimica Italiana (per FEDERCHIMICA) Lista di Priorità 1 della EU Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Applicazioni in campo ambientale Predizione di proprietà chimico-fisiche per studi di ripartizione nei comparti ambientali: - bioconcentrazione (BCF) - volatilità (log Koa, log H, Vp) - coefficiente di adsorbimento nel suolo (log Koc) - indici di mobilità (leaching…….) Predizione di parametri di persistenza ambientale: - reattività atmosferica (costanti di velocità di reazione con radicali OH, NO3 ed O3) - indici di persistenza ambientale (emivite) - biodegradabilità ……. Predizione di attività biologiche: tossicità - mutagenicità …… - Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Analisi delle Componenti Principali (PCA) La PCA è una analisi esplorativa di dati multivariati. Le Componenti principali sono: •Combinazioni lineari dei dati originali •Ordinate secondo le direzioni di massima varianza (PC1, PC2..) •Non correlate •Sono quindi nuove variabili con le quali si condensa e “pulisce” l’informazione contenuta nei dati originali •Rappresentano macroproprietà dell’insieme dei dati originali La PCA consiste in una rotazione nello spazio dei dati originali in modo che le singole componenti siano tra loro ortogonali I dati vengono così “visti” in un diverso sistema di riferimento Secondo visuali controllate per qualità e quantità dell’informazione rappresentata PERSISTENCE ATMOSPHERIC DEGRADATION REACTIONS Reaction rate constants for the degradation by Tropospheric Oxidants: OH•, NO3• radicals and Ozone Paola Gramatica - QSAR and Environmental Chemistry Research Unit DBSF - INSUBRIA University - (Varese - ITALY) GLOBAL ATMOSPHERIC PERSISTENCE INDEX (ATPIN) 65 VOCs Principal Component Analysis PC1 score = ATPIN Cum.E.V.%=95.1% (PC1=88.3%) 1.0 126 167 97 305 1 165 134 94 -logkO 3 0.6 141 41 95166 135 298 45 186 0.2 PC2 233 501 -logkNO 3 -0.2 17229 20 -0.6 228 143 218 66 456 115 241 461 316 243 468 459485 229 385 484 466 82 197 370 388 247 419 337 145 144 142 216 254 255 -1.0 301 325 349 87 65 objects 114 474 116 227 31 -logkOH (Atmospheric Persistence Index) 184 81 34 R2=93.76% 297 21 4 variables Atmospheric Persistence Q2LOO=92.53 Q2LMO(50%)=91.44% 22 RMS=0.163 -1.4 -5 -4 -3 -2 -1 0 1 2 3 4 PC1=ATPIN ATPIN=20.37+1.94HOMO-2.34nBnz-0.29AMW+0.27DELS 4 3 DESCRIPTORS :(in order of significance) • nBnz: number of aromatic rings • AMW: average molecular weight • DELS: molecular electropological variation (charge distribution) 1 ATPIN calculated • HOMO: highest occupied molecular orbital (nucleophilicity) 2 0 254 1 -1 228 -2 45 -3 255 -4 -5 -5 -4 -3 -2 -1 0 1 ATPIN from PCA Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) 2 3 4 Example: QSAR Models for Degradation by NO3• (114 chemicals) Obj.Tr. Obj.Test Var.N. VARIABLES R2 114 3 HOMO nBnz MATS1m 92.9 77 37 3 HOMO nBnz MATS1m 90.3 18 Q2LOO Q2LMO(50%) Q2ext RMS 92.3 92.1 0.58 91.2 89.6 95.9 0.59 DESCRIPTORS (in order of significance): -logk(NO3) predicted 16 •HOMO: highest occupied molecular orbital (nucleophilicity of molecules) 14 •nBnz: number of aromatic rings 12 •MATS1m: 2D-autocorrelation of Moran 10 Training set Test set (atomic distribution) 8 8 10 12 14 16 18 -logk(NO 3) experimental Gramatica et al., Atmos. Environ. 2003, 37, 3115-3124. Paola Gramatica - QSAR and Environmental Chemistry Research Unit DBSF - INSUBRIA University - (Varese - ITALY) GLOBAL ATMOSPHERIC PERSISTENCE “enlarged INDEX (ATPIN) 65 VOCs ” 399 VOCs Principal Component Principal Component Analysis Analysis Cum.Cum.E.V.%=95.1% E.V.% = 95.3% (PC1 = 80.9%) (PC1=88.3%) 21.0 Exp. + Pred. (399 obj.) 0.6 1 Exp. (65 obj.) -logk(O3) -log k(O3) PC2 PC2 0.2 -logk(NO3) 0-0.2 -logk(OH) -log k(NO3) -0.6 -1 -log k(OH) -1.0 AtmosphericPersistence Persistence Atmospheric -2-1.4 -5 -5 -4 -3 -3 -2 -1 -1 0 1 1 23 3 PC1 PC1=ATPIN = ATPIN P. Gramatica et al., SAR &QSAR Env Res., 13, 2002, 743-753. Paola Gramatica - QSAR and Environmental Chemistry Research Unit DBSF - INSUBRIA University - (Varese - ITALY) 5 4 QSAR Modelling of “enlarged” GLOBAL ATMOSPHERIC PERSISTENCE INDEX PC1 score = ATPIN (399 chemicals experimental + predicted data) data Obj.Tr. Obj.Test Var.N. VARIABILI R2 Q2LOO Q2LMO(50%) Q2ext 399 3 HOMO nBnz BEHe4 93.3 93.2 93.2 255 174 3 HOMO nBnz BEHe4 93.7 93.5 93.4 92.7 RMS 0.42 0.41 4.5 3.5 2.5 DESCRIPTORS (in order of significance): ATPIN calcolato 1.5 •HOMO: highest occupied molecular orbital (nucleophilicity of molecules) 0.5 -0.5 -1.5 •nBnz: number of aromatic rings -2.5 Training set Test set -3.5 -4.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 •BEHe4: weighted by electronegativity (charge distribution) 4.5 ATPIN da PCA Paola Gramatica - QSAR and Environmental Chemistry Research Unit DBSF - INSUBRIA University - (Varese - ITALY) Ranking of pesticides for environmental distribution, based on PCA Principal Component Analysis (PCA) ON CHEMICAL-PHYSICAL PROPERTIES OF 54 PESTICIDES Cum. E.V. = 94.6% (PC1 = 70.1%) CLA 1 = MEDIUM COMP. CLA 2 = VOLATILE COMP. CLA 3 = SOLUBLE COMP. CLA 4 = SORBED COMP. 10 3 15 4 42 Volatility 2 40 12 PC2 1 11 33 3 13 53 1 46 39 37 7 48 20 31 log H 19 17 30 logKow logKoc 38 18 29 35 3654 50 475 6 44 41 14 27 16logS 49 -1 34 9 8 32 Solubility 0 log vp 2 43 2224 25 21 Sorption 26 28 23 45 51 -2 52 -4 -3 -2 -1 0 1 2 3 4 PC1 Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Clustering of pesticides for environmental distribution in 4 a priori classes DENDROGRAM Similarity 0.00 33.33 66.67 3 16 43 11 33 12 41 5 6 44 47 49 53 1 7 46 37 39 38 36 50 54 28 45 48 51 52 2 42 13 4 15 10 8 9 34 14 31 27 32 40 21 24 22 23 17 20 30 19 18 26 25 29 35 100.00 3: Soluble3comp. 1: Not-volatile/medium 1 comp. 2: Volatile comp. 2 4 4:Sorbed comp. Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Definition of 4 a priori classes of pesticides for environmental distribution PCA ON CHEMICAL-PHYSICAL PROPERTIES OF 54 PESTICIDES Cum. E.V. = 94.6% (PC1 = 70.1%) 4 CLA 1 = NOT-volatile COMP. CLA 2 = VOLATILE COMP. CLA 3 = SOLUBLE COMP. CLA 4 = SORBED COMP. Volatility 10 3 15 4 42 2 CLA 2 40 12 PC 2 1 2 43 11 33 0 3 13 41 -1 Solubility -2 14 1 46 39 377 47 5 6 44 48 28 45 51 30 2224 38 18 -5 -3 29 21 25 35 3654 50 26 CLA 4 23 CLA 1 52 Sorption Leaching -3 20 19 17 31 27 16 49 53 CLA 3 34 9 8 32 -1 1 3 PC 1 Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) 5 Classification of 54 pesticides for environmental distribution Classification Tree MW DESCRIPTORS: 317.69 MW: molecular weight (size) J 2.53 nHD: nHD: number of donor atoms in Hydrogen bonds NHD 1 Not Volatile/Med. comp. 2 Volatile comp. 1 J: Balaban topological index 3 Soluble comp. 4 Sorbed comp. P.Gramatica,..Int. J. Environ. Anal. Chem. 84, 65-74, 2004 Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Environmental Partitioning for 135 pesticides The “FOUR-LEAVES” approach (P.Gramatica and A.Di Guardo, Chemosphere, 2002, 47, 9547-956) 5 log VP 4 24 3 log H 2 VIN 1 0 log Kow -1 log Koc 51 -2 1130 19 22 92 26 16 36 84 70 91 89 80 75 5 33 109 29 18 76 35 55 68 40 72 42 102 47 56 3748 65 6 73 9 105 94 63 49 41 1 4 3264 99 28 3 74 27 45 62 34 38 15 96 14 82 83 61122 110 50 39 2 53 52 59 8 113 77 132 126 118 119 123 12 93 116 111 129 71 103 100 88 104 117 54 69 131 11286 114 127 57 85 98 120 20 60 130 135 107 106 115 121 46 95 134 43 44 101 124 17128 108 133 10 13 -3 7 23 78 66 79 log Sw 81 97 58 6787 9021 31 125 Leaching 25 -4 -5 -5 -4 -3 -2 -1 0 1 2 3 4 5 LIN PC1 score = LIN Leaching Index PC2 score = VIN Volatility Index Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) Environmental Partitioning Indexes Regression 6 LIn experimental 2 0 -2 -4 23 66 125 21 87 90 67 6102 1173 91132 3427 115 8 105 104 13 135 96 14 9 106 36 103 73 110109 1064 118 26 134 33 12 121 32 107 5 71 114 133 4113 70101 20 128 1 25 94 122 22 59 129 120 111 171586 77 35 112 61 76126 119 19 123 131 43 124 44 72 83 127 65 116 98 30 16 28108 2 88 84 11 8582 74 80 99 24 56 68 75 93 89 18 63 130 39 29 45 69 62 41 42 38 100 54 4660 40 57 48 53 473795 55 50 49 52 51 -4 -2 0 LIN (Leaching Index) = 58 81 4 2 97 78 7 79 31 -3.04 - 0.96 nX - 2.28 nNO+3.42 Ms - 1.74 ICR - 0.45 nS 92 n= 135 R2= 87.0% Q2LOO= 85.8% Q2LMO= 85.7% SDEP= 0.68 SDEC= 0.65 4 6 LIn calculated 4 24 92 VIN (Volatility Index) = 19 11 16 2 0 95 17 -2 13 n=135 R2= 77.1% Q2LOO= 74.1% Q2LMO= 72.6% SDEP= 0.66 SDEC=0.63 30 26 VIn experimental 2.35-1.58HY-0.230Xv-0.28nBM-0.40nCIC1.42ASP+0.97 IVDE 22 125 108 133 10 36 70 91 84 23 89 80 7 75 33 109 29 78 566 76 55 18 40 35 68 72 42 79 9102 47 37 56 105 65 6 73 94 48 63 4 1 64 49 41 67 32 99 7428 3 27 45 62 87 38 9682 14 1534 90 21 81 122 110 61 50 83 39 53 59 52 113 8 277 118 12612 111 132 97 58 119 123 93 116 129 Y103 respon 69 71 54 88 86 131 104 127 117 112 114 31 57 20 85120 98 60 135 130 107 51 121 106 115 46 44 128101124 43 134 25 -4 -4 -2 0 VIn calculated P. Gramatica and A. Di Guardo, Chemosphere, 2002, 47, 9547-956. Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) 2 4 WQO WQO -- Water Water Quality Quality Objectives Objectives EEC toxicity EEC Priority Priority List List 11 toxicity test test on on algae, algae, Daphnia Daphnia,, fish fish 7 TOXICITY IN DAPHNIA Daphnia 6 Log 1/EC50 = - 3.57 + 4.05 nP - 0.39 nHA + 1.02 IDM + 0.67 E1m Calc. toxicity 5 4 n = 94 3 R2 = 84.2% Q2LMO = 81.7% Q2LOO = 82.1% 2 nP: nP: n. of phosphorous atoms nHA: nHA: n. of H bond acceptors 1 0 -1 -1 0 1 2 3 4 5 6 Exp. toxicity 7 IDM: mean inf. cont. on the dist. magn. magn. E1m: distribuz. distribuz. atomica dirdir-WHIM descriptor 2.5 2.5 T. Daphnia PRINCIPAL COMPONENT ANALYSIS (PCA) II I III 1.5 T. Fish - all toxicity data available for 37 chemicals (E.V.: 90%) Ecotox and Environ Safety, 49, (2001) 206-220 PC2 0.5 - experimental + predicted data for 97 chemicals (E.V.: 93.7%) -0.5 -1.5 Experimental Predicted T. Algae -2.5 -5 -3 Low toxicity Low toxicity -1 1 PC1 ZM1 72 CLASSIFICAZIONE CART CHI0 5.13 Obj. n.: 125 NoModel ER: 40.8% I II III Assigned class Experimental Selected var.: ZM1 - CHI0 ER: 7.2% cvER: 15.2% ZM1: Zagreb index CHI0: connectivity index Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) 3 High High toxicity toxicity 5 2.5 T. Daphnia II I III 1.5 T. Fish PC2 0.5 -0.5 -1.5 Experimental Predicted T. Algae -2.5 -5 -3 Low toxicity -1 1 PC1 3 High toxicity Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) 5 SCREENING -life in SCREENING of of POPs POPs for for overall overall persistence persistence based based on on half half-life in air, air, water, water, soil soil HALFHALF-LIFE in SOIL SOIL 2.2 Predicted half life 1.8 Log h.l. soil = - 3.46 + 0.58 IDM + 0.99 E2m + 0.48 G2e 1.4 obj.= 30 R2 = 83.2% Q2LOO = 77.8% Q2LMO = 76.9% 1.0 0.6 IDM : mean inf. index on distance magnitude E2m- G2e : directional WHIMs 0.2 -0.2 -0.2 0.2 0.6 1.0 1.4 1.8 2.2 3.5 4 3 2.5 PRINCIPAL COMPONENT ANALYSIS (PCA) 2 1.5 1 (Cum. E.V.: 87.6%) experimental + predicted data for 91 chemicals (Cum. E.V.: 79.5%) PC22 PC all half-life data available for 29 chemicals half-life surf.W. half-life air 0 0.5 -1 -0.5 -2 -3 -1.5 -4 -4 -2.5 -4 PC 1 half-life soil half-life groundW -3 -3 -2 -2 -1 -1 0 0 PC 1 PC 1 1 1 2 2 PERSISTENCE 4 3 Predicted PC 1 scores 2 OVERALL PERSISTENCE INDEX (PC1) 1 0 PC1= 9.22 + 3.14 AACAAC- 6.32 E2s – 17.49 E1e – 0.16 Tm -1 -2 obj. obj. = 91 R2 = 85.1 Q2LOO = 82.6 Q2LMO = 82.2 -3 -4 -4 -3 -2 -1 0 1 2 3 4 PC 1 scores Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) 3 3 SORBED SOLUBLES and VOLATILES Experimental half life 4 4 half-life surf.W. half-life air 3 SOLUBLES and VOLATILES 4 2 0 -1 -2 -3 SORBED PC 2 1 half-life soil half-life groundW -4 -4 -3 -2 -1 0 1 2 3 PC 1 PERSISTENCE Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) 4 “SCREENING” dei POP per il Long Range Transport (LRT) Principal Component Analysis Per avere mimimo potenziale di LRT: 2.5 70 mp 1.5 49 50 logH 33 logKow 17 42 68 55 9 10 69 39 8 43 16 15 67 65 2 62 6636 51 25 4 19 21 41 40 1337 59 14 38 64 58 5756 28 53 78 47 5 12 52 63 22 7611 61 60 3 81 44 48 1 18 24 26 20 23 35 0.5 PC2 31 46 32 29 30 27 -0.5 34 72 79 80 logKoa 71 MW 76 77 -5 -3 logSol MOBILITY MOBILITY -2.5 -7 (PC1 in PCA delle emivite, fig. precedente) 75 -1.5 -1 • Minimizzare la persistenza 54 45 74 73 logVP 1 3 5 PC1= 74.6% MCDM: Multicriteria Decision Making • Minimizzare la mobilità (PC1 in PCA delle proprietà chimico-fisiche, es. a lato) k Funzione di utilità ∑ Fx ( )= λi∗fi(x) i=1 LRT potential for 81 chemicals 1.0 30 28 27 29 Low LRT potential (Class 3) 34 10 6 0.6 2 5 7 3 0.5 8 9 16 1415 12 13 72 75 73 71 49 21 52 51 50 11 39 4 3637 38 1 Medium LRT potential (Class 2) 46 4748 42 40 54 5859 41 81 79 60 63646566 61 6768 62 69 4445 70 High LRT potential (Class 1) 0.2 74 56 55 57 0.4 0.3 77 53 43 10 20 30 78 40 ID 50 60 70 35 29 e Classificazione CART (nC, E1u) : 0.1 0 1.0 80 76 22 0.7 UTILITY 2324 18 20 19 35 3233 2526 17 0.8 80 MRcv= 6.2% No Model 42% LRT potential predicted 0.9 LRT=0.08-0.04nCl+0.06X2-0.10SPAM+0.44E1u Regressione 31 0.6 78 62 63 60 6164 65 68 67 66 44 6945 34 2728 80 24 26 25 23 22 73 18 2 0 76 19 17 21 16 52 71 15 46 13 6 14 49 10 75 51 8 9 72 44 41 38 36 47 39 311 8 50 12 42 75 4043 37 81 2 1 59 58 53 74 5755 54 56 79 Q2LOO=86.4%, Q2LMO =85.7% 70 0.2 0.0 0.0 31 77 0.8 0.4 33 30 32 0.2 0.4 0.6 0.8 LRT index from MCDM (utility function) Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy) 1.0 Acknowledgements Dr. Ester Papa Dr. Pamela Pilutti Dr. Francesca Battaini Dr. Fulvio Villa http://dipbsf.uninsubria.it/qsar http://www.qsar.it Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)