Università degli Studi di Bologna - Dottorato in Chimica Ind.– 17/2/ 2004
Applicazione delle
metodologie QSAR a
problematiche
ambientali di inquinanti
organici
Prof. Paola Gramatica
QSAR and Environmental Chemistry Research Unit
DBSF - University of Insubria - Varese
http://
www.qsar.it
http://www.qsar.it
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
QSAR
QSAR
Quantitative
Structure - Activity
Relationships
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Quantitative Structure
-Activity Relationships (QSAR)
Structure-Activity
and
and
Quantitative Structure
-Property Relationships (QSPR)
Structure-Property
“The structure of a chemical influences its properties and biological activity”
“Similar compounds behave similarly”
(Hansch 1964)
Activity or Property = f (Structure)
It is possible to find a relationship (f) between Structure and behavior
(Activity or Property) of a chemical
PREDICTED DATA
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Chemicals
Synthesis
Natural products
Xenobiotics
Physico-chemical
properties
Environmental
fate and behavior
Biological Activity
degradation
toxicity
persistence
mutagenicity
bioaccumulation
carcinogenicity
partitioning
endocrine disrupt.
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
THE CHEMICAL UNIVERSE
NEW
1.000.000 / year
22.000.000 in C.A.S.
Q
S
A
R
100.000 on market
EINECS TSCA
5%
known
data
experiments
Environmental fate?
Human effects?
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
NEW
2.000 / year
(Q)SAR History
Alkane m.p. and b.p.
(Cros, 1863)
n.C
M.W.
Alcohol water solubility
n.C
M.W.
PHYSICO-CHEMICAL
PROPERTIES
STRUCTURE
Alcohol toxicity
part. coeff.
oil/water
(Meyer-Overton 1899-1901)
Log P
BIOLOGICAL
ACTIVITY
PHYSICO-CHEMICAL
PROPERTIES
STRUCTURE/
PROPERTIES
BIOLOGICAL
ACTIVITY
(Hansch 1964)
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Classical Hansch equation:
“Toxicity” = a + b logP + c E + d S
logP
E
S
or log Kow, partition coefficient
between octanol and water:
hydrophobicity term
electronic term
steric term related
to bulk and shape
The probability or ability of the
chemical to reach the target site
The possibility of the chemical to interact
with the target and to be active
Congenericity principle
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
CHEMICALS
D
M
A
MOLECULAR
DESCRIPTORS
R1
PHYSICO
-CHEMICAL
PHYSICO-CHEMICAL
PROPERTIES
R2
R3
BIOLOGICAL
ACTIVITIES
M: experimental measures of properties
A: experimental measures of activities
D: theoretical procedures for descriptors
R1, R2, R3: mathematical relationships
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
THE 3 NECESSITIES:
GOOD INPUT DATA
High-quality experimental data as input data to find the Structure-Activity
Relation
MEANINGFUL STRUCTURAL INFORMATION
Good representation of the chemical structure: molecular descriptors
PREDICTIVE MODELS
Quantitative models with validated predictive performances
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Experimental data set
There is a need for a “limited” number of HIGH-QUALITY
experimental data on which to develop QSAR models!
The models will only be as good as the data used to develop them!
“Garbage in, garbage out”
NEEDS FOR EXPERIMENTAL DATA:
AS NUMEROUS AS POSSIBLE
CORRECT
REPRESENTATIVE
HOMOGENEOUS
(ideally, same lab, same method)
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
CHEMICALS
D
M
A
MOLECULAR
DESCRIPTORS
R1
PHYSICO
-CHEMICAL
PHYSICO-CHEMICAL
PROPERTIES
R2
R3
BIOLOGICAL
ACTIVITIES
M: experimental measures of properties
A: experimental measures of activities
D: theoretical procedures for descriptors
R1, R2, R3: mathematical relationships
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
The “magic” molecular descriptor
molecular
fragments
C log P
Software
OH
Bioconcentration
Log P (or Kow)
Sorption
Water solubility
Cl
Toxicity
“ 35 years of (ab)using of log P for everything modelling is enough! “
(R. Schwarzenback)
SETAC 2000
FROM PARTITION PROPERTY TO ACTUAL MOLECULAR STRUCTURE...
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
MOLECULAR
DESCRIPTORS
. .
. 0D
. ·· ·· ·· ·· .
1D
. · · . .· · .
.
Representation of a chemical by numerical indices
. .
.
. ·· ·· ·· ·· .
. · ·. .· · .
.
Cl
H
C
Cl
Cl
Cl
H
C
C
C
H
H
C
C
C
C
C
C
C
H
Cl
Cl
C
H
2D
Cl Cl
H
Cl
Cl
H
H
H
H
Cl Cl
3D
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
H
CHEMICALS
D
M
A
MOLECULAR
DESCRIPTORS
R1
PHYSICO
-CHEMICAL
PHYSICO-CHEMICAL
PROPERTIES
R2
R3
BIOLOGICAL
ACTIVITIES
M: experimental measures of properties
A: experimental measures of activities
D: theoretical procedures for descriptors
R1, R2, R3: mathematical relationships
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
CHEMICALS
EXPERIMENTAL
DATA
MOLECULAR
DESCRIPTORS
EXPLORATIVE
ANALYSIS:
x1
- Principal Component
Analysis
- Cluster Analysis
CLASSIFICATION METHODS:
- Classification Tree (CART)
- Discriminant Analysis
- Neural Networks
Quantitative models
for
qualitative responses
x2
...
xn
TRAINING
TRAINING
SET
SET
Y
Chemometric
Methods
REGRESSION METHODS:
- Multivariate Linear Regression
(MLR)
- Partial Least Squares Regression
(PLS)
Quantitative models
for
quantitative responses
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
DATA SET
SPLITTING
•Dimension
TRAINING
TRAINING SET
SET
•Chemical composition
TEST
TEST SET
SET
REGRESSION
MODEL
FITTING
R2
INTERNAL
VALIDATION
Q2LOO
Q2LMO
EXTERNAL
VALIDATION
Q2EXT
PREDICTABILITY
NEW DATA
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
MOLECULAR DESCRIPTORS
T
O
X
I
C
I
T
Y
INFORMATION
SELECTION
Model with
relevant information
Y = f ( selected descriptors)
Y
QSAR
MODEL
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
LIMITATIONS OF QSAR MODELS
Statistical quality
Fitting R2
Predictability Q2
Chemical domain
Pred. response
Outliers
Prediction reliability
Exp. response
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
EXPERIMENTAL
DATA
CHEMICALS
MOLECULAR
DESCRIPTORS
Y
X
MAXIMUM
FITTING
PREDICTIVE POWER
MODEL
REVERSIBLE
DECODING
NEW
CHEMICALS
???
PREDICTION
MOLECULAR
DESCRIPTORS
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
APPLICATIONS of QSAR PREDICTIONS
•Filling of data gaps
•Validation of experimental data
•Screening, ranking and priority setting
•Highlighting chemicals of concern (also before their synthesis)
PRIORITY
PRIORITY LISTS
LISTS
Optimize industry resource allocation
Minimize animal testing
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
ENVIRONMENTAL PARAMETERS
Priority setting / Risk Assessment
Parameters
Quality of QSAR models
Physico-chemical data
m.p.; b.p.; vapour pressure; Henry law constant;
water solubility; partition coefficients (Kow, Koc, …).
OPTIMUM
Environmental fate and pathways
chemical-, photo- and bio-degradation;
bioaccumulation; compartment partitioning.
Ecotoxicity
algae; Daphnia; fish; ….
Mammalian toxicity
skin-, eyes-, oral-, inhalation acute toxicity; mutagenicity;
carcinogenicity; toxicity to reproduction system;...
HIGH
MEDIUM-HIGH
MEDIUM
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
QSAR in U.S.
Since 1979/80 wide use and “abuse”
EPA / OPPT
TSCA
NCP
Office
Offic of Pollution Prevention and Toxics
Toxic Substances Control Act inventory (~75.000 chem.)
New Chemicals Program (PMN with QSAR data)
QSAR in E.U.
Since 1992/93 but, so far, limited use
EC Regulation on Evaluation and Control of Risks of
Existing Substances
IPS
EURAM
Informal Priority Setting method
Europe Union RAnking Method
WHITE PAPER on the Strategy for a Future
Chemicals Policy (Feb 2001)
•Same regulation for new and existing chemicals (1:15)
•Responsibility from authorities to industries for testing and risk
assessment
•REACH system: Registration Evaluation Authorisation
of Chemicals
- Registration by companies for > 1 t prod (30.000)
into 2005 1000 t (HPV), into 2012 all
- Evaluation
of information by authorities for > 100 t (5000)
into 2008
- Authorisation for carcinogenic, mutagenic, toxic to
reproduction and POPs
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
QSAR in WHITE PAPER
Art. 3.2 ….”to keep animal testing to a minimum”
• Development and validation of alternative methods
ECVAM (Europ. Centre for Validation of Alternative
Methods) –JRC Ispra
• Inclusion in the Community legislation and OECD Test
Guidelines Programme for international recognition
Task Force of Esperts in QSAR
•Particular research efforts for developing and validating
modelling (e.g. QSAR) and screening methods for
assessing the potential adverse effects of chemicals.
Setubal Principles
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Classi di composti studiate
POP (Persistent Organic Pollutants), PBT (Persistent Bioaccumulative Toxics)
VOC (Volatile Organic Compounds) e HPV (High Production Volume)
Pesticidi: insetticidi, erbicidi, …
Idrocarburi aromatici policondensati (PAH)
Bifenili policlorurati (PCB), diossine
Benzeni e fenoli sostituiti
Prodotti Industria Chimica Italiana (per FEDERCHIMICA)
Lista di Priorità 1 della EU
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Applicazioni in campo ambientale
Predizione di proprietà chimico-fisiche per studi di ripartizione
nei comparti ambientali:
- bioconcentrazione (BCF)
- volatilità (log Koa, log H, Vp)
- coefficiente di adsorbimento nel suolo (log Koc)
- indici di mobilità (leaching…….)
Predizione di parametri di persistenza ambientale:
- reattività atmosferica (costanti di velocità di reazione
con radicali OH, NO3 ed O3)
- indici di persistenza ambientale (emivite)
- biodegradabilità …….
Predizione di attività biologiche:
tossicità
- mutagenicità ……
-
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Analisi delle Componenti Principali (PCA)
La PCA è una analisi esplorativa di dati multivariati.
Le Componenti principali sono:
•Combinazioni lineari dei dati originali
•Ordinate secondo le direzioni di massima varianza (PC1, PC2..)
•Non correlate
•Sono quindi nuove variabili con le quali si condensa e “pulisce”
l’informazione contenuta nei dati originali
•Rappresentano macroproprietà dell’insieme dei dati originali
La PCA consiste in una rotazione nello spazio dei dati originali
in modo che le singole componenti siano tra loro ortogonali
I dati vengono così “visti” in un diverso sistema di riferimento
Secondo visuali controllate per qualità e quantità
dell’informazione rappresentata
PERSISTENCE
ATMOSPHERIC
DEGRADATION REACTIONS
Reaction rate constants for the degradation by
Tropospheric Oxidants:
OH•, NO3• radicals and Ozone
Paola Gramatica - QSAR and Environmental Chemistry Research Unit
DBSF - INSUBRIA University - (Varese - ITALY)
GLOBAL ATMOSPHERIC PERSISTENCE
INDEX (ATPIN) 65 VOCs
Principal Component Analysis
PC1 score = ATPIN
Cum.E.V.%=95.1% (PC1=88.3%)
1.0
126
167
97
305
1
165
134
94
-logkO 3
0.6
141
41
95166
135
298
45
186
0.2
PC2
233
501
-logkNO 3
-0.2
17229
20
-0.6
228
143
218
66
456 115
241
461
316
243
468
459485
229
385 484 466
82
197
370
388
247
419 337
145
144
142
216
254
255
-1.0
301
325
349
87
65 objects
114
474
116
227
31
-logkOH
(Atmospheric Persistence Index)
184
81
34
R2=93.76%
297
21
4 variables
Atmospheric Persistence
Q2LOO=92.53
Q2LMO(50%)=91.44%
22
RMS=0.163
-1.4
-5
-4
-3
-2
-1
0
1
2
3
4
PC1=ATPIN
ATPIN=20.37+1.94HOMO-2.34nBnz-0.29AMW+0.27DELS
4
3
DESCRIPTORS :(in order of significance)
• nBnz: number of aromatic rings
• AMW: average molecular weight
• DELS: molecular electropological variation
(charge distribution)
1
ATPIN calculated
• HOMO: highest occupied molecular orbital
(nucleophilicity)
2
0
254
1
-1
228
-2
45
-3
255
-4
-5
-5
-4
-3
-2
-1
0
1
ATPIN from PCA
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
2
3
4
Example: QSAR Models for Degradation by
NO3•
(114 chemicals)
Obj.Tr. Obj.Test Var.N.
VARIABLES
R2
114
3
HOMO nBnz MATS1m 92.9
77
37
3
HOMO nBnz MATS1m 90.3
18
Q2LOO Q2LMO(50%) Q2ext RMS
92.3
92.1
0.58
91.2
89.6
95.9 0.59
DESCRIPTORS (in order of significance):
-logk(NO3) predicted
16
•HOMO: highest occupied molecular
orbital (nucleophilicity of molecules)
14
•nBnz: number of aromatic rings
12
•MATS1m: 2D-autocorrelation of Moran
10
Training set
Test set
(atomic distribution)
8
8
10
12
14
16
18
-logk(NO 3) experimental
Gramatica et al., Atmos. Environ. 2003, 37, 3115-3124.
Paola Gramatica - QSAR and Environmental Chemistry Research Unit
DBSF - INSUBRIA University - (Varese - ITALY)
GLOBAL ATMOSPHERIC PERSISTENCE
“enlarged
INDEX (ATPIN) 65
VOCs ” 399 VOCs
Principal
Component
Principal
Component Analysis
Analysis
Cum.Cum.E.V.%=95.1%
E.V.% = 95.3% (PC1
= 80.9%)
(PC1=88.3%)
21.0
Exp. + Pred. (399 obj.)
0.6
1
Exp. (65 obj.)
-logk(O3)
-log k(O3)
PC2
PC2
0.2
-logk(NO3)
0-0.2
-logk(OH)
-log k(NO3)
-0.6
-1
-log k(OH)
-1.0
AtmosphericPersistence
Persistence
Atmospheric
-2-1.4
-5 -5
-4
-3
-3
-2
-1
-1
0 1
1
23
3
PC1
PC1=ATPIN
= ATPIN
P. Gramatica et al., SAR &QSAR Env Res., 13, 2002, 743-753.
Paola Gramatica - QSAR and Environmental Chemistry Research Unit
DBSF - INSUBRIA University - (Varese - ITALY)
5
4
QSAR Modelling of “enlarged” GLOBAL
ATMOSPHERIC PERSISTENCE INDEX
PC1 score = ATPIN (399 chemicals experimental + predicted data)
data
Obj.Tr. Obj.Test Var.N.
VARIABILI
R2 Q2LOO Q2LMO(50%) Q2ext
399
3
HOMO nBnz BEHe4 93.3 93.2
93.2
255
174
3
HOMO nBnz BEHe4 93.7 93.5
93.4
92.7
RMS
0.42
0.41
4.5
3.5
2.5
DESCRIPTORS (in order of significance):
ATPIN calcolato
1.5
•HOMO: highest occupied molecular
orbital (nucleophilicity of molecules)
0.5
-0.5
-1.5
•nBnz: number of aromatic rings
-2.5
Training set
Test set
-3.5
-4.5
-4.5
-3.5
-2.5
-1.5
-0.5
0.5
1.5
2.5
3.5
•BEHe4: weighted by electronegativity
(charge distribution)
4.5
ATPIN da PCA
Paola Gramatica - QSAR and Environmental Chemistry Research Unit
DBSF - INSUBRIA University - (Varese - ITALY)
Ranking of pesticides for environmental
distribution, based on PCA
Principal Component Analysis (PCA) ON CHEMICAL-PHYSICAL PROPERTIES OF 54 PESTICIDES
Cum. E.V. = 94.6% (PC1 = 70.1%)
CLA 1 = MEDIUM COMP.
CLA 2 = VOLATILE COMP.
CLA 3 = SOLUBLE COMP.
CLA 4 = SORBED COMP.
10
3
15
4
42
Volatility
2
40
12
PC2
1
11
33
3
13
53
1 46 39
37 7
48
20
31 log H
19
17
30
logKow
logKoc
38
18
29
35
3654
50
475
6 44
41
14
27
16logS
49
-1
34
9
8
32
Solubility
0
log vp
2
43
2224
25
21
Sorption
26
28
23
45
51
-2
52
-4
-3
-2
-1
0
1
2
3
4
PC1
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Clustering of pesticides for environmental
distribution in 4 a priori classes
DENDROGRAM
Similarity
0.00
33.33
66.67
3
16
43
11
33
12
41
5
6
44
47
49
53
1
7
46
37
39
38
36
50
54
28
45
48
51
52
2
42
13
4
15
10
8
9
34
14
31
27
32
40
21
24
22
23
17
20
30
19
18
26
25
29
35
100.00
3: Soluble3comp.
1: Not-volatile/medium
1
comp.
2: Volatile
comp.
2
4
4:Sorbed
comp.
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Definition of 4 a priori classes of pesticides
for environmental distribution
PCA ON CHEMICAL-PHYSICAL PROPERTIES OF 54 PESTICIDES
Cum. E.V. = 94.6% (PC1 = 70.1%)
4
CLA 1 = NOT-volatile COMP.
CLA 2 = VOLATILE COMP.
CLA 3 = SOLUBLE COMP.
CLA 4 = SORBED COMP.
Volatility
10
3
15
4
42
2
CLA 2
40
12
PC 2
1
2
43
11
33
0
3
13
41
-1
Solubility
-2
14
1 46 39
377
47
5
6 44
48 28
45
51
30
2224
38
18
-5
-3
29
21
25
35
3654
50
26
CLA 4
23
CLA 1
52
Sorption
Leaching
-3
20 19
17
31
27
16
49 53
CLA 3
34
9
8
32
-1
1
3
PC 1
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
5
Classification of 54 pesticides for
environmental distribution
Classification Tree
MW
DESCRIPTORS:
317.69
MW: molecular
weight (size)
J
2.53
nHD:
nHD: number of
donor atoms in
Hydrogen bonds
NHD
1
Not Volatile/Med. comp.
2
Volatile comp.
1
J: Balaban
topological index
3
Soluble comp.
4
Sorbed comp.
P.Gramatica,..Int. J. Environ. Anal. Chem. 84, 65-74, 2004
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Environmental Partitioning for 135 pesticides
The “FOUR-LEAVES” approach
(P.Gramatica and A.Di Guardo, Chemosphere, 2002, 47, 9547-956)
5
log VP
4
24
3
log H
2
VIN
1
0 log Kow
-1
log Koc
51
-2
1130
19
22
92
26
16
36
84
70
91
89 80
75
5 33
109
29 18
76 35
55
68
40
72
42
102
47
56
3748
65
6
73 9 105
94
63
49
41
1 4 3264
99 28
3
74
27
45
62
34
38
15
96
14
82
83 61122
110
50
39
2
53
52
59
8
113
77
132
126
118
119
123
12
93
116
111
129 71 103
100
88
104 117
54 69
131 11286 114
127
57
85 98
120 20
60
130
135
107 106
115
121
46
95
134
43
44
101
124 17128
108
133
10
13
-3
7 23
78
66
79
log Sw
81
97 58
6787
9021
31
125
Leaching
25
-4
-5
-5
-4
-3
-2
-1
0
1
2
3
4
5
LIN
PC1 score = LIN Leaching Index PC2 score = VIN Volatility Index
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Environmental Partitioning Indexes Regression
6
LIn experimental
2
0
-2
-4
23 66
125
21
87
90
67
6102
1173
91132 3427
115 8
105
104
13
135
96
14 9
106
36
103
73 110109
1064
118
26
134 33
12
121
32
107
5
71
114
133
4113
70101
20
128
1 25
94
122
22
59
129
120
111
171586 77
35
112
61
76126
119
19
123
131 43
124
44
72 83
127
65
116
98
30 16
28108
2
88 84
11
8582 74
80
99
24
56
68
75
93 89 18
63
130
39 29
45
69
62
41
42 38
100
54 4660
40
57
48
53
473795
55
50
49 52
51
-4
-2
0
LIN (Leaching Index) =
58
81
4
2
97
78
7
79
31
-3.04 - 0.96 nX - 2.28 nNO+3.42 Ms
- 1.74 ICR - 0.45 nS
92
n= 135 R2= 87.0% Q2LOO= 85.8%
Q2LMO= 85.7% SDEP= 0.68 SDEC= 0.65
4
6
LIn calculated
4
24
92
VIN (Volatility Index) =
19
11
16
2
0
95
17
-2
13
n=135 R2= 77.1% Q2LOO= 74.1%
Q2LMO= 72.6% SDEP= 0.66 SDEC=0.63
30
26
VIn experimental
2.35-1.58HY-0.230Xv-0.28nBM-0.40nCIC1.42ASP+0.97 IVDE
22
125 108
133
10
36
70
91 84
23 89 80
7
75
33
109
29 78 566
76
55
18
40 35
68
72
42
79
9102
47 37
56
105
65
6
73
94
48
63
4 1
64 49 41 67
32
99
7428 3 27
45
62
87
38
9682
14 1534 90
21
81
122 110
61
50 83
39 53
59 52 113
8 277
118
12612 111 132
97
58
119
123
93 116
129
Y103
respon
69 71 54
88
86 131
104
127 117
112
114
31
57
20
85120
98
60 135
130
107
51 121
106
115
46
44
128101124
43
134
25
-4
-4
-2
0
VIn calculated
P. Gramatica and A. Di Guardo, Chemosphere, 2002, 47, 9547-956.
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
2
4
WQO
WQO -- Water
Water Quality
Quality Objectives
Objectives
EEC
toxicity
EEC Priority
Priority List
List 11
toxicity test
test on
on algae,
algae, Daphnia
Daphnia,, fish
fish
7
TOXICITY IN DAPHNIA
Daphnia
6
Log 1/EC50 = - 3.57 + 4.05 nP - 0.39 nHA + 1.02 IDM + 0.67 E1m
Calc. toxicity
5
4
n = 94
3
R2 = 84.2%
Q2LMO = 81.7%
Q2LOO = 82.1%
2
nP:
nP: n. of phosphorous atoms nHA:
nHA: n. of H bond acceptors
1
0
-1
-1
0
1
2
3
4
5
6
Exp. toxicity
7
IDM: mean inf. cont. on the dist. magn.
magn. E1m: distribuz.
distribuz. atomica
dirdir-WHIM descriptor
2.5
2.5
T. Daphnia
PRINCIPAL COMPONENT ANALYSIS (PCA)
II
I
III
1.5
T. Fish
- all toxicity data available for 37 chemicals (E.V.: 90%)
Ecotox and Environ Safety, 49, (2001) 206-220
PC2
0.5
- experimental + predicted data for 97 chemicals (E.V.: 93.7%)
-0.5
-1.5
Experimental
Predicted
T. Algae
-2.5
-5
-3
Low
toxicity
Low
toxicity
-1
1
PC1
ZM1 72
CLASSIFICAZIONE CART
CHI0 5.13
Obj. n.: 125
NoModel ER: 40.8%
I
II
III
Assigned class
Experimental
Selected var.: ZM1 - CHI0
ER: 7.2%
cvER: 15.2%
ZM1: Zagreb index CHI0: connectivity index
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
3
High
High toxicity
toxicity
5
2.5
T. Daphnia
II
I
III
1.5
T. Fish
PC2
0.5
-0.5
-1.5
Experimental
Predicted
T. Algae
-2.5
-5
-3
Low toxicity
-1
1
PC1
3
High toxicity
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
5
SCREENING
-life in
SCREENING of
of POPs
POPs for
for overall
overall persistence
persistence based
based on
on half
half-life
in air,
air,
water,
water, soil
soil
HALFHALF-LIFE in SOIL
SOIL
2.2
Predicted half life
1.8
Log h.l. soil = - 3.46 + 0.58 IDM + 0.99 E2m + 0.48 G2e
1.4
obj.= 30 R2 = 83.2% Q2LOO = 77.8% Q2LMO = 76.9%
1.0
0.6
IDM : mean inf. index on distance magnitude
E2m- G2e : directional WHIMs
0.2
-0.2
-0.2
0.2
0.6
1.0
1.4
1.8
2.2
3.5
4
3
2.5
PRINCIPAL COMPONENT ANALYSIS (PCA)
2
1.5
1
(Cum. E.V.: 87.6%)
experimental + predicted data for 91 chemicals (Cum. E.V.: 79.5%)
PC22
PC
all half-life data available for 29 chemicals
half-life
surf.W.
half-life
air
0
0.5
-1
-0.5
-2
-3
-1.5
-4
-4
-2.5
-4
PC 1
half-life
soil
half-life
groundW
-3
-3
-2
-2
-1
-1
0
0
PC 1
PC 1
1
1
2
2
PERSISTENCE
4
3
Predicted PC 1 scores
2
OVERALL PERSISTENCE INDEX (PC1)
1
0
PC1= 9.22 + 3.14 AACAAC- 6.32 E2s – 17.49 E1e – 0.16 Tm
-1
-2
obj.
obj. = 91 R2 = 85.1 Q2LOO = 82.6 Q2LMO = 82.2
-3
-4
-4
-3
-2
-1
0
1
2
3
4
PC 1 scores
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
3
3
SORBED
SOLUBLES and VOLATILES
Experimental half life
4
4
half-life
surf.W.
half-life
air
3
SOLUBLES and VOLATILES
4
2
0
-1
-2
-3
SORBED
PC 2
1
half-life
soil
half-life
groundW
-4
-4
-3
-2
-1
0
1
2
3
PC 1
PERSISTENCE
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
4
“SCREENING” dei POP per il Long Range Transport (LRT)
Principal Component Analysis
Per avere mimimo potenziale di LRT:
2.5
70
mp
1.5
49
50
logH
33
logKow
17
42
68
55
9
10
69 39
8
43
16 15 67 65 2 62
6636
51
25
4
19 21
41 40 1337
59
14
38
64
58 5756
28 53 78
47
5
12
52
63
22
7611
61
60
3
81
44
48
1
18
24
26
20
23
35
0.5
PC2
31 46
32 29
30
27
-0.5
34
72
79
80
logKoa
71
MW
76
77
-5
-3
logSol
MOBILITY
MOBILITY
-2.5
-7
(PC1 in PCA delle emivite, fig. precedente)
75
-1.5
-1
• Minimizzare la persistenza
54
45
74 73
logVP
1
3
5
PC1= 74.6%
MCDM:
Multicriteria Decision
Making
• Minimizzare la mobilità
(PC1 in PCA delle proprietà chimico-fisiche,
es. a lato)
k
Funzione di utilità
∑
Fx
( )= λi∗fi(x)
i=1
LRT potential for 81 chemicals
1.0
30
28
27 29
Low LRT potential (Class 3)
34
10
6
0.6
2
5 7
3
0.5
8
9
16
1415
12
13
72 75
73
71
49
21
52
51
50
11
39
4
3637
38
1
Medium LRT potential (Class 2)
46
4748
42
40
54
5859
41
81
79
60 63646566
61
6768
62
69
4445
70
High LRT potential (Class 1)
0.2
74
56
55
57
0.4
0.3
77
53
43
10
20
30
78
40
ID
50
60
70
35
29
e
Classificazione
CART (nC, E1u) :
0.1
0
1.0
80
76
22
0.7
UTILITY
2324
18 20
19
35
3233
2526
17
0.8
80
MRcv= 6.2%
No Model 42%
LRT potential predicted
0.9
LRT=0.08-0.04nCl+0.06X2-0.10SPAM+0.44E1u
Regressione
31
0.6
78
62 63
60
6164
65
68
67
66
44
6945
34 2728
80
24 26
25
23
22
73
18
2
0
76 19
17
21 16
52
71
15
46 13 6 14 49
10
75 51
8 9 72
44
41
38 36
47
39
311
8 50 12
42
75
4043 37
81
2
1
59
58 53
74
5755 54
56
79
Q2LOO=86.4%,
Q2LMO =85.7%
70
0.2
0.0
0.0
31
77
0.8
0.4
33 30
32
0.2
0.4
0.6
0.8
LRT index from MCDM (utility function)
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
1.0
Acknowledgements
Dr. Ester Papa
Dr. Pamela Pilutti
Dr. Francesca Battaini
Dr. Fulvio Villa
http://dipbsf.uninsubria.it/qsar
http://www.qsar.it
Prof. Paola Gramatica - QSAR Research Unit - DBSF - University of Insubria - Varese (Italy)
Scarica

QSAR - Università dell`Insubria