Interazioni Proteina-Ligando
“Docking“, calcolo di affinità e
sviluppo di nuovi farmaci
Metodi bioinformatici per lo studio delle
interazioni proteina ligando
• Determinazione della struttura del
complesso Proteina-Ligando “Docking”
• Determinare la affinità di un ligando
• Cercare possibili inibitori
• Disegnare nuovi farmaci
The molecular docking problem
Given two molecules with 3D conformations in atomic
details:
 Do the molecules bind to each other?
 How does the molecule-molecule complex looks
like?
 How strong is the binding affinity?
What do we dock?
The two molecules might be:
A protein (enzyme, receptor) and a
small molecule (substrates, ligands)
A protein and a DNA molecule
Two proteins
Why do we dock?
 Drug discovery costs are too high: ~$800 millions,
8~14 years, ~10,000 compounds (DiMasi et al. 2003;
Dickson & Gagnon 2004)
 Drugs interact with their receptors in a highly specific
and complementary manner.
 Core of the target-based structure-based drug design
(SBDD) for lead generation and optimization.
 Lead is a compound that
 shows biological activity,
 is novel, and
 has the potential of being structurally modified for
improved bioactivity and selectivity
Rational Approach to
Drug Discovery
Identify target
Clone gene encoding target
Express target in recombinant form
Crystal
structures of
target and
target/inhibitor
complexes
Screen
recombinant
target with
available
inhibitors
Synthesize
modifications
of lead
compounds
Identify lead
compounds
Synthesize
modifications
of lead
compounds
Identify lead
compounds
Toxicity &
pharmacokinetic
studies
Preclinical trials
Lead Optimization
Active site
Lead
Lead Optimization
Motivation for proteinprotein docking
• Biological activity depends on the specific
recognition of proteins.
• Understand protein interaction networks in a cell
• Yield insight to thermodynamics of molecular
recognition
• The experimental determination of protein-protein
complex structures remains difficult.
Three components of docking
pre- and/or
during docking:
Representation of receptor
binding site and ligand
during docking:
Sampling of configuration space
of the ligand-receptor complex
during docking
and scoring:
Evaluation of ligand-receptor
interactions
Bringing a New Drug to Market
1
compound
approved
Review and approval by Food
& Drug Administration
Phase III: Confirms effectiveness and monitors
adverse reactions from long-term use in 1,000 to
5,000 patient volunteers.
Phase II: Assesses effectiveness and
looks for side effects in 100 to 500 patient
volunteers.
5 compounds enter
clinical trials
Phase I: Evaluates safety and dosage
in 20 to 100 healthy human volunteers.
5,000 compounds
evaluated
0
2
4
6
8
Discovery and preclininal testing:
Compounds are identified and evaluated
in laboratory and animal studies for
safety, biological activity, and formulation.
10
12
14
Years
16
DIFFERENCE BETWEEN AN INHIBITOR AND DRUG
Extra requirement of a drug compared to an inhibitor
•Selectivity
LIPINSKI’S RULE OF FIVE
•Less Toxicity
Poor absorption or permeation are more
likely when :
•Bioavailability
-There are more than five H-bond donors
•Slow Clearance
-The mol.wt is over 500 Da
MlogP is over 4.15(or CLOG P>5)
•Reach The Target -The
-The sums of N’s and O’s is over 10
•Ease Of Synthesis
•Low Price
•Slow Or No Development Of Resistance
•Stability Upon Storage As Tablet Or Solution
•Pharmacokinetic Parameters
•No Allergies
Termodinamica dei complessi
Enzima-Inibitore:
E+I
EI
[E] [I]
Kd=
[EI]
∆Gbind = RT ln (Kd) = ∆Hbind - T∆Sbind < 0
• ∆H
• ∆S
Forza delle interazioni E-I
<0
Disordine Conformazionale > 0
Free Energy calculations
E
I
EI
〈 ΔG bind,sol 〉=〈 ΔH bind,vac 〉−TΔS bind,vac −〈 ΔG solv 〉−〈 ΔG solv 〉+〈 ΔG solv 〉
Basic principles
 The association of molecules is based on
interactions
 H-bonds
 salt bridges
 hydrophobic contacts
 electrostatic
 very strong repulsive (VdW) interactions on short
distances
 Ligands are flexible
 Receptors are mostly rigid
Sampling of configuration space of the
ligand-receptor complex
 Descriptor-matching: using pattern-recognizing geometric
methods to match ligand and receptor site descriptors
 geometric, chemical, pharmacophore properties, such
as distance pairs, triplet, volume, vector, hydrogenbond, hydrophobic, charged, etc.
 Molecular simulation: MD (molecular dynamics), MC
(Monte Carlo)
 Others: GA (genetic algorithm), similarity, fragmentbased
 No “best” method
Molecular simulation: MD & MC
 Two major components:
The description of the degrees of freedom
The energy evaluation
 The local movement of the atoms is performed
Due to the forces present at each step in MD (Molecular
Dynamics)
Randomly in MC (Monte Carlo)
 Usually time consuming:
Search from a starting orientation to low-energy
configuration
Several simulations with different starting orientation
must be performed to get a statistically significant result
Algoritmi genetici
Se consideriamo un problema che ha una soluzione dipendente da
n parametri e da k valori, una esplorazione completa richiede k n
operazioni.
Ma se noi sappiamo come si può evolvere il sistema (perchè
abbiamo un training set o sappiamo le regole) per ricavare il
risultato, sappiamo che alcuni passaggi non sono possibili o non si
sono mai verificati, e sappiamo che ci sono percorsi che sono
preferiti ad altri.
Se l’algoritmo viene modellato per rispettare gli schemi osservati e
viene calcolata per ogni passaggio una FITNESS, cioè un valore di
attendibilità, posso arrivare entro un certo numero di cicli ad avere
un risultato che ha una fitness ottimale per le mie aspettative
 posso simulare un crossing over tra due sequenze visto che so come il
crossing over avviene.
cromosomi
 posso simulare la mutagenesi visto che conosco le frequenze di mutazioni
e gli eventi mutageni che accadono
Coarse search step
generazione
valutazione della fitness
N
H
N
N
N
selezione del cromosoma con fitness maggiore
OH
N
H
mutazione e crossing-over
nuovi cromosomi sostituiscono i precedenti
O
OH
O
Fine search step
stop
Genetic algorithm docking
 Requires the generation of an initial population where
conventional MC and MD require a single starting
structure in their standard implementation.
 The collection of genes (chromosome) is assigned a
fitness based on a scoring function. There are three
genetic operators:
mutation operator randomly changes the value of a
gene;
crossover exchanges a set of genes from one
parent chromosome to another;
migration moves individual genes from one subpopulation to another.
Come combinare la ricerca
conformazionale e l'affinità
• Esistono diversi algoritmi di ricerca MC,
FFT , GA , etc.
• Docking Rígido o Flessible
• L'affinità si stima con una funzione
punteggio “Scoring” che stima il G
• Non sempre il risultato a miglior punteggio
è quello corretto
Fast Fourier Transform
FFT
IFFT
L
Correlation
R
FFT
Surface
Y
Interior
X
Binding Site
Increase the speed by 107
Uso di Griglie : FFT
Ligando
Proteína
Programmi: AutoDock
http://autodock.scripps.edu/
Programma: UCSF-Dock
http://dock.compbio.ucsf.edu/
Esempio 1
Disegno e valutazione di un
nuovo inibitore della chinasi C-Kit
Cosa si sa di C-Kit:
• Chinasi C-Kit: è un target terapeutico nel
trattamento dei tumori gastrointestinali
• Inibitore Imatinib: imatinib (o gleevec) è un
potente inibitore di C-Kit
• La mutazione di C-Kit D816V conferisce
resistenza all' imatinib, pero mantiene la chinasi
attiva
Come funziona Imatinib? –
Induce il loop di attivazione
a
adottare
una
conformazione inattiva
C-Kit Inattiva
Complesso con Imanitib
Mutante
resistente!
C-Kit y C-Kit D816V con e senza Imanitib x MD
D816V mutant
wild-type
Calcolo dell' Affinità x Imatinib di C-Kit wt e
Mutante D816V
H
(kcal/mol)
TS
G
Kd
wild-type+imatinib
21 nM -57.69
40.70
-17.0
D816V+imatinib
12 M -59.15
54.69
-4.5
Il problema sta nel contributo entropico...
Ridisegno dell' inibitore
L'Introduzione di un atomo di
Cloro per destabilizzare F811
e aumentare l' entropía del
complesso, aumentando la
affinita
D816V mutant
wild-type
Risultati:
(kcal/mol)
H
TS
G
Kd
D816V+prototype
22 nM
-60.17
45.29
-14.9
wild-type+prototype
47 nM
-57.25
39.02
-18.2
wild-type+imatinib
21 nM
-57.69
40.70
-17.0
M
il prototipo 12
Imatinib-Cl,
è capace di inibire la
C-Kit wt e il mutante D816V, mantenendo la
specificità per C-Kit.
D816V+imatinib
-59.15
54.69
-4.5
Three components of docking
pre- and/or
during docking:
Representation of receptor
binding site and ligand
during docking:
Sampling of configuration space
of the ligand-receptor complex
during docking
and scoring:
Evaluation of ligand-receptor
interactions
Types of scoring functions
 Force field based: nonbonded interaction terms as the
score, sometimes in combination with solvation terms
 Empirical: multivariate regression methods to fit
coefficients of physically motivated structural functions by
using a training set of ligand-receptor complexes with
measured binding affinity
 Knowledge-based: statistical atom pair potentials derived
from structural databases as the score
 Other: scores and/or filters based on chemical properties,
pharmacophore, contact, shape complementary
 Consensus scoring functions approach
Force field based scoring functions
lig rec
E= ∑ ∑
i=1 j=1
(
Aij
B ij
qi q j
− b +332
a
Dr ij
r ij r ij
)
e.g. CharmM in CDocker
 Advantages
FF terms are well studied and have some physical
basis
Transferable, and fast when used on a precomputed grid
 Disadvantages
Only parts of the relevant energies, i.e., potential
energies & sometimes enhanced by solvation or
entropy terms
Electrostatics often overestimated, leading to
systematic problems in ranking complexes
Empirical scoring functions
ΔG=ΔG 0 +ΔG rot N rot +ΔG HB ∑ neutral Hbonds f ( ΔR,Δα )
+ΔG io ∑ ionic _int f ( ΔR,Δα ) +ΔG aro ∑ aro _int f ( ΔR,Δα )
+ΔG lipo ∑ lipo . cont f ( ΔR,Δα )
LUDI
PLP
LigScore
Jain
 Counts the number of interactions and assign a score
based on the number of occurrences
 H-bonds, ionic interactions (easy to quantify)
 Hydrophobic interactions (more difficult to assess
and quantify)
 Number of rotatable bonds frozen (link to entropic
cost of binding, quite difficult to estimate)
 Advantages: fast & direct estimation of binding affinity
Knowledge-based potentials of mean
force scoring functions (PMF)
 Assumptions
An observed crystallographic complex represents the
optimum placement of the ligand atoms relative to the
receptor atoms
 Advantages
Similar to empirical, but more general (much more
distance data than binding energy data)
 Disadvantages
PMF are typically pair-wise, while the probability to find
atoms A and B at a distance r is non-pairwise and
depends also on surrounding atoms
Consensus Scoring
 Combination of several scoring functions
 The common top rankers get a higher consensus rank
than single outliers
 False positives can be detected easier than one
singular scoring function
 Advisable to use 2-4 well-suited scoring functions for
the consensus score
Take home message
There is no best method!
 Try different methods, force-fields, scoring functions
 Refer to your results as a suggestion
 Use the experimental data
ALCUNI programmi per fare docking,
alcuni web server dedicati
ClusPro: http://nrc.bu.edu/cluster/
Z-Dock (http://zlab.bu.edu/zdock/) può essere scaricato sul proprio computer
Z-Dock su server: http://zdock.bu.edu/
http://bioinfo3d.cs.tau.ac.il/PatchDock/
http://vakser.bioinformatics.ku.edu/resources/gramm/grammx
http://www.csd.abdn.ac.uk/hex_server/
Scarica

Interazioni Proteina-Ligando