Molecular Dynamics Very often in Chemistry and Biochemistry, molecular structures are displayed as static, rigid entities with the exception of rotation about single bonds. However, a molecule is not a rigid, fixed struture at some minimum energy but is actually constantly vibrating at a higher energy somewhere above the relaxed potential energy surface. When this conceptual barrier is overcome, it is possible to understand that the very large structural changes taking place in biological systems can be broken down into many smaller changes. What we are in fact observing is a cumulative effect of many, small, thermally accessible changes resulting from the dynamical properties of proteins. Although these dynamical changes may not be observed experimentally, they may be calculated by theoretical methods such as Molecular Dynamics. The starting point for the Molecular Dynamics (MD) method is a well-defined microscopic description of a macroscopic physical system. This description can be a Hamiltonian or Lagrangian, or a direct expression in Newton's equations of motion. As the name suggests, the Molecular Dynamics method calculates properties using the equations of motion and allows for the calculation of both static and dynamic properties. On the other hand, the Monte Carlo method produces configurational properties. Molecular Dynamics simulations are sometimes referred to as deterministic in the sense that, given a set of initial coordinates and velocities with a force (F(r)) the evolution of the system over time is now determined. Also, the motion of the particles in Molecular Dynamics is collective because the collection of forces causes ALL the particles to move over time. This is distinct from the Monte Carlo method which requires external changes in the system by allowing random moves to produces the changes. The force field for a typical protein can be given as a sum of the various components including bond stretching and bending, torsional potentials, and non-bonded interactions. It is then possible to calculate the forces between the atoms by taking the derivative of the potential V(r) given above and leads to the following equation Integrazione numerica delle equazioni del moto 1 xt t xt x t t xt t 2 ..... 2! Simulation Times, The Time Step For a large protein consisting of ~15000 atoms a Molecular Dynamics is very computationally demanding and is usually limited to 300 - 400 picoseconds of simulation time although simulation times of 1 nanosecond for a few large proteins have been performed. (1 picosecond (ps) = 1 x 10-12 sec; 1 nanosecond (ns) = 1 x 10-9 sec; 1000 ps = 1 ns). Molecular Dynamics is extremely useful to study properties that depend on high frequency vibrations. It is difficult to study properties such as rotations, diffusion mechanisms, and viscosity using Molecular Dynamics where local events are equilibrated on the order of 100 ps and longer. Unless small time steps are used, very significant errors in the energy will be introduced. In practice the time step is taken in femtoseconds (1 femtosecond (fs) = 1 x 10-15 sec). To ensure numerical stability for the solution of the equations of motion, the time step must be at most 1/100 - 1/20 the period of fastest oscillation in the simulated system. The C-H, N-H, and O-H vibrations have periods of order 10-14 sec (= 10 fs). If the time unit is chosen as 1 tu = 10-14, then the time step becomes on the order of 0.2 fs (0.02 tu). However, it is possible in Molecular Dynamics simulations to constrain the C-H, N-H, and O-H distances allowing the use of time steps up to 2 fs. The procedure involved is called SHAKE in which the H atoms are initially moved according to the algorithms given above. Then, since each bond is considered independently of another, all the C-H, N-H, and O-H distances are corrected such that the bond length is constrained within a given tolerance. The `Leap Frog' algorithm 2 t t vi tn t vi tn vi tn vi tn ..... 2 2 2 2 2 t t vi tn t vi tn vi tn vi t n ..... 2 2 2 2 sottraendo membro a membro: t t vi t n vi t n vi t n t ..... 2 2 ovvero: t t Fi tn vi tn vi tn t 2 2 mi equazione 1 Analogamente per la posizione: t t t 1 t t ri tn t ri t n ri tn ri t n ..... 2 2 2 2 2 2 2 t t t 1 t t ri tn ri t n ri tn ri t n ..... 2 2 2 2 2 2 2 sottraendo membro a membro: t ri t n t ri t n ri t n t ..... 2 ovvero: t ri tn t ri tn vi tn t 2 equazione 2 condizioni iniziali Fi(tn) Fi(tn+t) ri(tn) ri(tn+t) vi(tn-t/2) | tn | tn+t/2 vi(tn+3t/2) | tn+t { { { | tn-t/2 vi(tn+t/2) passo 1 equazione 1 passo 2 equazione 2 passo 3 equazione 1 | tn+3t/2 | Verlet algorithm 1 1 2 ri t n t ri t n ri t n t ri t n t ri t n t 3 ..... 2 3! 1 1 ri t n t ri t n ri t n t ri t n t 2 rit n t 3 ..... 2 3! sommando membro a membro: ri tn t 2ri tn ri tn t ai tn t 2 ..... dove: L’espressione di Fi tn Vi tn ai tn mi mi ri tn t è corretta fino ai termini di ordine t 3 Per ricavare la velocità viene fatta l’approssimazione che vi rimanga costante durante ri tn t ri tn t vi tn t 2 t : Performing a Simulation: Minimization, Heating, Equilibration, and Production Initially, the system should be minimized to eliminate the close atom-atom contacts, since this could produce large forces and consequently local strain. After the system is minimized, it is gradually heated to the desired temperature by assigning velocities to the atoms according to a Maxwell distribution. This is done periodically to the system to ensure that there are no local heating problems. After the system attains the desired temperature, it is important to go through an equilibration phase to allow for a redistribution of energy and produce stability. However, it may still be necessary to periodically rescale the velocities to bring the system to the desired temperature. The final phase involves production dynamics. In the Microcanical Ensemble (NVE dynamics) the system undergoes dynamics with no rescaling of the velocities. The temperature may start to drift as a result of a short cut-off for the long-range electrostatic forces. coordinate di input minimizzazione 1 N 3 2 mi vi NkT 2 i1 2 assegnazione delle velocità: alcuni picosecondi alcuni picosecondi rinormalizzazione delle velocità per avere momento lineare nullo calcolo forze calcolo nuove posizioni ( t =.2 femtosecondi) normalizzazione delle velocità per mantenere T costante calcolo forze calcolo nuove posizioni ( t =.2 femtosecondi) il sistema è stabile? no si T finale? si no T T continuazione simulazione memorizzazione di coordinate e velocità Constrains Vtot = V campo di forza + V constrains V constrains = A (grandezza calcolata-grandezza osservata)2 rij NMR NOE Fh, k,l X ray grandezza 2 1 N Ecinetica mi ri 2 i 1 Ecinetica T 3 KN 2 V N i 1 i Etotale Ecinetica V Valutazione statistica dell’entropia S k pq lnpqdq 1 pq e Q Q e V q kT V q kT dq Matrice di covarianza configurazionale s s ij qi qi q j qj Funzione di crosscorrelation: h f t gt dt Autocorrelazione: h f t f t dt Autocorrelazione della velocità: h vi t vi t dt Monte Carlo and Dynamic Simulations Methods Similarities Both the Monte Carlo (MC) and Stochastic Dynamics (SD) methods are thermodynamically equivalent ways to generate conformational states. Both methods are stochastic techniques which incorporate randomness in searching for microstates in phase space. SD assigns random initial velocities to every atom followed by deterministic classical mechanics. On the other hand, MC continually assigns random changes to the system by modifying a few degrees of freedom at a time. Both methods incorporate the concept of temperature and thermal fluctuations allowing higher energy structures and producing a wide spectrum of conformational states which may accurately reflect the experimental conditions. Differences In SD the next configuration is generated by moving the atoms in a deterministic manner as a result of evaluating the force defined by energy gradients. Each new structure is a unique result based on the forces acting on the previous structure. In the MC method, the atoms are moved randomly, and the next conformation is only a guess. The main difference between SD and MC is that SD uses forces and requires that the derivatives of the energy be calculated while MC requires only the calculation of the energy, E. Because SD integrates the classical equations of motion, it is quite suitable for studying time dependent phenomena. On the other hand, only E is needed for MC, and there is no concept of time. In un insieme canonico, il valore medio di una qualsiasi grandezza fisica A si può ottenere dalla media su tutte le copie diverse del sistema, secondo l’espressione: A e Ej KT Aj j e Ej KT j Passando a variabili continue nello spazio delle fasi: A Ae e E q, p KT E q, p KT dqdp dqdp Eq, p V(q) T( p) quando si può scrivere: si ottiene: A Ae e V q KT V q KT dq e dq e T p KT T p KT dp dp Ae e V q KT V q KT dq dq La superficie di energia potenziale presenta ampie zone stericamente proibite, con profondi minimi Schematicamente si ha: e-V/KT Da qui l’importanza di una esplorazione delle zone a energia potenziale minore q Metodo Montecarlo Calcolo di un integrale b f(x) I f x dx a I f b a b a x Selezionando n numeri a caso nell’intervallo 1 n f a f x i n i 1 b Per n grandi: a a,b lim f a f n b a n f x dx b a f b a f a f xi n i1 Metropolis Monte Carlo The most common technique employed in MC simulations is the Metropolis algorithm and involves the following steps: 1. 2. 3. 4. 5. Construct an initial configuration for a molecule. Make a random trial change in one of the degrees of freedom (e.g. dihedral angle) in a random molecule. Compute the change in energy ( E) of the system due to the trial change. If ,accept the new configuration. If ,compute the transition probability, Generate a random number, r in the interval [0,1], then accept the move, if or retain the previous configuration, if By comparing the Boltzmann weighted energy difference to a random number, a few higher energy conformers will be accepted, and energy barriers can in principle be crossed. simulazione MonteCarlo Metropolis Il rapporto tra le popolazioni è determinato dalla legge di Boltzmann: Sia Vn l'energia potenziale della configurazione n-esima e Vn-1 quella relativa alla configurazione precedente (n-1)-esima. Ci si trovi in una situazione in cui Vn > Vn-1. In tal caso la nuova configurazione viene accettata con una probabilità: pe Vn Vn1 kT e V kT e ciò viene fatto tramite la generazione di un numero casuale compreso tra 0 e 1, e se tale numero è ≤ p, la configurazione viene accettata e memorizzata, se è > p, la configurazione non viene accettata, e viene memorizzata la configurazione (n-1)-esima. Distribuzione delle configurazioni Er Es Siano date 2 configurazioni diverse r e s del sistema in studio e sia La probabilità Prs che venga generata una catena di trasformazioni che porta il sistema dalla configurazione r alla s, è a priori = alla probabilità Psr che venga generata una catena di trasformazioni che porta il sistema dalla configurazione s alla r: Prs Psr r s numero di configurazioni r ottenute numero di configurazioni s ottenute Numero di passaggi Numero di passaggi rs sr vr Prs vs Psr e E KT vs Psr e e Er KT Es KT In una simulazione infinita, il numero di configurazioni varia fino a raggiungere l’equilibrio tra i passaggi nelle 2 direzioni: vr Prs vs Psr e e Er KT Es KT Er KT E i vr e Es vi e KT vs e KT Poiché la distribuzione è già canonica, è sufficiente fare la media dei valori che la proprietà assume in ogni singola configurazione: N F F i i1 N dove N è il n. totale di configurazioni.