Lecture notes on Tight-Binding Molecular Dynamics,
and Tight-Binding justification of classical potentials
Furio Ercolessi
Dipartimento di Fisica, Università di Udine
email: [email protected]
7 February 2005 (minor revision 4 November 2010)
Introduction
The purpose of these notes is:
1. to illustrate the main aspects of the tight-binding (TB) method;
2. to sketch how TB molecular dynamics is done in practice;
3. to show how TB supplies ideas useful for the construction of empirical
potentials.
In TB approaches one works within the one-electron, self-consistent field
approximation (Kohn-Sham equations in the density functional theory framework). There are several kinds of TB models, differing in the kind of approximations made to the full density functional total energy expression. The common
denominator is the use of a minimal basis of local orbitals. I will not consider
the more accurate approaches, which definitely fall in the class of first-principle
methods, and focus the attention on the more approximate approaches—which
are of course those that allow molecular dynamics calculations with large number of particles.
It should also be noted that TB treatments in elementary solid state books
are focused on perfect crystals, and usually show how the band structure arises
within this scheme when applied to situations with translational symmetry (introducing the k vector, constructing Bloch functions, etc.). I will not follow this
road here, as we are mainly interested in applying this formalism to molecular
dynamics calculations for systems that do not present in general any particular
symmetry. We will therefore focus on a real space description, and ignore the
aspects related with band theory. So, no k vector appears in these notes.
These notes are just an introduction to the subject. The reader interested
in going deeper may consult the references given at the end of this document
[1, 2, 3, 4].
The tight-binding formalism
In TB approaches one writes the (unknown) total wavefunction ψ(r) as a linear
combination of atomic orbitals (LCAO):
ψ(r) =
X
iα
ciα φiα (r)
(1)
Here i is the atomic index and α runs on the different orbitals (such as s, px ,
py , pz etc). The orbital φiα (r) will be centered on the position ri of atom i.
In the vast majority of cases bonding is dominated by valence electrons, so we
will not include all the orbitals but only those which are believed to form the
bonding by overlapping with each other and “hybridizing”. These are often the
s and p orbitals of the outer shell (sp-bonded materials), but d orbitals are also
essential in transition metals (sd-bonded materials).
In the general case orbitals will not be an orthonormal set1 and we define
the overlap integrals as
Siαjβ = hiα|jβi =
Z
φ∗iα φjβ dr
(2)
In orthogonal-TB schemes S reduces to the unit matrix.
Of course, the true optimal one-electron wavefunction will not necessarily
be of the form (1) as atomic orbitals, being a finite number of functions, can not
constitute a complete set. However, in the spirit of the variational principle,
the “best” wavefunction within this class will be the one that minimizes the
functional
Z
Z
G[ψ] = ψ ∗ Hψ dr − ψ ∗ ψ dr
(3)
It can be shown (the reader is referred for details to section 5.3 of the Meccanica Quantistica course notes) that minimizing this functional corresponds to
solving the generalized eigenvalue problem
Hc = Sc
(4)
where c is the vector of coefficients ciα , S is the overlap matrix defined above,
and H is the Hamiltonian matrix whose elements are
Hiαjβ = hiα|H|jβi =
Z
φ∗iα Hφjβ dr
(5)
Explicitly, we need to solve
X
(Hiαjβ − Siαjβ )cjβ = 0
(6)
jβ
for all values of i and α.
This is a standard problem that can be easily solved with linear algebra
methods. The crucial feature of the tight-binding method is that orbitals are
centered around atoms and decay rapidly. Therefore, the S and H matrix
elements will be non-zero only for on-site terms (i = j) and between atom pairs
i and j whose relative distance is less than a certain cutoff distance.
The solution will consist of a set of eigenvalues s giving the energy levels,
(s)
and the corresponding eigenvectors ciα giving the related wavefunctions. One
will then populate the energy levels with all the electrons that have to be
1
We indicate later why non-orthogonal schemes are to be preferred.
2
accommodated, starting from the lowest level and going upwards, two electrons
per level, implying a contribution to the total energy
X
Uband = 2
s = 2
occ
X
ns s
(7)
s
where ns is 0 or 12 and represents the occupancy of the state s with eigenvalue
s .
This energy term is called the band energy, and is usually attractive and
responsible for the cohesion. In fact, if atoms get closer their overlap increases,
the range of the eigenvalues increases and, since only the lowest energy states
are occupied, the energy decreases (bonding increases).
However, the TB formalism shown above describes only bonding due to the
outer electrons. If one brings two atoms close together, inner shells will start to
overlap and bring additional energy (in the form of a strong repulsion) that is
not included in the band energy term. The total energy will therefore be given
as
E = Urep + Uband
(8)
where Urep is an empirical repulsive energy term. In most cases this is modelled
simply as a sum of two-body repulsive potentials between atoms:
1X
vR (rij )
2 ij
Urep =
(9)
but many-body expressions such as
Urep =
X


X
g  Φ(rij )
i
(10)
j
(where g is a non-linear embedding function) have also been proposed.
Finally, let us note that the band energy term can be formally rewritten as
follows:
Uband = 2
X
ns s = 2
X
s
ns hψs |H|ψs i = 2
XX
s
(s)∗ (s)
ns ciα cjβ Hiαjβ
(11)
s iαjβ
since
|ψs i =
X (s)
ciα |iαi
(12)
iα
If we define
ρiαjβ =
X
(s)∗ (s)
ns ciα cjβ
(13)
s
then
Uband = 2
X
s
ns s = 2
X
ρiαjβ Hiαjβ = 2 Tr(ρH)
(14)
iαjβ
(which is of course invariant with respect to rotations of the reference system) ρ
is called the density matrix and it basically contains the information contained
2
At high temperatures one could also use, more accurately, a Fermi distribution.
3
in the eigenvectors. The band energy can be seen as a weighted sum of the
hamiltonian matrix elements, the weights being the density matrix elements.
The part of Uband including only terms with i 6= j in the summation is also
called the bond energy:
Ubond = 2
X
ρiαjβ Hiαjβ
(15)
iαjβ
i6=j
In some schemes the terms of Uband with i = j are modelled within Urep , so in
these schemes one has
E = Urep + Ubond
(16)
The bond order
The quantity ρiαjβ defined by (13) in the context of eq. (15) (that is, when
i 6= j) is also called the bond order. Bond orders are important indicators of the
strength of the bonding. To be non-zero, the state has to be occupied and must
involve orbitals of both atoms i and j; but these are only necessary conditions.
Cancellation effects may in fact reduce the bond order for states that are not
effective in binding atoms together.
Let use consider for instance the hydrogen molecule, with two s orbitals
centered on the two atoms, φ1 and φ2 . As well known, they give rise to a
bonding and an antibonding state:
ψb =
ψa =
1
√ (φ1 + φ2 )
2
1
√ (φ1 − φ2 )
2
The definition (13) reduces in this case to
(b)∗ (b)
(a)∗ (a)
c2
ρ12 = nb c1 c2 + na c1
1
= (nb − na )
2
(17)
√
(b)
(b)
(a)
(a)
having used c1 = c2 = c1 = −c2 = 1/ 2.
We see that if we place both electrons in the bonding state, nb = 1, na = 0
(we are using a convention where n = 1 means that the state is occupied by
two electrons with opposite spin) and ρ12 = 1/2. If we place one electron in the
bonding state and one in the antibonding state, nb = na = 1/2, then ρ12 = 0, in
agreement with the fact that there is no bonding in this case. The bond order
becomes negative if we place both electrons in the antibonding state.
In this case, coefficients with the same (opposite) sign between two atoms
indicate that the state tends to be bonding (antibonding), but this is true only
for s orbitals. For instance, for a diatomic molecule pz orbitals (where z is the
molecule axis) are bonding when the coefficients have opposite sign.
4
Empirical tight-binding
The elements of the overlap and hamiltonian matrices are of course a function
of the atomic positions. One can evaluate them by computing the integrals,
but this is of course a compute-intensive approach. In empirical tight-binding
such computations are not made, and there is not even the need to know the
atomic orbitals: in these schemes one uses parametrized analytical expressions
for the matrix elements as a function of the atomic coordinates.
Let us consider the matrix element
Hiαjβ = hiα|H|jβi = hiα|T |jβi + hiα|Veff |jβi
(18)
where the kinetic and potential terms in the Kohn-Sham hamiltonian are considered separately. It can be immediately seen that
hiα|T |jβi =
Z
1
φ∗iα (r) − ∇2 φjβ (r)
2
(19)
depends only on the relative distance rij between the two atoms, and not on
their environment: it is a two-body term (unless i = j, in that case it is an
on-site term). This is in general not true for the potential term. If one makes
the additional assumption that Veff can be expressed as a sum of atom-centered
contributions,
X (i)
veff (r − ri )
(20)
Veff (r) =
i
then it can be seen that, if i 6= j, we only get two- and three-centre integrals:
(i)
(j)
hiα|Veff |jβi = hiα|veff |jβi + hiα|veff |jβi +
X
(k)
hiα|veff |jβi
(21)
k6=i,j
and if i = j we only get one- and two-centre integrals:
(i)
hiα|Veff |iβi = hiα|veff |iβi +
X
(j)
hiα|veff |iβi
(22)
j6=i
In the most commonly used approximate TB schemes, it is assumed that
(20) is valid and three-centre integrals are neglected. Moreover, one assumes
that one- and two-centre integrals are not calculated explicitly, but directly
modelled through an explicit analytical form, which in the case of two-centre
integrals is a function of the distance rij between the two atoms. Sometimes
these schemes are also based on an orthonormal basis assumption (Siαjβ =
δij δαβ ), so only the Hiαjβ (r) matrix elements are given.
Clearly this approximation will lead to some errors. To some extent the lack
of three-centre integrals can be compensated when choosing the parameters to
fit one- and two-centre integrals, but this is certainly a serious approximation
that may be difficult to circumvent and may lead to transferability problems.
The number of required parameters seems to be rather large. For instance,
in a full spd model we would have 1 + 3 + 5 = 9 orbitals on each site, and
they would give rise to 9 × 9 = 81 hopping integrals between each pair of
atoms. However, in the two-centre approximation symmetry greatly reduces
5
the number of independent hopping integrals, as worked out by Slater and
Koster in 1954. They showed that in the generic case (different atoms) only 19
of those 81 integrals can be non-zero, 15 of them independent. The number of
independent integrals goes down to 11 if the two atoms involved are identical.
These symmetry considerations apply to both the bond integrals and the overlap
integrals (when present).
Force computation
Forces on atoms, which are the main ingredient for a MD calculation, are defined
as
Fi = −∇i E
(23)
where E is the energy given by eq. (8). While the repulsive term, expressed
as a classical potential, poses no problem, derivatives of the band energy term
require some attention.
As indicated by linear algebra methods, multiplying eq. (4) on the left by cT ∗
(where the T index means transposition) will give, using properly normalized
eigenvectors,
= cT ∗ Hc
(24)
or
s =
X (s)∗
(s)
ciα Hiαjβ cjβ
(25)
iαjβ
All the terms depend on the atomic positions, therefore
ds
=
dR iαjβ
X

(s)∗
dc
 iα
dR
(s)∗ dHiαjβ (s)
cjβ
(s)
Hiαjβ cjβ + ciα
dR
dcjβ

dR

(s)
(s)∗
+ ciα Hiαjβ
(26)
(here R means any component of the vector position of any atom). However,
the eigenstates are orthonormal:
(s0 )
X (s)∗
ciα Siαjβ cjβ = δss0
(27)
iαjβ
hence the derivative of this quantity has to be zero:

0=
X
iαjβ
(s)∗
dc
 iα
dR
(s)∗ dSiαjβ (s)
cjβ
(s)
Siαjβ cjβ + ciα
dR
dcjβ

dR

(s)
(s)∗
+ ciα Siαjβ
(28)
These two equations for the derivatives can be combined by subtracting (28)
multiplied by s from (26) and obtaining
X (s)∗ dHiαjβ
dSiαjβ
ds
=
ciα
− s
dR iαjβ
dR
dR
The first and third terms have vanished because of (6).
6
(s)
cjβ
(29)
We can now calculate the force:
X ds
X dHiαjβ
dSiαjβ
dE
= −2
ns
= −2
ρiαjβ −
wiαjβ
dR
dR
dR
dR
s
iαjβ
−
(30)
where ρiαjβ is the density matrix defined by (13), and
wiαjβ =
X
(s)∗ (s)
ns s ciα cjβ
(31)
s
is an energy-weighted density matrix.
Usually Hiαjβ = hαβ (rij ) and Siαjβ = sαβ (rij ) are known functions of the
distance between pairs of atoms, so one can use (30) to calculate the force on
each atom, provided that the vectors c are exact eigenvectors of H.
The actual force calculations will of course have to deal with the angular
dependences of the hopping and overlap integrals. There are efficient library
routines available that cope with this problem, so that it is not of practical
concern.
To summarize, this is how a TB MD calculation works in practice for each
time step:
1. obtain the bond integrals and, when present in the model, the overlap
integrals as parametrized analytical functions of the interatomic distances;
2. solve the eigenvalues problem to obtain the density matrix ρ and the
energy-weighted density matrix w (this is the compute-intensive part);
3. compute the forces using (30);
4. move the atoms to the new positions;
5. go to step 1 and repeat.
The transferability problem
As always the case with approximate interaction schemes, transferability can be
a major problem with TB. In other words, a parametrization appropriate for a
certain environment, such as for instance bulk atoms in a semiconductor with
a diamond-like structure, may not be appropriate for a different environment
where coordination is different. This issue is particularly important for TBMD
calculations, where atomic environments can change in unpredictable ways.
The transferability problem is less strong when the orbitals are more localized, because there is less overlap with neighboring atoms and therefore the
parameters are less dependent on the environment. Orbitals of isolated atoms
are not really good for TB, because they have a fairly long exponential tail.
Generally, the best transferability is achieved with the most localized basis
sets.
In this sense, non-orthogonal parametrizations (that is, where S is not the
unit matrix), even if computationally heavier, are preferable. The reason for
this is that an orthogonal basis set that is equivalent to a non-orthogonal one
7
(in the sense that it reproduces the same eigenvalues for a given structure) must
necessarily have orbital wavefunctions with a longer range. Such wavefunctions
usually have oscillatory tails, which make it possible to have all overlaps equal
to zero. However this feature “binds” the model to a certain crystal structure:
orthogonality can not be easily achieved by the same model in the context of a
different environment. As a consequence, transferability will be lost.
For this reason, the tendency of recent parametrizations is to use nonorthogonal schemes (and this is why I presented the slightly more complex
non-orthogonal theory in these notes).
O(N) Methods
Diagonalizing a N × N matrix is usually a O(N 3 ) task; N would be in this case
the number of valence electrons involved. Therefore, this is the most computeintensive part of a standard empirical TB calculation. Recently, approaches that
scale linearly with N have been proposed and used. These approaches are based
on the use of the density matrix, and assume that this matrix is short-ranged,
namely that only matrix elements between atoms within a certain cutoff radius
are nonzero. The physical justification behind this assumption is electronic
screening. Thanks to screening, a perturbation made in a point of the system
is not seen far from that point. The reader can consult [1] for more detail.
Tight-binding and classical potentials
The tight-binding formalism allows to establish a “bridge” between the firstprinciples interactions as obtained from electronic structure methods, and empirical classical potentials (where the electronic degrees of freedom have been
formally eliminated). This connection is useful to get hints about the analytical
forms best suited to model materials realistically.
Local density of states
We first need to define some concepts, putting emphasis on locality as our classical potentials must be able to change as the local environment changes—this
is the key to accurate modelling and good transferability. For sake of simplicity,
we will assume in this section that the basis of orbitals is orthonormal. This
is not a fundamental assumption; it is done just to simplify the treatment, but
the final conclusions remain valid in the general case.
Let us start by rewriting (11) introducing the Fermi distribution n() and
a density of states
X
D() =
δ( − s )
(32)
s
so that
Uband = 2
X
Z +∞
ns s = 2
s
8
n()D() d
−∞
(33)
In turn, we can consider D() to be the sum of local density of states, one
for each orbital in the system:
D() =
X
Diα ()
(34)
|hψs |iαi|2 δ( − s )
(35)
iα
This equation is satisfied if we choose
Diα =
X
s
as can be readily verified by summing on iα making use of iα |iαihiα| = 1.
The local density of states is an important quantity in the context of atomic
bonding. When atoms are brought together to form a condensed system, their
energy levels become “bands” and we can consider the band seen by each atom
as described by the local density of states. We can now say that
P
Uband = 2
X Z +∞
−∞
iα
n()Diα () d
(36)
and this is useful because when using classical potentials the total energy is
P
almost always written as a sum over atoms, E = i Ei . Therefore, the relation
above is our key to establish the bridge between the electronic and classical
worlds.
The moments
Diα () can be a rather complex function, but we can attempt a simplified description in terms of its moments, as proposed by the group of Cyrot-Lackmann.
The p-th moment of the distribution is defined as
µiα
p =
Z +∞
−∞
p Diα () d
(37)
that we can also write, using (35),
µiα
p =
X
|hψs |iαi|2 ps
(38)
hψs |iαihiα|H p |ψs i = hiα|H p |iαi
(39)
s
This can be expanded as
µiα
p =
X
s
This expression is particularly nice, because the orbitals are fixed and known,
so this is a real space function of the positions of the atoms i and j that can be
evaluated without solving any Schrœdinger equation.
iα
Note that µiα
0 = 1, and µ1 represents an average energy for the atomic
orbital, related with the choice of the zero for the energy scale. The really
relevant informations on the shape of Diα () are therefore contained in µiα
2 and
higher moments. Let us then concentrate on the second moment, which is the
most important one. If we assume that the average energy is zero, the second
9
moment is a variance and therefore represents the square of the energy width, or
“bandwidth” W of the local density. For instance, if one assumes that Diα () is
a rectangular distribution centered on 0, of width W and height 1/W , then the
2
second moment is W 2 /12. More generally, we can always consider µiα
2 ∝W .
We can write the second moment as
2
µiα
2 = hiα|H |iαi =
X
hiα|H|jβihjβ|H|iαi
(40)
jβ
This can be seen as the sum of all the possible two-hops paths that start from
iα and come back to iα. It is easy to see by the same argument that µiα
p can
be written as the sum of all the possible p-hops paths that start from iα and
come back to iα. So we have
µiα
3 =
X
hiα|H|jβihjβ|H|kγihkγ|H|iαi
(41)
jβkγ
and so on. This result is known as the moment theorem. The very remarkable
result is that all moments can be constructed out of the same building blocks
that are usually just a function of the distance between pairs of atoms. It is
precisely this result that allows us to connect the tight binding theory with
empirical potentials.
Glue and embedded-atom potentials
[For background information on these potentials please see ref. [5]].
Let us assume that we can neglect the third and higher order moments and
can describe a system just in terms of the second moments. Also, let us drop
the orbital indexes in this section of sake of simplicity, and also because we are
now moving to a very approximate, almost qualitative approach.
Consider eq. (36) for the tight-binding energy. Each term in the summation
will then depend only on µi2 , as this is by hypothesis the only quantity that
characterizes Di (). But µi2 is a quantity that has the dimensions of an energy
squared, so it must be necessarily
Uband ' −A
Xq
µi2
(42)
i
where A is a positive proportionality constant. This amounts to say that the
energy decrease is proportional to the bandwidth, which is understandable.
Think for instance to a situation where the band is half-populated: the average
energy will decrease proportionally to W when W is increased.
The moment theorem (40) tells us that
µi2 =
X
[h(rij )]2
(43)
j
where we have assumed that h(rij ) = |hi|H|ji| is a function of the distance
between i and j. Putting together the two results we have
Uband ' −A
X sX
i
10
j
[h(rij )]2
(44)
P
P
which is a glue potential term i U [ j ρ(rij )] with a glue function U (n) =
√
−A n and an “atomic density” function ρ(r) = [h(rij )]2 . n is here the “generalized coordination” of the glue scheme. The tight-binding approach suggests
that the energy should depend on the square root of the coordination (this
may be contrasted with the linear dependence upon coordination implied by
two-body interaction models).
In practice, in real parametrizations the functions U (n) and ρ(r), together
with the pair potential that enters the Urep part, are fitted to experimental and
simulation data. In this sense, the approach presented in this subsection does
not really have a quantitative validity. Its importance lies in the fact that it
shows how this kind of Hamiltonians arises “naturally” out of the tight-binding
framework, with a glue function U exhibiting a positive curvature.
Such potentials are also called pair functionals, as they not pairwise potentials but still constructed using only the distance between pairs of atoms.
They work fairly well for certain classes of metals such as noble metals, where
angular forces between atoms are weak. However, for transition models and
covalent materials such as semiconductors, higher order moments play a crucial
role and cannot be neglected. In particular, it has been found that second moment schemes cannot reproduce accurately energy differences between crystal
structures.
Pair functionals from higher order moments
If higher order moments have to be considered for a realistic modelling, we move
from pair functionals to cluster functionals. If we limit ourselves to fourth
moments, our theory suggests that the functional form for the energy of the
system will be of the type
Uband =
X
V (µi2 , µi3 , µi4 )
(45)
i
This can be computed once one has an appropriate parametrization of the local
density of states in terms of these three moments. The moment theorem tells
us that µi3 consists of three-body terms, and µi4 consists of four-body terms.
The moments will be combined to give an energy by means of a non-linear
expression.
Several models of this kind have been studied in the past years. Unfortunately, it was found that even fourth moment descriptions are not able to
explain subtle energy differences even in relatively simple sp-bonded materials
such as semiconductors. This has led researchers to explore alternative models.
Bond order potentials
In covalent materials there is an accumulation of electronic charge in the regions
between pair of atoms. In such cases, it may be more convenient to focus on
the concept of bond rather than on the concept of atom (as implied by (45))
when writing the analytical form for the classical potential.
11
To this end, let us go back to the bond order concept and consider in
particular eq. (14), which we rewrite again dropping the orbital indexes:
Uband = 2
X
ρij Hij
(46)
ij
ρij is what we called the bond order, and is a quantity that depends on the
environment. Therefore, it can not be modeled as a function of the distance
between i and j, and it will be influenced by the nature and position of all
the atoms in the surrounding of the i-j bond. In this sense we are still dealing
with a cluster functional, but of a slightly different nature with respect to those
described above.
To model ρij one typically considers bonding and antibonding orbitals,
that—as discussed for the H2 molecule case—bring contributions of different
signs to the bond order. The details of the models and fitting procedures can
be rather complex and will not be discussed here.
One of the most successful empirical potentials for semiconductors based on
the bond order concept is that proposed by Tersoff [6] for silicon, germanium
and carbon. The Tersoff scheme is based on an analytical form
E=
1X
[ΦR (rij ) + ρ(zij )ΦA (rij )]
2 i6=j
(47)
where ΦR (r) is a repulsive pair potential, ΦA (r) an attractive pair potential, and
ρ the bond order modelled as a decreasing function of a “bond coordination”
variable zij , in turn defined in terms of three-body terms dependent on distances
and angles. The choices made by Tersoff in his original implementation are
zij =
ΦR (r) = Ae−λ1 r
(48)
ΦA (r) = −Be−λ2 r
(49)
ρ(z) = (1 + β n z n )−1/2n
(50)
X
fc (rik )g(θijk )eλ
3 (r −r )3
ij
ik
(51)
k6=i,j
fc (r) =


 1
1
2


1
2
− sin
π r−R
2 D
if r < R − D
0
g(θ) = 1 +
if R − D < r < R + D
if r > R + D
c2
c2
−
d2 d2 + (h − cos θ)2
(52)
(53)
where A, B, λ1 , λ2 , β, n, λ, R, D, c, d, h are 12 fitting parameters.
These potentials are qualitatively correct, and definitely better than potentials from the previous generation of purely “geometrical” potentials with
three-body terms (Stillinger-Weber), in particular from the transferability point
of view. Some problems remain, which may be related to the parametrization
and fitting rather than to the general scheme.
12
References
[1] L. Colombo, Tight-binding molecular dynamics: A primer, Rivista del
Nuovo Cimento 10, 2005 (2006).
[2] A. E. Carlsson, Solid State Phys. 43, 1 (1990).
[3] M. Finnis, Interatomic forces in condensed matter, Oxford University Press,
Oxford (2003).
[4] K. Ohno, K. Esfarjani and Y. Kawazoe, Computational Materials Science,
Springer-Verlag, Berlin (1999) (chapter 3).
[5] F. Ercolessi, A molecular dynamics primer, http://www.fisica.uniud.
it/~ercolessi/md/, and references therein.
[6] J. Tersoff, Phys. Rev. B 37, 6991 (1988).
13