a r X i v : 1 0 0 6 . 3 2 5 4 v 1 [ c o n d  m a t . m t r l  s c i ] 1 6 J u n 2 0 1 0
physica status solidi, 17 June 2010
Accuracy of Quantum Monte CarloMethods for Point Defects in Solids
William D. Parker
1
, John W. Wilkins
1
, Richard G. Hennig
*1,2
,
1
Department of Physics, The Ohio State University, 191 W. Woodruff Ave., Columbus, Ohio 43210, USA
2
Department of Materials Science and Engineering, Cornell University, Ithaca, New York 14853, USAReceived XXXX, revised XXXX, accepted XXXXPublished online XXXX
PACS
∗
Corresponding author: email
rhennig@cornell.edu
, Phone: +016072546429
Quantum Monte Carlo approaches such as the diffusion Monte Carlo (DMC) method are among the most accuratemanybody methods for extended systems. Their scaling makes them well suited for defect calculations in solids. WereviewthevariousapproximationsneededforDMC calculationsofsolidsandtheresultsofpreviousDMC calculationsfor point defects in solids. Finally, we present estimates of how approximations affect the accuracy of calculations forselfinterstitial formation energies in silicon and predict DMC values of
4
.
4(1)
,
5
.
1(1)
and
4
.
7(1)
eV for the X, T andH interstitial defects, respectively, in a 16(+1)atom supercell.
Copyright line will be provided by the publisher
1 Introduction
Point defects, such as vacancies, interstitials and antisite defects, are the only thermodynamically stable defects at ﬁnite temperatures [1]. The inﬁniteslope of the entropy of mixing at inﬁnitesimally small defect concentrations results in an inﬁnite driving force fordefect formation. As a result, at small defect concentrations,theentropyofmixingalwaysovercomestheenthalpyof defect formations. In addition to being present in equilibrium, point defects often control the kinetics of materials, such as diffusion and phase transformations and areimportant for materials processing. The presence of pointdefects in materials can fundamentally alter the electronicand mechanical properties of a material. This makes pointdefects technologically important for applications such asdoping of semiconductors [2,3], solid solution hardening
of alloys [4,5], controlling the transition temperature for
shapememoryalloys [6] and the microstructural stabilization of twophase superalloys.However, the properties of defects, such as their structures and formation energies, are difﬁcult to measure insome materials due to their small sizes, low concentrations, lack of suitable radioactive isotopes,
etc.
Quantummechanical ﬁrstprinciples or
ab initio
theories make predictions to ﬁll in the gaps left by experiment [7].The most widely used method for the calculation of defect properties in solids is density functional theory(DFT). DFT replaces explicit manybody electron interactions with quasiparticles interacting via a meanﬁeldpotential,
i.e.
the exchangecorrelation potential, which isa functional of the electron density [8]. A universally trueexchangecorrelationfunctional is unknown, and DFT calculations employ various approximate functionals, eitherbased on a model system or an empirical ﬁt. The mostcommonly used functionals are based on DMC simulations [9] for the uniform electron gas at different densities,
e.g.
, the local density approximation (LDA) [10,11] andgradient expansions,
e.g.
, the generalized gradient approximation (GGA) [12,13,14,15,16]. These local and semi
local functionals suffer from a signiﬁcant selfinteractionerror reﬂected in the variable accuracy of their predictionsfor defect formation energies, charge transition levels andband gaps [17,18]. Another class of functionals, calledhybrid functionals, include a fraction of exact exchange toimprove their accuracy [19,20].The seemingly simple system of Si selfinterstitials exempliﬁes the varied accuracy of different density functionals and manybody methods. The diffusion and thermodynamics of silicon selfinterstitial defects dominatesthe doping and subsequent annealing processes of crystalline silicon for electronics applications [3,21,22]. The
mechanism of selfdiffusion in silicon is still under debate. Open questions [23] include: (1) are the interstitialatoms the prime mediators of selfdiffusion,(2) what is thespeciﬁc mechanism by which the interstitials operate, and
Copyright line will be provided by the publisher
2 Parker et al.: Testing QMC on Defects in Solids
(3) what is the value of the interstitial formation energy?Quantummechanical methods are well suited to determinedefect formation energies. LDA, GGA and hybrid functionals predict formationenergies for these defects rangingfromabout
2
to
4
.
5
eV [24]. Quasiparticle methodssuch asthe
GW
approximation reduce the selfinteraction error inDFT and are expected to improvethe accuracy of the interstitial formation energies. Recent
G
0
W
0
calculations [25]predict formation energies of about
4
.
5
eV in close agreement with HSE hybrid functionals [24] and previous DMCcalculations [26,24]. Quantum Monte Carlo methods provide an alternative to DFT and a benchmark for defect formation energies [27,28].
In this paper, we review the approximations that aremade in diffusion Monte Carlo (DMC) calculations forsolids and estimate how these approximations affect theaccuracy of point defect calculations, using the Si selfinterstitial defects as an example. Section 2 describes thequantumMonteCarlomethodandits approximations.Section 3 reviews previous quantum Monte Carlo calculationsfor defects in solids, and Section 4 discusses the results of our calculations for interstitials in silicon and the accuracyof the various approximations.
2 Quantum Monte Carlo method
Quantum MonteCarlo (QMC) methods are among the most accurate electronic structure methods available and, in principle, havethe potential to outperform current computational methods in both accuracy and cost for extended systems. QMCmethods scale as
O
(
N
3
)
with system size and can handle large systems. At the present time, calculations for asmany as 1,000 electrons on 1,000 processors make effective use of available computational resources [24]. Currentwork is under way to develop algorithms that extend thesystem size accessible by QMC methods to petascale computers [29].Continuum electronic structure calculations primarilyuse two QMC methods [27]: the simpler variational MonteCarlo (VMC) and the more sophisticated diffusion MonteCarlo (DMC). In VMC, a Monte Carlo method evaluatesthe manydimensional integral to calculate quantum mechanical expectation values. Accuracy of the results depends crucially on the quality of the trial wave functionwhich is controlled by the functional form of the wavefunctionandthe optimizationofthe wavefunctionsparameters [30]. DMC removesmost ofthe errorin the trial wavefunction by stochastically projecting out the ground stateusing an integral form of the imaginarytime Schr¨odingerequation.One of the most accurate forms of trial wave functionsfor quantum Monte Carlo applications to problems in electronic structure is a sum of Slater determinants of singleparticle orbitals multiplied by a Jastrow factor and modiﬁed by a backﬂow transformation:
Ψ
(
r
n
) = e
J
(
r
n
,R
m
)
c
i
CSF
i
(
x
n
)
.
The Jastrow factor
J
typically consists of a low orderpolynomial and a planewave expansion in electron coordinates
r
n
and nuclear coordinates
R
m
that efﬁcientlydescribe the dynamic correlations between electrons andnuclei. Static (neardegeneracy) correlations are describedby a sum of Slater determinants. Symmetryadapted linear combinations of Slater determinants, socalled conﬁguration state functions (CSF), reduce the number of determinant parameters
c
i
. For extended systems, the lack of size consistency for a ﬁnite sum of CSF’s makes this formof trial wave functions impractical, and a single determinant is used instead. Finally, the backﬂow transformation
r
n
→
x
n
allows the nodes of the trial wave function tobe moved which can efﬁciently reduce the ﬁxednode error [31]. Since the backﬂow transformed coordinate of anelectron
x
n
depends on the coordinates of all other electrons, the ShermanMorrison formula used to efﬁcientlyupdate the Slater determinant does not apply, increasingthe scaling of QMC to
O
(
N
4
)
. If a ﬁnite cutoff for thebackﬂow transformation is used, the ShermanMorrisonWoodbury formula [32] applies and the scaling is reducedto
O
(
N
3
)
.Optimization of the manybody trial wave function iscrucial because accurate trial wave functions reduce statistical and systematic errors in both VMC and DMC. Mucheffort has been spent on developing improved methods foroptimizing manybody wave functions, and this continuesto be the subject of ongoing research. Energy and variance minimization methods can effectively optimize thewave function parameters in VMC calculations [30,33].
Recently developed energy optimization methods enablethe efﬁcient optimization of CSF coefﬁcients and orbitalparameters in addition to the Jastrow parameters for smallmolecular systems, eliminating the dependence of the results on the input trial wave function [30].VMC and DMC contain two categories of approximation to make the manyelectron solution tractable:
controlled
approximations,whose errors can be made arbitrarily small through adjustable parameters, and
uncontrolled
approximations, whose errors are unknown exactly. Thecontrolled approximations include the ﬁnite DMC timestep,theﬁnitenumberofmanyelectronconﬁgurationsthatrepresent the DMC wave function, the basis set approximation,
e.g.
, spline or planewave representation, for thesingleparticle orbitals such of of the trial wave functionand the ﬁnitesized simulation cell. The uncontrolled approximations include the ﬁxednode approximation whichconstraints the nodes of the wave function in DMC to bethe same as the ones for the trial wave function, the replacement of the core electrons around each atom with apseudopotentialto representthecorevalenceelectronicinteraction and the locality approximation that uses the trialwave function to project the nonlocal angular momentumcomponents of the pseudopotential.
2.1 Controlled approximations
Copyright line will be provided by the publisher
pss header will be provided by the publisher 3
Time step
Diffusion Monte Carlo is based on thetransformation of the timedependent Schr¨odinger equation into an imaginarytime diffusion equation with asourcesink term. The propagation of the
3
N
dimensionalelectron conﬁgurations (walkers) that sample the wavefunction requires a ﬁnite imaginary time step which introduces an error in the resulting energy [34,35].Controlling the time step error is simply a matter of performing calculations for a range of time steps either todeterminewhenthetotalenergyordefectformationenergyreaches the required accuracy or to perform an extrapolation to a zero time step using a low order polynomial ﬁt of the energy as a function of time step. Smaller time steps,however, require a larger total number of steps to samplesufﬁciently the probability space. Thus, the optimal timestep should be small enough to add no signiﬁcant error tothe average while large enough to keep the total number of Monte Carlo steps manageable. In addition, the more accurate the trial wave functionis the smaller the error due tothe timestep will be [35].
Conﬁguration population
In DMC, a ﬁnite numberof electron conﬁgurations represent the manybody wavefunction. These conﬁgurations are the timeindependentSchr¨odinger equation’s analogues to particles in the diffusion equation and have also been called psips [34] andwalkers [27]. To improve the efﬁciency of sampling themanybodywave function, the number of conﬁgurations isallowed to ﬂuctuated from time step to time step in DMCusing a branching step. However, the total number of conﬁgurations needs to be controlled to avoid the conﬁguration population to diverge or vanish [35]. This populationcontrol introduces a bias in the energy. In practice wheretested [36], few hundreds of conﬁgurations are sufﬁcientto reduce the population control bias in the DMC total energy below the statistical uncertainty.The VMC and DMC calculations parallelize easilyover walkers. After an initial decorrelation run, the propagation of a larger number of walkers is computationallyequivalent to performing more time steps. The variance of the total energy scales like
σ
2
E
∝
τ
corr
N
conf
N
step
where
N
conf
denotes the number of walkers,
N
step
thenumber of time steps and
τ
corr
the auto correlation time.
Basis set
A sum of basis functions with coefﬁcientsrepresents the singleparticle orbitals in the Slater determinant. A DFT calculation usually determines these coefﬁcients. Plane waves provide a convenient basis for calculations of extended systems since they form an orthogonalbasis that systematically improves with increasing number of plane waves that span the simulation cell. Increasingthe number of plane waves until the total energy convergeswithin an acceptable threshold in DFT creates a basis setthat has presumably the same accuracy in QMC.Since the plane wave basis functions are extendedthroughout the simulation cell, the evaluation of an orbitalat a given position requires a sum over all plane waves.Furthermore, the number of plane waves is proportional tothe volume of the simulation cell. The computational costof orbital evaluation can signiﬁcantly be reduced by usinga local basis, such as Bsplines, which replaces the sumover plane waves with a sum over a small number of localbasis functions. The resulting polynomial approximationreduces the computational cost of orbital evaluation at asingle point from the number of plane waves (hundredsto thousands depending on the basis set) to the numberof nonzero polynomials (64 for cubic splines) [37]. Thewavelength of the highest frequency plane wave sets theresolutionof the splines. Thus,the most importantquantityto control in the basis set approximation is the size of thebasis set.
Simulationcell
Simulationcellswithperiodicboundary conditions are ideally suited to describe an inﬁnitesolid but result in undesirable ﬁnitesize errors that needcorrection. There are three types of ﬁnitesize errors. First,the singleparticle ﬁnitesize error arises from the choiceof a single
k
point in the singleparticle Bloch orbitals of the trial wave function. Second, the manybody ﬁnite sizeerror arises from the nonphysical selfimage interactionsbetween electrons in neighboring cells. Third, the defectcreates a strain ﬁeld that results in an additional ﬁnite sizeerror for small simulation cells.The singleparticle ﬁnite size error is greatly reducedby averaging DMC calculations for singleparticle orbitalsat different
k
points that sample the ﬁrst Brillouin zone of the simulation cell, socalled twistaveraging [38] Alternatively, the singleparticle ﬁnite size error can also be estimated from the DFT energy difference between a calculation with a dense
k
point mesh and one with the samesingle
k
point chosen for the orbitals of the QMC wavefunction.For the manybody ﬁnite size error, several methodsaim to correct the ﬁctitious periodic correlations betweenelectrons in different simulation cells. The ﬁrst approach,the model periodic Coulomb (MPC) interaction [39], revises the Ewald method [40] to account for the periodicity of the electrons by restoring the Coulomb interactionwithin the simulation cell and using the Ewald interactionto evaluate the Hartree energy. The second approach isbased on the random phase approximation for long wavelengths. The resulting ﬁrstorder, ﬁnitesizecorrectionterm for both the kinetic and potential energies can beestimated from the electronic structure factor [41]. Thethird approach estimates the manybody ﬁnitesize error from the energy difference between DFT calculationsusing a ﬁnitesized and an inﬁnitesized model exchangecorrelation functional [42]. This approach relies on theexchangecorrelation functional being a reasonable description of the system, whereas the other two approaches(MPC andstructurefactor)donot havethis restriction.TheMPC andstructurefactorcorrectionsarefundamentallyrelated and often result in similar energy corrections [43].
Copyright line will be provided by the publisher
4 Parker et al.: Testing QMC on Defects in Solids
The defect strain ﬁnite size error, can be estimatedat the DFT level using extrapolations of large simulationcells. Also since QMC force calculations are expensiveandstill underdevelopment[44],QMC calculationsforextendedsystems typically start with DFTrelaxed structures.Energy changes due to small errors in the ionic position aswell as thermal disorder are expected to be quite small becauseofthequadraticnatureoftheminimaandwilllargelycancel when taking energy differences for the defect energies.
2.2 Uncontrolled approximationsFixednode approximation
The Monte Carlo algorithm requires a probability distribution, which is nonnegative everywhere, but fermions, such as electrons, areantisymmetric under exchange, and therefore any wavefunction of two or more fermions has regions of positive and negative value. For quantum Monte Carlo to takethe wave function as the probability distribution, Anderson [34] ﬁxed the zeros or nodes of the wave function andtook the absolute value of the wave function as the probability distribution. If the trial wave function has the nodesof the ground state, then DMC projects out the groundstate. However, if the nodes differ from the ground state,then DMC ﬁnds the closest ground state of the systemwithin the inexact nodal surfaceimposed by the ﬁxednodecondition. This inexact solution has an energy higher thanthat of the ground state.Three methods estimate the size of the ﬁxed node approximation: (1) In the SlaterJastrow form of the wavefunction, the singleparticle orbitals in the Slater determinant set the zeroes of the trial wave function. Sincethese orbitals come from DFT calculations, varying theexchangecorrelation functional in DFT changes the trialwave function nodes and provides an estimate of the sizeof the ﬁxednode error. (2) L´opez R´ıos
et al.
[31] appliedbackﬂow to the nodes by modifying the interparticle distances,enhancingelectronelectronrepulsionandelectronnucleus attraction. The expense of the method has thus farlimited its application in the literature to studies of second and thirdrow atoms, the water dimer and the 1D and2D electron gases. (3) Because the eigenfunction of theHamiltonian has zero variance in DMC, a linear extrapolation from the variances of calculations with and withoutbackﬂow to zero variance estimates the energy of the exactground state of the Hamiltonian.
Pseudopotential
Valence electrons play the mostsigniﬁcantroles in determininga compositesystem’s properties. The core electrons remain close to the nucleus andare largely inert. The separation of valence and core electron energy scales allows the use of a pseudopotential todescribe the corevalence interaction without explicitlysimulating the core electrons. However, there is often noclearboundarybetweencoreandvalenceelectrons,andthecorevalenceinteraction is more complicatedthan a simplepotential can describe. Nonetheless, the computational demands of explicitly simulating the core electrons and thepractical success of calculations with pseudopotentials inreproducing experimental values promote their continueduse in QMC. Nearly all solidstate and many molecularQMC calculations to date rely on pseudopotentials to reduce the number of electrons and the time requirement of simulating the coreelectron energy scales.ComparingDMC energies using pseudopotentialsconstructed with different energy methods (DFT and HartreeFock[HF]) provides an estimate of the error incurred bythe pseudopotential approximation. Additionally, the difference between density functional pseudopotential andallelectron energies estimates the size of the error introduced by the pseudopotential and is used as a correctionterm.
Pseudopotential locality
DMC projects out theground state of a trial wave function but does not producea wave function, only a distribution of pointlike conﬁgurations. However, the pseudopotential contains separatepotentials (or channels) for different angularmomenta of electrons. One channel,identiﬁedas local, does not requirethe wave function to evaluate, but the nonlocal channelsrequire an angular integration to evaluate, and such anintegration requires a wave function. Mit´aˇs
et al.
[45] introduced use of the trial wave function to evaluate thenonlocal components requiring integration. This localityapproximation has an error that varies in sign. While thereare no good estimates of the magnitude of this error, Casula [46] developed a latticebased technique that makesthe total energy using a nonlocal potential an upper boundonthe groundstateenergy.PozzoandAlf`e [47] foundthat,
in magnesium and magnesium hydride, the errors of thelocality approximation and the latticeregularized methodare comparably small, but the lattice method requires amuch smaller time step (
0
.
05
Ha
−
1
vs.
1
.
00
Ha
−
1
in Mgand
0
.
01
Ha
−
1
vs.
0
.
05
Ha
−
1
in MgH
2
) to achieve thesame energy. Thus, they chose the nonlocal approximation.While allelectron calculations would, in principle,make the pseudopotential and locality errors controllable,in practice, the increase in number of electrons, requiredvariational parameters and variance of the local energymakes such calculations currently impractical for anythingbut small systems and light elements [48].
3 Review of previous DMC defect calculations
To date, there have been DMC calculations for defects inthree materials: the vacancy in diamond, the Schottky defect in MgO and the selfinterstitials in Si.
3.1 Diamond vacancy
Diamond’s high electron andhole mobilityand its toleranceto high temperaturesand radiation make it a technologically important semiconductormaterial. Diffusion in diamond is dominated by vacancydiffusion[53], and the vacancyis also associated with radiation damage [54]. Table 1 shows the rangeof vacancyfor
mation and migrationenergies calculated by LDA [50] andDMC [49]. DMC used structures from LDA relaxationand
Copyright line will be provided by the publisher
pss header will be provided by the publisher 5
Figure 1
DMC,
GW
and DFT energies (in eV) for neutral defects in three materials. DMC and experimental values havean estimated uncertainty indicated by numbers in parenthesis. For the diamond vacancy, DFTLDA and DMC include a
0
.
36
eV JahnTeller relaxation energy. LDA relaxation produced the structures and transition path so the DMC value formigration energy is an upper bound on the true value. The Schottky energy in MgO is the energy to form a cationanionvacancy pair. DFTLDA produces a range from
6

7
eV depending on the representationof the orbitals and treatment of thecore electrons. DMC using a planewave basis and pseudopotentialsresults in a value on the upperend of the experimentalrange. For Si interstitial defects, DFT values of the formation energy range from
2
eV below up to the DMC values,depending on the exchangecorrelation functional(LDA, GGA[PBE] or hybrid[HSE]), and the
GW
values lie within thetwostandarddeviationconﬁdence level of DMC.
Energy DFT
GW
DMC Exp. Ref.type LDA GGA HybridC diamond vacancy Formation 6.98 7.51   5.96(34)  [49,50]Migration 2.83    4.40(36) 2.3(3)MgO Schottky defect Formation 5.97, 6.99, 6.684    7.50(53) 5  7 [51,52]self X 3.31 3.64 4.69 4.40 5.0(2), 4.94(5)  [26,24]Si interstitial T Formation 3.43 3.76 4.95 4.51 5.5(2), 5.13(5) defect H 3.31 3.84 4.80 4.46 4.7(2), 5.05(5) 
singleparticle orbitals employing a Gaussian basis. LDApseudopotential removed core electrons. The DMC calculations predict a lower formation energy than LDA. TheDMC value for the migration energy is an upper boundon the actual number since the structures have not beenrelaxed in DMC. Furthermore, DMC estimates the experimentally observed dipole transition and provides an upperbound on the migration energy.[49] The GR1 optical transition is not a transitionbetweenoneelectronstates but between spin states
1
E and
1
T
2
. DMC calculates a transitionenergy of
1
.
5(3)
eV from
1
E to
1
T
2
, close to the experimentally observed value of
1
.
673
eV. LDA cannot distinguish these states. For the cohesive energy, DMC predictsa value of
7
.
346(6)
eV in excellent agreement with the experimentalresult of
7
.
371(5)
eV while LDA overbindsandyields
8
.
61
eV.
3.2 MgO Schottky defect
MgO is an important testmaterial for understanding oxides. Its rocksalt crystalstructure is simple, making it useful for computationalstudy.Schottkydefects are oneof the main types of defectspresent after exposure to radiation, according to classicalmolecular dynamics simulations [55]. Table 1 shows that
DMC predicts a Schottky defect formation energy in MgOat the upper end of the range of experimental values [51].
3.3 Si interstitial defects
Table 1 shows that DFTand DMC differ by up to
2
eV in their predictions of theformation energies of these defects [26,24]. We comparethe DMC values with our results including tests on theQMC approximations in Section 4.
4 Results
We speciﬁcally test the timestep, pseudopotential and ﬁxednode approximations for the formation energies of three silicon selfinterstitial defects,the split
110
interstitial (X), the tetrahedral interstitial(T) and the hexagonal interstitial (H). The QMC calculations are performed using the
CASINO
[56] code. Densityfunctional calculations in this work used the Q
UANTUM
ESPRESSO [57] and WIEN2k [58] codes. The defect
structures are identical to those of Batista
et al.
[24].The orbitals of the trial wave function come from DFTcalculations using the LDA exchangecorrelation functional. The planewave basis set with a cutoff energy of
1
,
088
eV (
60
Ha) converges the DFT total energies to
1
meV. A 7
×
7
×
7 MonkhorstPack
k
point mesh centeredat the Lpoint (0.5,0.5,0.5)converges the DFT total energyto
1
meV. A population of
1
,
280
walkers ensured thatthe error introduced by the population control is negligible small. Due to the computational cost of backﬂow, weperform the simulations for a supercell of 16(+1) atomsand estimate the ﬁnitesize corrections using the structurefactor method [41]. The ﬁnal corrected DMC energies forthe X, T and H defects are shown in the bottom line of Table 2.
4.1 Time step
Figure 3 shows the total energies of bulk silicon and the X defect as a function of time stepin DMC. A time step of
0
.
01
Ha
−
1
reduces the time steperror to within the statistical uncertainty of the DMC totalenergy.
4.2 Pseudopotential
In our calculations, a DiracFock (DF) pseudopotential represents the core electronsfor each silicon atom [59,60,61]. To estimate the errorintroduced by the pseudopotential, we compare the defectformation energies in DFT using this pseudopotential withallelectron DFT calculations using the linearized augmented planewave method [58]. This comparison givescorrections of
0
.
083
,
−
0
.
168
and
0
.
054
eV for the H, Tand X defects respectively.
Copyright line will be provided by the publisher