Court Filings

Accuracy of quantum Monte Carlo methods for point defects in solids

Description
Accuracy of quantum Monte Carlo methods for point defects in solids
Categories
Published
of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
    a  r   X   i  v  :   1   0   0   6 .   3   2   5   4  v   1   [  c  o  n   d  -  m  a   t .  m   t  r   l  -  s  c   i   ]   1   6   J  u  n   2   0   1   0 physica status solidi, 17 June 2010 Accuracy of Quantum Monte CarloMethods for Point Defects in Solids William D. Parker 1 , John W. Wilkins 1 , Richard G. Hennig *1,2 , 1 Department of Physics, The Ohio State University, 191 W. Woodruff Ave., Columbus, Ohio 43210, USA 2 Department of Materials Science and Engineering, Cornell University, Ithaca, New York 14853, USAReceived XXXX, revised XXXX, accepted XXXXPublished online XXXX PACS ∗ Corresponding author: e-mail  rhennig@cornell.edu , Phone: +01-607-2546429 Quantum Monte Carlo approaches such as the diffusion Monte Carlo (DMC) method are among the most accuratemany-body methods for extended systems. Their scaling makes them well suited for defect calculations in solids. WereviewthevariousapproximationsneededforDMC calculationsofsolidsandtheresultsofpreviousDMC calculationsfor point defects in solids. Finally, we present estimates of how approximations affect the accuracy of calculations forself-interstitial formation energies in silicon and predict DMC values of   4 . 4(1) ,  5 . 1(1)  and  4 . 7(1)  eV for the X, T andH interstitial defects, respectively, in a 16(+1)-atom supercell. Copyright line will be provided by the publisher 1 Introduction  Point defects, such as vacancies, in-terstitials and anti-site defects, are the only thermodynam-ically stable defects at finite temperatures [1]. The infiniteslope of the entropy of mixing at infinitesimally small de-fect concentrations results in an infinite driving force fordefect formation. As a result, at small defect concentra-tions,theentropyofmixingalwaysovercomestheenthalpyof defect formations. In addition to being present in equi-librium, point defects often control the kinetics of mate-rials, such as diffusion and phase transformations and areimportant for materials processing. The presence of pointdefects in materials can fundamentally alter the electronicand mechanical properties of a material. This makes pointdefects technologically important for applications such asdoping of semiconductors [2,3], solid solution hardening of alloys [4,5], controlling the transition temperature for shape-memoryalloys [6] and the microstructural stabiliza-tion of two-phase superalloys.However, the properties of defects, such as their struc-tures and formation energies, are difficult to measure insome materials due to their small sizes, low concentra-tions, lack of suitable radioactive isotopes,  etc.  Quantummechanical first-principles or  ab initio  theories make pre-dictions to fill in the gaps left by experiment [7].The most widely used method for the calculation of defect properties in solids is density functional theory(DFT). DFT replaces explicit many-body electron inter-actions with quasiparticles interacting via a mean-fieldpotential,  i.e.  the exchange-correlation potential, which isa functional of the electron density [8]. A universally trueexchange-correlationfunctional is unknown, and DFT cal-culations employ various approximate functionals, eitherbased on a model system or an empirical fit. The mostcommonly used functionals are based on DMC simula-tions [9] for the uniform electron gas at different densities, e.g. , the local density approximation (LDA) [10,11] andgradient expansions,  e.g. , the generalized gradient approx-imation (GGA) [12,13,14,15,16]. These local and semi- local functionals suffer from a significant self-interactionerror reflected in the variable accuracy of their predictionsfor defect formation energies, charge transition levels andband gaps [17,18]. Another class of functionals, calledhybrid functionals, include a fraction of exact exchange toimprove their accuracy [19,20].The seemingly simple system of Si self-interstitials ex-emplifies the varied accuracy of different density func-tionals and many-body methods. The diffusion and ther-modynamics of silicon self-interstitial defects dominatesthe doping and subsequent annealing processes of crys-talline silicon for electronics applications [3,21,22]. The mechanism of self-diffusion in silicon is still under de-bate. Open questions [23] include: (1) are the interstitialatoms the prime mediators of self-diffusion,(2) what is thespecific mechanism by which the interstitials operate, and Copyright line will be provided by the publisher  2 Parker et al.: Testing QMC on Defects in Solids (3) what is the value of the interstitial formation energy?Quantummechanical methods are well suited to determinedefect formation energies. LDA, GGA and hybrid func-tionals predict formationenergies for these defects rangingfromabout  2  to  4 . 5  eV [24]. Quasiparticle methodssuch asthe GW   approximation reduce the self-interaction error inDFT and are expected to improvethe accuracy of the inter-stitial formation energies. Recent  G 0 W  0  calculations [25]predict formation energies of about  4 . 5  eV in close agree-ment with HSE hybrid functionals [24] and previous DMCcalculations [26,24]. Quantum Monte Carlo methods pro-vide an alternative to DFT and a benchmark for defect for-mation energies [27,28]. In this paper, we review the approximations that aremade in diffusion Monte Carlo (DMC) calculations forsolids and estimate how these approximations affect theaccuracy of point defect calculations, using the Si self-interstitial defects as an example. Section 2 describes thequantumMonteCarlomethodandits approximations.Sec-tion 3 reviews previous quantum Monte Carlo calculationsfor defects in solids, and Section 4 discusses the results of our calculations for interstitials in silicon and the accuracyof the various approximations. 2 Quantum Monte Carlo method  Quantum MonteCarlo (QMC) methods are among the most accurate elec-tronic structure methods available and, in principle, havethe potential to outperform current computational meth-ods in both accuracy and cost for extended systems. QMCmethods scale as  O ( N  3 )  with system size and can han-dle large systems. At the present time, calculations for asmany as 1,000 electrons on 1,000 processors make effec-tive use of available computational resources [24]. Currentwork is under way to develop algorithms that extend thesystem size accessible by QMC methods to petascale com-puters [29].Continuum electronic structure calculations primarilyuse two QMC methods [27]: the simpler variational MonteCarlo (VMC) and the more sophisticated diffusion MonteCarlo (DMC). In VMC, a Monte Carlo method evaluatesthe many-dimensional integral to calculate quantum me-chanical expectation values. Accuracy of the results de-pends crucially on the quality of the trial wave functionwhich is controlled by the functional form of the wavefunctionandthe optimizationofthe wavefunctionsparam-eters [30]. DMC removesmost ofthe errorin the trial wavefunction by stochastically projecting out the ground stateusing an integral form of the imaginary-time Schr¨odingerequation.One of the most accurate forms of trial wave functionsfor quantum Monte Carlo applications to problems in elec-tronic structure is a sum of Slater determinants of single-particle orbitals multiplied by a Jastrow factor and modi-fied by a backflow transformation: Ψ  ( r n ) = e J  ( r n ,R m )  c i  CSF i ( x n ) . The Jastrow factor  J   typically consists of a low orderpolynomial and a plane-wave expansion in electron co-ordinates  r n  and nuclear coordinates  R m  that efficientlydescribe the dynamic correlations between electrons andnuclei. Static (near-degeneracy) correlations are describedby a sum of Slater determinants. Symmetry-adapted lin-ear combinations of Slater determinants, so-called config-uration state functions (CSF), reduce the number of deter-minant parameters  c i . For extended systems, the lack of size consistency for a finite sum of CSF’s makes this formof trial wave functions impractical, and a single determi-nant is used instead. Finally, the backflow transformation r n  →  x n  allows the nodes of the trial wave function tobe moved which can efficiently reduce the fixed-node er-ror [31]. Since the backflow transformed coordinate of anelectron  x n  depends on the coordinates of all other elec-trons, the Sherman-Morrison formula used to efficientlyupdate the Slater determinant does not apply, increasingthe scaling of QMC to  O ( N  4 ) . If a finite cutoff for thebackflow transformation is used, the Sherman-Morrison-Woodbury formula [32] applies and the scaling is reducedto  O ( N  3 ) .Optimization of the many-body trial wave function iscrucial because accurate trial wave functions reduce statis-tical and systematic errors in both VMC and DMC. Mucheffort has been spent on developing improved methods foroptimizing many-body wave functions, and this continuesto be the subject of ongoing research. Energy and vari-ance minimization methods can effectively optimize thewave function parameters in VMC calculations [30,33]. Recently developed energy optimization methods enablethe efficient optimization of CSF coefficients and orbitalparameters in addition to the Jastrow parameters for smallmolecular systems, eliminating the dependence of the re-sults on the input trial wave function [30].VMC and DMC contain two categories of approxi-mation to make the many-electron solution tractable:  con-trolled   approximations,whose errors can be made arbitrar-ily small through adjustable parameters, and  uncontrolled  approximations, whose errors are unknown exactly. Thecontrolled approximations include the finite DMC timestep,thefinitenumberofmany-electronconfigurationsthatrepresent the DMC wave function, the basis set approxi-mation,  e.g. , spline or plane-wave representation, for thesingle-particle orbitals such of of the trial wave functionand the finite-sized simulation cell. The uncontrolled ap-proximations include the fixed-node approximation whichconstraints the nodes of the wave function in DMC to bethe same as the ones for the trial wave function, the re-placement of the core electrons around each atom with apseudopotentialto representthecore-valenceelectronicin-teraction and the locality approximation that uses the trialwave function to project the nonlocal angular momentumcomponents of the pseudopotential. 2.1 Controlled approximations Copyright line will be provided by the publisher  pss header will be provided by the publisher 3 Time step  Diffusion Monte Carlo is based on thetransformation of the time-dependent Schr¨odinger equa-tion into an imaginary-time diffusion equation with asource-sink term. The propagation of the  3 N  -dimensionalelectron configurations (walkers) that sample the wavefunction requires a finite imaginary time step which intro-duces an error in the resulting energy [34,35].Controlling the time step error is simply a matter of performing calculations for a range of time steps either todeterminewhenthetotalenergyordefectformationenergyreaches the required accuracy or to perform an extrapola-tion to a zero time step using a low order polynomial fit of the energy as a function of time step. Smaller time steps,however, require a larger total number of steps to samplesufficiently the probability space. Thus, the optimal timestep should be small enough to add no significant error tothe average while large enough to keep the total number of Monte Carlo steps manageable. In addition, the more ac-curate the trial wave functionis the smaller the error due tothe time-step will be [35]. Configuration population  In DMC, a finite numberof electron configurations represent the many-body wavefunction. These configurations are the time-independentSchr¨odinger equation’s analogues to particles in the dif-fusion equation and have also been called psips [34] andwalkers [27]. To improve the efficiency of sampling themany-bodywave function, the number of configurations isallowed to fluctuated from time step to time step in DMCusing a branching step. However, the total number of con-figurations needs to be controlled to avoid the configura-tion population to diverge or vanish [35]. This populationcontrol introduces a bias in the energy. In practice wheretested [36], few hundreds of configurations are sufficientto reduce the population control bias in the DMC total en-ergy below the statistical uncertainty.The VMC and DMC calculations parallelize easilyover walkers. After an initial decorrelation run, the prop-agation of a larger number of walkers is computationallyequivalent to performing more time steps. The variance of the total energy scales like σ 2 E   ∝ τ  corr N  conf   N  step where  N  conf   denotes the number of walkers,  N  step  thenumber of time steps and  τ  corr  the auto correlation time. Basis set  A sum of basis functions with coefficientsrepresents the single-particle orbitals in the Slater determi-nant. A DFT calculation usually determines these coeffi-cients. Plane waves provide a convenient basis for calcula-tions of extended systems since they form an orthogonalbasis that systematically improves with increasing num-ber of plane waves that span the simulation cell. Increasingthe number of plane waves until the total energy convergeswithin an acceptable threshold in DFT creates a basis setthat has presumably the same accuracy in QMC.Since the plane wave basis functions are extendedthroughout the simulation cell, the evaluation of an orbitalat a given position requires a sum over all plane waves.Furthermore, the number of plane waves is proportional tothe volume of the simulation cell. The computational costof orbital evaluation can significantly be reduced by usinga local basis, such as B-splines, which replaces the sumover plane waves with a sum over a small number of localbasis functions. The resulting polynomial approximationreduces the computational cost of orbital evaluation at asingle point from the number of plane waves (hundredsto thousands depending on the basis set) to the numberof non-zero polynomials (64 for cubic splines) [37]. Thewavelength of the highest frequency plane wave sets theresolutionof the splines. Thus,the most importantquantityto control in the basis set approximation is the size of thebasis set. Simulationcell  Simulationcellswithperiodicbound-ary conditions are ideally suited to describe an infinitesolid but result in undesirable finite-size errors that needcorrection. There are three types of finite-size errors. First,the single-particle finite-size error arises from the choiceof a single  k -point in the single-particle Bloch orbitals of the trial wave function. Second, the many-body finite sizeerror arises from the non-physical self-image interactionsbetween electrons in neighboring cells. Third, the defectcreates a strain field that results in an additional finite sizeerror for small simulation cells.The single-particle finite size error is greatly reducedby averaging DMC calculations for single-particle orbitalsat different  k -points that sample the first Brillouin zone of the simulation cell, so-called twist-averaging [38] Alterna-tively, the single-particle finite size error can also be esti-mated from the DFT energy difference between a calcu-lation with a dense  k -point mesh and one with the samesingle  k -point chosen for the orbitals of the QMC wavefunction.For the many-body finite size error, several methodsaim to correct the fictitious periodic correlations betweenelectrons in different simulation cells. The first approach,the model periodic Coulomb (MPC) interaction [39], re-vises the Ewald method [40] to account for the periodic-ity of the electrons by restoring the Coulomb interactionwithin the simulation cell and using the Ewald interactionto evaluate the Hartree energy. The second approach isbased on the random phase approximation for long wavelengths. The resulting first-order, finite-size-correctionterm for both the kinetic and potential energies can beestimated from the electronic structure factor [41]. Thethird approach estimates the many-body finite-size er-ror from the energy difference between DFT calculationsusing a finite-sized and an infinite-sized model exchange-correlation functional [42]. This approach relies on theexchange-correlation functional being a reasonable de-scription of the system, whereas the other two approaches(MPC andstructurefactor)donot havethis restriction.TheMPC andstructurefactorcorrectionsarefundamentallyre-lated and often result in similar energy corrections [43]. Copyright line will be provided by the publisher  4 Parker et al.: Testing QMC on Defects in Solids The defect strain finite size error, can be estimatedat the DFT level using extrapolations of large simulationcells. Also since QMC force calculations are expensiveandstill underdevelopment[44],QMC calculationsforex-tendedsystems typically start with DFT-relaxed structures.Energy changes due to small errors in the ionic position aswell as thermal disorder are expected to be quite small be-causeofthequadraticnatureoftheminimaandwilllargelycancel when taking energy differences for the defect ener-gies. 2.2 Uncontrolled approximationsFixed-node approximation  The Monte Carlo algo-rithm requires a probability distribution, which is non-negative everywhere, but fermions, such as electrons, areantisymmetric under exchange, and therefore any wavefunction of two or more fermions has regions of posi-tive and negative value. For quantum Monte Carlo to takethe wave function as the probability distribution, Ander-son [34] fixed the zeros or nodes of the wave function andtook the absolute value of the wave function as the proba-bility distribution. If the trial wave function has the nodesof the ground state, then DMC projects out the groundstate. However, if the nodes differ from the ground state,then DMC finds the closest ground state of the systemwithin the inexact nodal surfaceimposed by the fixed-nodecondition. This inexact solution has an energy higher thanthat of the ground state.Three methods estimate the size of the fixed node ap-proximation: (1) In the Slater-Jastrow form of the wavefunction, the single-particle orbitals in the Slater deter-minant set the zeroes of the trial wave function. Sincethese orbitals come from DFT calculations, varying theexchange-correlation functional in DFT changes the trialwave function nodes and provides an estimate of the sizeof the fixed-node error. (2) L´opez R´ıos  et al.  [31] appliedbackflow to the nodes by modifying the interparticle dis-tances,enhancingelectron-electronrepulsionandelectron-nucleus attraction. The expense of the method has thus farlimited its application in the literature to studies of sec-ond and third-row atoms, the water dimer and the 1D and2D electron gases. (3) Because the eigenfunction of theHamiltonian has zero variance in DMC, a linear extrapo-lation from the variances of calculations with and withoutbackflow to zero variance estimates the energy of the exactground state of the Hamiltonian. Pseudopotential  Valence electrons play the mostsignificantroles in determininga compositesystem’s prop-erties. The core electrons remain close to the nucleus andare largely inert. The separation of valence and core elec-tron energy scales allows the use of a pseudopotential todescribe the core-valence interaction without explicitlysimulating the core electrons. However, there is often noclearboundarybetweencoreandvalenceelectrons,andthecore-valenceinteraction is more complicatedthan a simplepotential can describe. Nonetheless, the computational de-mands of explicitly simulating the core electrons and thepractical success of calculations with pseudopotentials inreproducing experimental values promote their continueduse in QMC. Nearly all solid-state and many molecularQMC calculations to date rely on pseudopotentials to re-duce the number of electrons and the time requirement of simulating the core-electron energy scales.ComparingDMC energies using pseudopotentialscon-structed with different energy methods (DFT and Hartree-Fock[HF]) provides an estimate of the error incurred bythe pseudopotential approximation. Additionally, the dif-ference between density functional pseudopotential andall-electron energies estimates the size of the error intro-duced by the pseudopotential and is used as a correctionterm. Pseudopotential locality  DMC projects out theground state of a trial wave function but does not producea wave function, only a distribution of point-like config-urations. However, the pseudopotential contains separatepotentials (or channels) for different angular-momenta of electrons. One channel,identifiedas local, does not requirethe wave function to evaluate, but the nonlocal channelsrequire an angular integration to evaluate, and such anintegration requires a wave function. Mit´aˇs  et al.  [45] in-troduced use of the trial wave function to evaluate thenonlocal components requiring integration. This localityapproximation has an error that varies in sign. While thereare no good estimates of the magnitude of this error, Ca-sula [46] developed a lattice-based technique that makesthe total energy using a nonlocal potential an upper boundonthe ground-stateenergy.PozzoandAlf`e [47] foundthat, in magnesium and magnesium hydride, the errors of thelocality approximation and the lattice-regularized methodare comparably small, but the lattice method requires amuch smaller time step ( 0 . 05  Ha − 1 vs.  1 . 00  Ha − 1 in Mgand  0 . 01  Ha − 1 vs.  0 . 05  Ha − 1 in MgH 2 ) to achieve thesame energy. Thus, they chose the nonlocal approxima-tion.While all-electron calculations would, in principle,make the pseudopotential and locality errors controllable,in practice, the increase in number of electrons, requiredvariational parameters and variance of the local energymakes such calculations currently impractical for anythingbut small systems and light elements [48]. 3 Review of previous DMC defect calculations To date, there have been DMC calculations for defects inthree materials: the vacancy in diamond, the Schottky de-fect in MgO and the self-interstitials in Si. 3.1 Diamond vacancy  Diamond’s high electron andhole mobilityand its toleranceto high temperaturesand ra-diation make it a technologically important semiconductormaterial. Diffusion in diamond is dominated by vacancydiffusion[53], and the vacancyis also associated with radi-ation damage [54]. Table 1 shows the rangeof vacancyfor- mation and migrationenergies calculated by LDA [50] andDMC [49]. DMC used structures from LDA relaxationand Copyright line will be provided by the publisher  pss header will be provided by the publisher 5 Figure 1  DMC,  GW   and DFT energies (in eV) for neutral defects in three materials. DMC and experimental values havean estimated uncertainty indicated by numbers in parenthesis. For the diamond vacancy, DFT-LDA and DMC include a 0 . 36  eV Jahn-Teller relaxation energy. LDA relaxation produced the structures and transition path so the DMC value formigration energy is an upper bound on the true value. The Schottky energy in MgO is the energy to form a cation-anionvacancy pair. DFT-LDA produces a range from  6 - 7  eV depending on the representationof the orbitals and treatment of thecore electrons. DMC using a plane-wave basis and pseudopotentialsresults in a value on the upperend of the experimentalrange. For Si interstitial defects, DFT values of the formation energy range from  2  eV below up to the DMC values,depending on the exchange-correlation functional(LDA, GGA[PBE] or hybrid[HSE]), and the  GW   values lie within thetwo-standard-deviationconfidence level of DMC. Energy DFT  GW   DMC Exp. Ref.type LDA GGA HybridC diamond vacancy Formation 6.98 7.51 - - 5.96(34) - [49,50]Migration 2.83 - - - 4.40(36) 2.3(3)MgO Schottky defect Formation 5.97, 6.99, 6.684 - - - 7.50(53) 5 - 7 [51,52]self- X 3.31 3.64 4.69 4.40 5.0(2), 4.94(5) - [26,24]Si interstitial T Formation 3.43 3.76 4.95 4.51 5.5(2), 5.13(5) -defect H 3.31 3.84 4.80 4.46 4.7(2), 5.05(5) - single-particle orbitals employing a Gaussian basis. LDApseudopotential removed core electrons. The DMC calcu-lations predict a lower formation energy than LDA. TheDMC value for the migration energy is an upper boundon the actual number since the structures have not beenrelaxed in DMC. Furthermore, DMC estimates the experi-mentally observed dipole transition and provides an upperbound on the migration energy.[49] The GR1 optical tran-sition is not a transitionbetweenone-electronstates but be-tween spin states  1 E and  1 T 2 . DMC calculates a transitionenergy of   1 . 5(3)  eV from  1 E to  1 T 2 , close to the experi-mentally observed value of   1 . 673  eV. LDA cannot distin-guish these states. For the cohesive energy, DMC predictsa value of   7 . 346(6)  eV in excellent agreement with the ex-perimentalresult of   7 . 371(5) eV while LDA overbindsandyields  8 . 61  eV. 3.2 MgO Schottky defect  MgO is an important testmaterial for understanding oxides. Its rock-salt crystalstructure is simple, making it useful for computationalstudy.Schottkydefects are oneof the main types of defectspresent after exposure to radiation, according to classicalmolecular dynamics simulations [55]. Table 1 shows that DMC predicts a Schottky defect formation energy in MgOat the upper end of the range of experimental values [51]. 3.3 Si interstitial defects  Table 1 shows that DFTand DMC differ by up to  2  eV in their predictions of theformation energies of these defects [26,24]. We comparethe DMC values with our results including tests on theQMC approximations in Section 4. 4 Results  We specifically test the time-step, pseu-dopotential and fixed-node approximations for the for-mation energies of three silicon self-interstitial defects,the split-  110   interstitial (X), the tetrahedral interstitial(T) and the hexagonal interstitial (H). The QMC calcula-tions are performed using the  CASINO  [56] code. Densityfunctional calculations in this work used the Q UANTUM ESPRESSO [57] and WIEN2k  [58] codes. The defect structures are identical to those of Batista  et al.  [24].The orbitals of the trial wave function come from DFTcalculations using the LDA exchange-correlation func-tional. The plane-wave basis set with a cutoff energy of  1 , 088  eV ( 60  Ha) converges the DFT total energies to 1  meV. A 7 × 7 × 7 Monkhorst-Pack   k -point mesh centeredat the L-point (0.5,0.5,0.5)converges the DFT total energyto  1  meV. A population of   1 , 280  walkers ensured thatthe error introduced by the population control is negligi-ble small. Due to the computational cost of backflow, weperform the simulations for a supercell of 16(+1) atomsand estimate the finite-size corrections using the structurefactor method [41]. The final corrected DMC energies forthe X, T and H defects are shown in the bottom line of Table 2. 4.1 Time step  Figure 3 shows the total energies of bulk silicon and the X defect as a function of time stepin DMC. A time step of   0 . 01  Ha − 1 reduces the time steperror to within the statistical uncertainty of the DMC totalenergy. 4.2 Pseudopotential  In our calculations, a Dirac-Fock (DF) pseudopotential represents the core electronsfor each silicon atom [59,60,61]. To estimate the errorintroduced by the pseudopotential, we compare the defectformation energies in DFT using this pseudopotential withall-electron DFT calculations using the linearized aug-mented plane-wave method [58]. This comparison givescorrections of   0 . 083 ,  − 0 . 168  and  0 . 054  eV for the H, Tand X defects respectively. Copyright line will be provided by the publisher
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks