Basis Sets Patrick Briddon
Transcription
Basis Sets Patrick Briddon
Basis Sets Patrick Briddon Contents What is a basis set? Why do we need them? Gaussian basis sets Uncontracted Contracted Accuracy: a case study Some concluding thoughts What is a basis set? Solutions to the Schrödinger equation: 1d V E 2 2 dx 2 are continuous functions, ψ(x). → not good for a modern computer (discrete) Why a basis set? Idea: write the solution in terms of a series of functions: x cii x i The function Ψ is then “stored” as a number of coefficients: c1 , c2 , c3 , A few questions … What shall I choose for the functions? How many of them do I need? How do I work out what the correct coefficients are? Choosing Basis functions V ψ Try to imagine what the true wavefunction will be like: Choosing Basis functions ψ Basis states The coefficients These are determined by using the variational principle of quantum mechanics. If we have a trial wave-function: x cii x i Choose the coefficients to minimise the energy. How many basis functions? The more the better (i.e. the more accurate). The more you use, the slower the calculation! Energy always greater than true energy, but approaches it from above. In fact time depends on number-cubed! The better they are, the fewer you need. Basis sets ad LCAO/MO There is a close relationship between chemistry ideas and basis sets. Think about the H2 molecule: 1s H1 1s H 2 1s H1 1s H 2 * Basis sets and LCAO Physicists call this LCAO (“linear combination of atomic orbitals”) The basis functions are the atomic orbitals Chemists call this “molecular orbital theory” There is a big difference though: In LCAO/MO the number of basis functions is equal to the number of MOs. There is no “variational freedom”. What about our basis functions? Atomic orbitals are fine, but they are: Not well defined – you can’t push a button on a calculator and get one! Cumbersome to use on a computer AIMPRO used Gaussian orbitals It is called a “Gaussian Orbital” code. Gaussian Orbitals The idea: r ci exp i r R i 2 i There are thus three ingredients: An “exponent”, – controls the width of the Gaussian. A “centre” R – controls the location A coefficient – varied to minimise the energy The Exponents Typically vary between 0.1 and 10 Si: 0.12 up to 4; F: 0.25 up to 10 These are harder to find than coefficients. Small or large exponents are dangerous Fixed in a typical AIMPRO run: determined for atom or reference solid. i.e. vary exponents to get the lowest energy for bulk Si; Put into “hgh-pots” then keep them fixed when we look at other defect systems. The Positions/Coefficients Positions: we put functions on all atoms In the past we put them on bond centres too Abandoned – what if a bond disappears during a run? You cannot put two identical functions on the same atom – the functions must all be different. That is why small exponents are dangerous. Coefficients: AIMPRO does that for you! How good are Gaussians? Problems near the nucleus? True AE wave function was a cusp … but the pseudo wave function does not! How good are Gaussians? Problems at large distance? True wave function decays exponentially: exp[-br] Our function will decay more quickly: exp[-br2] Not ideal, but is not usually important for chemical bonding. Could be important for VdW forces But DFT doesn’t get them right anyway Only ever likely to be an issue for surfaces or molecules (our solution: ghost orbitals) AIMPRO basis set We do not only use s-orbitals of course. Modify Gaussians to form Cartesian Gaussian functions: p y R exp r R p z R exp r R p x x Rix exp i r R i 2 2 y iy i i 2 z iz i i Alongside the s orbital that will give 4 independent functions for the exponent. What about d’s? We continue, multiplying by 2 pre-factors: y y R exp r R z z R exp r R xy x R y R exp r R xz x R z R exp r R yz y R z R exp r R x 2 x Rix 2 exp i r R i 2 2 2 2 iy i i 2 2 2 iz i i 2 ix iy i i 2 ix iz i i 2 iy iz i i What about d’s? This introduces 6 further functions i.e. giving 10 including the s and p’s Of these 6 functions, 5 are the d-orbitals One is an additional s-type orbital: x 2 y 2 z 2 x Rix 2 y Riy 2 z Riz 2 exp i r R i 2 r R i exp i r R i 2 2 ddpp and all that We often label basis sets as “ddpp”. What does this mean? 4 letters means 4 different exponents. The first (smallest) has s/p/d functions (10) The next also has s/p/d functions (10) The last two (largest exponents) have s/p (4 each) Total of 28 functions Can we do better? Add more d-functions: Add more exponents “dddd” with 40 functions per atom this can be important if states high in the conduction band are needed (EELS). Clearly crucial for elements like Fe! ddppp Pddppp Put functions in extra places (bond centres) Not recommended How good is the energy? We can get the energy of an atom to 1 meV when the basis fitted. BUT: larger errors encountered when transferring that basis set to a defect. The energy is not well converged. But energy differences can be converged. So: ONLY SUBTRACT ENERGIES CALCULATED WITH THE SAME BASIS SET! Other properties Structure converges fastest with basis set Energy differences converge next fastest Conduction band converges more slowly Vibrational frequencies also require care. Important to be sure, the basis set you are using is good enough for the property that you are calculating! Contracted basis sets A way to reduce the number of functions whilst maintaining accuracy. Combine all four s-functions together to create a single combination: s 0.1e 0.1r 2 0.2e 0.5r 2 0.7e 1.4 r 2 0.3e 3.5r 2 The 0.1, 0.2, etc. are chosen to do the best for bulk Si. They are then frozen – kept the same for large runs. Do the same for the p-orbitals. This gives 4 contracted orbitals The C4G basis These 4 orbitals provide a very small basis set. How much faster than ddpp? Answer: (28/7)3 or 343 times! Sadly: not good enough! You will probably never hear this spoken of! Chemistry equivalent: “STO-3G” Also regarded as rubbish! The C44G basis Next step up: choose two different s/p combinations: s1 0.1e 0.1r 2 s2 0.4e 0.2e 0.5 r 2 0.7e 1.4 r 2 0.5e 1.4 r 2 We will now have 8 functions per atom. 0.1r 2 0.2e 0.5 r 2 (8/4)3 or 8 times slower than C4G! (28/8)3 or 43 times faster than ddpp. Sadly: still not good enough! 0.3e 3.5 r 2 0.4e 3.5 r 2 The C44G* basis Main shortcoming: change of shape of s/p functions when solid is formed. Need d-type functions. Add 5 of these. Gives 13 functions What we call C44G* (again “PRB speak”) Similar to chemists 6-31G* The C44G* basis 13 functions still (28/13)3 times faster than ddpp Diamond generally very good Si: conduction band not converged – various approaches (Jon’s article on Wiki) Chemists use 6-31G* for much routine work. Results for Si (JPG) Basis Num Expt Etot/at (Ha) Erel/at (eV) a0 (au) B0 (GPa) Eg (eV) 10.263 97.9 1.17 216 Time (s) 512 dddd 40 -3.96667 0.000 10.175 95.7 0.47 25339 ddpp 28 -3.96431 0.064 10.195 96.9 0.52 8348 27173 C44G* 13 -3.96350 0.086 10.192 98.5 0.74 1149 4085 Si-C4G 4 -3.94271 0.652 10.390 92.1 2.28 107 411 The way forwards? 13 functions still (28/13)3 times faster than ddpp 4 functions was (28/4)3 times faster. Idea at Nantes: form combinations not just of functions on one atom. Be very careful how you do this. Accuracy can be “as good as” ddpp. Plane Waves Another common basis set is the set of plane waves – recall the nearly free electron model. We can form simple ideas about the band structure of solids by considering free electrons. Plane waves are the equivalent to “atomic orbitals” for free electrons. r cG e G iG r Gaussians vs Plane Waves Number of Gaussians is very small Gaussians: 20/atom Plane Waves: 1000/atom Well written Gaussian codes are therefore faster. Plane waves are systematic: no assumption as to true wave function Assumptions are dangerous (they can be wrong!) … but they enable more work if they are faster Gaussians vs Plane Waves Plane waves can be increased until energy converges In reality it is not possible for large systems. Number of Gaussians cannot be increased indefinitely Gaussians good when we have a single “difficult atom” Carbon needs a lot of pane waves → SLOW! 1 C atom in 512 atom Si cell as slow as diamond True for 2p elements (C, N, O, F) and 3d metals. Gaussians codes are much faster for these. In conclusion Basis set is fundamental to what we do. A quick look at the mysterious “hgh-pots”. Uncontracted and contracted Gaussian bases. Rate of convergence depends on property. A good publication will demonstrate that results are converged with respect to basis.