All-electron full-potential DFT: The Jülich FLEUR family of codes

Transcription

All-electron full-potential DFT: The Jülich FLEUR family of codes
Mitglied der Helmholtz-Gemeinschaft
All-electron full-potential DFT:
The Jülich FLEUR family of codes
Daniel Wortmann
Quantum Theory of Materials
Institute for Advanced Simulation
The Team (IAS-1/PGI-1)
Kohn Sham - Density Functional Theory
Standard Model in many Fields:
Physics, Chemistry, Material Science, Biology …
Basic idea:
0
@
X
i
hi +
X
i,j
1
Uij A (~r1 , ...~rN ) = E (~r1 , ...~rN )
Mapping of the many-electron system on a system of noninteracting electrons described by an effective single-particle
Hamiltonian
h i (~r ) = ✏i
r)
i (~
Self-consistency problem
ninit
Potential
generation
Diagonalization
Fermi level
Construction of
charge
h=
1 2
r + V0 + VC [n] + VEX [n]
2
h i (r ) = ✏i
N=
✏X
i <✏F
n(r ) =
i (r )
1
i
✏X
i <✏F
i
Mixing of
charge
n = F [nold , nnew ]
| (r )|2
Simulations
Different Algorithms
Variety of Codes
Parallelization
Analysis of Data
Density Functional Theory
1 2
(
r + Vef f ) = ✏
2
Supercomputers
Clusters
Workstations
Inhomegenous
machines
Bandgaps
Strong correlations
Different energy
scales
Structural relaxations
Electronic structure
Magnetic properties
Phonons&Magnons
Transport
Typical Applications
10x10x10
Atomic
structure
Electronic
structure
Magnetic
structure
Zoo of methods
Basis sets:
•  Plane waves
•  Numerical/Analytical
Localized basis sets
Local density approximation
(LDA), GGA
LDA+U
Hybrid Functionals
GW-Approximation
Real space grids
Green functions
1 2
(
r + Vef f ) = ✏
2
Finite difference approx.
Non-relativistic equation
Scalar-relativistic approx.
Spin-orbit coupling
Dirac equation
All-electron
Pseudo-potential
Shape approximations
Full potential
Spin-polarized calculations
Method development
Codes presented
today:
•  FLEUR
•  juRS (P. Baumeister)
•  KKRnano (E. Rabel)
Further Codes developed in IAS-1:
•  Tight Binding code juTiBi
•  Various KKR codes
•  Spin-dynamic code juSpinX
FLEUR: the Jülich FLAPW codes
n  All-electron,
n  All
full-potential DFT
elements, open systems
Atomic structure
Total energies
Forces
Electronic structure
Bandgaps
Bandstructures
Charge density
Surface states
Linearized Augmented Plane Waves
All-electron code:
V(r) contains singularity
due to the nucleii
Basis functions:
G
=
(
e i(k+G )r
P
lm (alm,G u(r ) + blm,G u̇(r ))Ylm (r )
Numerical radial basis:
✓
@u
Energy derivative: u̇ =
@✏
2
Interstitial
Muffin-Tin
◆
1 @
0
+
V
(r
)
ru(r
)
=
✏
l
l ru(r )
2
2 @r
+Additional localized functions (local orbitals) can be added
Where does the CPU time go?
Self consistency loop:
Potential
generation
Fermi level
Hamiltonian
setup
Diagonalization
Mixing of
charge
Construction of
charge
H,S
Diagonalization
Charge
Time , PE
50%
13%
33%
28min , 1 PE
27%
20%
44%
36min , 32 PE
33%
50%
17%
10min , 30 PE
23%
61%
11%
22min , 40 PE
Parallelisation
Multiple level parallelism: k-loop + further loops
Potential
generation
Fermi level
Mixing of
charge
MPI+OMP
Hamiltonian
setup
Diagonalization
Construction of
charge
i
MPI+OMP
MPI+(OMP)
k-point loop, MPI parallel,
little communication
MPI
k-point loop,
MPI parallel
Eigenvalue Problem
Hci = ✏i Sci
For large systems this is the computational most relevant
problem
•  Generalized eigenvalue problem
•  Full-Matrix solver needed
•  Usually only about 5-10% of eigenvectors are needed
•  Many iterations with similar matrices
•  For machines with few processors:
need to store many solutions of the eigenvalue problem
Hamiltonian Setup
Interstitial contribution:
•  Plane wave part
•  Not a simple integration over
all space !
Hij
D
E
lapwi |Ĥ|lapwj
Muffin-tin contribution:
Z
•  For each atom
1
•  For each pair of basis
e i(k+Gi )r Ĥe i(k+Gj )r dr
=
functions i,j
V INT
!⇤
!
Z
X
X
Ylm (alm;i ul + blm;i u̇l Ĥ
Ylm (alm;j ul + blm;j u̇l dr
+
=
MT
lm
Hamiltonian:
•  Kinetic energy, spherical
potential
•  Non-spherical potential couples
different l,m
lm
Similar for overlap matrix
Eigenvalue Problem
ELPA Library provides OMP+MPI parallelism
But: only Juropa so far
Unified Interface for output quantities?
DFT codes face a lot of common tasks:
•  Generation of input, symmetry
•  Determination of relaxed positions
•  Plotting of output:
• 
Band structure
• 
Density of States
• 
Charge density
•  Charge density mixing
•  k-integration
Implementation of hybridWillfunctionals
it blend?
Becke 1993, JCP 98, p.1372 and p.5648
•  hybrid functionals combine bare (or
screened) nonlocal (NL) Hartree-Fock
exchange with local (L) xc functionals
•  Kohn-Sham equation with an
additional non-local operator
NL,
Vx,GG
0 (k) =
LD
AE
xch
LDA C
•  FLAPW basis:
occ. X
BZ Z Z
X
HF Exchange
an
ge
xc
E
A
GG
on
orrelati
C
A
G
G
orrelati
on
hyb.
LDA
Exc
= Exc
+ a0 (ExHF
ge
n
ha
ExLDA ) + ax (ExGGA
ExLDA ) + ac (EcGGA
⇤
0
⇤ 0 Martin Schlipf
0 3
3 0
Muffin-tin Recipes
(r)⇥
(r)v(r,
r
)⇥
(r
)
(r
)d
r
d
r
0
kG
nq
nq
kG
q
n
•  employ mixed product basis for
NL,
Vx,GG
0 (k) =
occ. X
BZ X
X
n
q
EcLDA )
IJ
⇥
kG |⇥nk
q
q
M
⇤v
(q)⇥M
IJ
q I
J ⇥nk
with the bare (screened) Coulomb matrix
vIJ (q)
q | kG0 ⇤
Hybrid functionals
§  hybrid functionals are a factor of 10 up to 100 computationally more
expensive than conventional LDA or GGA calculations
§  more than 90% of the time is spent to compute the matrix elements of
the non-local exchange potential
§  by introducing an auxiliary basis {MIq }the matrix elements are casted
into a sum over vector-matrix-vector products
Vn1 n2 (k) =
BZ X
occ. X
X
q
n
IJ
⇥
n1 k | nq MIq ⇤CIJ (q)⇥MJq
Loop over k-points
Loop over q-points
Loop over bands n1
Loop over occupied bands n
Sparse matrix vector product
Loop over bands n2
scalar product
nq | n2 k ⇤
GW approximation (SPEX code)
Direct calculation of electronic excitation energies Ekn
Equation of motion of interacting particles (electrons or holes):
ĥ0 (r)
Numerical
scaling:
(Size)4
kn (r) +
GW
(r, r ; Ekn )
kn (r)
Self-energy operator: Depends on r, r', and E (complex)
è 8 independent parameters
GW approximation:
⇥
GW (r, r
For comparison DFT:
i
; ⇥) =
2
i⇤
GKS
d⇥
⇥ (r, r ; ⇥ + ⇥ )W (r, r ; ⇥ )e
Equation of motion formally of noninteracting particles:
ĥ0 (r)⇥kn (r) + v xc (r)⇥kn (r) =
Numerical
Scaling:
(Size)3
3
(r
)d
r = Ekn
kn
kn ⇥kn (r)
Exchange-correlation potential: Depends on r
è only 3 independent parameters
Green function embedding
Bloch spectral function
n 
Density Functional Theory for
broken symmetries
■ 
Green function method
■ 
Complex Bandstructure,
Transport
Interface transmission Ag/Pt
Order-N scaling
n  Separate
system into layers
n  Each
step scales linear
with size:
1)  Green function for each
insulated layer
2)  Propagate embedding
potentials
3)  Green function for each layer with correct embedding potentials
Efficient distribution on many processors
Additional features
•  Film setups
• 
Semi-infinite vacuum, no
supercells
•  Wannier functions
• 
Wannier interpolation
• 
Construction of TB models
•  Non-collinear magnetism •  Electric fields
• 
Spin-spirals
•  Spin-orbit interaction
•  LDA+U
• 
Calculation of U by
constrained RPA
•  Optimized effective
potential method+RPA
•  Interface to
van der Waals code
•  …….
Summary
FLEUR:
Possible Projects:
•  All-electron fullpotential DFT
•  Eigenvalue problem
•  Features
•  CPU intensive parts
•  Parallelization
•  OMP parallelization
•  Unified input/output
interface
•  Charge density mixing
for large systems