cardwell lattice
Transcription
cardwell lattice
Ising Spin Model Model of ferromagnet - collection of magnetic moments associated with the spins of atoms. Treat the spins as located at fixed points on a 2D lattice. Ignore quantum mechanical effects (active area of research). See: Computational Physics, N. Giordano, H. Nakanishi, 2nd Ed., McGraw Hill Consider spin projections along either +z or -z direction. Consider only nearest neighbor interactions. E = −J ∑ si s j ij si, j = ±1 where ij means all nearest neighbor spins Summer 2009 Data Analysis and Monte Carlo Methods Lecture 10 1 Ising Spin Model E=J E = −J ∑ si s j ij € s = −1 s = 1 E = −J € s=1 s=1 is a link. There are € € (N rows − 1)N columns + (N columns − 1)N rows = 2N columns N rows − N rows − N columns links. Since simulations of large arrays is limited by computer time, we often employ periodic boundary conditions, where we also have a link between spins in rows 1,Nrows, and spins in columns 1,Ncolumns. In this case, there are Ncolumns*Nrows links. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 10 2 Statistical Mechanics In the spirit of Statistical Mechanics, we consider the energy associated with a spin state (which corresponds to knowing the spin directions of all atoms on the lattice). For a system in equilibrium with a heat bath, the probability associated with being in a given state depends only on the energy: Pα ∝ e − E α / kT For a square lattice with N sites per direction, there are N2 sites and a total of € possible states. Even a small lattice with 2N N=10 is impossible to study with ‘brute force’. On the other hand, such a small lattice is very far from typical real systems, where we deal with O(1023) spins. Boundary conditions in the € simulation become important. 2 Summer 2009 Data Analysis and Monte Carlo Methods Lecture 10 3 Statistical Mechanics Typical quantities of interest are: < E >= ∑ Pα Eα < M >=∑ Pα Mα α α where Mα = ∑ si i Note that there are often many ‘microstates’ - here, arrangement of spins - with the same energy. The normalization of the € probability is e − E / kT Pα = − E / kT ∑e α α α Competition between highest probability state (min energy) and large number of lower probability states (max entropy). Balance € by the temperature, T. controlled Summer 2009 Data Analysis and Monte Carlo Methods Lecture 10 4 Energy vs Entropy + + + + + + + + + + + + + + + + + + Min energy when all spins are lined up: E=-18 nα=2 (all up or all down) If one spin opposite direction E=-14 nα=18 (one up or one down) Which is more likely ? Depends on T. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 10 5 Mean Field Theory Before starting with the simulations, get some guidance on expected results from an approximate calculational approach, called ‘Mean Field Theory’. For an infinitely large system, all spins are equivalent (edge effects unimportant). Then, e.g., M = ∑ si = N 2 si i where we consider the average value of the spin projection for one spin, and the average is done over all the states weighted by their probability. € Imagine we add an external magnetic field, H. Then, the energy becomes: E = −J ∑ si s j − µH ∑ si ij Summer 2009 i Data Analysis and Monte Carlo Methods Lecture 10 6 Mean Field Theory Suppose for the moment we only have one spin in the external magnetic field. Then, there are two possible states, with the spin aligned or not aligned: s= + E + = −µH P+ = Ce + µH / kT s=− E − = +µH P− = Ce − µH / kT Requiring total probability 1 gives: € so that € Summer 2009 € e ± µH / kT P± = + µH / kT e + e − µH / kT e + µH / kT − e − µH / kT s = ∑ sP± = + µH / kT − µH / kT = tanh(µH /kT) e +e s= ±1 Data Analysis and Monte Carlo Methods Lecture 10 7 Mean Field Theory The Mean Field approximation consists of the approximation that the interaction of a spin with its neighbors can be represented by the interaction with an effective magnetic field, Heff: E = −µH eff ∑ si − µH ∑ si i i H eff J = ∑ sj µ [ j] and further replacing the values of the spins by their average values: € zJ H eff = s z is an effective number of spins producing H eff µ If we now drop the external magnetic field, we get s = tanh(zJ s /kT) € Summer 2009 Implicit relation Data Analysis and Monte Carlo Methods Lecture 10 8 Mean Field Theory Need to find the crossing points, or, the roots of the equation s − tanh(zJ s /kT) = 0 s = 0 is always a solution € € T in units of J/k Summer 2009 Paramagnetic phase For T<TC, other solutions are also possible (ferromagnetic phase). These have net magnetization Data Analysis and Monte Carlo Methods Lecture 10 9 Root Finding To find the solutions for the average spin, we need to find the roots of s − tanh(zJ s /kT) = 0 Recall the Newton-Raphson method: € f (x) x i+1 = x i − f ′ (x) We can easily differentiate f(x) so that this method converges quickly.€ f (x) = x −€ tanh(zJx /kT) = 0 f ′ (x) = 1− Summer 2009 e 2zJx / kT zJ 4 kT + e −2zJx / kT + 2 Data Analysis and Monte Carlo Methods Lecture 10 10 Aside – Root finding and Optimization We want to solve problems of the sort max h(θ ) θ ∈Θ We start with some standard numerical techniques for doing this (steepest descent, conjugate gradient, Newton-Raphson). They € work well in a small number of dimensions, but not in large dimensional spaces. They also require some analytic knowledge of the function to work well. Recall – optimization is very similar to root finding (zero of derivative) Later we consider Monte Carlo methods (following Monte Carlo Statistical Methods, C. Robert, G. Casella, 2nd Ed. Chapter 5.) Summer 2009 Data Analysis and Monte Carlo Methods Lecture 9 11 Newton-Raphson Method for Root Finding Here we use the slope (or also 2nd derivative) at a guess position to extrapolate to the zero crossing. Can easily generalize to many parameters and many equations. However, it also has its drawbacks as we will see. What if this is our guess ? Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 12 Several Roots Suppose we have a function which as several roots; e.g. [ ] 0 = −κ 3 + 8κ 2 (a 2 + b 2 ) − κ (16a 4 + 39a 2b 2 + 16b 4 ) + 28a 2b 2 (a 2 + b 2 ) It is important to analyze the problem and set the bounds correctly. For the NR method, if you are not sure where the root is, then need to give several different starting values and see what happens. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 13 Several variable The Newton-Raphson method is easily generalized to several functions of several variables: f1 (x1,, x n ) =0 f (x) = f n (x1,, x n ) We form the derivative matrix: € ∂f 1 ∂f 1 ∂x 1 ∂x n Df = ∂f n ∂f n ∂x ∂x 1 n If the matrix is not singular, we solve the SLE: (0) (0) (1) (0) 0 = f ( x )+ Df ( x )( x − x ) The iteration is Summer 2009 (r +1) € (r ) (r ) x = x − Df ( x ) ( ) −1 (r ) f (x ) Data Analysis and Monte Carlo Methods Lecture 11 14 Optimization We now move to the closely related problem of optimization (finding the zeroes of a derivative). This is a very widespread problem in physics (e.g., finding the minimum χ2, the maximum likelihood, the lowest energy state, …). Instead of looking for zeroes of a function, we look for extrema. Finding global extrema is a very important and very difficult problem, particularly in the case of several variables. Many techniques have been invented, and we look at a few here. Here, look for the minimum of h( x ) . For a maximum, consider the minimum of -h( x ) . We assume the function is at least twice differentiable. € Summer 2009 € Data Analysis and Monte Carlo Methods Lecture 11 15 Optimization First and second derivatives: ∂h t ∂h g ( x ) = ,, ∂x 1 ∂x n a vector ∂ 2h ∂ 2h ∂x ∂x ∂x ∂x 1 n 1 1 H( x ) = 2 2 ∂ h ∂ h ∂x n∂x1 ∂x n∂x n the Hessian matrix General technique: 1. Start with initial guess x ( 0) 2. Determine a direction, s, and a step size λ 3. Iterate x until g t < ε, or cannot find smaller h Summer 2009 Data Analysis and Monte Carlo Methods x ( r +1) = x (r ) + λ r sr Lecture 11 16 Steepest Descent Reasonable try: steepest descent sr = − gr dh( xr − λgr ) step length from 0 = dλ Note that consecutive steps are in orthogonal directions. 2 χ As an example, we consider minimizing a : € 2 λ are the parameters of the function to be fit n (yi − f (xi ; λ )) 2 yi are the measured points at values xi χ =∑ 2 € wi i=0 wi is the weight given to point i f (x; A, ϑ ) = A cos(x + ϑ ) In our example: and wi = 1 ∀i we want to minimize χ2 as a function of A and ϕ Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 17 Steepest Descent x y 0. 0. 1.26 0.95 2.51 0.59 3.77 -0.59 5.03 -0.95 6.28 0. 7.54 0.95€ 8.80 0.59 Summer 2009 2 (yi − f (xi ; λ )) 2 χ =∑ 2 w i=0 i n 2 8 h(A,ϑ ) = χ = ∑ (y i − Acos(x i + ϑ )) 2 i=1 Data Analysis and Monte Carlo Methods Lecture 11 18 Steepest Descent To use our method, need to have the derivatives: 8 2(y − Acos(x + ϑ ))(− cos(x + ϑ )) ∑ i i i i=1 g(A,ϑ ) = 8 ∑ 2(y i − Acos(x i + ϑ ))(Asin(x i + ϑ )) i=1 dh( x r − λgr ) d recall step length from 0 = = h( x r +1 ) dλ r dλ r € d d T T h( x r +1 ) = ∇h( x r +1 ) ⋅ x r +1 = −∇h( x r +1 ) gr dλ r dλ r Setting to zero we see that the step length is chosen so as to make the next step orthogonal. We proceed in a zig-zag pattern. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 19 Steepest Descent * Starting guesses for parameters A=0.5 phase=1. step=1. * Evaluate derivatives of function gA=dchisqdA(A,phase) gp=dchisqdp(A,phase) * h=chisq(A,phase) * Itry=0 1 continue Itry=Itry+1 * update parameters for given step size A1=A-step*gA phase1=phase-step*gp * reevaluate the chisquared h1=chisq(A1,phase1) * change step size if chi squared increased If (h1.gt.h) then step=step/2. goto 1 Endif * Chi squared decreased, keep this update A=A1 phase=phase1 gA=dchisqdA(A,phase) gp=dchisqdp(A,phase) Data Analysis and Monte Carlo Methods Lecture 11 20 Determining the step size through the orthogonality is often difficult. Easier to do it by trial and error: Summer 2009 Steepest Descent * Starting guesses for parameters A=0.5 phase=1. step=1. Minimum found is A=-1, phase=π/2. Previously, A=1, phase=- π/2 Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 21 Other Optimization Techniques Conjugate Gradient { Similar to steepest descent, but slightly different way of choosing direction of next step: x r +1 = x r + λ r sr s0 = − g0 sr +1 = − gr +1 + β r +1sr new term t λ r is chosen to minimize h( x r +1 ). This yields gr +1gr = 0 € Here we allow a further step in the sr direction. One choice (Fletcher - Reeves) for β r +1 is gr2+1 β r +1 = 2 gr Summer 2009 Data Analysis and Monte Carlo Methods Lecture 12 22 Newton-Raphson Assume the function that we want to minimize is twice differentiable. Then, a Taylor expansion gives t 1 t h( x + α ) ≈ a + b α + α Cα 2 where € Now ∂ 2h a = h( x ), b = ∇h( x ) = g( x ), C = =H ∂x i∂x j ∇h( x + α ) ≈ b + Cα Because C is symmetric (check) € −1 For an extremum, we have b + Cα = 0 α = −C b € −1 x = x − H( x or r +1 r r ) g( x r ) € Summer 2009 Data Analysis and Monte Carlo Methods Lecture 12 23 Newton-Raphson x r +1 = x r − H( x r ) −1 g( x r ) −1 s = H( x i.e., the search direction is r ) g( x r ) and λ = 1 This converges quickly (if you start with a good guess), but the penalty is that the Hessian needs to be calculated (usually € numerically) Again, convergence is when s is sufficiently small How would we calculate€the Hessian numerically ? Use Lagrange polynomial in several € dimensions and work it out Summer 2009 Data Analysis and Monte Carlo Methods Lecture 12 24 Bounded Regions The standard tool for minimization in particle physics is the MINUIT program (CERN library). It has also made its way well outside the particle physics community. Author: Fred James Here is how MINUIT handles bounded search regions - it transforms the parameter to be optimized as follows: b− a λ−a λ ′ = arcsin 2 −1 λ = a+ sin λ ′ + 1) ( b− a 2 λ is the exernal (user) parameter λ ′ is the internal parameter MINUIT is available within PAW, ROOT … €Summer 2009 Data Analysis and Monte Carlo Methods Lecture 12 25 MINUIT MINUIT uses a (variable metric) conjugate gradient search algorithm (along with others). Basic idea: • assume that the function to minimize can be approximated by a quadratic form near the minimum • build up iteratively an approximation for the inverse of the Hessian matrix. Recall 1 t h( x + α ) ≈ h( x ) + ∇h( x ) ⋅ α + α Hα 2 the approximation for the Hessian is updated as follows: ( x i+1 − x i ) ⊗ ( x i+1 − x i ) [ H i ⋅ (∇hi+1 − ∇hi )] ⊗ [ H i ⋅ (∇hi+1 − ∇hi )] H i+1 = H i + − € ( x i+1 − x i ) ⋅ (∇hi+1 − ∇hi ) (∇hi+1 − ∇hi ) ⋅ H i ⋅ (∇hi+1 − ∇hi ) where the ⊗ symbol represents an outer product of two vectors (a matrix) (a ⊗ b )ij = aib j Summer 2009 Data Analysis and Monte Carlo Methods Lecture 12 26 Stochastic Exploration Brute force: • generate values of Θ using a uniform distribution, and find the maximum using the approximation: max h(θ ) ≈ h * = max(h(u1 ),h(u2 ),...,h(um )) ui ~ UΘ θ ∈Θ * If h = h(u* ), θ * ≈ u* This will always work, but it may be extremely slow. Obviously, if € we can sample according to h(θ) we will be much more efficient. Let’s try it out on our old friend: h(x) = [cos(50x) + sin(20x)] Summer 2009 € 2 Data Analysis and Monte Carlo Methods Lecture 9 27 Stochastic Exploration Summer 2009 Data Analysis and Monte Carlo Methods Lecture 9 28 Stochastic Exploration Let’s look at a somewhat more complicated function: 2 h(x, y) = [ x sin(20y) + y sin(20x)] cosh(sin(10x)x) + 2 [ x cos(10y) − y sin(10x)] cosh(cos(20y)y) € Many local minima Global minimum at (0,0) Summer 2009 Data Analysis and Monte Carlo Methods Lecture 9 29 Stochastic Exploration About 40000 iterations needed to reach real minimum. May not be stable. Can we do better ? Have principle problem that many of the local minima have minimum value very close to the absolute minimum. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 9 30 Mean Field Theory - Average Spin Ferromagnetic material - spontaneous magnetization 2nd order phase transition (slope discontinuous) at TC=zJ/k=4J/k M (or <s>) is an order parameter. It tells us what phase the system is in. At high T, disordered, at low T ordered. TC Paramagnetic material - M=0 Summer 2009 Data Analysis and Monte Carlo Methods Lecture 10 31 Order Parameter & Critical Exponent Writing x=<s>, we have zJx 1 zJx x = tanh(zJx / kT ) x → − →0 kT 3 kT 3 solutions x = 0, x = 3 kT 3 zJ − T = zJ T k 3 3 T (TC − T ) T TC ∝ T (TC − T )1/ 2 ∝ (TC − T )β β ≈ 1/ 2 is the 'critical exponent' € Summer 2009 Data Analysis and Monte Carlo Methods Lecture 10 32 Metropolis-Hastings We now look at the Ising spin system with a simulation: E=J E = −J ∑ si s j ij € s = −1 s = 1 E = −J € s=1 s=1 is a link. € € Use Markov Chain approach (Metropolis algorithm). The Metropolis algorithm has the desired properties: aperiodic, recurrent, irreducible. We know the form of the target distribution up to a normalization, and Markov Chain theory guarantees sampling according to this distribution (asymptotically). Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 33 Metropolis-Hastings Algorithm: 1. 2. Generate a random assignment of spins on a square NxN lattice Calculate the starting energy, magnetization N 2 E 0 = − ∑ si s j links ij 3. M = ∑ si i=1 Loop over temperatures. There is a unique probability distribution of the states at each T. We therefore want to evolve the Markov Chain until we have reached a stationary distribution for each T. a) Loop over ‘time’, € where 1 unit of time is defined as N2 attempted spin flips. i. Select a site at random ii. Flip the spin at that site, and calculate the new energy. We apply periodic boundary conditions, so that sites at i=1 connect to i=N, and j=1 connect to j=N. −(E − E ) / kT e iii. If Enew≤E0, accept. If Enew>E0, accept with probability , and, if flip accepted, set E0=Enew. b) Keep track of the mean energy and rms over the last tmin time steps. Declare convergence if t> tmin and Erms<T. new 0 € Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 34 Example 3x3 - + + + + - + - + + + + + - E0=+2 M0=+1 - - - Start with T=5 + + First attempt - flip spin in row 3 + - + (starting from bottom) and column 2. - + Energy decreased by 4 units (2 for + + every net link sign) - accepted - - + … Summer 2009 Data Analysis and Monte Carlo Methods E0=-2 M0=+3 Lecture 11 35 Example 3x3 At lower T, start to get stuck in some configurations for many time cycles. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 36 Example 3x3 Transition occurs near T=2, not T=4 Analytic calculation for square lattice gives: TC = 2 ln(1 + 2) ≈ 2.27 € Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 37 Example 10x10 See what happens when we go to a larger lattice: Transitions get sharper for larger lattices. Big fluctuations near the critical point. Need to run simulation for sufficient time. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 38 Comparison to Mean Field Theory The value of TC from the simulation is much closer to the exact value than mean field theory. Why ? Also, the behavior of M vs T is very different (sharper in the simulation). Check why this happens. First, look at the critical exponent. Mean field theory gave β=0.5, whereas the exact value for an Ising model on a square lattice is 1/8. What does the simulation yield ? As we have seen, need large enough grid to have sharp predictions. Go to 20x20 grid, minimum of 105 iterations per temperature, and maximum of 106. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 39 Example 20x20 Note that Eavg does not go to zero as M goes to zero for T>TC. Indicates that there is a correlation of the spins of nearest neighbors. More on this later. To get critical exponent, need to get exponent of (TC-T)β in region near TC. This is difficult (time consuming) because the fluctuations are huge in this region. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 40 Critical Exponent Fit of function M avg = A(TC − T) β P1 = M avg € P2 = TC P3 = β Errors chosen proportional to Mrms. Try with more T steps near TC Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 41 Critical Exponent Fit function M avg = AT(TC − T) β P1 = M avg Summer 2009 Data Analysis and Monte Carlo Methods P2 = TC P3 = β Lecture 11 42 Specific Heat Recall the relation € Summer 2009 C= ( ΔE ) kT 2 2 2 E rms = kT 2 Specific heat becomes discontinuous at TC. Sign is a phase transition from disordered to ordered system. The peak becomes sharper for larger grids. Data Analysis and Monte Carlo Methods Lecture 11 43 Correlations between Spins As we have seen, the average energy can be non-zero even when the net magnetization is zero. The reason is that neighboring spins are correlated. Let’s look at the correlation: Black square: spin up See blocks of spins oriented in one direction. Let’s calculate: si si+ n (T) where i,i + n are in the same column, but separated by n rows. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 44 Correlations 20x20 These correlations are not properly handled in the Mean Field Theory in particular the rapid change near the critical temperature. Summer 2009 Data Analysis and Monte Carlo Methods Lecture 11 45