Changes of Table of Contents The Title of Appendix CC
Transcription
Changes of Table of Contents The Title of Appendix CC
Changes of Table of Contents The Title of Appendix CC should be “More Accurate Numerical Solutions of the Eigenvalue Problem” Appenidix II is missing The appendices with double Roman Letters should be Appendix AA The Gradient and Laplacian Operators Appendix BB Solution of the Schrdinger Equation in Spherical Coordinates Appendix CC More Accurate Numerical Solutions of the Eigenvalue Problem Appendix DD The Angular Momentum Operators Appendix EE The Radial Equation for Hydrogen Appendix FF Transition Probabilities for z-Polarized Light Appendix GG Transitions with x- and y-Polarized Light Appendix HH Derivation of the Distribution Laws Appendix II Derivation of Bloch’s Theorem Appendix JJ The Band Gap Appendix KK Vector Spaces and Matrices Appendix LL Algebraic Solution of the Oscillator In the current web pages of the book the word “Bloch” is misspelled. The correct title of Appendix ii is given above. 1 Changes for Preface Page xi, Three lines from top of section on THE NEW EDITION Replace “two new sections; I have written” with “two new sections I have written” (remove colon) Page xi, Four lines from the bottom of the page Repace “Another MATLAB programs” with “Another MATLAB program” (the subject should be singular) Page xv, Seven lines from top of page Replace “Ovtave” with “Octave” Changes for Introduction Page, xxiii, Equation following Equation (I.19) Replace “where k = π/L and” with “where k = 2π/L and” (add a factor of 2) Page xxiii, Equation (I.22) Replace “A” with “−A” (Place a minus sign before the right-hand side of the equation) Page xxxvii, First line after equation in Problem 4 Replace “where” with “where k = 2π/L and” (add definition of k) Page xxxvii, Problem 17 Replace “Fig. i.12(a)” with “Fig. I.12(a)” (letter “i” should be upper case) 2 1 APPENDIX AA The Gradient and Laplacian Operators The gradient operator The gradient of a function is a vector that points in the direction in which the function changes most rapidly and has a magnitude equal to the rate of change of the function in that direction. It is the natural three-dimensional generalization of the derivative with respect to a single variable. In Cartesian coordinates, the gradient of the function ψ can be written ∇ψ = i ∂ψ ∂ψ ∂ψ +j +k ∂x ∂y ∂z (1) where i, j and k are unit vectors pointing along the x, y and z axes respectively. In spherical coordinates, which are defined by eq. (4.3), the gradient of the function ψ is ∇ψ = ˆ r ∂ψ ˆ 1 ∂ψ ˆ 1 ∂ψ +θ +φ , ∂r r ∂θ r sin θ ∂φ (2) where ˆ r, θˆ and φˆ are unit vectors pointing in the direction in which r moves when r, θ and φ increase. These three unit vectors are shown in Fig. aa.1. We note that the factor r∂θ which occurs in the denominator of the second term is the distance the point at r would move if the angle θ increased by an amount ∂θ with r and φ held fixed, while the factor r sin θ∂φ which occurs in the denominator of the third term is the distance the point at r would move if φ increased by ∂φ with r and θ held fixed. Each of the terms of the gradient operator gives the rate of change of the function on which the gradient operator acts with respect to the displacement associated with a particular spherical coordinate. The equation defining the gradient of a function for any orthogonal set of coordinates is similar to eq. (2) defining the gradient of a function for spherical coordinates. For a system of orthogonal coordinates (q1 , q2 , q3 ), a change of the first coordinate q1 by an amount dq1 causes a spatial point to move a distance ds1 = h1 dq1 . Similarly, changes in the second coordinate by dq2 2 and a change of the third coordinates by dq3 causes the point to move distances ds2 = h2 dq2 and ds3 = h3 dq3 , respectively . The weights (h1 , h2 , h3 ) determines how far the point moves if the corresponding coordinate changes. The gradient of a function ψ in a system with coordinate (q1 , q2 , q3 ) is defined by the equation ∇ψ = ˆ e1 1 ∂ψ 1 ∂ψ 1 ∂ψ +ˆ e2 +ˆ e3 . h1 ∂q1 h2 ∂q2 h3 ∂q3 (3) Here as for the spherical coordinates (r, θ, φ) the quantities h1 ∂q1 , h2 ∂q2 , and h3 ∂q3 occurring in the denominators of the above equation give the distances a point will move if the three coordinates change by the amounts ∂q1 , ∂q2 , and ∂q3 , respectively. The three weight factors for spherical coordinates are hr = 1, hθ = r, hφ = r sin θ . (4) The divergence of a vector To find the divergence of a vector A, we consider the infinitesimal volume dV = dq1 dq2 dq3 shown in Fig. aa.2. The volume is bounded by surfaces for which the first coordinate has the values, q1 and q1 + dq1 , the second coordinate has the values, q2 and q2 + dq2 , and the third coordinate has the values, q3 and q3 + dq3 . Gauss’s theorem for the vector field A(q1 , q2 , q3 ) is Z Z ∇ · A dV = A · dS. The integral of the out-going normal of A over the two surfaces for which the first coordinate has the values q1 and q1 + dq1 is (A1 ds2 ds3 )q1 +dq1 − (A1 ds2 ds3 )q1 = ∂(A1 ds2 ds3 ) dq1 . ∂q1 Using the fact that the displacements, ds1 , ds2 , and ds3 , are equal to h1 dq1 , h2 dq2 , and h3 dq3 , respectively, this last equation can be written (A1 ds2 ds3 )q1 +dq1 − (A1 ds2 ds3 )q1 = 1 ∂(h2 h3 A1 ) dV. h1 h2 h3 ∂q1 Analogous expressions hold for the other two sets of surfaces. According to Gauss’s theorem, the sum of these three terms is equal to A · AdV . Hence 3 the divergence of A is given by the equation 1 ∂(h2 h3 A1 ) ∂(h3 h1 A2 ) ∂(h1 h2 A3 ) ∇·A= + + . h1 h2 h3 ∂q1 ∂q2 ∂q3 (5) The Laplacian of a function The Laplacian of a scalar function ψ is the divergence of the gradient of the function. We can write ∇2 ψ = ∇ · ∇ψ. Using eq. (5) for the divergence of a vector and eq. (3) for the gradient of a function, this last equation can be written ∇2 ψ = 1 ∂ h2 h3 ∂ψ ∂ h3 h1 ∂ψ ∂ h1 h2 ∂ψ + + . (6) h1 h2 h3 ∂q1 h1 ∂q1 ∂q2 h2 ∂q2 ∂q3 h3 ∂q3 For spherical polar coordinates, the three weight factors are given by eq. (4) and the equation for the Laplacian operator in spherical coordinates is 1 1 ∂ ∂ψ 1 ∂ 2ψ 1 ∂ 2 ∂ψ 2 r + 2 sin θ + . (7) ∇ψ= 2 r ∂r ∂r r sin θ ∂θ ∂θ sin2 θ ∂φ2 The angular momentum operators The operator associated with the angular momentum of a particle can be obtained by writing the angular momentum in terms of the momentum l=r×p and then making the replacement p → −i∇ to obtain l = −i~r × ∇. (8) The angular momentum operator in spherical coordinates can be obtained by using eq. (2) and the relations ˆ r × θˆ = φˆ and ˆ r × φˆ = −θˆ 4 to give 1 ∂ ∂ l = −i~ φˆ − θˆ ∂θ sin θ ∂φ . The operator corresponding to the z-component of the angular momentum can then be obtained by taking the dot product of this last expression with ˆ pointing along the z-axis to obtain the unit vector k lz = −i~ ∂ . ∂φ (9) The square of the angular momentum operator is related to the second term on the right-hand side of eq. (7) by the equation 1 ∂ ∂ 1 ∂2 2 2 l = −~ sin θ + . (10) sin θ ∂θ ∂θ sin2 θ ∂φ2 Using eqs. (7) and (10), the Lapalcian operator in spherical coordinates can be written simply 1 l2 1 ∂ 2 ∂ψ 2 r − 2 2. (11) ∇ = 2 r ∂r ∂r r ~ The angular part of the Laplacian operator in spherical coordinates is equal to l2 /~2 r2 . 5 APPENDIX BB Solution of the Schr¨ odinger Equation in Spherical Coordinates Separation of the Schr¨ odinger Equation The Schr¨odinger equation for an electron with mass m moving about a nucleus with mass M and charge eZ can be written − ~2 2 1 Ze2 ∇ ψ(r) − ψ(r) = Eψ(r), 2µ 4π0 r (12) where the reduced mass µ is defined by the equation µ= mM . m+M (13) Using the expression for the Laplacian operator in Spherical coordinates given in Appendix AA, the Scr¨odinger equation can be written 1 ∂ ∂ψ 1 ∂ 2ψ ~2 1 ∂ 2 ∂ψ r + 2 sin θ + 2 2 − 2µ r2 ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2 1 Ze2 − ψ = Eψ. (14) 4π0 r This last equation can be solved by the method of separation of variables by writing the wave function as a product of a functions of the radial and angular coordinates ψ(r, θ, φ) = R(r)Y (θ, φ). Substituting the product function into eq. (14) and dividing by −~2 /2µr2 times the product function, we obtain 1 d R dr r 2 dR dr 2µr2 1 Ze2 + 2 E+ ~ 4π0 r 1 1 ∂ ∂Y 1 ∂ 2Y =− sin θ + . (15) Y sin θ ∂θ ∂θ sin2 θ ∂φ2 6 Since the left-hand side of this last equation depends only on r and the righthand side depends only on θ and φ, both sides must be equal to a constant that we call λ. The resulting radial equation can be written 1 d 2µ 1 Ze2 λ 2 dR r + 2 E+ − 2 R = 0, (16) r2 dr dr ~ 4π0 r r and the angular equation is 1 ∂ ∂Y 1 ∂ 2Y − sin θ − = λY. sin θ ∂θ ∂θ sin2 θ ∂φ2 (17) Using eq. (10) of Appendix AA, we may identify the second of these two last equations as the eigenvalue equation of the angular momentum operator l2 with eigenvalue ~2 λ. We shall use a purely algebraic line of argument in Appendix CC to show that the eigenvalues of the orbital angular momentum operator l2 are ~2 l(l+1), where l is the angular momentum quantum number. We may thus identify the separation constant λ as being l(l + 1) and write the radial equation 2µ 1 Ze2 l(l + 1) 1 d 2 dR r + 2 E+ − R = 0, (18) r2 dr dr ~ 4π0 r r2 We consider now more fully the radial equation. Evaluating the derivatives of the first term in the equation, we obtain 2µ d2 R 2 dR 1 Ze2 l(l + 1) + + 2 E+ − R = 0. (19) dr2 r dr ~ 4π0 r r2 This last equation can be simplified by introducing the change of variables ρ = αr, (20) where α is a constant yet to be specified. The equation defining the change of variables can be written r = α−1 ρ. (21) The derivatives of the radial function R can then be expressed in terms of the new variable ρ by using the chain rule. We have dR dρ dR dR = =α . dr dr dρ dρ (22) 7 Similarly, the second derivative can be written 2 d2 R 2d R = α . dr2 dρ2 (23) Substituting eqs. (21)-(23) into eq. (24) and dividing the resulting equation by α2 , we obtain d2 R 2 dR + + dρ2 ρ dρ 2µZe2 ~2 4π0 α 1 2µE l(l + 1) + R = 0. − ρ ~2 α2 ρ2 (24) To simplify this last equation, we now make the following choice of α α2 = 8µ|E| , ~2 (25) and we define a new parameter ν by the equation ν= 2µZe2 , ~2 4π0 α (26) The radial equation then becomes simply d2 R 2 dR ν 1 l(l + 1) + + − − R = 0, dρ2 ρ dρ ρ 4 ρ2 (27) where the new variable ρ is related to the radial distance r by eq. (20). In this appendix we shall solve the radial equation (18) using the power series method. We first note that the first two terms in the equation and the last term depend upon the −2 power of ρ, while the third term in the equation depends upon the −1 power of ρ and the fourth term depends upon the 0 power. Since the equation depends upon more than two powers of ρ, it cannot be solved directly by the power series method. To overcome this difficulty, we examine the behavior of the equation for large values of r for which the fourth term in the equation dominates over the third and last terms. The function e−ρ/2 , which is everywhere finite, is a solution of the radial equation for large r. This suggests we look for an exact solution of eq.27) of the form R(ρ) = F (ρ)e−ρ/2 , (28) 8 where F (ρ) is a function of ρ. We shall substitute this representation of R(ρ) into eq. (18) and in this way derive an equation for F (ρ). Using eq. (28), the first and second derivatives of R(ρ) can be written dF dR 1 = − e−ρ/2 F + e−ρ/2 dρ 2 dρ 2 d2 R 1 −ρ/2 −ρ/2 dF −ρ/2 d F e F − e + e = . dρ2 4 dρ dρ2 Substituting eqs. (28 and (29) into eq. (27) leads to the equation 2 dF ν − 1 l(l + 1) d2 F + −1 + − F = 0. dρ2 ρ dρ ρ ρ2 (29) (30) We note that the first, second, and last terms of the new radial equation depend upon the −2 power of ρ, while the remaining two terms in the equation depend upon the −1 power of ρ. Since eq. (18) only involves two powers of ρ, it is amendable to a power series solution. We now look for a solution for F (ρ) of the form F (ρ) = ρs L(ρ), (31) where the function L(ρ) can be expressed as a power series L(ρ) = ∞ X ak ρ k . (32) k=0 We shall suppose that the coefficient a0 in the expansion of L(ρ) is not equal to zero and that the function ρs gives the dependence of the function F (ρ) near the origin. The requirement that F (ρ) be finite can be satisfied if s has integer values equal or greater than zero. Substituting eq. (31) into eq. (30) gives the following equation for L ρ2 dL d2 L + ρ [2(s + 1) − ρ] + [ρ(ν − s − 1) + s(s + 1) − l(l + 1)] L = 0. 2 dρ dρ Setting ρ = 0 in this last equation, leads to the condition s(s + 1) − l(l + 1) = 0. 9 This quadratic equation has two roots: s = l and s = −(l + 1). Only the root s = l is consistent with the boundary condition that the radial function R(ρ) be finite for ρ = 0. The equation for L then becomes ρ d2 L dL + (ν − l − 1)) L = 0. + [2(l + 1) − ρ] dρ2 dρ (33) Notice that the first and second terms of this last equation depend upon the −1 power of ρ, while the remaining terms in the equation depend upon the 0 power of ρ. To obtains a power seres solution of eq. (33), we take the first two derivatives of eq. (32) to obtain ∞ dL X = kak ρk−1 , dρ k=1 ∞ X d2 F k(k − 1)ak ρk−2 . = dρ2 k=2 We substitute these expressions for L(ρ) and its derivatives into eq. (33) to obtain ∞ X k(k−1)ak ρ k=2 k−1 +2(l+1) ∞ X k=1 kak ρ k−1 − ∞ X k=1 k kak ρ +(ν −l−1) ∞ X ak ρk = 0. k=0 Notice that the first and second summations involve ρk−1 , while the third and fourth sums involve ρk . Because of the factor k − 1 in the first sum, the first summation can be extended down to k = 1 and because of the factor k in the third sum, the third summation can be extended down to k = 0, and the equation may be written ∞ ∞ X X [k(k − 1) + 2k(l + 1)]ak ρk−1 − [k − (ν − l − 1)]ak ρk = 0. k=1 k=1 The first summation in this last equation has the variable ρ is raised to the power k − 1, while in the second summation has ρ raised to the power k. In order to bring these different contribution together so that they contain terms corresponding to the same power of ρ, we make thefollowing substitution in the first summation k = k 0 + 1, (34) 10 and we simplify the terms within the two summations to obtain ∞ X 0 k0 0 (k + 1)(k + 2l + 2)ak0 +1 ρ − k0 =0 ∞ X (k − ν + l + 1)ak ρk = 0. k=0 As with a change of variables for a problem involving integrals, the lower limit of the first summation is obtained by substituting the value k = 1 into eq. (34) defining the change of variables. We now replace the dummy variable k 0 with k in the first summation and draw all of the terms together within a single summation to obtain ∞ X [(k + 1)(k + 2l + 2)ak+1 − (k − ν + l + 1)ak ] ρk = 0. k=0 This equation can hold for all values of y only if the coefficient of every power of ρ is equal to zero. This leads to the following recursion formula ak+1 = k+l+1−ν ak . (k + 1)(k + 2l + 1) (35) The recursion formula gives a1 , a2 , a3 , . . . in terms of a0 . We may thus regard L(ρ) to be defined in terms of the two constanta0 . We must examine, however, the behavior of L(ρ) as y approaches infinity. Since the behavior of H(y) for large values of y will depend upon the terms far out in the power series, we consider the recursion formula (35) for large values of k. This gives 1 k ak+1 → 2 = . ak k k We now compare this result with the Taylor series expansion of the function eρ 1 1 1 eρ = 1 + ρ + ρ2 + · · · + ρk + ρk+1 + . . . . 2! k! (k + 1)! The ratio of the coefficients of this series for large values of k is 1/(k + 1)! k! 1 1 = = → . 1/k! (k + 1)! k+1 k The ratio of successive terms for these two series is the same for large values of k. This means that the power series representation of L(ρ) has the same 11 dependence upon ρ for large values of ρ as the function eρ . Recall now that the radial function R(ρ) is related to F (ρ) by eq. (28), and F (ρ) is related to L(ρ) by eq. (31) with s = l. Setting L(ρ) = eρ ieads to the following behavior of the radial function for large ρ R(ρ) = e−ρ/2 ρl eρ = ρl eρ/2 as y → ∞. The radial function we have obtained from the series expansion thus becomes infinite as y → ∞, which is unacceptable. There is only one way of avoiding this consequence and that is to terminate the infinite series. The series can be terminated by letting ν be equal to an integer n, such that ν = n = n0 + l + 1. (36) The recursion formula (35) then implies that the coefficient an+1 is equal to zero, and the function L(ρ) will be a polynomial of degree n. Eqs. (28) and (31) with s = l then implies that the radial function R(ρ) is equal to a polynomial times the function e−ρ/2 , which means that R approaches zero as ρ → ∞. Setting ν = n and solving eqs. (161) and (26) for the energy, we obtain 1 µZ 2 e4 . |E| = 2 2 2(4π0 ) ~ n2 The energy of the n th bound state of a hydrogen-like ion with nuclear charge Ze is µZ 2 e4 1 . (37) En = − 2 2 2(4π0 ) ~ n2 Using the reduced mass µ for the electron mass takes into account the fact that the finite mass of the nucleus. For the hydrogen atom with Z = 1 and with the reduced mass µ equal to m, eq. (37) reduces the expression for energy En for the hydrogen atom in Chapter 1. The polynomials L(ρ) may be indentified as associated Languere Polynomials Lpq which satisfy the equation ρ dLpq d2 Lpq + [p + 1 − ρ] + (q − p)) Lpq = 0. dρ2 dρ (38) Equating the coefficients in eqs. (33) and (??), we see that p = 2l + 1 and q = n + l. The appropriate polynomials are given by the equation L2l+1 n+l (ρ) = n−l−1 X (−1)k+1 k=0 [(n + l)!]2 ρk . (n − l − 1 − k)!(2l + 1 + k)!k! (39) 12 The normalized radial wave functions for a hydrogen-like ion can be written Rnl (r) = −Anl e−ρ/2 ρl L2l+1 n+l (ρ) with the normalization coefficients are given by the equation ) ( 3 (n − l − 1)! 2Z , Anl = na0 2n[(n + l)!]3 where (40) (41) 4π0 ~2 2Z a0 = and ρ = r. 2 µe na0 The first three radial functions, which can be found using eqs. (39) and (40), are 32 Z 2e−Zr/a0 R10 (r) = a0 3 Zr −Zr/2a0 Z 2 2− R20 (r) = e 2a0 a0 3 Z 2 Zr −Zr/2a0 √ e R21 (r) = 2a0 a0 3 13 APPENDIX CC The Angular Momentum Operators Generalization of the quantum rules The quantum rules given in Chapter 3 may be generalized to three dimensions. The position of a particle in three dimensions can be represented by a vector r, which extends from the origin to the particle, while the momentum of a particle moving in three-dimensional space is represented by a vector p, which points in the direction of the particle’s motion. For a particle moving in three dimensions, the operator associated with the momentum, which we denote by p ˆ , is defined to be p ˆ = −i~∇, (42) where ∇ is the gradient operator discussed in Appendix AA. The gradient of a function is a vector that points in the direction in which the function changes most rapidly and has a magnitude equal to the rate of change of the function in that direction. It is the natural generalization of the concept of the derivative to three dimensions. The expression for the energy in three dimensions is E= 1 2 p + V (r), 2m (43) where p and r are the momentum and radius vectors. Substituting the momentum operator (42) into the eq. (43) for the energy leads to the following Hamiltonian operator 2 ˆ = −~ ∇2 + V (r), H (44) 2m where ∇2 is the Laplacian operator discussed in Appendix F. Once one has constructed an operator O corresponding to a variable of a microscopic system, the wave function of the system and the possible values that can be obtained by measuring the variable are determined by forming the eigenvalue equation Oψ(r) = λψ(r). (45) As for problems in one dimension, the values of λ, for which there is a solution of eq. (45) satisfying the boundary conditions, are the possible values that 14 can be obtained in a measurement of the variable. The wave function ψ(r) describes the system when it is in a state corresponding to the eigenvalue λ. Commutation relations The operators used in quantum mechanics to represent physical variables satisfy certain algebraic relations. Recall that the commutator of two operators, A and B, was defined in the second section of Chapter 3 by the equation [A, B] = AB − BA. (46) In this appendix, we shall discard the caret symbol (ˆ) associated with operators denoting the operator corresponding to a variable with the same symbol used to denote the variable itself. The operator corresponding to the x-component of the momentum, for instance, will be denote simply as px . With this notation the commutation relation between x and px , which is given in the second section of Chapter 3, is written [x, px ] = i~. (47) Other commutation relations can be obtained from this relation by making the cyclic replacements, x → y, y → z, z → x. This leads to the additional commutation relations [y, py ] = i~, [z, pz ] = i~. (48) We recall that the commutators that can be formed with one component of the position vector r and another component of the momentum operator p are equal to zero. For instance, we have [x, py ] = 0. (49) The commutators formed from two components of the position vector or two components of the momentum are also zero. As discussed in Appendix AA, the operator corresponding to the orbital angular momentum can be obtained by replacing the momentum p in the defining equation for the angular momentum, l = r × p, (50) l = −i~ r × ∇. (51) with the operator (42) to obtain 15 We shall now derive commutation relations for the angular momentum operators. These commutation relations will then be used to derive the spectra of eigenvalues of the angular momentum. To make it easier for us to evaluate the commutators of the angular momentum operators, we first derive a few general properties of the commutation relations of operators, which we denote by A, B, and C. Using the definition of the commutator (46), we may write [A, (B + C)] = A(B + C) − (B + C)A = AB + AC − BA − CA = (AB − BA) + (AC − CA). (52) The two terms appearing within parentheses on the right may be identified as the commutators, [A, B] and [A, C]. We thus have [A, (B + C)] = [A, B] + [A, C]. (53) One may prove in a similar fashion that [(A + B), C] = [A, C] + [B, C]. (54) The commutation relations can thus be said to be linear. We next consider the commutation [A, BC]. Again, we use the definition of the commutator (46) to obtain [A, BC] = ABC − BCA. (55) Subtracting and adding the term BAC after the first term on the right, we have [A, BC] = ABC − BAC + BAC − BCA = (AB − BA)C + B(AC − CA) (56) . Again, identifying the two terms within parentheses on the right as the commutators, [A, B] and [A, C], we have [A, BC] = [A, B]C + B[A, C]. (57) Similarly, one may prove that [AB, C] = A[B, C] + [A, C]B. (58) These last two commutation relations can be described in simple terms. The commutator of a single operator with the product of two operators can be 16 written as a sum of two terms involving the commutator of the single operator with each of the operators of the product. For each of these terms, the operator not appearing in the commutator is pulled to the front or the back to preserve the order of the operators within the product. In the first term on the right-hand side of eq. (57), the operator C is pulled to the back, while in the second term on the right, the operator B is pulled to the front. In the two terms of the resulting equation, B appears before C. Similarly, in the first term on the right-hand side of eq. (58), the operator A is pulled to the front so that it appears before B, while in the second term, B is pulled to the back so that it appears after A. We now evaluate the commutator [lx , ly ] involving the x- and y- components of the angular momentum operator l. Using the definition of the orbital angularmomentum given by eq. (50), the x-component of the angular momentum operator can be seen to be lx = ypz − zpy and the y-component may be seen to be ly = zpx − xpz . We may thus use eq. (53) to obtain [lx , ly ] = [lx , zpx − xpz ] = [lx , zpx ] − [lx , xpz ] = [(ypz − zpy ), zpx ] − [(ypz − zpy ), xpz ]. (59) Using eq. (54), this becomes [lx , ly ] = [ypz , zpx ] − [zpy , zpx ] − [ypz , xpz ] + [zpy , xpz ]. (60) We now note that of the commutators that can be formed from the operators on the right-hand side of this last equation, the commutators, [x, px ], [y, py ] and [z, pz ], are each equal to i~. All other commutators are equal to zero. Omitting the second and third terms on the right-hand side of eq. (60), which do not contain operators having a nozero commutator, the equation becomes [lx , ly ] = [ypz , zpx ] + [zpy , xpz ]. (61) The commutators on the right-hand side of this equation may be evaluated as we have described following eqs. (57) and eq. (58). For the first term on the right-hand, we pull y to the front and px to the back giving y[pz , z]px . Similarly, for the second term on the right, we pull x toward the front and py toward the back giving x[z, pz ]py . Eq. (61) then becomes [lx , ly ] = y[pz , z]px + x[z, pz ]py . (62) 17 Like the commutator [x, px ], the commutator [z, pz ] is equal to i~. The commutator [pz , z], for which the operators pz and z are interchanged, is equal to −i~. We thus obtain [lx , ly ] = i~(xpy − ypx ). (63) The term within parentheses on the right may be identified as lz and hence the equation may be written [lx , ly ] = i~lz . (64) The commutation relations (64) assume a more simple form if the angular momentum is measured in units of ~. Commutation relations for the new angular momentum operators can be obtained by dividing eq. (64) by ~2 to obtain [(lx /~), (ly /~) ] = i(lz /~). (65) If the orbital angular momentum is measured in units of ~, the angular momentum operators thus satisfy the commutation relations [lx , ly ] = ilz . (66) The orbital angular momentum is then represented by the operator ~l. Other commutation relations can be obtained from eq. (66) by making the cyclic replacements, x → y, y → z and z → x. We consider now the commutation relation involving lz and the operator, l2 = lx lx + ly ly + lz lz . (67) Using eqs. (53), we may write [lz , l2 ] = [lz , lx lx + ly ly + lz lz ] = [lz , lx lx ] + [lz , ly ly ] + [lz , lz lz ]. (68) We now use eq. (57) and take advantage of the fact that lz commutes with itself to write [lz , l2 ] = lx [lz , lx ] + [lz , lx ]lx + ly [lz , ly ] + [lz , ly ]ly . (69) We now note that the commutators in the first and second terms are in cyclic order, while the commutators in the third and fourth terms are not in cyclic order. The above equation thus becomes [lz , l2 ] = i[lx ly + ly lx − ly lx − lx ly = 0. (70) 18 We thus find that the operators l2 and lz commute with each other. In quantum theory, commuting operators corresponding to variables that can be accurately measured simultaneously. It is generally possible to find simultaneous eigenfunctions for such variables. The common eigenfunctions represent states of the system in which the variables corresponding to the operators have definite values. It is easy to find physical examples of these results. For the hydrogen atom, the states of the electron are described by the quantum numbers, l and ml , corresponding to well-defined values of both l2 and lz . In a magnetic field the magnetic moment and the angular momentum of an electron precess about the direction of the magnetic field with the magnitude of the angular momentum and the projection of the angular momentum upon the direction of the magnetic field having constant values. The definition of the angular momentum by eq. (50) does not apply to the spin. We shall reguire, though, that the components of the spin angular momentum satisfy commutation relations analogous to eq. (66). We have [sx , sy ] = isz . (71) It is thus convenient to regard the commutation relations between the components of the angular momentum operators as the definition the angular momentum. Analogous commutations relations apply to the components of the orbital and spin angular momentum and to the components of the total angular momentum. Spectrum of eigenvalues We conclude this appendix by showing that the commutation relations of the angular momentum operators determine the spectrum of eigenvalues of these operators. Our arguments will be very general applying to any angular momentum operator. Using the symbol j to denote an angular momentum operator, the commutation relations satisfied by the components of the angular momentum operators may be obtained by writing j in place of l in eq. (66) or j in place of s in eq. (71) giving [jx , jy ] = ijz . (72) The symbol j denotes the angular momentum operator in units of ~. As for the orbital angular momentum operators considered previously, the operator jz commutes with the operator j2 = jx jx + jy jy + jz jz . (73) 19 We have [jz , j2 ] = 0. (74) Eq. (74) can be derived as before from the commutation relation (72) and the other commutaion relations obtained from this basic equation by making the cyclic replacements, jx → jy , jy → jz , and jz → jx . Since the operators, j2 and jz , commute, they have a common set of eigenfuctions. Denoting a typical eigenvalue of j2 by µ and an eigenvalue of jz by µz , we denote the simultaneous eigenfunctions of j2 and jz by ψ(µ, µz ). These functions satisfy the eigenvalue equations, j2 ψ(µ, µz ) = µψ(µ, µz ) (75) jz ψ(µ, µz ) = µz ψ(µ, µz ). (76) and In order to study the properties of the angular momentum eigenfunctions, we introduce new operators by the equations j+ = jx + i jy (77) j− = jx − i jy (78) and Since j+ and j− are linear combinations of jx and jy , and since jx and jy commute with j2 , j+ and j− must also commute with j2 . One may easily confirm this result using the deinining eqs. (77) and (78) together with eq. (54). In order to evaluate the commutation relation of jz with j+ , we first use eqs. (77) and (53) to write the commutator as follows [jz , j+ ] = [jz , (jx + i jy )] = [jz , jx ] + i [jz , jy ] (79) The commutation relation satisfied by the components of the angular momentum may then be used to obtain [jz , j+ ] = ijy + jx . (80) This last equality can be written [jz , j+ ] = j+ . (81) 20 The commutator of jz with j− may be evaluated in a similar way. We obtain [jz , j− ] = −j− . (82) In order to evaluate the operator product j+ j− , we first use the definitions, (78) and (77), to write j− j+ = (jx − ijy ) (jx + ijy ) = jx2 + jy2 + i [jx , jy ] = j2 − jz2 + i [jx , jy ]. (83) We then use the commutation relations (72) to obtain j− j+ = j2 − jz2 − jz . (84) Similarly, the product j+ j− may be evaluated giving j+ j− = j2 − jz2 + jz . (85) Operating now on the eigenvalue equation (75) with j+ and using the fact that j+ and j2 commute, we obtain j2 j+ ψ(µ, µz ) = µj+ ψ(µ, µz ). (86) The function j+ ψ(µ, µz ) is thus also an eigenfunction of j2 corresponding to the eigenvalue µ. Similarly, multiplying the eigenvalue equation (76) by the operator j+ gives j+ jz ψ(µ, µz ) = µz j+ ψ(µ, µz ). (87) We may now use the definition of the commutator of two operators and eq. (81) to write j+ jz = jz j+ − [jz , j+ ] = jz j+ − j+ . (88) Substituting this expression for j+ jz into eq. (87) and bringing the term with j+ over to the right-hand side of the equation, we then obtain jz j+ ψ(µ, µz ) = (µz + 1) j+ ψ(µ, µz ). (89) The function j+ ψ(µ, µz ) is thus an eigenfunction of jz corresponding to the eigenvalue µz + 1. So, operating on the function ψ(µ, µz ) with j+ gives a new eigenfunction belonging to the same eigenvalue of j2 but to the eigenvalue 21 (µz + 1) of jz . By repeatedly operating with j+ on ψ(µ, µz ), we can generate a whole series of eigenfunctions of jz belonging to the eigenvalues µz , µz + 1, µz + 2, . . . and all belonging to the eigenvalue µ of j2 . For this reason, j+ is called a step-up operator. Equations similar to eqs. (86) and (89) may be derived by multiplying the eigenvalue equations (75) and (76) by j− . We have j2 j− ψ(µ, µz ) = µj− ψ(µ, µz ) (90) jz j− ψ(µ, µz ) = (µz − 1) j− ψ(µ, µz ). (91) The function j− ψ(µ, µz ) is an eigenfunction of j2 corresponding to the eigenvalue µ and an eigenfunction of the operator jz corresponding to the eigenvalue (µz − 1). The operator j− may thus be thought of as a step-down operator. We now determine the possible values of µ and µz . For a definite value of µ, there must be a limit to how large or how small µz can become. The eigenvalue µ gives the square of the length of the vector j, while µz is the projection of the vector j upon the z-axis. The projection of a vector upon the z-axis cannot be larger than the length of the vector itself. We denote the maximum eigenvalue of jz by j and the eigenfunction corresponding to the maximum eigenvalue by ψ(µ, j). Multiplying the function ψ(µ, j) by j+ must give zero j+ ψ(µ, j) = 0. (92) For, otherwise, j+ ψ(µ, j) would be an eigenfunction of z corresponding to the eigenvalue j + 1. We note that setting j+ ψ(µ, j) equal to zero gives a solution of eqs. (86) and (89). Operating on eq. (92) with j− gives j− j+ ψ(µ, j) = 0. (93) Using eq. (84), this equation can be written (µ − j 2 − j) ψ(µ, j) = 0. (94) Since the function ψ(µ, j) cannot vanish at all points, it follows that µ = j(j + 1). (95) Similarly, let (j − r) be the least eigenvalue of jz . Then it follows that j− ψ(µ, j − r) = 0. (96) 22 and j+ j− ψ(µ, j − r) = 0. (97) Using eq. (85), then leads to the equation [µ − (j − r)2 + (j − r)] ψ(µ, j − r) = 0, (98) µ − (j − r)2 + (j − r) = 0. (99) and we must have Substituting the value of µ given by eq. (95) into this last equation leads to the quadratic equation r2 − r(2j − 1) − 2j = 0. (100) which has only one positive root, r = 2j. Thus, the least eigenvalue of jz is equal to j − r = −j. This means that for a particular eigenvalue µ = j(j + 1) of j2 , there are 2j + 1 eigenfunctions ψ(µ, m) of jz corresponding to the eigenvalues m = j, j − 1, . . . , −j + 1, −j. (101) It is also clear from the above argument that 2j must be an integer, which means that the quantum number j of the angular momentum must be an integer or a half-integer. Using commutation relations of the angular momentum operators, we have thus shown that the eigenvalues of j2 are j(j + 1) where j may an integer or half-integer. For a particular value of j, the eigenvalues of jz are m = j, j − 1, . . . , −j. Denoting the simultaneous eigenfunctions of j2 and jz by the quantum numbers, j and m, the eigenvalue equations become j2 ψ(jm) = j(j + 1)ψ(jm) (102) jz ψ(jm) = mψ(jm). (103) The operators j2 and jz give the square of the angular momentum operator and the z-component of the angular momentum in units of ~. The operators, (~j)2 and ~jz , which represent the angular momentum in an absolute sense, have eigenvalues j(j + 1)~2 and m~. The general results we have obtained for the angular momentum can be applied to the orbital and spin angular momenta. The operator corresponding to the square of the orbital angular momentum, which we have denoted 23 previously by l2 , has eigenvalues l(l + 1)~2 . For a given value of l, the zcomponent of the orbital angular momentum, which we denote by lz , has the values ml ~, where ml = −1, −l + 1, . . . , l. The spin quantum number of the electron has the value s = 1/2 with the eigenvalues of the spin operator s2 being (1/2) · (3/2)~2 and the spin operator sz has eigenvalues ±(1/2)~. 24 APPENDIX DD The Radial Equation for Hydrogen The operators describing atomic electrons can generally be separated into radial and angular parts. As we have seen in Appendix AA, the angular part of the Laplacian operator is related to the angular momentum operator l2 . Eq. (11) expresses the Laplacian operator in terms of its radial and angular parts. We consider now the effect of the Laplacian operator (11) upon the wave function of the hydrogen atom. Using the form of the atomic wave function and the spherical harmonics introduced in Section 4.1.3, the wave function of hydrogen can be written ψ(r, θ, φ) = P (r) Y (θ, φ). r lml (104) The spherical harmonic Ylml (θ, φ) is an eigenfunction of the angular momentum operator l2 corresponding to the eigenvalue l(l + 1)~2 . We consider first the effect of the first term on the right-hand side of eq. (11) upon the wave function (104). Taking the partial derivate with respect to r does not affect the spherical harmonic. So we must evaluate the result of operating successively upon the function P (r)/r with the partial derivative, then with r2 and the partial derivative, and finally multiplying with 1/r2 . Operating upon P (r)/r with the partial derivative gives ∂ ∂r P ∂(r−1 P ) 1 dP 1 = = − 2 P. r ∂r r dr r (105) Multiplying this equation by r2 and taking the partial derivative with respect to r, then gives P d2 P ∂ 2 ∂ r =r 2. (106) ∂r ∂r r dr Finally, dividing by r2 , we obtain 1 ∂ P 1 d2 P 2 ∂ r = . r2 ∂r ∂r r r dr2 (107) 25 Operating with the first term on the right-hand side of eq. (11) upon the wave function (104) thus gives 1 ∂ 1 d2 P 2 ∂ r ψ(r, θ, φ) = Y (θ, φ). (108) r2 ∂r ∂r r dr2 lml Similarly, operating with the second term on the right-hand side of eq. (11) upon the wave function (104) and taking advantage of the fact that Ylml (θ, φ) is an eigenfunction of l2 corresponding to the eigenvalue l(l + 1)~2 , we obtain − l(l + 1) P (r) l2 ψ(r, θ, φ) = − Y (θ, φ). ~2 r2 r2 r lml (109) The effect of multiplying the Laplacian operator (11) upon the hydrogen wave function (104) can be evaluated using eqs. (108) and (109). This leads to the result ∇2 ψ(r, θ, φ) = l(l + 1) P (r) 1 d2 P Ylml (θ, φ) − Y (θ, φ). 2 r dr r2 r lml (110) The Schr¨odinger equation for the electron of a hydrogen atom is given by eq. (4.2). Substituting the hydrogen wave function (104) into the Schr¨odinger equation and using eq. (110) to evaluate the effect of the Laplacian operator upon the wave function, we obtain − ~2 1 d2 P ~2 l(l + 1) P (r) Y (θ, φ) + Y (θ, φ) 2m r dr2 lml 2m r2 r lml P (r) 1 Ze2 P (r) Ylml (θ, φ) = E Y (θ, φ). (111) − 4π0 r r r lml The radial equation (4.5) is obtained from this last equation by deleting (1/r)Ylml (θ, φ) from each term. 26 APPENDIX EE Transition Probabilities for z-Polarized Light We suppose that the electromagnetic radiation incident upon an atom is a superposition of plane waves. For each of these waves, the electric field can be written E = 2E0 sin(k · r − ωt). (112) The energy per unit volume of the radiation field associated with this monochromatic wave is W = 0 E2 = 0 4E02 sin2 (k · r − ωt). Since the time average of the sine squared function in this last equation is 1/2 , the average energy per volume is Wav = 20 E02 . (113) Using the representation of the sine function given by eq. (3.10), eq. (112) can be written E = −iE0 ei(k·r−ωt) + iE0 e−i(k·r−ωt) . (114) For most applications, the coupling between the electrons and the radiation field is rather weak. The interaction can then be described by the Hamiltonian Hint = E · (−e r) , (115) where (−e r) is the dipole moment of the electron. We shall consider radiation for which the electric field vector E is directed along the z-axis. Using eq. (114), the interaction Hamiltonian can then be written Hint = −i(−ez)E0 ei(k·r−ωt) + i(−ez)E0 e−i(k·r−ωt) . (116) The wave function of a hydrogen-like ion exposed to a time-dependent radiation field may be described by eq. (4.17) X ψ(r, t) = cn (t) φn (r) e−iEn t/~ , (117) n where the coefficients cn (t) depend on time. The wave functions φn are eigenfunctions of the stationary atomic Hamiltonian H0 = 1 Ze2 −~2 2 ∇ − . 2m 4π0 r (118) 27 For simplicity, we assume that the eigenvalues En are nondegenerate. In order to be in a position to calculate the probability that the atom makes a transition from a level i to a level j, we suppose that at time, t = 0, the coefficient ci (0) is equal to one and all the other coefficients cj (0) are equal to zero. We wish to calculate the probability |cj (t)|2 that at a later time t the atom is in the state j. Substituting eq. (117) into the Schr¨odinger time-dependent equation, i~ ∂ψ(r, t) = H ψ(r, t), ∂t (119) we obtain the following first-order differential equation for the coefficients cn (t) X dcn X i~ + En cn φn (r) e−iEn t/~ = (H0 + Hint ) cn φn (r) e−iEn t/~ . dt n n (120) On the right-hand side of this equation, we have written the Hamiltonian H as the sum of a stationary term H0 and a dynamic term Hint corresponding to the interaction of the electron with an oscillating electromagnetic field. Since φn is an eigenfunction of H0 corresponding to the eigenvalue En , the second term on the left-hand side of the equation cancels with the first term on the right to give X X dcn φn (r) e−iEn t/~ = Hint cn φn (r) e−iEn t/~ . (121) i~ dt n n The assumption that Hint is small means that the coefficients cn (t) evolve slowly with time. It is thus reasonable to approximate the coefficients cn on the right side of this equation with their initial values. Since ci (0) = 1 and all the other coefficients are zero, we get X dcn φn (r) e−iEn t/~ = Hint φi (r) e−iEi t/~ . (122) i~ dt n We may now single out the term on the left-hand side corresponding to the level j by multiplying the equation through on the left by the function φ∗j (r) and integrating to obtain Z Z X dcn ∗ −iEi t/~ −iEn t/~ i~ e φj (r)φn (r)dV = e φ∗j (r)Hint φi (r)dV. (123) dt n 28 The eigenfunctions of H0 have the property that they form an orthogonal R ∗set of functions. This means that if n is not equal to j, the integral, φj (r)φn (r)dV , which appears on the left is equal to zero. For the case, n = j, the functions can be normalized so that the integral is equal to one. Using this property of the wave functions, eq. (123) can be written dcj = ei(Ej t − Ei )t/~ i~ dt Z φ∗j (r)Hint φi (r)dV. (124) The factor, (Ej − Ei )/~, which appears in the exponential term may be identified as the angular frequency of the transition ωij = Ej − Ei . ~ (125) Multiplying eq. (124) through by −idt/~ and integrating from 0 to t, we obtain the following equation for the coefficient cj as a function of time i cj (t) = − ~ Z 0 t 0 eiωij t Z φ∗j (r)Hint φi (r)dV dt0 . (126) In order to solve this last equation for cj , we must use the explicit form of the interaction Hamiltonian. Substituting eq. (116) into eq. (126) and performing the integrations over t0 , we obtain # Z 1 − ei(ωij − ω)t iE0 φ∗j (r)(−ez) eik·r φi (r)dV cj (t) = − ~(ωij − ω) " # Z 1 − ei(ωij + ω)t + iE0 φ∗j (r)(−ez) e−ik·r φi (r)dV. (127) ~(ωij + ω) " For the case Ej < Ei , the angular frequency ωij , which is given by (125), is negative and the transition i → j corresponds to stimulated emission. When the frequency ω of the incident radiation is near −ωij , the denominator of the second term in eq. (127) will become very small and the second term will be much larger than the first. It is usually true that for emission processes the first term may be neglected. Similarly, the first term in eq. (127) provides a good approximate description of absorption. 29 We shall consider stimulated emission in some detail. Factoring ei(ωij + ω)t/2 from the second term of eq. (127), we may write this contribution to cj as " cj (t) = ei(ωij + ω)t/2 # ei(ωij + ω)t/2 − e−i(ωij + ω)t/2 × ~(ωij + ω) Z iE0 φ∗j (r)(−ez) e−ik·r φi (r)dV. (128) The representation of the sine function in terms of exponentials given by eq. (3.10) may then be used to write eq. (128) in the following way Z sin[(ωij + ω)t/2] i(ω + ω)t/2 ij E0 φ∗j (r)(−ez) e−ik·r φi (r)dV. (129) cj (t) = e ~[(ωij + ω)/2] The transition probability per time is |cj (t)|2 /t. Using eq. (129), the transition probability per time may be written Z t sin2 [(ωij + ω)t/2] 2 2 E0 | φ∗j (r)(−ez) e−ik·r φi (r)dV |2 . (130) |cj (t)| /t = 2 ~ [(ωij + ω)t/2]2 Eq. (130) gives the probability per time that radiation of a single frequency ω will be emitted. According to eq. (113), the energy per volume of the wave is equal to 20 E02 . In order to be in a position to integrate over the entire spectrum of frequencies, we set this expression for the energy equal to the amount of energy of a continuous spectrum in the range between ω and ω + dω 20 E02 = ρ(ω)dω. (131) where, as before, ρ(ω) is the energy density per frequency range. Solving this equation for E02 , gives 1 E02 = ρ(ω)dω. (132) 20 We now substitute this expression for E02 into eq. (130) and integrate over a range of frequencies that includes the resonant frequency −ωij to obtain Z 1 2 |cj (t)| /t = | φ∗j (r)(−ez) e−ik·r φi (r)dV |2 2 20 ~ 2 Z ω2 sin ((ωij + ω)t/2)) × ρ(ω) tdω. (133) ((ωij + ω)t/2) ω1 30 The term occurring in the denominator of the integrand will be zero when the frequency ω is equal to −ωij . This frequency, which makes the largest contribution to the transition probability, will be denoted by ω ∗ . Using eq. (125), we may write ω ∗ = −ωij = Ei − Ej . ~ (134) For an emission process, Ei will be greater than Ej and ω ∗ will be positive. According to eq. (44) of Chapter 4, ω ∗ is then equal to the transition frequency. Substituting ω ∗ for −ωij in eq. (133), we get 1 | |cj (t)| /t = 20 ~2 2 Z φ∗j (r)(−ez) e−ik·r φi (r)dV |2 2 Z ω2 sin ((ω − ω ∗ )t/2)) × tdω. (135) ρ(ω) ((ω − ω ∗ )t/2) ω1 The function within square brackets in this last equation is similar to the function occurring within square brackets in eq. (3.44), which is represented by the dotted line in Fig. 3.13. Both functions have well-defined maxima. The function within square brackets in eq. (135) has its maximum value for ω = ω ∗ , and the function is zero when the frequency ω differs from ω ∗ by an integral number of multiples of 2π/t ω − ω∗ = n 2π . t (136) For large values of t, the function within square brackets becomes very sharply peaked. The function ρ(ω) can then be approximated by its value at the transition frequency ω = ω ∗ and brought outside the integral to give 1 ρ(ω ∗ )| |cj (t)| /t = 20 ~2 2 Z φ∗j (r)(−ez) e−ik·r φi (r)dV |2 2 Z ω2 sin ((ω − ω ∗ )t/2) × tdω. (137) ((ω − ω ∗ )t/2) ω1 In the limit of large t, the integral has the value 2π, and we obtain Z π 2 ∗ |cj (t)| /t = ρ(ω )| φ∗j (r)(−ez) e−ik·r φi (r)dV |2 . 0 ~2 (138) 31 In deriving this result, we have not made any assumptions concerning the wavelength of the light. A very useful approximation can be obtained by taking advantage of the fact that the size of the atom is much smaller than the wavelength of visible or even ultraviolet light. The wavelength of visible light is between 400 and 700 nm, while the wavelength of ultraviolet light is between 10 and 400 nm. By comparison, the size of an atom is about 0.1 nm. The dependence of the incident wave upon the spatial coordinates occurs through the factor e-ik · r in eq. (138). Since the magnitude of the wave vector k is 2π/λ, k · r will not change appreciably over the size of the atom. It follows that we can approximate the exponential function by the first term in its Taylor series expansion e−ik · r = 1 − ik · r + . . . , (139) This is called the electric dipole approximation. Replacing the exponential function with 1 in eq. (138) and denoting the transition frequency by ω as in the text, we obtain Z π 2 ρ(ω)| φ∗j (r)(−ez) φi (r)dV |2 . (140) |cj (t)| /t = 0 ~2 32 APPENDIX FF Transitions with x- and y-Polarized Light We would now like to calculate transition probabilities for light polarized in the x- and y-directions. For this purpose, we introduce linear combinations of the x- and y- coordinates having simple transition integrals. We define r+ = x + iy and r− = x − iy. Using eq. (4.3) and Euler’s formula, one may derive the following equations for r+ and r− r+ = r sin θ eiφ (141) r− = r sin θ e−iφ . (142) The significance of the variables r+ and r− may be understood in physical terms. If we were to multiply r+ by the time dependence e−iωt corresponding to an oscillating electric field, we would obtain r+ (t) = r sin θ ei(φ − ωt) . (143) r+ (t) has the same dependence upon the polar angle and time as the components of a polarization vector rotating about the z-axis. We may thus associate the variable r+ with circularly polarized light. Similarly, r− may be associated with circularly polarized light for which the polarization vector rotates about the z- axis in the opposite direction. We now use the variables r+ and r− to evaluate transition integrals for x- and y- polarized light. Example Again for the 2p → 1s transition of the hydrogen atom, calculate the transition integrals for x- and y-polarized light. Solution For the transition 2p-1 to 1s0, the integral of r+ is Z φ∗1s0 r+ φ2p−1 dV √ Z π Z 2π 1 3 1 1 √ · sin θ · √ · eiφ · √ e−iφ dφ, (144) = Ri sin θ sin θ dθ 2 2 2π 2π 0 0 33 where Ri is the radial integral given by eq. (4.25). The integration over φ gives 1, and the θ integration can be written r Z π r r 1 1 3 3 4 2 sin3 θdθ = · = . (145) 2 2 0 2 2 3 3 The transition integral for the operator r+ is thus r Z 2 ∗ φ1s0 r+ φ2p−1 dV = Ri 3 (146) The integrals of r+ for the transition 2p0 → 1s0 and 2p + 1 → 1s0 may be shown to be equal to zero. The integral of r− for the transition 2p + 1 to 1s0 can be shown to be r Z 2 Ri , (147) φ∗1s0 r− φ2p+1 dV = − 3 and the other integrals of r− are zero. The defining equations for r+ and r− can be solved for x and y to obtain 1 x = (r+ + r− ) 2 (148) 1 (r+ − r− ). (149) 2i These equations can be used to evaluate the transition integrals for x-and y-polarized light. y= The transition rates for x- and y-polarized light can be calculated using eqs. (4.20) and (4.21) with x and y in place of z. The calculation of these transition rates are left as an exercise. (See Problem 13.). 34 APPENDIX GG Derivation of the Distribution Laws The distribution laws are derived by maximizing the statistical weight for a perfect gas with respect to the occupation numbers nr . In the following, we shall find it more convenient to require that the natural log of the weight be a maximum rather than the weight itself. Since the natural logarithm of the weight ln W increases monotonically with W , the weight will have a maximum when the logarithm of the weight has its maximum. We thus require that the following condition be satisfied δ ln W = 0, (150) for small variations of the occupation numbers consistent with the equations N= X nr , (151) r and E= X nr r . (152) r The condition (150) for changes in the occupation numbers consistent with eqs. (151) and (152) may be shown to be equivalent to the condition δ[ln W − α ∞ X r=1 nr − β ∞ X r nr ] = 0, (153) r=1 for all changes in the occupation numbers. This variational condition may be written ∞ ∞ X X δ ln W − α δnr − β r δnr = 0. (154) r=1 r=1 This equation may be used to derive the distribution laws for classical and quantum statistics. Since the expression for the weight W depends upon the particular form of statistics, we must consider each kind of statistics separately. 35 Maxwell-Boltzmann Statistics For a perfect classical gas, the statistical weight is given by eq. (7.6). Using the explicit form of this equation with the product notation, the natural logarithm of W (n1 , n2 , . . . , nr , . . . ) may be written ∞ X ln W = ln N ! + (nr ln gr − ln nr !) . (155) r=1 For a macroscopic sample, the occupation numbers nr are very large and the natural logarithm of the factorial ln nr ! may be approximated by Sterling’s formula ln n! = n(ln n − 1), for large n. (156) Eq. (155) then becomes ln W = ln N ! + ∞ X (nr ln gr − nr ln nr + nr ) . (157) r=1 Using eq. (157), the change in ln W due to changes in the occupation numbers δnr may be written δ ln W = ∞ X (ln gr − ln nr )δnr . (158) r=1 Substituting this last equation into the variational condition (154), we obtain the condition, ∞ X (ln gr − ln nr − α − βr )δnr = 0, r=1 which can be true for all variations δnr only if the factors appearing within parentheses are equal to zero. We thus have ln gr − ln nr − α − βr = 0. This last condition can be written ln nr = −α − βr . gr (159) 36 Taking the exponent of each side of eq. (159), we obtain finally nr = e−α − βr . gr (160) The distribution law (160) may be cast into a more convenient form by expressing the constant α in terms of another constant Z by the equation e−α = We then have N . Z N nr = e−βr . gr Z (161) (162) As shown in the book by McGervey, which is cited at the end of Chapter 7, the constant β is equal to kT with the constant k being called the Boltzmann constant. We thus obtain N nr = er /kT . (163) gr Z This last equation is known as the Maxwell-Boltzmann distribution law. Bose-Einstein Statistics The statistical weight for Bose-Einstein statistics is given by eq. (7.54). Using this formula, the natural logarithm of W (n1 , n2 , . . . , nr , . . . ) may be written ln W = ∞ X [ln(nr + gr − 1)! − ln nr ! − ln(gr − 1)!] . (164) r=1 Sterling’s formula (156) may again be used to evaluate the first two natural logarithms, and we obtain ln W = ∞ X [(nr + gr − 1) ln(nr + gr − 1) r=1 −nr ln nr − (gr − 1) − ln(gr − 1)!] . (165) 37 Using eq. (165), the change in ln W due to changes in the occupation numbers δnr may be written δ ln W = ∞ X [ln(nr + gr − 1) − ln nr ] δnr . (166) r=1 Since nr is much larger than one, the number, −1, in the first term may be omitted and this last equation becomes δ ln W = ∞ X [ln(nr + gr ) − ln nr ] δnr . (167) r=1 Substituting eq. (167) into the variational condition (154), we obtain ∞ X [ln(nr + gr ) − ln nr − α − βr ] δnr = 0. (168) r=1 Again, setting the factor multiplying δnr equal to zero, we get ln(nr + gr ) − ln nr − α − βr = 0. (169) This last equation can be written ln nr = −α − βr . nr + gr (170) Taking the exponent of each side of eq. (170) and collecting together the terms depending upon nr , we obtain nr 1 − e−α − βr = gr e−α − βr . (171) We may again take β = kT , and this equation may be written nr = gr 1 eα er /kT − 1 . (172) 38 Eq. (172) is known as the Bose-Einstein distribution law. Fermi-Dirac Statistics The statistical weight for Fermi-Dirac statistics is given by eq. (7.55). Using this formula, the natural logarithm of W (n1 , n2 , . . . , nr , . . . ) may be written ln W = ∞ X [ln gr ! − ln nr ! − ln(gr − nr )!] . (173) r=1 Again using Sterling’s formula (156) to evaluate the first two terms in the summation, we obtain ln W = ∞ X [ln gr ! − nr ln nr − (gr − nr ) ln(gr − nr ) + gr ] . r=1 (174) Using eq. (174), the change in ln W due to changes in the occupation numbers δnr may be written δ ln W = ∞ X [ln(gr − nr ) − ln nr ] δnr . (175) r=1 As before, we substitute eq. (175) into the variational condition (154) to obtain ∞ X [ln(gr − nr ) − ln nr − α − βr ] δnr = 0. (176) r=1 The factor multiplying δnr may again be set equal to zero to give the following equation ln nr = −α − βr . gr − nr (177) 39 Taking the exponent of each side of eq. (177) and collecting together the terms depending upon nr , we obtain nr 1 + e−α − βr = gr e−α − βr (178) Again setting β = kT , this equation may be written nr = gr 1 eα er /kT + 1 , which is known as the Fermi-Dirac distribution law. (179) 40 APPENDIX HH Derivation of Block’s Theorem The wave function of an electron moving in a periodic potential is a solution of the Schr¨odinger equation, which is given by eq. (8.41). Evaluating the Schr¨odinger equation at the coordinate point r + l, we obtain ~2 2 ∇ + V (r + l) ψ(r + l) = Eψ(r + l). (180) − 2m We may use eq. (8.42) to replace the potential energy term in this equation with its value at the point r giving ~2 2 ∇ + V (r) ψ(r + l) = Eψ(r + l). (181) − 2m The functions ψ(r) and ψ(r + l) are thus both solutions of the Schr¨odinger equation corresponding to the energy E. If the energy eigenvalue E is nondegenerate, the function ψ(r + l), which is obtained from ψ(r) by a displacement by a lattice vector l, must be proportional to ψ(r). This is true for any l. We consider first the function ψ(r + a1 ) which corresponds to a single step in the direction a1 . For this function, the appropriate relation of proportionality can be written ψ(r + a1 ) = λ1 ψ(r). (182) Since the functions ψ(r + a1 ) and ψ(r) are both normalized, we must have |λ1 |2 = 1. (183) λ1 = eik1 , (184) We can thus write λ1 in the form where k1 is a real number. Eq. (182) then becomes ψ(r + a1 ) = eik1 (r). (185) Similar equations can be derived for displacements in the a2 and a3 directions ψ(r + a2 ) = eik2 (r) , ψ(r + a3 ) = eik3 (r). (186) 41 The effect of a general translation can be obtained by applying eqs. (185) and (186) successively for translations in the a1 , a2 and a3 directions ψ(r + l) = ψ(r + l1 a1 + l2 a2 + l3 a3 ) = eik1 ψ(r + (l1 − 1)a1 + l2 a2 + l3 a3 ) = eik1 l1 ψ(r + l2 a2 + l3 a3 ) = ei(k1 l1 + k2 l2 + k3 l3 ) ψ(r). (187) To express this result in more general terms, we define a wave vector k k = k1 b1 b2 k3 b3 + k2 + , 2π 2π 2π (188) where b1 , b2 and b3 are the reciprocal vectors corresponding to the unit vectors a1 , a2 and a3 . Using eq. (8.1) and also eq. (8.22), eq. (187) can be written ψ(r + l) = eik · l ψ(r), (189) which is a mathematical expression for the theorem. The proof of the theorem depends upon the potential energy being periodic. A more general proof of Block’s theorem including the case for which the energy E is degenerate can be found in the book by Ziman which is cited at the end of Chapter 8. 42 APPENDIX II The Band Gap To find the form of the wave function and the energy near near the zone boundary, we consider the Schr¨odinger equation for the diffracted electron ~2 2 − ∇ + V (r) ψk = E(k)ψk . (190) 2m According to eqs. (8.63) and (8.58), the wave function and the potential energy may be expressed in terms of plane waves as follows X 0 ψk (r) = αk−g0 ei(k − g ) · r , (191) g0 and V (r) = X 00 Vg00 eig · r , (192) g00 where the summations over g0 and g00 extend over the vectors of the reciprocal lattice. Substituting these equations into eq. (190) and using eq. (8.63), we get X 0 E 0 (k − g0 )αk−g0 ei(k − g ) · r + X 0 00 Vg00 αk−g0 ei(k − g + g ) · r g0 ,g00 g0 = E(k) X 0 αk−g0 ei(k − g ) · r , (193) g0 where E 0 (k − g0 ) is the kinetic energy of a free electron with wave vector k − g0 . The term on the right-hand side of this last equation may be grouped together with the first term on the left-hand side giving X X 0 0 00 E 0 (k − g0 ) − E(k) αk−g0 ei(k − g ) · r + Vg00 αk−g0 ei(k − g + g ) · r = 0. g0 ,g00 g0 Multiplying eq. (194) through from the left by 1/V ψk∗ (r) = 1 V 1/2 e−ik · r , 1/2 (194) and by the function, (195) 43 integrating over all space, and using eq. (8.56), we obtain X X E 0 (k − g0 ) − E(k) αk−g0 δk,k−g0 + Vg00 αk−g0 δk,k−g0 +g00 = 0. g0 (196) g0 ,g00 The kronecker delta function δk,k−g0 occurring in the first term is equal to one if k = k − g0 , and, otherwise, is equal to zero. This kronecker delta thus has the effect of reducing the first summation to a single term for which g0 = 0. Similarly, the kronecker delta in the second summation is one if g0 = g00 being zero otherwise and reduces the double summation to a single summation. We have X 0 E (k) − E(k) αk + Vg0 αk−g0 = 0. (197) g0 As illustrated in Fig. 8.22, the crystal filed causes the wave function with wave vector k to interact with the state with wave vector k − g. For a particular reciprocal lattice vector g, we thus ignore all coefficients except αk and αk−g . Eq. (197) then becomes 0 E (k) − E(k) αk + V0 αk + Vg αk−g = 0. (198) We note that the Fourier coefficient V0 corresponds to a constant term in the potential energy V (r) and may thus be taken to be zero. This eliminates the second term in the above equation, and we obtain 0 E (k) − E(k) αk + Vg αk−g = 0. (199) A second equation for the coefficients, αk and αk−g , can be obtained by multiplying eq. (194) through from the left by the function, ∗ ψk−g (r) = 1 V 1/2 e−i(k − g) · r , and integrating over all space as before. We get X 0 E (k − g) − E(k) αk−g + Vg0 −g αk−g0 = 0. (200) (201) g0 Limiting ourselves again to the terms depending upon the coefficients, αk and αk−g , we note that the term in the sum for which g0 = g vanishes since 44 we have supposed that the Fourier coefficient V0 is equal to zero. Setting g0 = 0 in the above summation leads to the equation V−g αk + E 0 (k − g) − E(k) αk−g = 0. (202) The Fourier coefficient V−g appearing in eq. (202) is equal to the coefficient Vg∗ which appears in the Fourier expansion of V (r)∗ . A trivial solution of eqs. (199) and (202) can be obtained by taking the coefficients, αk and αk−g , equal to zero. If the determinant of eqs. (199) and (202) is not equal to zero, this is the only solution of the equations. In order to find a physically meaningful description of the diffracted electron, we thus set the determinant of the coefficients equal to zero. We have 0 [E (k) − E(k)] Vg = 0. (203) [E 0 (k − g) − E(k)] Vg∗ The two free-electron states ek · r and e(k − g) · r have the same energy, E 0 (k) = E 0 (k − g). If we denote this common value by E 0 , the quadratic equation resulting from eq. (203) can be written 2 (204) E(k) − E 0 − |Vg |2 = 0. Eq. (204) has two solutions E(k)± = E 0 ± |Vg |. (205) The interaction of the two free-electron states with wave vectors k and k − g thus cause a discontinuity in the energy at the zone boundary. The magnitude of the discontinuity depends upon the Fourier coefficients, Vg , which occur in the expansion of the potential energy. An expression for the Fourier coefficients can be obtained by applying eq. (8.26) to the periodic function V (r), which gives Z 1 V (r) e−ig · r dV. (206) Vg = vcell We now divide eq. (206) into real and imaginary parts by using Eurler’s equation Z Z 1 i V (r) cos(g · r)dV − V (r) sin(g · r)dV. (207) Vg = vcell vcell 45 Most of the common three-dimensional lattices are symmetric with respect to inversion r → −r. The body-and face-centered cubic structures and the hexagonal close packed structures described in Chapter 8 are invariant with respect to inversions provided that a suitable choice is made of the origin. If the potential V (r) is symmetric with respect to inversion, the second integral in eq. (207) vanishes, and the equation for the Fourier coefficients becomes Z 1 Vg = V (r) cos(g · r)dVr . (208) vcell Since the electrons are attracted to the ion cores, the potential energy function V (r) is negative in the neighborhood of each atom. It thus follows from eq. (208) that the Fourier coefficient Vg are negative real numbers. As discussed in Chapter 8, the crystal field mixes the two free-electron states ek · r and e(k − g) · r to produce states having energies E − and E + . We can derive a condition for the coefficients of the lower state by substituting the value of E − given by eq. (205) into eq. (199) to obtain |Vg |αk + Vg αk−g = 0. (209) Since Vg is negative, |Vg | is equal to −Vg . Using this result together with eq. (209), one may readily show that the coefficients αk−g is equal to the coefficient αk . Similarly, we can derive a condition for the coefficients of the upper state by substituting the value of E + given by eq. (205) into eq. (199) and using the relation |Vg | = −Vg to obtain Vg αk + Vg αk−g = 0. (210) This equation implies that for the upper state the mixing coefficient αk−g is equal to −αk . The spatial form of the wave functions for the lower and upper states are discussed in the text. 46 APPENDIX JJ Vector Spaces and Matrices A vector space is defined to be a set of elements called vectors for which vector addition and scalar multiplication is defined. Vector addition assigns to every pair of vectors, x and y, a sum, x + y, in such a way that (1) addition is commutative, x+y=y+x, (2) addition is associative, x+(y+z)=(x+y)+z (3) there is in the vector space a unique vector 0 (called the origin) such that x + 0 = x for every vector x, and (4) to every vector x in the space there corresponds a unique vector −x such that x + (−x) = 0. The multiplication of a scalar α times a vector x assigns to every pair, α and x, a vector αx in the vector space called the product of α and x. Scalar-vector multiplication is such that (1) multiplication by scalars is associative, α(βx) = (αβ)x, (2) 1x = x for every vector x, (3) multiplication by scalars is distributive with respect to vector addition, α(x + y) = αx + αy, and (4) multiplication by scalars is distributive with respect to scalar addition, (α + β)x = αx + βx. An elementary example of a vector space is ordered sets of n numbers x1 x2 xn = . . . xn with vector addition and scalar multiplication defined by the equations x1 + y 1 αx1 x2 + y n , αxn = αx2 xn + yn = ... ... xn + y n αxn If the components of the vectors in this space are complex numbers, the vector space is denoted C n . The vector space consisting of ordered sets of 47 real numbers is denoted Rn . Another example of a vector space is the set of polynomials pn (x) = a0 + a1 x + a2 x2 + · · · + an xn with complex coefficients an . The addition of two polynomial and multiplication by a complex number is defined in the ordinary way. A finite set {xi } of vectors is said to be linearly dependent if there is a corresponding set {αi } of numbers, not all zero, such that X αi xi = 0, i where the zero on the right-hand P side of the equation corresponds to the zero vector. If, on the other hand, i αi xi = 0 implies that αi = 0 for each i, the set {xi } is linearly independent. To illustrate the idea of linear dependence, we consider the following equation involving two vectors 1 0 0 α1 + α2 = . 0 1 0 Using the rules we have given previously for scalar multiplication and vector addition in C n , we can combine the two vectors on the left-hand side of the equation to obtain α1 0 = . α2 0 This last equation can only be true if the coefficients, α1 and α2 , are both equal to zero. We thus conclude that the vectors, [10]T and [01]T , are linearly independent. A basis in a vector space V is a set of linearly independent vectors X such that every vector in the space can be expressed as a linear combination of members of X . For instance, the vectors, 1 0 and , 0 1 form a basis in the space C 2 . An inner product in a vector space is a complex or real valued function of the ordered pair of vectors x and y such that (1)(x, y) = (y, x), where the line over the ordered pair on the right indicates complex conjugation, 48 (2) (x, α1 y1 + α2 y2 ) = α1 (x, y1 ) + α2 (x, y2 ) (3) (x, x) ≥ 0; (x, x) = 0 if and only if x = 0. The condition (1) implies that (x, x) is always real, so that the inequality in (3) makes sense. In a space with an inner product defined, the norm of a vector ||x|| is defined p ||x|| = (x, x). The inner product thus makes it possible to associate a norm or length with every vector in the space. For the space C n , the inner product of two vectors x1 y1 x2 y2 x= . . . , y = . . . xn yn is defined (x, y) = n X xi yi . i=1 Now for a little terminology. The vectors, x and y, are said to be orthogonal if the inner product of the two vectors is equal to zero (x, y) = 0. A vector x is said to be normalized if its norm is equal to one ||x|| = 1, and a basis of vectors {φi } is said to be orthonormal if each basis vector is orthogonal to the other members of the basis and if each basis vector is normalized. The wave functions representing the states of a physical system may be thought of as vectors in a function space. The inner product of two wave functions, ψ1 and ψ2 , which depend upon a single variable x is defined Z (ψ, φ) = ψ1 (x)ψ2 (x)dx, and the inner product for wave functions depending on two or three variables is defined accordingly. For a particle moving in three dimensions, the inner product of the wave functions, ψ1 and ψ2 , is Z (ψ, φ) = ψ1 (r)ψ2 (r)dV, 49 where dV is the volume element. The presence of a basis in a vector space makes it possible to associate a column vector in C n with every vector in the space and to associate a matrix with operators acting on vectors in the space. Let V be a vector space and let X = φ1 , φ2 , . . . , φn be a basis of V. Using the basis, a vector x may be expressed n X x= xi φi , (211) i=1 and we may associate a column vector, x1 x2 x= . . . , xn with each vector x. The product of an operator A and a vector x is a vector which may also be expressed as a linear combination of the basis vectors φi . This will be true when A acts on the members of the basis itself Aφj = n X aij φi , (212) i=1 for j = 1, . . . , n. The set {aij } of numbers, indexed with the double subscript i, j is the matrix corresponding to A. We shall generally denote the matrix of A by [A]. The matrix may be written more explicitly in the form of a square array a11 a12 . . . a1n a21 a22 . . . a2n . [A] = (213) . . . an1 an2 . . . ann If the basis is ortho-normal, an explicit expression can be derived for the matrix elements aij by using the inner product. Taking the inner product of eq. (212) from the left with φk , gives (φk , Aφj ) = (φk , n X i=1 aij φi ) = n X i=1 aij (φk , φi ) = akj . 50 To derive this last equation, we have used the second property of the inner product and the fact that the basis is orthonormal. The last result may be written akj = (φk , Aφj ). The result of acting with an operator A upon a vector x can be obtained from the matrix associated with A and the column vector associated with x. Using eqs. (211) and (212), we obtain n n X n n n n X X X X X aji φj = ( aji xi )φj Ax = A( xi φ i ) = xi Aφi = xi i=1 i=1 i=1 j=1 j=1 i=1 We may write (Ax)j = n X aji xi . i=1 The j th component of the vector Ax may thus be obtained by forming the sum of the elements of j th row of the matrix [A] times the components of the column vector [x]. As an example of this rule, we give the result of the matrix-vector multiplications in Problem 7 of Chapter 10. 1 2 0 1 3 1 1 2 1 = 2 1 3 1 0 4 1 0 1 1 2 1 2 1 0 = 2 1 1 3 1 4 5 2 1 1 1 1 0 1 2 = 2 3 1 1 0 1 The matrix corresponding to the product of two operators, A and B, may be obtained by allowing to AB to act upon an element of the basis (AB)φj = A(Bφj ) = A( n X bkj φk ) = k=1 = n X k=1 n X bkj Aφk k=1 n n X n X X bkj ( aik φi ) = ( aik bkj )φi . i=1 i=1 k=1 51 We thus define the matrix product [A][B] by the equation ([A][B])ij = n X aik bkj . (214) k=1 The process of forming the product of two matrices can be described in terms of the individual matrices. To obtain the ij th element of the product matrices, one forms the sum of the products of the elements of the i th row of [A] with the elements of the j th column of [B]. The matrix multiplication will be well defined only if the matrix [A] has as many columns as the matrix [B] has rows. As an example of the rule (214) for multiplying matrices, we give the result of the matrix-matrix multiplications in Problem 8 of Chapter 10. 0 1 0 −i i 0 = 1 0 i 0 0 −i 0 1 2 0 0 −2 = 1 0 0 −2 2 0 1 2 0 2 1 1 4 1 3 1 1 2 1 0 1 = 5 3 2 1 3 1 1 1 0 6 2 4 1 APPENDIX KK Vector Spaces and Matrices A vector space is defined to be a set of elements called vectors for which vector addition and scalar multiplication is defined. Vector addition assigns to every pair of vectors, x and y, a sum, x + y, in such a way that (1) addition is commutative, x+y=y+x, (2) addition is associative, x+(y+z)=(x+y)+z (3) there is in the vector space a unique vector 0 (called the origin) such that x + 0 = x for every vector x, and (4) to every vector x in the space there corresponds a unique vector −x such that x + (−x) = 0. The multiplication of a scalar α times a vector x assigns to every pair, α and x, a vector αx in the vector space called the product of α and x. Scalar-vector multiplication is such that (1) multiplication by scalars is associative, α(βx) = (αβ)x, (2) 1x = x for every vector x, (3) multiplication by scalars is distributive with respect to vector addition, α(x + y) = αx + αy, and (4) multiplication by scalars is distributive with respect to scalar addition, (α + β)x = αx + βx. An elementary example of a vector space is ordered sets of n numbers x1 x2 xn = . . . xn with vector addition and scalar multiplication defined by the equations x1 + y1 αx1 x2 + yn αx2 xn + yn = . . . , αxn = . . . xn + yn αxn If the components of the vectors in this space are complex numbers, the vector space is denoted C n . The vector space consisting of ordered sets of real numbers is denoted Rn . Another example of a vector space is the set of polynomials pn (x) = a0 + a1 x + a2 x2 + · · · + an xn with complex coefficients an . The addition of two polynomial and multiplication by a complex number is defined in the ordinary way. 2 A finite set {xi } of vectors is said to be linearly dependent if there is a corresponding set {αi } of numbers, not all zero, such that X αi xi = 0, i where the zero on the right-hand P side of the equation corresponds to the zero vector. If, on the other hand, i αi xi = 0 implies that αi = 0 for each i, the set {xi } is linearly independent. To illustrate the idea of linear dependence, we consider the following equation involving two vectors 1 0 0 α1 + α2 = . 0 1 0 Using the rules we have given previously for scalar multiplication and vector addition in C n , we can combine the two vectors on the left-hand side of the equation to obtain α1 0 = . α2 0 This last equation can only be true if the coefficients, α1 and α2 , are both equal to zero. We thus conclude that the vectors, [10]T and [01]T , are linearly independent. A basis in a vector space V is a set of linearly independent vectors X such that every vector in the space can be expressed as a linear combination of members of X . For instance, the vectors, 1 0 and , 0 1 form a basis in the space C 2 . An inner product in a vector space is a complex or real valued function of the ordered pair of vectors x and y such that (1)(x, y) = (y, x), where the line over the ordered pair on the right indicates complex conjugation, (2) (x, α1 y1 + α2 y2 ) = α1 (x, y1 ) + α2 (x, y2 ) (3) (x, x) ≥ 0; (x, x) = 0 if and only if x = 0. The condition (1) implies that (x, x) is always real, so that the inequality in (3) makes sense. In a space with an inner product defined, the norm of a vector ||x|| is defined p ||x|| = (x, x). The inner product thus makes it possible to associate a norm or length with every vector in the space. For the space C n , the inner product of two vectors x1 y1 x2 y2 x= . . . , y = . . . xn yn 3 is defined (x, y) = n X xi yi . i=1 Now for a little terminology. The vectors, x and y, are said to be orthogonal if the inner product of the two vectors is equal to zero (x, y) = 0. A vector x is said to be normalized if its norm is equal to one ||x|| = 1, and a basis of vectors {φi } is said to be orthonormal if each basis vector is orthogonal to the other members of the basis and if each basis vector is normalized. The wave functions representing the states of a physical system may be thought of as vectors in a function space. The inner product of two wave functions, ψ1 and ψ2 , which depend upon a single variable x is defined Z (ψ, φ) = ψ1 (x)ψ2 (x)dx, and the inner product for wave functions depending on two or three variables is defined accordingly. For a particle moving in three dimensions, the inner product of the wave functions, ψ1 and ψ2 , is Z (ψ, φ) = ψ1 (r)ψ2 (r)dV, where dV is the volume element. The presence of a basis in a vector space makes it possible to associate a column vector in C n with every vector in the space and to associate a matrix with operators acting on vectors in the space. Let V be a vector space and let X = φ1 , φ2 , . . . , φn be a basis of V. Using the basis, a vector x may be expressed x= n X x i φi , i=1 and we may associate a column vector, x1 x2 x= . . . , xn with each vector x. (1) 4 The product of an operator A and a vector x is a vector which may also be expressed as a linear combination of the basis vectors φi . This will be true when A acts on the members of the basis itself Aφj = n X aij φi , (2) i=1 for j = 1, . . . , n. The set {aij } of numbers, indexed with the double subscript i, j is the matrix corresponding to A. We shall generally denote the matrix of A by [A]. The matrix may be written more explicitly in the form of a square array a11 a12 . . . a1n a21 a22 . . . a2n . [A] = (3) ... an1 an2 . . . ann If the basis is ortho-normal, an explicit expression can be derived for the matrix elements aij by using the inner product. Taking the inner product of eq. (2) from the left with φk , gives (φk , Aφj ) = (φk , n X aij φi ) = i=1 n X aij (φk , φi ) = akj . i=1 To derive this last equation, we have used the second property of the inner product and the fact that the basis is orthonormal. The last result may be written akj = (φk , Aφj ). The result of acting with an operator A upon a vector x can be obtained from the matrix associated with A and the column vector associated with x. Using eqs. (1) and (2), we obtain n n n n n X n X X X X X Ax = A( x i φi ) = xi Aφi = xi aji φj = ( aji xi )φj i=1 i=1 i=1 j=1 j=1 i=1 We may write (Ax)j = n X aji xi . i=1 The j th component of the vector Ax may thus be obtained by forming the sum of the elements of j th row of the matrix [A] times the components of the column vector [x]. As an example of this rule, we give the result of the matrix-vector multiplications in Problem 7 of Chapter 10. 1 2 0 1 3 1 1 2 1 = 2 1 3 1 0 4 5 1 0 1 2 1 1 2 1 1 0 1 1 1 1 2 1 0 = 2 3 1 4 1 1 5 1 2 = 2 0 1 3 The matrix corresponding to the product of two operators, A and B, may be obtained by allowing to AB to act upon an element of the basis (AB)φj = A(Bφj ) = A( n X bkj φk ) = k=1 = n X n X bkj Aφk k=1 n n X n X X aik bkj )φi . ( aik φi ) = bkj ( i=1 k=1 i=1 k=1 We thus define the matrix product [A][B] by the equation ([A][B])ij = n X aik bkj . (4) k=1 The process of forming the product of two matrices can be described in terms of the individual matrices. To obtain the ij th element of the product matrices, one forms the sum of the products of the elements of the i th row of [A] with the elements of the j th column of [B]. The matrix multiplication will be well defined only if the matrix [A] has as many columns as the matrix [B] has rows. As an example of the rule (4) for multiplying matrices, we give the result of the matrix-matrix multiplications in Problem 6 of Chapter 10. 0 1 0 −i i 0 = 1 0 i 0 0 −i 0 1 1 1 1 2 1 3 1 2 0 0 −2 = 0 0 −2 2 0 0 2 1 1 4 1 3 2 1 0 1 = 5 3 2 1 1 1 0 6 2 4 1 APPENDIX LL Algebraic solution of the oscillator We now return to the harmonic oscillator using operator methods to obtain the wave functions and the energy. Te harmonic oscillator is used in quantum field theory as a model for the behavior of ions vibrating in a crystal and as a model for the quantum mechanic treatment of the electromagnetic field. An expression for the energy of the harmonic oscillator is obtained by adding the kinetic energy p2 /2m to the general expression for the potential energy of the oscillator provided by eq. (1.4) of the introduction to obtain E= 1 2 1 p + mω 2 x2 . 2m 2 Following the procedure described in Section 3.1, the energy operator is obtained by replacing the momentum p and the position x in this last equation by the operators, pˆ and x ˆ. We obtain ˆ = 1 pˆ2 + 1 mω 2 x H ˆ2 . 2m 2 (1) We recall that the momentum operator is given by eq. (3.2) and the position operator x ˆ is equal to the position coordinate x. To see how we might find the wave functions for the harmonic oscillator by algebraic methods, we first note that if the momentum operator pˆ was an ordinary number, we could factor the Hamiltonian operator for the oscillator given by eq. (1) as follows 2 ˆ = pˆ + 1 mω 2 x H ˆ2 = ~ω 2m 2 r mω 2~ iˆ p x ˆ− mω r mω 2~ iˆ p x ˆ+ mω (2) This factorization may readily be confirmed using the identity (a − ib)(a + ib) = a2 + b2 . Such a factorization of the Hamiltonian can not be carried out because the momentum and the position are represented in quantum mechanics by operators, and the order of two operators cannot generally be interchanged as can the products of numbers. The effect of changing the order of the momentum and position operators can be determined by using the explicit form of the momentum operator given by eq. (3.2). The product of the momentum and position operators is pˆx ˆ = −i~ d x. dx 2 We can study the properties of this operator product by allowing it to act on an arbitrary wave function ψ(x) giving −i~ d d x ψ = −i~ (xψ) . dx dx The derivative of (xψ) on the right can now be evaluated using the ordinary product rule to obtain −i~ d dψ (xψ) = −i~ψ − i~x . dx dx Bringing the last term on the right-hand side of the above equation over to the left-hand side and multiplying the whole equation through with −1, we obtain d d + i~ x ψ = i~ψ. −i~x dx dx Since this last equation is true for any arbitrary function ψ, we can write it as an operator identity d d −i~x + i~ x = i~ dx dx or x ˆ pˆ − pˆ x ˆ = i~. (3) The expression on the left-hand side of this last equation involving the position and momentum operators, x ˆ and pˆ, may be written more simply using the idea of a commutator. The commutator of two operators, A and B, is defined by the equation [A, B] = AB − BA. (4) Using this notation, eq. (3) becomes [ˆ x, pˆ] = i~. (5) Even though the Hamiltonian operator (1) cannot be factored as simply as indicated in eq. (2), we would like to evaluate the product of the “factors” occurring in this equation. For this purpose, we define the two operators, r mω iˆ p † a = x ˆ− (6) 2~ mω and r a= mω 2~ iˆ p x ˆ+ . mω (7) We consider now the operator product ~ω a† a, (8) which is equal to the expression appearing after the second equality in eq. (2). 3 Using the commutation relation (5), the operator product (8) can be written 1 i pˆ2 mω 2 x [ˆ x, pˆ] ˆ2 + 2 2 + 2 m ω mω 2 pˆ 1 1 = + mω 2 x ˆ2 − ~ω 2m 2 2 ~ωa† a = . (9) The first two terms appearing after the second equality can be identified as the ˆ given by eq. (1). Bringing the last term on the rightoscillator Hamiltonian H hand side of eq. (9) over to the left-hand side and then interchanging the two sides of the equation, we obtain ˆ = ~ω a† a + 1 . (10) H 2 This equation may be regarded as the corrected version of the naive factorization (2). We note that the order of the operators a† and a in eq. (9) is important. The same line of argument with the operators, a† and a, interchanged leads to the equation pˆ2 1 1 ~ω aa† = + mω 2 x ˆ2 + ~ω. (11) 2m 2 2 We can obtain a commutation relation for the operators a and a† by subtracting eq. (9) from eq. (11) giving [a, a† ] = aa† − a† a = 1. (12) The commutation relation between a and a† may be used to derive a relation between successive eigenfunctions of the oscillator Hamiltonian. Suppose that the wave function ψ is an eigenfunction of the Hamiltonian (10) corresponding to the eigenvalue E. Then ψ satisfies the equation 1 ~ω a† a + ψ = Eψ. (13) 2 Multiplying this equation from the left with a† gives 1 ~ω a† a† a + a† ψ = Ea† ψ. 2 The first term on the left-hand side of this last equation may be rewritten by using the commutation relation (12) to replace a† a with aa† − 1 giving 1 a† ψ = Ea† ψ. ~ω a† a − 1 + 2 Finally, we take the second term on the left-hand side over to the right-hand side to obtain 1 ~ω a† a + a† ψ = (E + ~ω)a† ψ. 2 4 a† E + 3h (a†) 3 E + 2h (a†) 2 E + h a† E E – h a E – 2h a2 E + 3h (a†) 3 a† Figure 1: The effect of the raising and lowering operators on the states of the harmonic operator. Hence, the wave function a† ψ is an eigenfunction of the oscillator Hamiltonian corresponding to the eigenvalue E + ~ω. We shall thus refer to a† as a ˆ into raising operator or a step-up operator. It transforms an eigenfunction of H an eigenfunction corresponding to the next higher eigenvalue. In the same way, the operator a may be shown to be a lowering operator or step-down operator ˆ into an eigenfunction corresponding to which transforms an eigenfunction of H the next lower eigenvalue. The effect of the raising and lowering operators on the states of the Harmonic oscillator is illustrated in Fig. 1. By repeatedly operating on an eigenfunction of the harmonic oscillator with the lowering operator, one can produce eigenfunctions corresponding to lower and lower eigenvalues. Allowed to continue indefinitely, this process would eventually lead to energy eigenvalues less than zero. However, one may show that the harmonic oscillator, which has a potential energy that is positive everywhere, cannot have a bound state with a negative eigenvalue. The lowering process must end in some way. This can only happen by the product of the lowering operator a and the wave function producing a zero function. Denoting the lowest bound state by ψ0 , we must have aψ0 = 0. (14) Since every term in an eigenvalue equation depends upon the eigenfunction, an eigenvalue equation will always be satisfied by a function which is equal to zero everywhere. The energy of the lowest state can be determined by substituting the func- 5 tion ψ0 into eq. (13) to obtain 1 † ~ω a a + ψ0 = E0 ψ0 . 2 Since aψ0 = 0, the first term on the left-hand side of this equation must be equal to zero. We must have 1 E0 = ~ω. 2 A state of the oscillator with ~ω more energy can be obtained by operating with the operator a† upon the lowest state, and this process can be continued producing states with greater and greater energy. The energy levels of the oscillator are thus given by the formula E = ~ω(n + 1/2). Eq. (14) can be solved for the wave function corresponding to the lowest eigenvalue. More energetic states can then be obtained by operating upon ψ0 with the step-up operator a† . Substituting eq. (7) into eq. (14), we obtain iˆ p ψ0 = 0. x ˆ+ mω This equation may be written as a differential equation by using the explicit expression for the momentum operator given by eq. (3.2) to get ~ dψ0 + xψ0 = 0, mω dx or dψ0 mω x dx. =− ψ0 ~ Integrating this last equation, we obtain ln ψ0 = − mω 2 x + ln A0 , 2~ where we have denoted the arbitrary integration constant as ln A0 . Taking the term ln A0 over to the left-hand side of the equation and using the properties of the natural logarithm, we obtain ψ0 mω 2 ln =− x . A0 2~ The lowest eigenfunction ψ0 can then be obtained by taking the anti-logarithm of both sides of the equation and rearranging terms to get 2 ψ0 = A0 e(−mω/2~) x . (15) All of the results obtained in this section have been obtained before by solving the Schr¨ odinger equation for the oscillator. The energy and the wave 6 function of the oscillator are given by eqs. (2.44) and (2.45). We leave as an exercise to use the methods developed in this section to obtain the wave functions of the first two excited states of the oscillator. The algebraic solution of this problem provides its own insights into the oscillator. Since the harmonic oscillator can only vibrate with a single frequency, all of its states can be regarded as excitations of a single state. The first excited state of the oscillator corresponds to a state for which the oscillator has absorbed a single photon with energy ~ω, and higher excited states correspond to states for which the oscillator has absorbed additional photons. This way of thinking about the oscillator is used in quantum field theory as a model for the quantization of the electromagnetic field.