Changes of Table of Contents The Title of Appendix CC

Transcription

Changes of Table of Contents The Title of Appendix CC
Changes of Table of Contents
The Title of Appendix CC should be “More Accurate Numerical Solutions of
the Eigenvalue Problem”
Appenidix II is missing
The appendices with double Roman Letters should be
Appendix AA The Gradient and Laplacian Operators
Appendix BB Solution of the Schrdinger Equation in Spherical Coordinates
Appendix CC More Accurate Numerical Solutions of the Eigenvalue Problem
Appendix DD The Angular Momentum Operators
Appendix EE The Radial Equation for Hydrogen
Appendix FF Transition Probabilities for z-Polarized Light
Appendix GG Transitions with x- and y-Polarized Light
Appendix HH Derivation of the Distribution Laws
Appendix II Derivation of Bloch’s Theorem
Appendix JJ The Band Gap
Appendix KK Vector Spaces and Matrices
Appendix LL Algebraic Solution of the Oscillator
In the current web pages of the book the word “Bloch” is misspelled. The correct title of Appendix ii is given above.
1
Changes for Preface
Page xi, Three lines from top of section on THE NEW EDITION
Replace “two new sections; I have written” with “two new sections I have written” (remove colon)
Page xi, Four lines from the bottom of the page
Repace “Another MATLAB programs” with “Another MATLAB program” (the
subject should be singular)
Page xv, Seven lines from top of page
Replace “Ovtave” with “Octave”
Changes for Introduction
Page, xxiii, Equation following Equation (I.19)
Replace “where k = π/L and” with “where k = 2π/L and” (add a factor of 2)
Page xxiii, Equation (I.22)
Replace “A” with “−A” (Place a minus sign before the right-hand side of the
equation)
Page xxxvii, First line after equation in Problem 4
Replace “where” with “where k = 2π/L and” (add definition of k)
Page xxxvii, Problem 17
Replace “Fig. i.12(a)” with “Fig. I.12(a)” (letter “i” should be upper case)
2
1
APPENDIX AA
The Gradient and Laplacian Operators
The gradient operator
The gradient of a function is a vector that points in the direction in which
the function changes most rapidly and has a magnitude equal to the rate of
change of the function in that direction. It is the natural three-dimensional
generalization of the derivative with respect to a single variable. In Cartesian
coordinates, the gradient of the function ψ can be written
∇ψ = i
∂ψ
∂ψ
∂ψ
+j
+k
∂x
∂y
∂z
(1)
where i, j and k are unit vectors pointing along the x, y and z axes respectively.
In spherical coordinates, which are defined by eq. (4.3), the gradient of
the function ψ is
∇ψ = ˆ
r
∂ψ ˆ 1 ∂ψ ˆ 1 ∂ψ
+θ
+φ
,
∂r
r ∂θ
r sin θ ∂φ
(2)
where ˆ
r, θˆ and φˆ are unit vectors pointing in the direction in which r moves
when r, θ and φ increase. These three unit vectors are shown in Fig. aa.1.
We note that the factor r∂θ which occurs in the denominator of the second
term is the distance the point at r would move if the angle θ increased by an
amount ∂θ with r and φ held fixed, while the factor r sin θ∂φ which occurs
in the denominator of the third term is the distance the point at r would
move if φ increased by ∂φ with r and θ held fixed. Each of the terms of
the gradient operator gives the rate of change of the function on which the
gradient operator acts with respect to the displacement associated with a
particular spherical coordinate.
The equation defining the gradient of a function for any orthogonal set of
coordinates is similar to eq. (2) defining the gradient of a function for spherical coordinates. For a system of orthogonal coordinates (q1 , q2 , q3 ), a change
of the first coordinate q1 by an amount dq1 causes a spatial point to move
a distance ds1 = h1 dq1 . Similarly, changes in the second coordinate by dq2
2
and a change of the third coordinates by dq3 causes the point to move distances ds2 = h2 dq2 and ds3 = h3 dq3 , respectively . The weights (h1 , h2 , h3 )
determines how far the point moves if the corresponding coordinate changes.
The gradient of a function ψ in a system with coordinate (q1 , q2 , q3 ) is defined
by the equation
∇ψ = ˆ
e1
1 ∂ψ
1 ∂ψ
1 ∂ψ
+ˆ
e2
+ˆ
e3
.
h1 ∂q1
h2 ∂q2
h3 ∂q3
(3)
Here as for the spherical coordinates (r, θ, φ) the quantities h1 ∂q1 , h2 ∂q2 , and
h3 ∂q3 occurring in the denominators of the above equation give the distances
a point will move if the three coordinates change by the amounts ∂q1 , ∂q2 ,
and ∂q3 , respectively. The three weight factors for spherical coordinates are
hr = 1, hθ = r, hφ = r sin θ .
(4)
The divergence of a vector
To find the divergence of a vector A, we consider the infinitesimal volume dV = dq1 dq2 dq3 shown in Fig. aa.2. The volume is bounded by surfaces
for which the first coordinate has the values, q1 and q1 + dq1 , the second
coordinate has the values, q2 and q2 + dq2 , and the third coordinate has the
values, q3 and q3 + dq3 . Gauss’s theorem for the vector field A(q1 , q2 , q3 ) is
Z
Z
∇ · A dV = A · dS.
The integral of the out-going normal of A over the two surfaces for which
the first coordinate has the values q1 and q1 + dq1 is
(A1 ds2 ds3 )q1 +dq1 − (A1 ds2 ds3 )q1 =
∂(A1 ds2 ds3 )
dq1 .
∂q1
Using the fact that the displacements, ds1 , ds2 , and ds3 , are equal to h1 dq1 ,
h2 dq2 , and h3 dq3 , respectively, this last equation can be written
(A1 ds2 ds3 )q1 +dq1 − (A1 ds2 ds3 )q1 =
1 ∂(h2 h3 A1 )
dV.
h1 h2 h3
∂q1
Analogous expressions hold for the other two sets of surfaces. According to
Gauss’s theorem, the sum of these three terms is equal to A · AdV . Hence
3
the divergence of A is given by the equation
1
∂(h2 h3 A1 ) ∂(h3 h1 A2 ) ∂(h1 h2 A3 )
∇·A=
+
+
.
h1 h2 h3
∂q1
∂q2
∂q3
(5)
The Laplacian of a function
The Laplacian of a scalar function ψ is the divergence of the gradient of
the function. We can write
∇2 ψ = ∇ · ∇ψ.
Using eq. (5) for the divergence of a vector and eq. (3) for the gradient of a
function, this last equation can be written
∇2 ψ =
1
∂
h2 h3 ∂ψ
∂
h3 h1 ∂ψ
∂
h1 h2 ∂ψ
+
+
. (6)
h1 h2 h3 ∂q1
h1 ∂q1
∂q2
h2 ∂q2
∂q3
h3 ∂q3
For spherical polar coordinates, the three weight factors are given by
eq. (4) and the equation for the Laplacian operator in spherical coordinates
is
1
1 ∂
∂ψ
1 ∂ 2ψ
1 ∂
2 ∂ψ
2
r
+ 2
sin θ
+
. (7)
∇ψ= 2
r ∂r
∂r
r sin θ ∂θ
∂θ
sin2 θ ∂φ2
The angular momentum operators
The operator associated with the angular momentum of a particle can be
obtained by writing the angular momentum in terms of the momentum
l=r×p
and then making the replacement p → −i∇ to obtain
l = −i~r × ∇.
(8)
The angular momentum operator in spherical coordinates can be obtained
by using eq. (2) and the relations
ˆ
r × θˆ = φˆ
and
ˆ
r × φˆ = −θˆ
4
to give
1 ∂
∂
l = −i~ φˆ
− θˆ
∂θ
sin θ ∂φ
.
The operator corresponding to the z-component of the angular momentum
can then be obtained by taking the dot product of this last expression with
ˆ pointing along the z-axis to obtain
the unit vector k
lz = −i~
∂
.
∂φ
(9)
The square of the angular momentum operator is related to the second
term on the right-hand side of eq. (7) by the equation
1 ∂
∂
1 ∂2
2
2
l = −~
sin θ
+
.
(10)
sin θ ∂θ
∂θ
sin2 θ ∂φ2
Using eqs. (7) and (10), the Lapalcian operator in spherical coordinates can
be written simply
1 l2
1 ∂
2 ∂ψ
2
r
− 2 2.
(11)
∇ = 2
r ∂r
∂r
r ~
The angular part of the Laplacian operator in spherical coordinates is equal
to l2 /~2 r2 .
5
APPENDIX BB
Solution of the Schr¨
odinger Equation in Spherical Coordinates
Separation of the Schr¨
odinger Equation
The Schr¨odinger equation for an electron with mass m moving about a
nucleus with mass M and charge eZ can be written
−
~2 2
1 Ze2
∇ ψ(r) −
ψ(r) = Eψ(r),
2µ
4π0 r
(12)
where the reduced mass µ is defined by the equation
µ=
mM
.
m+M
(13)
Using the expression for the Laplacian operator in Spherical coordinates given
in Appendix AA, the Scr¨odinger equation can be written
1 ∂
∂ψ
1
∂ 2ψ
~2 1 ∂
2 ∂ψ
r
+ 2
sin θ
+ 2 2
−
2µ r2 ∂r
∂r
r sin θ ∂θ
∂θ
r sin θ ∂φ2
1 Ze2
−
ψ = Eψ. (14)
4π0 r
This last equation can be solved by the method of separation of variables
by writing the wave function as a product of a functions of the radial and
angular coordinates
ψ(r, θ, φ) = R(r)Y (θ, φ).
Substituting the product function into eq. (14) and dividing by −~2 /2µr2
times the product function, we obtain
1 d
R dr
r
2 dR
dr
2µr2
1 Ze2
+ 2 E+
~
4π0 r
1
1 ∂
∂Y
1 ∂ 2Y
=−
sin θ
+
. (15)
Y sin θ ∂θ
∂θ
sin2 θ ∂φ2
6
Since the left-hand side of this last equation depends only on r and the righthand side depends only on θ and φ, both sides must be equal to a constant
that we call λ. The resulting radial equation can be written
1 d
2µ
1 Ze2
λ
2 dR
r
+ 2 E+
− 2 R = 0,
(16)
r2 dr
dr
~
4π0 r
r
and the angular equation is
1 ∂
∂Y
1 ∂ 2Y
−
sin θ
−
= λY.
sin θ ∂θ
∂θ
sin2 θ ∂φ2
(17)
Using eq. (10) of Appendix AA, we may identify the second of these two last
equations as the eigenvalue equation of the angular momentum operator l2
with eigenvalue ~2 λ. We shall use a purely algebraic line of argument in
Appendix CC to show that the eigenvalues of the orbital angular momentum
operator l2 are ~2 l(l+1), where l is the angular momentum quantum number.
We may thus identify the separation constant λ as being l(l + 1) and write
the radial equation
2µ
1 Ze2
l(l + 1)
1 d
2 dR
r
+ 2 E+
−
R = 0,
(18)
r2 dr
dr
~
4π0 r
r2
We consider now more fully the radial equation. Evaluating the derivatives of the first term in the equation, we obtain
2µ
d2 R 2 dR
1 Ze2
l(l + 1)
+
+ 2 E+
−
R = 0.
(19)
dr2
r dr
~
4π0 r
r2
This last equation can be simplified by introducing the change of variables
ρ = αr,
(20)
where α is a constant yet to be specified. The equation defining the change
of variables can be written
r = α−1 ρ.
(21)
The derivatives of the radial function R can then be expressed in terms of
the new variable ρ by using the chain rule. We have
dR
dρ dR
dR
=
=α .
dr
dr dρ
dρ
(22)
7
Similarly, the second derivative can be written
2
d2 R
2d R
=
α
.
dr2
dρ2
(23)
Substituting eqs. (21)-(23) into eq. (24) and dividing the resulting equation
by α2 , we obtain
d2 R 2 dR
+
+
dρ2
ρ dρ
2µZe2
~2 4π0 α
1 2µE
l(l + 1)
+
R = 0.
−
ρ ~2 α2
ρ2
(24)
To simplify this last equation, we now make the following choice of α
α2 =
8µ|E|
,
~2
(25)
and we define a new parameter ν by the equation
ν=
2µZe2
,
~2 4π0 α
(26)
The radial equation then becomes simply
d2 R 2 dR
ν 1 l(l + 1)
+
+
− −
R = 0,
dρ2
ρ dρ
ρ 4
ρ2
(27)
where the new variable ρ is related to the radial distance r by eq. (20).
In this appendix we shall solve the radial equation (18) using the power
series method. We first note that the first two terms in the equation and
the last term depend upon the −2 power of ρ, while the third term in the
equation depends upon the −1 power of ρ and the fourth term depends upon
the 0 power. Since the equation depends upon more than two powers of
ρ, it cannot be solved directly by the power series method. To overcome
this difficulty, we examine the behavior of the equation for large values of r
for which the fourth term in the equation dominates over the third and last
terms. The function e−ρ/2 , which is everywhere finite, is a solution of the
radial equation for large r. This suggests we look for an exact solution of
eq.27) of the form
R(ρ) = F (ρ)e−ρ/2 ,
(28)
8
where F (ρ) is a function of ρ. We shall substitute this representation of R(ρ)
into eq. (18) and in this way derive an equation for F (ρ). Using eq. (28), the
first and second derivatives of R(ρ) can be written
dF
dR
1
= − e−ρ/2 F + e−ρ/2
dρ
2
dρ
2
d2 R
1 −ρ/2
−ρ/2 dF
−ρ/2 d F
e
F
−
e
+
e
=
.
dρ2
4
dρ
dρ2
Substituting eqs. (28 and (29) into eq. (27) leads to the equation
2
dF
ν − 1 l(l + 1)
d2 F
+
−1
+
−
F = 0.
dρ2
ρ
dρ
ρ
ρ2
(29)
(30)
We note that the first, second, and last terms of the new radial equation depend upon the −2 power of ρ, while the remaining two terms in the equation
depend upon the −1 power of ρ. Since eq. (18) only involves two powers of
ρ, it is amendable to a power series solution.
We now look for a solution for F (ρ) of the form
F (ρ) = ρs L(ρ),
(31)
where the function L(ρ) can be expressed as a power series
L(ρ) =
∞
X
ak ρ k .
(32)
k=0
We shall suppose that the coefficient a0 in the expansion of L(ρ) is not equal
to zero and that the function ρs gives the dependence of the function F (ρ)
near the origin. The requirement that F (ρ) be finite can be satisfied if s has
integer values equal or greater than zero. Substituting eq. (31) into eq. (30)
gives the following equation for L
ρ2
dL
d2 L
+ ρ [2(s + 1) − ρ]
+ [ρ(ν − s − 1) + s(s + 1) − l(l + 1)] L = 0.
2
dρ
dρ
Setting ρ = 0 in this last equation, leads to the condition
s(s + 1) − l(l + 1) = 0.
9
This quadratic equation has two roots: s = l and s = −(l + 1). Only
the root s = l is consistent with the boundary condition that the radial
function R(ρ) be finite for ρ = 0. The equation for L then becomes
ρ
d2 L
dL
+ (ν − l − 1)) L = 0.
+
[2(l
+
1)
−
ρ]
dρ2
dρ
(33)
Notice that the first and second terms of this last equation depend upon the
−1 power of ρ, while the remaining terms in the equation depend upon the
0 power of ρ.
To obtains a power seres solution of eq. (33), we take the first two derivatives of eq. (32) to obtain
∞
dL X
=
kak ρk−1 ,
dρ
k=1
∞
X
d2 F
k(k − 1)ak ρk−2 .
=
dρ2
k=2
We substitute these expressions for L(ρ) and its derivatives into eq. (33) to
obtain
∞
X
k(k−1)ak ρ
k=2
k−1
+2(l+1)
∞
X
k=1
kak ρ
k−1
−
∞
X
k=1
k
kak ρ +(ν −l−1)
∞
X
ak ρk = 0.
k=0
Notice that the first and second summations involve ρk−1 , while the third
and fourth sums involve ρk . Because of the factor k − 1 in the first sum, the
first summation can be extended down to k = 1 and because of the factor k
in the third sum, the third summation can be extended down to k = 0, and
the equation may be written
∞
∞
X
X
[k(k − 1) + 2k(l + 1)]ak ρk−1 −
[k − (ν − l − 1)]ak ρk = 0.
k=1
k=1
The first summation in this last equation has the variable ρ is raised to the
power k − 1, while in the second summation has ρ raised to the power k. In
order to bring these different contribution together so that they contain terms
corresponding to the same power of ρ, we make thefollowing substitution in
the first summation
k = k 0 + 1,
(34)
10
and we simplify the terms within the two summations to obtain
∞
X
0
k0
0
(k + 1)(k + 2l + 2)ak0 +1 ρ −
k0 =0
∞
X
(k − ν + l + 1)ak ρk = 0.
k=0
As with a change of variables for a problem involving integrals, the lower
limit of the first summation is obtained by substituting the value k = 1 into
eq. (34) defining the change of variables. We now replace the dummy variable
k 0 with k in the first summation and draw all of the terms together within a
single summation to obtain
∞
X
[(k + 1)(k + 2l + 2)ak+1 − (k − ν + l + 1)ak ] ρk = 0.
k=0
This equation can hold for all values of y only if the coefficient of every power
of ρ is equal to zero. This leads to the following recursion formula
ak+1 =
k+l+1−ν
ak .
(k + 1)(k + 2l + 1)
(35)
The recursion formula gives a1 , a2 , a3 , . . . in terms of a0 . We may thus regard L(ρ) to be defined in terms of the two constanta0 . We must examine,
however, the behavior of L(ρ) as y approaches infinity. Since the behavior of
H(y) for large values of y will depend upon the terms far out in the power
series, we consider the recursion formula (35) for large values of k. This gives
1
k
ak+1
→ 2 = .
ak
k
k
We now compare this result with the Taylor series expansion of the function
eρ
1
1
1
eρ = 1 + ρ + ρ2 + · · · + ρk +
ρk+1 + . . . .
2!
k!
(k + 1)!
The ratio of the coefficients of this series for large values of k is
1/(k + 1)!
k!
1
1
=
=
→ .
1/k!
(k + 1)!
k+1
k
The ratio of successive terms for these two series is the same for large values
of k. This means that the power series representation of L(ρ) has the same
11
dependence upon ρ for large values of ρ as the function eρ . Recall now that
the radial function R(ρ) is related to F (ρ) by eq. (28), and F (ρ) is related to
L(ρ) by eq. (31) with s = l. Setting L(ρ) = eρ ieads to the following behavior
of the radial function for large ρ
R(ρ) = e−ρ/2 ρl eρ = ρl eρ/2
as y → ∞.
The radial function we have obtained from the series expansion thus becomes infinite as y → ∞, which is unacceptable. There is only one way of
avoiding this consequence and that is to terminate the infinite series. The
series can be terminated by letting ν be equal to an integer n, such that
ν = n = n0 + l + 1.
(36)
The recursion formula (35) then implies that the coefficient an+1 is equal
to zero, and the function L(ρ) will be a polynomial of degree n. Eqs. (28)
and (31) with s = l then implies that the radial function R(ρ) is equal to
a polynomial times the function e−ρ/2 , which means that R approaches zero
as ρ → ∞. Setting ν = n and solving eqs. (161) and (26) for the energy, we
obtain
1
µZ 2 e4
.
|E| =
2
2
2(4π0 ) ~ n2
The energy of the n th bound state of a hydrogen-like ion with nuclear
charge Ze is
µZ 2 e4
1
.
(37)
En = −
2
2
2(4π0 ) ~ n2
Using the reduced mass µ for the electron mass takes into account the fact
that the finite mass of the nucleus. For the hydrogen atom with Z = 1 and
with the reduced mass µ equal to m, eq. (37) reduces the expression for
energy En for the hydrogen atom in Chapter 1.
The polynomials L(ρ) may be indentified as associated Languere Polynomials Lpq which satisfy the equation
ρ
dLpq
d2 Lpq
+
[p
+
1
−
ρ]
+ (q − p)) Lpq = 0.
dρ2
dρ
(38)
Equating the coefficients in eqs. (33) and (??), we see that p = 2l + 1 and
q = n + l. The appropriate polynomials are given by the equation
L2l+1
n+l (ρ)
=
n−l−1
X
(−1)k+1
k=0
[(n + l)!]2 ρk
.
(n − l − 1 − k)!(2l + 1 + k)!k!
(39)
12
The normalized radial wave functions for a hydrogen-like ion can be written
Rnl (r) = −Anl e−ρ/2 ρl L2l+1
n+l (ρ)
with the normalization coefficients are given by the equation
)
(
3
(n − l − 1)!
2Z
,
Anl =
na0
2n[(n + l)!]3
where
(40)
(41)
4π0 ~2
2Z
a0 =
and ρ =
r.
2
µe
na0
The first three radial functions, which can be found using eqs. (39) and
(40), are
32
Z
2e−Zr/a0
R10 (r) =
a0
3 Zr −Zr/2a0
Z 2
2−
R20 (r) =
e
2a0
a0
3
Z 2 Zr −Zr/2a0
√ e
R21 (r) =
2a0
a0 3
13
APPENDIX CC
The Angular Momentum Operators
Generalization of the quantum rules
The quantum rules given in Chapter 3 may be generalized to three dimensions. The position of a particle in three dimensions can be represented by a
vector r, which extends from the origin to the particle, while the momentum
of a particle moving in three-dimensional space is represented by a vector p,
which points in the direction of the particle’s motion.
For a particle moving in three dimensions, the operator associated with
the momentum, which we denote by p
ˆ , is defined to be
p
ˆ = −i~∇,
(42)
where ∇ is the gradient operator discussed in Appendix AA. The gradient
of a function is a vector that points in the direction in which the function
changes most rapidly and has a magnitude equal to the rate of change of the
function in that direction. It is the natural generalization of the concept of
the derivative to three dimensions.
The expression for the energy in three dimensions is
E=
1 2
p + V (r),
2m
(43)
where p and r are the momentum and radius vectors. Substituting the
momentum operator (42) into the eq. (43) for the energy leads to the following
Hamiltonian operator
2
ˆ = −~ ∇2 + V (r),
H
(44)
2m
where ∇2 is the Laplacian operator discussed in Appendix F.
Once one has constructed an operator O corresponding to a variable of a
microscopic system, the wave function of the system and the possible values
that can be obtained by measuring the variable are determined by forming
the eigenvalue equation
Oψ(r) = λψ(r).
(45)
As for problems in one dimension, the values of λ, for which there is a solution
of eq. (45) satisfying the boundary conditions, are the possible values that
14
can be obtained in a measurement of the variable. The wave function ψ(r)
describes the system when it is in a state corresponding to the eigenvalue λ.
Commutation relations
The operators used in quantum mechanics to represent physical variables
satisfy certain algebraic relations. Recall that the commutator of two operators, A and B, was defined in the second section of Chapter 3 by the
equation
[A, B] = AB − BA.
(46)
In this appendix, we shall discard the caret symbol (ˆ) associated with operators denoting the operator corresponding to a variable with the same
symbol used to denote the variable itself. The operator corresponding to
the x-component of the momentum, for instance, will be denote simply as
px . With this notation the commutation relation between x and px , which is
given in the second section of Chapter 3, is written
[x, px ] = i~.
(47)
Other commutation relations can be obtained from this relation by making
the cyclic replacements, x → y, y → z, z → x. This leads to the additional
commutation relations
[y, py ] = i~, [z, pz ] = i~.
(48)
We recall that the commutators that can be formed with one component of
the position vector r and another component of the momentum operator p
are equal to zero. For instance, we have
[x, py ] = 0.
(49)
The commutators formed from two components of the position vector or two
components of the momentum are also zero.
As discussed in Appendix AA, the operator corresponding to the orbital
angular momentum can be obtained by replacing the momentum p in the
defining equation for the angular momentum,
l = r × p,
(50)
l = −i~ r × ∇.
(51)
with the operator (42) to obtain
15
We shall now derive commutation relations for the angular momentum
operators. These commutation relations will then be used to derive the
spectra of eigenvalues of the angular momentum. To make it easier for us to
evaluate the commutators of the angular momentum operators, we first derive
a few general properties of the commutation relations of operators, which we
denote by A, B, and C. Using the definition of the commutator (46), we
may write
[A, (B + C)] = A(B + C) − (B + C)A = AB + AC − BA − CA
= (AB − BA) + (AC − CA).
(52)
The two terms appearing within parentheses on the right may be identified
as the commutators, [A, B] and [A, C]. We thus have
[A, (B + C)] = [A, B] + [A, C].
(53)
One may prove in a similar fashion that
[(A + B), C] = [A, C] + [B, C].
(54)
The commutation relations can thus be said to be linear.
We next consider the commutation [A, BC]. Again, we use the definition
of the commutator (46) to obtain
[A, BC] = ABC − BCA.
(55)
Subtracting and adding the term BAC after the first term on the right, we
have
[A, BC] = ABC − BAC + BAC − BCA
= (AB − BA)C + B(AC − CA)
(56)
.
Again, identifying the two terms within parentheses on the right as the commutators, [A, B] and [A, C], we have
[A, BC] = [A, B]C + B[A, C].
(57)
Similarly, one may prove that
[AB, C] = A[B, C] + [A, C]B.
(58)
These last two commutation relations can be described in simple terms. The
commutator of a single operator with the product of two operators can be
16
written as a sum of two terms involving the commutator of the single operator
with each of the operators of the product. For each of these terms, the
operator not appearing in the commutator is pulled to the front or the back
to preserve the order of the operators within the product. In the first term
on the right-hand side of eq. (57), the operator C is pulled to the back, while
in the second term on the right, the operator B is pulled to the front. In the
two terms of the resulting equation, B appears before C. Similarly, in the
first term on the right-hand side of eq. (58), the operator A is pulled to the
front so that it appears before B, while in the second term, B is pulled to
the back so that it appears after A.
We now evaluate the commutator [lx , ly ] involving the x- and y- components of the angular momentum operator l. Using the definition of the
orbital angularmomentum given by eq. (50), the x-component of the angular
momentum operator can be seen to be lx = ypz − zpy and the y-component
may be seen to be ly = zpx − xpz . We may thus use eq. (53) to obtain
[lx , ly ] = [lx , zpx − xpz ] = [lx , zpx ] − [lx , xpz ]
= [(ypz − zpy ), zpx ] − [(ypz − zpy ), xpz ].
(59)
Using eq. (54), this becomes
[lx , ly ] = [ypz , zpx ] − [zpy , zpx ] − [ypz , xpz ] + [zpy , xpz ].
(60)
We now note that of the commutators that can be formed from the operators
on the right-hand side of this last equation, the commutators, [x, px ], [y, py ]
and [z, pz ], are each equal to i~. All other commutators are equal to zero.
Omitting the second and third terms on the right-hand side of eq. (60), which
do not contain operators having a nozero commutator, the equation becomes
[lx , ly ] = [ypz , zpx ] + [zpy , xpz ].
(61)
The commutators on the right-hand side of this equation may be evaluated
as we have described following eqs. (57) and eq. (58). For the first term on
the right-hand, we pull y to the front and px to the back giving y[pz , z]px .
Similarly, for the second term on the right, we pull x toward the front and
py toward the back giving x[z, pz ]py . Eq. (61) then becomes
[lx , ly ] = y[pz , z]px + x[z, pz ]py .
(62)
17
Like the commutator [x, px ], the commutator [z, pz ] is equal to i~. The
commutator [pz , z], for which the operators pz and z are interchanged, is
equal to −i~. We thus obtain
[lx , ly ] = i~(xpy − ypx ).
(63)
The term within parentheses on the right may be identified as lz and hence
the equation may be written
[lx , ly ] = i~lz .
(64)
The commutation relations (64) assume a more simple form if the angular
momentum is measured in units of ~. Commutation relations for the new
angular momentum operators can be obtained by dividing eq. (64) by ~2 to
obtain
[(lx /~), (ly /~) ] = i(lz /~).
(65)
If the orbital angular momentum is measured in units of ~, the angular
momentum operators thus satisfy the commutation relations
[lx , ly ] = ilz .
(66)
The orbital angular momentum is then represented by the operator ~l. Other
commutation relations can be obtained from eq. (66) by making the cyclic
replacements, x → y, y → z and z → x.
We consider now the commutation relation involving lz and the operator,
l2 = lx lx + ly ly + lz lz .
(67)
Using eqs. (53), we may write
[lz , l2 ] = [lz , lx lx + ly ly + lz lz ] = [lz , lx lx ] + [lz , ly ly ] + [lz , lz lz ].
(68)
We now use eq. (57) and take advantage of the fact that lz commutes with
itself to write
[lz , l2 ] = lx [lz , lx ] + [lz , lx ]lx + ly [lz , ly ] + [lz , ly ]ly .
(69)
We now note that the commutators in the first and second terms are in cyclic
order, while the commutators in the third and fourth terms are not in cyclic
order. The above equation thus becomes
[lz , l2 ] = i[lx ly + ly lx − ly lx − lx ly = 0.
(70)
18
We thus find that the operators l2 and lz commute with each other. In
quantum theory, commuting operators corresponding to variables that can
be accurately measured simultaneously. It is generally possible to find simultaneous eigenfunctions for such variables. The common eigenfunctions
represent states of the system in which the variables corresponding to the
operators have definite values. It is easy to find physical examples of these
results. For the hydrogen atom, the states of the electron are described by
the quantum numbers, l and ml , corresponding to well-defined values of both
l2 and lz . In a magnetic field the magnetic moment and the angular momentum of an electron precess about the direction of the magnetic field with
the magnitude of the angular momentum and the projection of the angular
momentum upon the direction of the magnetic field having constant values.
The definition of the angular momentum by eq. (50) does not apply to
the spin. We shall reguire, though, that the components of the spin angular
momentum satisfy commutation relations analogous to eq. (66). We have
[sx , sy ] = isz .
(71)
It is thus convenient to regard the commutation relations between the components of the angular momentum operators as the definition the angular
momentum. Analogous commutations relations apply to the components of
the orbital and spin angular momentum and to the components of the total
angular momentum.
Spectrum of eigenvalues
We conclude this appendix by showing that the commutation relations of
the angular momentum operators determine the spectrum of eigenvalues of
these operators. Our arguments will be very general applying to any angular
momentum operator. Using the symbol j to denote an angular momentum
operator, the commutation relations satisfied by the components of the angular momentum operators may be obtained by writing j in place of l in
eq. (66) or j in place of s in eq. (71) giving
[jx , jy ] = ijz .
(72)
The symbol j denotes the angular momentum operator in units of ~. As for
the orbital angular momentum operators considered previously, the operator jz commutes with the operator
j2 = jx jx + jy jy + jz jz .
(73)
19
We have
[jz , j2 ] = 0.
(74)
Eq. (74) can be derived as before from the commutation relation (72) and
the other commutaion relations obtained from this basic equation by making
the cyclic replacements, jx → jy , jy → jz , and jz → jx .
Since the operators, j2 and jz , commute, they have a common set of
eigenfuctions. Denoting a typical eigenvalue of j2 by µ and an eigenvalue of
jz by µz , we denote the simultaneous eigenfunctions of j2 and jz by ψ(µ, µz ).
These functions satisfy the eigenvalue equations,
j2 ψ(µ, µz ) = µψ(µ, µz )
(75)
jz ψ(µ, µz ) = µz ψ(µ, µz ).
(76)
and
In order to study the properties of the angular momentum eigenfunctions,
we introduce new operators by the equations
j+ = jx + i jy
(77)
j− = jx − i jy
(78)
and
Since j+ and j− are linear combinations of jx and jy , and since jx and jy commute with j2 , j+ and j− must also commute with j2 . One may easily confirm
this result using the deinining eqs. (77) and (78) together with eq. (54).
In order to evaluate the commutation relation of jz with j+ , we first use
eqs. (77) and (53) to write the commutator as follows
[jz , j+ ] = [jz , (jx + i jy )]
= [jz , jx ] + i [jz , jy ]
(79)
The commutation relation satisfied by the components of the angular momentum may then be used to obtain
[jz , j+ ] = ijy + jx .
(80)
This last equality can be written
[jz , j+ ] = j+ .
(81)
20
The commutator of jz with j− may be evaluated in a similar way. We obtain
[jz , j− ] = −j− .
(82)
In order to evaluate the operator product j+ j− , we first use the definitions,
(78) and (77), to write
j− j+ = (jx − ijy ) (jx + ijy )
= jx2 + jy2 + i [jx , jy ]
= j2 − jz2 + i [jx , jy ].
(83)
We then use the commutation relations (72) to obtain
j− j+ = j2 − jz2 − jz .
(84)
Similarly, the product j+ j− may be evaluated giving
j+ j− = j2 − jz2 + jz .
(85)
Operating now on the eigenvalue equation (75) with j+ and using the fact
that j+ and j2 commute, we obtain
j2 j+ ψ(µ, µz ) = µj+ ψ(µ, µz ).
(86)
The function j+ ψ(µ, µz ) is thus also an eigenfunction of j2 corresponding to
the eigenvalue µ. Similarly, multiplying the eigenvalue equation (76) by the
operator j+ gives
j+ jz ψ(µ, µz ) = µz j+ ψ(µ, µz ).
(87)
We may now use the definition of the commutator of two operators and
eq. (81) to write
j+ jz = jz j+ − [jz , j+ ] = jz j+ − j+ .
(88)
Substituting this expression for j+ jz into eq. (87) and bringing the term with
j+ over to the right-hand side of the equation, we then obtain
jz j+ ψ(µ, µz ) = (µz + 1) j+ ψ(µ, µz ).
(89)
The function j+ ψ(µ, µz ) is thus an eigenfunction of jz corresponding to the
eigenvalue µz + 1. So, operating on the function ψ(µ, µz ) with j+ gives a new
eigenfunction belonging to the same eigenvalue of j2 but to the eigenvalue
21
(µz + 1) of jz . By repeatedly operating with j+ on ψ(µ, µz ), we can generate
a whole series of eigenfunctions of jz belonging to the eigenvalues µz , µz +
1, µz + 2, . . . and all belonging to the eigenvalue µ of j2 . For this reason, j+
is called a step-up operator.
Equations similar to eqs. (86) and (89) may be derived by multiplying
the eigenvalue equations (75) and (76) by j− . We have
j2 j− ψ(µ, µz ) = µj− ψ(µ, µz )
(90)
jz j− ψ(µ, µz ) = (µz − 1) j− ψ(µ, µz ).
(91)
The function j− ψ(µ, µz ) is an eigenfunction of j2 corresponding to the eigenvalue µ and an eigenfunction of the operator jz corresponding to the eigenvalue (µz − 1). The operator j− may thus be thought of as a step-down
operator.
We now determine the possible values of µ and µz . For a definite value
of µ, there must be a limit to how large or how small µz can become. The
eigenvalue µ gives the square of the length of the vector j, while µz is the
projection of the vector j upon the z-axis. The projection of a vector upon
the z-axis cannot be larger than the length of the vector itself. We denote
the maximum eigenvalue of jz by j and the eigenfunction corresponding to
the maximum eigenvalue by ψ(µ, j). Multiplying the function ψ(µ, j) by j+
must give zero
j+ ψ(µ, j) = 0.
(92)
For, otherwise, j+ ψ(µ, j) would be an eigenfunction of z corresponding to
the eigenvalue j + 1. We note that setting j+ ψ(µ, j) equal to zero gives a
solution of eqs. (86) and (89). Operating on eq. (92) with j− gives
j− j+ ψ(µ, j) = 0.
(93)
Using eq. (84), this equation can be written
(µ − j 2 − j) ψ(µ, j) = 0.
(94)
Since the function ψ(µ, j) cannot vanish at all points, it follows that
µ = j(j + 1).
(95)
Similarly, let (j − r) be the least eigenvalue of jz . Then it follows that
j− ψ(µ, j − r) = 0.
(96)
22
and
j+ j− ψ(µ, j − r) = 0.
(97)
Using eq. (85), then leads to the equation
[µ − (j − r)2 + (j − r)] ψ(µ, j − r) = 0,
(98)
µ − (j − r)2 + (j − r) = 0.
(99)
and we must have
Substituting the value of µ given by eq. (95) into this last equation leads to
the quadratic equation
r2 − r(2j − 1) − 2j = 0.
(100)
which has only one positive root, r = 2j. Thus, the least eigenvalue of jz is
equal to j − r = −j. This means that for a particular eigenvalue µ = j(j + 1)
of j2 , there are 2j + 1 eigenfunctions ψ(µ, m) of jz corresponding to the
eigenvalues
m = j, j − 1, . . . , −j + 1, −j.
(101)
It is also clear from the above argument that 2j must be an integer, which
means that the quantum number j of the angular momentum must be an
integer or a half-integer.
Using commutation relations of the angular momentum operators, we
have thus shown that the eigenvalues of j2 are j(j + 1) where j may an
integer or half-integer. For a particular value of j, the eigenvalues of jz are
m = j, j − 1, . . . , −j. Denoting the simultaneous eigenfunctions of j2 and jz
by the quantum numbers, j and m, the eigenvalue equations become
j2 ψ(jm) = j(j + 1)ψ(jm)
(102)
jz ψ(jm) = mψ(jm).
(103)
The operators j2 and jz give the square of the angular momentum operator
and the z-component of the angular momentum in units of ~. The operators,
(~j)2 and ~jz , which represent the angular momentum in an absolute sense,
have eigenvalues j(j + 1)~2 and m~.
The general results we have obtained for the angular momentum can be
applied to the orbital and spin angular momenta. The operator corresponding to the square of the orbital angular momentum, which we have denoted
23
previously by l2 , has eigenvalues l(l + 1)~2 . For a given value of l, the zcomponent of the orbital angular momentum, which we denote by lz , has the
values ml ~, where ml = −1, −l + 1, . . . , l. The spin quantum number of the
electron has the value s = 1/2 with the eigenvalues of the spin operator s2
being (1/2) · (3/2)~2 and the spin operator sz has eigenvalues ±(1/2)~.
24
APPENDIX DD
The Radial Equation for Hydrogen
The operators describing atomic electrons can generally be separated into
radial and angular parts. As we have seen in Appendix AA, the angular part
of the Laplacian operator is related to the angular momentum operator l2 .
Eq. (11) expresses the Laplacian operator in terms of its radial and angular
parts.
We consider now the effect of the Laplacian operator (11) upon the wave
function of the hydrogen atom. Using the form of the atomic wave function
and the spherical harmonics introduced in Section 4.1.3, the wave function
of hydrogen can be written
ψ(r, θ, φ) =
P (r)
Y (θ, φ).
r lml
(104)
The spherical harmonic Ylml (θ, φ) is an eigenfunction of the angular momentum operator l2 corresponding to the eigenvalue l(l + 1)~2 .
We consider first the effect of the first term on the right-hand side of
eq. (11) upon the wave function (104). Taking the partial derivate with
respect to r does not affect the spherical harmonic. So we must evaluate the
result of operating successively upon the function P (r)/r with the partial
derivative, then with r2 and the partial derivative, and finally multiplying
with 1/r2 . Operating upon P (r)/r with the partial derivative gives
∂
∂r
P
∂(r−1 P )
1 dP
1
=
=
− 2 P.
r
∂r
r dr
r
(105)
Multiplying this equation by r2 and taking the partial derivative with respect
to r, then gives
P
d2 P
∂
2 ∂
r
=r 2.
(106)
∂r
∂r r
dr
Finally, dividing by r2 , we obtain
1 ∂
P
1 d2 P
2 ∂
r
=
.
r2 ∂r
∂r r
r dr2
(107)
25
Operating with the first term on the right-hand side of eq. (11) upon the
wave function (104) thus gives
1 ∂
1 d2 P
2 ∂
r
ψ(r,
θ,
φ)
=
Y (θ, φ).
(108)
r2 ∂r
∂r
r dr2 lml
Similarly, operating with the second term on the right-hand side of eq. (11)
upon the wave function (104) and taking advantage of the fact that Ylml (θ, φ)
is an eigenfunction of l2 corresponding to the eigenvalue l(l + 1)~2 , we obtain
−
l(l + 1) P (r)
l2
ψ(r,
θ,
φ)
=
−
Y (θ, φ).
~2 r2
r2
r lml
(109)
The effect of multiplying the Laplacian operator (11) upon the hydrogen
wave function (104) can be evaluated using eqs. (108) and (109). This leads
to the result
∇2 ψ(r, θ, φ) =
l(l + 1) P (r)
1 d2 P
Ylml (θ, φ) −
Y (θ, φ).
2
r dr
r2
r lml
(110)
The Schr¨odinger equation for the electron of a hydrogen atom is given by
eq. (4.2). Substituting the hydrogen wave function (104) into the Schr¨odinger
equation and using eq. (110) to evaluate the effect of the Laplacian operator
upon the wave function, we obtain
−
~2 1 d2 P
~2 l(l + 1) P (r)
Y
(θ,
φ)
+
Y (θ, φ)
2m r dr2 lml
2m r2
r lml
P (r)
1 Ze2 P (r)
Ylml (θ, φ) = E
Y (θ, φ). (111)
−
4π0 r
r
r lml
The radial equation (4.5) is obtained from this last equation by deleting
(1/r)Ylml (θ, φ) from each term.
26
APPENDIX EE
Transition Probabilities for z-Polarized Light
We suppose that the electromagnetic radiation incident upon an atom is
a superposition of plane waves. For each of these waves, the electric field can
be written
E = 2E0 sin(k · r − ωt).
(112)
The energy per unit volume of the radiation field associated with this monochromatic wave is
W = 0 E2 = 0 4E02 sin2 (k · r − ωt).
Since the time average of the sine squared function in this last equation is
1/2 , the average energy per volume is
Wav = 20 E02 .
(113)
Using the representation of the sine function given by eq. (3.10), eq. (112)
can be written
E = −iE0 ei(k·r−ωt) + iE0 e−i(k·r−ωt) .
(114)
For most applications, the coupling between the electrons and the radiation
field is rather weak. The interaction can then be described by the Hamiltonian
Hint = E · (−e r) ,
(115)
where (−e r) is the dipole moment of the electron. We shall consider radiation for which the electric field vector E is directed along the z-axis. Using
eq. (114), the interaction Hamiltonian can then be written
Hint = −i(−ez)E0 ei(k·r−ωt) + i(−ez)E0 e−i(k·r−ωt) .
(116)
The wave function of a hydrogen-like ion exposed to a time-dependent
radiation field may be described by eq. (4.17)
X
ψ(r, t) =
cn (t) φn (r) e−iEn t/~ ,
(117)
n
where the coefficients cn (t) depend on time. The wave functions φn are
eigenfunctions of the stationary atomic Hamiltonian
H0 =
1 Ze2
−~2 2
∇ −
.
2m
4π0 r
(118)
27
For simplicity, we assume that the eigenvalues En are nondegenerate. In
order to be in a position to calculate the probability that the atom makes
a transition from a level i to a level j, we suppose that at time, t = 0, the
coefficient ci (0) is equal to one and all the other coefficients cj (0) are equal
to zero. We wish to calculate the probability |cj (t)|2 that at a later time
t the atom is in the state j. Substituting eq. (117) into the Schr¨odinger
time-dependent equation,
i~
∂ψ(r, t)
= H ψ(r, t),
∂t
(119)
we obtain the following first-order differential equation for the coefficients
cn (t)
X dcn
X
i~
+ En cn φn (r) e−iEn t/~ = (H0 + Hint )
cn φn (r) e−iEn t/~ .
dt
n
n
(120)
On the right-hand side of this equation, we have written the Hamiltonian H
as the sum of a stationary term H0 and a dynamic term Hint corresponding
to the interaction of the electron with an oscillating electromagnetic field.
Since φn is an eigenfunction of H0 corresponding to the eigenvalue En , the
second term on the left-hand side of the equation cancels with the first term
on the right to give
X
X dcn
φn (r) e−iEn t/~ = Hint
cn φn (r) e−iEn t/~ .
(121)
i~
dt
n
n
The assumption that Hint is small means that the coefficients cn (t) evolve
slowly with time. It is thus reasonable to approximate the coefficients cn on
the right side of this equation with their initial values. Since ci (0) = 1 and
all the other coefficients are zero, we get
X dcn
φn (r) e−iEn t/~ = Hint φi (r) e−iEi t/~ .
(122)
i~
dt
n
We may now single out the term on the left-hand side corresponding to the
level j by multiplying the equation through on the left by the function φ∗j (r)
and integrating to obtain
Z
Z
X dcn
∗
−iEi t/~
−iEn t/~
i~
e
φj (r)φn (r)dV = e
φ∗j (r)Hint φi (r)dV. (123)
dt
n
28
The eigenfunctions of H0 have the property that they form an orthogonal
R ∗set of functions. This means that if n is not equal to j, the integral,
φj (r)φn (r)dV , which appears on the left is equal to zero. For the case,
n = j, the functions can be normalized so that the integral is equal to one.
Using this property of the wave functions, eq. (123) can be written
dcj
= ei(Ej t − Ei )t/~
i~
dt
Z
φ∗j (r)Hint φi (r)dV.
(124)
The factor, (Ej − Ei )/~, which appears in the exponential term may be
identified as the angular frequency of the transition
ωij =
Ej − Ei
.
~
(125)
Multiplying eq. (124) through by −idt/~ and integrating from 0 to t, we
obtain the following equation for the coefficient cj as a function of time
i
cj (t) = −
~
Z
0
t
0
eiωij t
Z
φ∗j (r)Hint φi (r)dV
dt0 .
(126)
In order to solve this last equation for cj , we must use the explicit form
of the interaction Hamiltonian. Substituting eq. (116) into eq. (126) and
performing the integrations over t0 , we obtain
#
Z
1 − ei(ωij − ω)t
iE0 φ∗j (r)(−ez) eik·r φi (r)dV
cj (t) = −
~(ωij − ω)
"
#
Z
1 − ei(ωij + ω)t
+
iE0 φ∗j (r)(−ez) e−ik·r φi (r)dV. (127)
~(ωij + ω)
"
For the case Ej < Ei , the angular frequency ωij , which is given by (125), is
negative and the transition i → j corresponds to stimulated emission. When
the frequency ω of the incident radiation is near −ωij , the denominator of the
second term in eq. (127) will become very small and the second term will be
much larger than the first. It is usually true that for emission processes the
first term may be neglected. Similarly, the first term in eq. (127) provides a
good approximate description of absorption.
29
We shall consider stimulated emission in some detail. Factoring ei(ωij + ω)t/2
from the second term of eq. (127), we may write this contribution to cj as
"
cj (t) =
ei(ωij + ω)t/2
#
ei(ωij + ω)t/2 − e−i(ωij + ω)t/2
×
~(ωij + ω)
Z
iE0 φ∗j (r)(−ez) e−ik·r φi (r)dV. (128)
The representation of the sine function in terms of exponentials given by
eq. (3.10) may then be used to write eq. (128) in the following way
Z
sin[(ωij + ω)t/2]
i(ω
+
ω)t/2
ij
E0 φ∗j (r)(−ez) e−ik·r φi (r)dV. (129)
cj (t) = e
~[(ωij + ω)/2]
The transition probability per time is |cj (t)|2 /t. Using eq. (129), the transition probability per time may be written
Z
t sin2 [(ωij + ω)t/2] 2
2
E0 | φ∗j (r)(−ez) e−ik·r φi (r)dV |2 . (130)
|cj (t)| /t = 2
~ [(ωij + ω)t/2]2
Eq. (130) gives the probability per time that radiation of a single frequency
ω will be emitted. According to eq. (113), the energy per volume of the
wave is equal to 20 E02 . In order to be in a position to integrate over the
entire spectrum of frequencies, we set this expression for the energy equal to
the amount of energy of a continuous spectrum in the range between ω and
ω + dω
20 E02 = ρ(ω)dω.
(131)
where, as before, ρ(ω) is the energy density per frequency range. Solving this
equation for E02 , gives
1
E02 =
ρ(ω)dω.
(132)
20
We now substitute this expression for E02 into eq. (130) and integrate over a
range of frequencies that includes the resonant frequency −ωij to obtain
Z
1
2
|cj (t)| /t =
| φ∗j (r)(−ez) e−ik·r φi (r)dV |2
2
20 ~
2
Z ω2
sin ((ωij + ω)t/2))
×
ρ(ω)
tdω. (133)
((ωij + ω)t/2)
ω1
30
The term occurring in the denominator of the integrand will be zero when
the frequency ω is equal to −ωij . This frequency, which makes the largest
contribution to the transition probability, will be denoted by ω ∗ . Using
eq. (125), we may write
ω ∗ = −ωij =
Ei − Ej
.
~
(134)
For an emission process, Ei will be greater than Ej and ω ∗ will be positive. According to eq. (44) of Chapter 4, ω ∗ is then equal to the transition
frequency. Substituting ω ∗ for −ωij in eq. (133), we get
1
|
|cj (t)| /t =
20 ~2
2
Z
φ∗j (r)(−ez) e−ik·r φi (r)dV |2
2
Z ω2
sin ((ω − ω ∗ )t/2))
×
tdω. (135)
ρ(ω)
((ω − ω ∗ )t/2)
ω1
The function within square brackets in this last equation is similar to the
function occurring within square brackets in eq. (3.44), which is represented
by the dotted line in Fig. 3.13. Both functions have well-defined maxima.
The function within square brackets in eq. (135) has its maximum value for
ω = ω ∗ , and the function is zero when the frequency ω differs from ω ∗ by an
integral number of multiples of 2π/t
ω − ω∗ = n
2π
.
t
(136)
For large values of t, the function within square brackets becomes very
sharply peaked. The function ρ(ω) can then be approximated by its value at
the transition frequency ω = ω ∗ and brought outside the integral to give
1
ρ(ω ∗ )|
|cj (t)| /t =
20 ~2
2
Z
φ∗j (r)(−ez) e−ik·r φi (r)dV |2
2
Z ω2 sin ((ω − ω ∗ )t/2)
×
tdω. (137)
((ω − ω ∗ )t/2)
ω1
In the limit of large t, the integral has the value 2π, and we obtain
Z
π
2
∗
|cj (t)| /t =
ρ(ω )| φ∗j (r)(−ez) e−ik·r φi (r)dV |2 .
0 ~2
(138)
31
In deriving this result, we have not made any assumptions concerning
the wavelength of the light. A very useful approximation can be obtained
by taking advantage of the fact that the size of the atom is much smaller
than the wavelength of visible or even ultraviolet light. The wavelength of
visible light is between 400 and 700 nm, while the wavelength of ultraviolet
light is between 10 and 400 nm. By comparison, the size of an atom is about
0.1 nm. The dependence of the incident wave upon the spatial coordinates
occurs through the factor e-ik · r in eq. (138). Since the magnitude of the
wave vector k is 2π/λ, k · r will not change appreciably over the size of the
atom. It follows that we can approximate the exponential function by the
first term in its Taylor series expansion
e−ik · r = 1 − ik · r + . . . ,
(139)
This is called the electric dipole approximation. Replacing the exponential
function with 1 in eq. (138) and denoting the transition frequency by ω as in
the text, we obtain
Z
π
2
ρ(ω)| φ∗j (r)(−ez) φi (r)dV |2 .
(140)
|cj (t)| /t =
0 ~2
32
APPENDIX FF
Transitions with x- and y-Polarized Light
We would now like to calculate transition probabilities for light polarized
in the x- and y-directions. For this purpose, we introduce linear combinations
of the x- and y- coordinates having simple transition integrals. We define
r+ = x + iy and r− = x − iy. Using eq. (4.3) and Euler’s formula, one may
derive the following equations for r+ and r−
r+ = r sin θ eiφ
(141)
r− = r sin θ e−iφ .
(142)
The significance of the variables r+ and r− may be understood in physical
terms. If we were to multiply r+ by the time dependence e−iωt corresponding
to an oscillating electric field, we would obtain
r+ (t) = r sin θ ei(φ − ωt) .
(143)
r+ (t) has the same dependence upon the polar angle and time as the components of a polarization vector rotating about the z-axis. We may thus
associate the variable r+ with circularly polarized light. Similarly, r− may
be associated with circularly polarized light for which the polarization vector
rotates about the z- axis in the opposite direction. We now use the variables
r+ and r− to evaluate transition integrals for x- and y- polarized light.
Example
Again for the 2p → 1s transition of the hydrogen atom, calculate the
transition integrals for x- and y-polarized light.
Solution
For the transition 2p-1 to 1s0, the integral of r+ is
Z
φ∗1s0 r+ φ2p−1 dV
√
Z π
Z 2π
1
3
1
1
√ · sin θ ·
√ · eiφ · √ e−iφ dφ, (144)
= Ri
sin θ sin θ dθ
2
2
2π
2π
0
0
33
where Ri is the radial integral given by eq. (4.25). The integration over φ
gives 1, and the θ integration can be written
r Z π
r
r
1
1 3
3
4
2
sin3 θdθ =
· =
.
(145)
2 2 0
2 2 3
3
The transition integral for the operator r+ is thus
r
Z
2
∗
φ1s0 r+ φ2p−1 dV =
Ri
3
(146)
The integrals of r+ for the transition 2p0 → 1s0 and 2p + 1 → 1s0 may be
shown to be equal to zero. The integral of r− for the transition 2p + 1 to 1s0
can be shown to be
r
Z
2
Ri ,
(147)
φ∗1s0 r− φ2p+1 dV = −
3
and the other integrals of r− are zero. The defining equations for r+ and r−
can be solved for x and y to obtain
1
x = (r+ + r− )
2
(148)
1
(r+ − r− ).
(149)
2i
These equations can be used to evaluate the transition integrals for x-and
y-polarized light.
y=
The transition rates for x- and y-polarized light can be calculated using
eqs. (4.20) and (4.21) with x and y in place of z. The calculation of these
transition rates are left as an exercise. (See Problem 13.).
34
APPENDIX GG
Derivation of the Distribution Laws
The distribution laws are derived by maximizing the statistical weight for
a perfect gas with respect to the occupation numbers nr . In the following,
we shall find it more convenient to require that the natural log of the weight
be a maximum rather than the weight itself. Since the natural logarithm
of the weight ln W increases monotonically with W , the weight will have
a maximum when the logarithm of the weight has its maximum. We thus
require that the following condition be satisfied
δ ln W = 0,
(150)
for small variations of the occupation numbers consistent with the equations
N=
X
nr
,
(151)
r
and
E=
X
nr r
.
(152)
r
The condition (150) for changes in the occupation numbers consistent
with eqs. (151) and (152) may be shown to be equivalent to the condition
δ[ln W − α
∞
X
r=1
nr − β
∞
X
r nr ] = 0,
(153)
r=1
for all changes in the occupation numbers. This variational condition may
be written
∞
∞
X
X
δ ln W − α
δnr − β
r δnr = 0.
(154)
r=1
r=1
This equation may be used to derive the distribution laws for classical and
quantum statistics. Since the expression for the weight W depends upon
the particular form of statistics, we must consider each kind of statistics
separately.
35
Maxwell-Boltzmann Statistics
For a perfect classical gas, the statistical weight is given by eq. (7.6). Using the explicit form of this equation with the product notation, the natural
logarithm of W (n1 , n2 , . . . , nr , . . . ) may be written
∞
X
ln W = ln N ! +
(nr ln gr − ln nr !) .
(155)
r=1
For a macroscopic sample, the occupation numbers nr are very large and the
natural logarithm of the factorial ln nr ! may be approximated by Sterling’s
formula
ln n! = n(ln n − 1), for large n.
(156)
Eq. (155) then becomes
ln W = ln N ! +
∞
X
(nr ln gr − nr ln nr + nr ) .
(157)
r=1
Using eq. (157), the change in ln W due to changes in the occupation
numbers δnr may be written
δ ln W =
∞
X
(ln gr − ln nr )δnr .
(158)
r=1
Substituting this last equation into the variational condition (154), we obtain
the condition,
∞
X
(ln gr − ln nr − α − βr )δnr = 0,
r=1
which can be true for all variations δnr only if the factors appearing within
parentheses are equal to zero. We thus have
ln gr − ln nr − α − βr = 0.
This last condition can be written
ln
nr
= −α − βr .
gr
(159)
36
Taking the exponent of each side of eq. (159), we obtain finally
nr
= e−α − βr .
gr
(160)
The distribution law (160) may be cast into a more convenient form by
expressing the constant α in terms of another constant Z by the equation
e−α =
We then have
N
.
Z
N
nr
= e−βr .
gr
Z
(161)
(162)
As shown in the book by McGervey, which is cited at the end of Chapter 7,
the constant β is equal to kT with the constant k being called the Boltzmann
constant. We thus obtain
N
nr
= er /kT .
(163)
gr
Z
This last equation is known as the Maxwell-Boltzmann distribution law.
Bose-Einstein Statistics
The statistical weight for Bose-Einstein statistics is given
by eq. (7.54). Using this formula, the natural logarithm of
W (n1 , n2 , . . . , nr , . . . ) may be written
ln W =
∞
X
[ln(nr + gr − 1)! − ln nr ! − ln(gr − 1)!] .
(164)
r=1
Sterling’s formula (156) may again be used to evaluate the first
two natural logarithms, and we obtain
ln W =
∞
X
[(nr + gr − 1) ln(nr + gr − 1)
r=1
−nr ln nr − (gr − 1) − ln(gr − 1)!] . (165)
37
Using eq. (165), the change in ln W due to changes in the
occupation numbers δnr may be written
δ ln W =
∞
X
[ln(nr + gr − 1) − ln nr ] δnr .
(166)
r=1
Since nr is much larger than one, the number, −1, in the first
term may be omitted and this last equation becomes
δ ln W =
∞
X
[ln(nr + gr ) − ln nr ] δnr .
(167)
r=1
Substituting eq. (167) into the variational condition (154), we
obtain
∞
X
[ln(nr + gr ) − ln nr − α − βr ] δnr = 0.
(168)
r=1
Again, setting the factor multiplying δnr equal to zero, we get
ln(nr + gr ) − ln nr − α − βr = 0.
(169)
This last equation can be written
ln
nr
= −α − βr .
nr + gr
(170)
Taking the exponent of each side of eq. (170) and collecting
together the terms depending upon nr , we obtain
nr 1 − e−α − βr = gr e−α − βr .
(171)
We may again take β = kT , and this equation may be written
nr = gr
1
eα er /kT − 1
.
(172)
38
Eq. (172) is known as the Bose-Einstein distribution law.
Fermi-Dirac Statistics
The statistical weight for Fermi-Dirac statistics is given by
eq. (7.55). Using this formula, the natural logarithm of W (n1 , n2 , . . . , nr , . . . )
may be written
ln W =
∞
X
[ln gr ! − ln nr ! − ln(gr − nr )!] .
(173)
r=1
Again using Sterling’s formula (156) to evaluate the first two
terms in the summation, we obtain
ln W =
∞
X
[ln gr ! − nr ln nr − (gr − nr ) ln(gr − nr ) + gr ] .
r=1
(174)
Using eq. (174), the change in ln W due to changes in the
occupation numbers δnr may be written
δ ln W =
∞
X
[ln(gr − nr ) − ln nr ] δnr .
(175)
r=1
As before, we substitute eq. (175) into the variational condition (154) to obtain
∞
X
[ln(gr − nr ) − ln nr − α − βr ] δnr = 0.
(176)
r=1
The factor multiplying δnr may again be set equal to zero to
give the following equation
ln
nr
= −α − βr .
gr − nr
(177)
39
Taking the exponent of each side of eq. (177) and collecting
together the terms depending upon nr , we obtain
nr 1 + e−α − βr = gr e−α − βr
(178)
Again setting β = kT , this equation may be written
nr = gr
1
eα er /kT + 1
,
which is known as the Fermi-Dirac distribution law.
(179)
40
APPENDIX HH
Derivation of Block’s Theorem
The wave function of an electron moving in a periodic potential is a
solution of the Schr¨odinger equation, which is given by eq. (8.41). Evaluating
the Schr¨odinger equation at the coordinate point r + l, we obtain
~2 2
∇ + V (r + l) ψ(r + l) = Eψ(r + l).
(180)
−
2m
We may use eq. (8.42) to replace the potential energy term in this equation
with its value at the point r giving
~2 2
∇ + V (r) ψ(r + l) = Eψ(r + l).
(181)
−
2m
The functions ψ(r) and ψ(r + l) are thus both solutions of the Schr¨odinger
equation corresponding to the energy E. If the energy eigenvalue E is nondegenerate, the function ψ(r + l), which is obtained from ψ(r) by a displacement by a lattice vector l, must be proportional to ψ(r). This is true for
any l. We consider first the function ψ(r + a1 ) which corresponds to a single step in the direction a1 . For this function, the appropriate relation of
proportionality can be written
ψ(r + a1 ) = λ1 ψ(r).
(182)
Since the functions ψ(r + a1 ) and ψ(r) are both normalized, we must have
|λ1 |2 = 1.
(183)
λ1 = eik1 ,
(184)
We can thus write λ1 in the form
where k1 is a real number. Eq. (182) then becomes
ψ(r + a1 ) = eik1 (r).
(185)
Similar equations can be derived for displacements in the a2 and a3 directions
ψ(r + a2 ) = eik2 (r) , ψ(r + a3 ) = eik3 (r).
(186)
41
The effect of a general translation can be obtained by applying eqs. (185)
and (186) successively for translations in the a1 , a2 and a3 directions
ψ(r + l) = ψ(r + l1 a1 + l2 a2 + l3 a3 )
= eik1 ψ(r + (l1 − 1)a1 + l2 a2 + l3 a3 )
= eik1 l1 ψ(r + l2 a2 + l3 a3 )
= ei(k1 l1 + k2 l2 + k3 l3 ) ψ(r).
(187)
To express this result in more general terms, we define a wave vector k
k = k1
b1
b2 k3 b3
+ k2
+
,
2π
2π
2π
(188)
where b1 , b2 and b3 are the reciprocal vectors corresponding to the unit
vectors a1 , a2 and a3 . Using eq. (8.1) and also eq. (8.22), eq. (187) can be
written
ψ(r + l) = eik · l ψ(r),
(189)
which is a mathematical expression for the theorem. The proof of the theorem
depends upon the potential energy being periodic.
A more general proof of Block’s theorem including the case for which the
energy E is degenerate can be found in the book by Ziman which is cited at
the end of Chapter 8.
42
APPENDIX II
The Band Gap
To find the form of the wave function and the energy near near the zone
boundary, we consider the Schr¨odinger equation for the diffracted electron
~2 2
−
∇ + V (r) ψk = E(k)ψk .
(190)
2m
According to eqs. (8.63) and (8.58), the wave function and the potential
energy may be expressed in terms of plane waves as follows
X
0
ψk (r) =
αk−g0 ei(k − g ) · r ,
(191)
g0
and
V (r) =
X
00
Vg00 eig · r ,
(192)
g00
where the summations over g0 and g00 extend over the vectors of the reciprocal
lattice. Substituting these equations into eq. (190) and using eq. (8.63), we
get
X
0
E 0 (k − g0 )αk−g0 ei(k − g ) · r +
X
0
00
Vg00 αk−g0 ei(k − g + g ) · r
g0 ,g00
g0
= E(k)
X
0
αk−g0 ei(k − g ) · r , (193)
g0
where E 0 (k − g0 ) is the kinetic energy of a free electron with wave vector k − g0 . The term on the right-hand side of this last equation may be
grouped together with the first term on the left-hand side giving
X
X
0
0
00
E 0 (k − g0 ) − E(k) αk−g0 ei(k − g ) · r +
Vg00 αk−g0 ei(k − g + g ) · r = 0.
g0 ,g00
g0
Multiplying eq. (194) through from the left by 1/V
ψk∗ (r) =
1
V
1/2
e−ik · r ,
1/2
(194)
and by the function,
(195)
43
integrating over all space, and using eq. (8.56), we obtain
X
X
E 0 (k − g0 ) − E(k) αk−g0 δk,k−g0 +
Vg00 αk−g0 δk,k−g0 +g00 = 0.
g0
(196)
g0 ,g00
The kronecker delta function δk,k−g0 occurring in the first term is equal to
one if k = k − g0 , and, otherwise, is equal to zero. This kronecker delta
thus has the effect of reducing the first summation to a single term for which
g0 = 0. Similarly, the kronecker delta in the second summation is one if
g0 = g00 being zero otherwise and reduces the double summation to a single
summation. We have
X
0
E (k) − E(k) αk +
Vg0 αk−g0 = 0.
(197)
g0
As illustrated in Fig. 8.22, the crystal filed causes the wave function with
wave vector k to interact with the state with wave vector k − g. For a
particular reciprocal lattice vector g, we thus ignore all coefficients except
αk and αk−g . Eq. (197) then becomes
0
E (k) − E(k) αk + V0 αk + Vg αk−g = 0.
(198)
We note that the Fourier coefficient V0 corresponds to a constant term in the
potential energy V (r) and may thus be taken to be zero. This eliminates the
second term in the above equation, and we obtain
0
E (k) − E(k) αk + Vg αk−g = 0.
(199)
A second equation for the coefficients, αk and αk−g , can be obtained by
multiplying eq. (194) through from the left by the function,
∗
ψk−g
(r) =
1
V 1/2
e−i(k − g) · r ,
and integrating over all space as before. We get
X
0
E (k − g) − E(k) αk−g +
Vg0 −g αk−g0 = 0.
(200)
(201)
g0
Limiting ourselves again to the terms depending upon the coefficients, αk
and αk−g , we note that the term in the sum for which g0 = g vanishes since
44
we have supposed that the Fourier coefficient V0 is equal to zero. Setting
g0 = 0 in the above summation leads to the equation
V−g αk + E 0 (k − g) − E(k) αk−g = 0.
(202)
The Fourier coefficient V−g appearing in eq. (202) is equal to the coefficient Vg∗
which appears in the Fourier expansion of V (r)∗ .
A trivial solution of eqs. (199) and (202) can be obtained by taking the
coefficients, αk and αk−g , equal to zero. If the determinant of eqs. (199)
and (202) is not equal to zero, this is the only solution of the equations. In
order to find a physically meaningful description of the diffracted electron,
we thus set the determinant of the coefficients equal to zero. We have
0
[E (k) − E(k)]
Vg
= 0.
(203)
[E 0 (k − g) − E(k)]
Vg∗
The two free-electron states ek · r and e(k − g) · r have the same energy, E 0 (k) =
E 0 (k − g). If we denote this common value by E 0 , the quadratic equation
resulting from eq. (203) can be written
2
(204)
E(k) − E 0 − |Vg |2 = 0.
Eq. (204) has two solutions
E(k)± = E 0 ± |Vg |.
(205)
The interaction of the two free-electron states with wave vectors k and k − g
thus cause a discontinuity in the energy at the zone boundary. The magnitude
of the discontinuity depends upon the Fourier coefficients, Vg , which occur
in the expansion of the potential energy.
An expression for the Fourier coefficients can be obtained by applying
eq. (8.26) to the periodic function V (r), which gives
Z
1
V (r) e−ig · r dV.
(206)
Vg =
vcell
We now divide eq. (206) into real and imaginary parts by using Eurler’s
equation
Z
Z
1
i
V (r) cos(g · r)dV −
V (r) sin(g · r)dV.
(207)
Vg =
vcell
vcell
45
Most of the common three-dimensional lattices are symmetric with respect
to inversion r → −r. The body-and face-centered cubic structures and the
hexagonal close packed structures described in Chapter 8 are invariant with
respect to inversions provided that a suitable choice is made of the origin. If
the potential V (r) is symmetric with respect to inversion, the second integral
in eq. (207) vanishes, and the equation for the Fourier coefficients becomes
Z
1
Vg =
V (r) cos(g · r)dVr .
(208)
vcell
Since the electrons are attracted to the ion cores, the potential energy function V (r) is negative in the neighborhood of each atom. It thus follows from
eq. (208) that the Fourier coefficient Vg are negative real numbers.
As discussed in Chapter 8, the crystal field mixes the two free-electron
states ek · r and e(k − g) · r to produce states having energies E − and E + . We
can derive a condition for the coefficients of the lower state by substituting
the value of E − given by eq. (205) into eq. (199) to obtain
|Vg |αk + Vg αk−g = 0.
(209)
Since Vg is negative, |Vg | is equal to −Vg . Using this result together with
eq. (209), one may readily show that the coefficients αk−g is equal to the
coefficient αk . Similarly, we can derive a condition for the coefficients of the
upper state by substituting the value of E + given by eq. (205) into eq. (199)
and using the relation |Vg | = −Vg to obtain
Vg αk + Vg αk−g = 0.
(210)
This equation implies that for the upper state the mixing coefficient αk−g
is equal to −αk . The spatial form of the wave functions for the lower and
upper states are discussed in the text.
46
APPENDIX JJ
Vector Spaces and Matrices
A vector space is defined to be a set of elements called vectors for which
vector addition and scalar multiplication is defined. Vector addition assigns
to every pair of vectors, x and y, a sum, x + y, in such a way that
(1) addition is commutative, x+y=y+x,
(2) addition is associative, x+(y+z)=(x+y)+z
(3) there is in the vector space a unique vector 0 (called the origin) such that
x + 0 = x for every vector x, and
(4) to every vector x in the space there corresponds a unique vector −x such
that x + (−x) = 0.
The multiplication of a scalar α times a vector x assigns to every pair,
α and x, a vector αx in the vector space called the product of α and x.
Scalar-vector multiplication is such that
(1) multiplication by scalars is associative, α(βx) = (αβ)x,
(2) 1x = x for every vector x,
(3) multiplication by scalars is distributive with respect to vector addition,
α(x + y) = αx + αy, and
(4) multiplication by scalars is distributive with respect to scalar addition,
(α + β)x = αx + βx.
An elementary example of a vector space is ordered sets of n numbers
 
x1
 x2 

xn = 
. . .
xn
with vector addition and scalar multiplication defined by the equations




x1 + y 1
αx1
 x2 + y n 


 , αxn =  αx2 
xn + yn = 
 ... 
 ... 
xn + y n
αxn
If the components of the vectors in this space are complex numbers, the
vector space is denoted C n . The vector space consisting of ordered sets of
47
real numbers is denoted Rn . Another example of a vector space is the set of
polynomials
pn (x) = a0 + a1 x + a2 x2 + · · · + an xn
with complex coefficients an . The addition of two polynomial and multiplication by a complex number is defined in the ordinary way.
A finite set {xi } of vectors is said to be linearly dependent if there is a
corresponding set {αi } of numbers, not all zero, such that
X
αi xi = 0,
i
where the zero on the right-hand
P side of the equation corresponds to the zero
vector. If, on the other hand, i αi xi = 0 implies that αi = 0 for each i, the
set {xi } is linearly independent. To illustrate the idea of linear dependence,
we consider the following equation involving two vectors
1
0
0
α1
+ α2
=
.
0
1
0
Using the rules we have given previously for scalar multiplication and vector
addition in C n , we can combine the two vectors on the left-hand side of the
equation to obtain
α1
0
=
.
α2
0
This last equation can only be true if the coefficients, α1 and α2 , are both
equal to zero. We thus conclude that the vectors, [10]T and [01]T , are linearly
independent. A basis in a vector space V is a set of linearly independent
vectors X such that every vector in the space can be expressed as a linear
combination of members of X . For instance, the vectors,
1
0
and
,
0
1
form a basis in the space C 2 .
An inner product in a vector space is a complex or real valued function
of the ordered pair of vectors x and y such that
(1)(x, y) = (y, x), where the line over the ordered pair on the right indicates
complex conjugation,
48
(2) (x, α1 y1 + α2 y2 ) = α1 (x, y1 ) + α2 (x, y2 )
(3) (x, x) ≥ 0; (x, x) = 0 if and only if x = 0.
The condition (1) implies that (x, x) is always real, so that the inequality
in (3) makes sense. In a space with an inner product defined, the norm of a
vector ||x|| is defined
p
||x|| = (x, x).
The inner product thus makes it possible to associate a norm or length with
every vector in the space. For the space C n , the inner product of two vectors
 
 
x1
y1
 x2 
 y2 

 
x=
. . . , y = . . .
xn
yn
is defined
(x, y) =
n
X
xi yi .
i=1
Now for a little terminology. The vectors, x and y, are said to be orthogonal
if the inner product of the two vectors is equal to zero
(x, y) = 0.
A vector x is said to be normalized if its norm is equal to one
||x|| = 1,
and a basis of vectors {φi } is said to be orthonormal if each basis vector
is orthogonal to the other members of the basis and if each basis vector is
normalized.
The wave functions representing the states of a physical system may be
thought of as vectors in a function space. The inner product of two wave
functions, ψ1 and ψ2 , which depend upon a single variable x is defined
Z
(ψ, φ) = ψ1 (x)ψ2 (x)dx,
and the inner product for wave functions depending on two or three variables
is defined accordingly. For a particle moving in three dimensions, the inner
product of the wave functions, ψ1 and ψ2 , is
Z
(ψ, φ) = ψ1 (r)ψ2 (r)dV,
49
where dV is the volume element.
The presence of a basis in a vector space makes it possible to associate a
column vector in C n with every vector in the space and to associate a matrix
with operators acting on vectors in the space. Let V be a vector space and
let X = φ1 , φ2 , . . . , φn be a basis of V. Using the basis, a vector x may be
expressed
n
X
x=
xi φi ,
(211)
i=1
and we may associate a column vector,
 
x1
 x2 

x=
. . . ,
xn
with each vector x.
The product of an operator A and a vector x is a vector which may also
be expressed as a linear combination of the basis vectors φi . This will be
true when A acts on the members of the basis itself
Aφj =
n
X
aij φi ,
(212)
i=1
for j = 1, . . . , n. The set {aij } of numbers, indexed with the double subscript i, j is the matrix corresponding to A. We shall generally denote the
matrix of A by [A]. The matrix may be written more explicitly in the form
of a square array


a11 a12 . . . a1n
 a21 a22 . . . a2n 
.
[A] = 
(213)
. . .

an1 an2 . . . ann
If the basis is ortho-normal, an explicit expression can be derived for the
matrix elements aij by using the inner product. Taking the inner product of
eq. (212) from the left with φk , gives
(φk , Aφj ) = (φk ,
n
X
i=1
aij φi ) =
n
X
i=1
aij (φk , φi ) = akj .
50
To derive this last equation, we have used the second property of the inner
product and the fact that the basis is orthonormal. The last result may be
written
akj = (φk , Aφj ).
The result of acting with an operator A upon a vector x can be obtained
from the matrix associated with A and the column vector associated with x.
Using eqs. (211) and (212), we obtain
n
n X
n
n
n
n
X
X
X
X
X
aji φj =
(
aji xi )φj
Ax = A(
xi φ i ) =
xi Aφi =
xi
i=1
i=1
i=1
j=1
j=1 i=1
We may write
(Ax)j =
n
X
aji xi .
i=1
The j th component of the vector Ax may thus be obtained by forming the
sum of the elements of j th row of the matrix [A] times the components of
the column vector [x]. As an example of this rule, we give the result of the
matrix-vector multiplications in Problem 7 of Chapter 10.

   
1 2 0 1
3
1 1 2 1 = 2
1 3 1 0
4

   
1 0 1 1
2
1 2 1 0 = 2
1 1 3 1
4
   

5
2 1 1 1
1 0 1 2 = 2
3
1 1 0 1
The matrix corresponding to the product of two operators, A and B, may
be obtained by allowing to AB to act upon an element of the basis
(AB)φj = A(Bφj ) = A(
n
X
bkj φk ) =
k=1
=
n
X
k=1
n
X
bkj Aφk
k=1
n
n X
n
X
X
bkj (
aik φi ) =
(
aik bkj )φi .
i=1
i=1 k=1
51
We thus define the matrix product [A][B] by the equation
([A][B])ij =
n
X
aik bkj .
(214)
k=1
The process of forming the product of two matrices can be described in
terms of the individual matrices. To obtain the ij th element of the product
matrices, one forms the sum of the products of the elements of the i th row of
[A] with the elements of the j th column of [B]. The matrix multiplication will
be well defined only if the matrix [A] has as many columns as the matrix [B]
has rows. As an example of the rule (214) for multiplying matrices, we give
the result of the matrix-matrix multiplications in Problem 8 of Chapter 10.
0 1 0 −i
i 0
=
1 0 i 0
0 −i
0 1 2 0
0 −2
=
1 0 0 −2
2 0


 

1 2 0 2 1 1
4 1 3
1 1 2 1 0 1 = 5 3 2
1 3 1 1 1 0
6 2 4
1
APPENDIX KK
Vector Spaces and Matrices
A vector space is defined to be a set of elements called vectors for which
vector addition and scalar multiplication is defined. Vector addition assigns to
every pair of vectors, x and y, a sum, x + y, in such a way that
(1) addition is commutative, x+y=y+x,
(2) addition is associative, x+(y+z)=(x+y)+z
(3) there is in the vector space a unique vector 0 (called the origin) such that
x + 0 = x for every vector x, and
(4) to every vector x in the space there corresponds a unique vector −x such
that x + (−x) = 0.
The multiplication of a scalar α times a vector x assigns to every pair, α and
x, a vector αx in the vector space called the product of α and x. Scalar-vector
multiplication is such that
(1) multiplication by scalars is associative, α(βx) = (αβ)x,
(2) 1x = x for every vector x,
(3) multiplication by scalars is distributive with respect to vector addition,
α(x + y) = αx + αy, and
(4) multiplication by scalars is distributive with respect to scalar addition,
(α + β)x = αx + βx.
An elementary example of a vector space is ordered sets of n numbers
 
x1
 x2 

xn = 
. . .
xn
with vector addition and scalar multiplication defined by the equations




x1 + y1
αx1
 x2 + yn 
 αx2 



xn + yn = 
 . . .  , αxn =  . . . 
xn + yn
αxn
If the components of the vectors in this space are complex numbers, the vector
space is denoted C n . The vector space consisting of ordered sets of real numbers
is denoted Rn . Another example of a vector space is the set of polynomials
pn (x) = a0 + a1 x + a2 x2 + · · · + an xn
with complex coefficients an . The addition of two polynomial and multiplication
by a complex number is defined in the ordinary way.
2
A finite set {xi } of vectors is said to be linearly dependent if there is a
corresponding set {αi } of numbers, not all zero, such that
X
αi xi = 0,
i
where the zero on the right-hand
P side of the equation corresponds to the zero
vector. If, on the other hand, i αi xi = 0 implies that αi = 0 for each i, the
set {xi } is linearly independent. To illustrate the idea of linear dependence, we
consider the following equation involving two vectors
1
0
0
α1
+ α2
=
.
0
1
0
Using the rules we have given previously for scalar multiplication and vector
addition in C n , we can combine the two vectors on the left-hand side of the
equation to obtain
α1
0
=
.
α2
0
This last equation can only be true if the coefficients, α1 and α2 , are both equal
to zero. We thus conclude that the vectors, [10]T and [01]T , are linearly independent. A basis in a vector space V is a set of linearly independent vectors X
such that every vector in the space can be expressed as a linear combination of
members of X . For instance, the vectors,
1
0
and
,
0
1
form a basis in the space C 2 .
An inner product in a vector space is a complex or real valued function of
the ordered pair of vectors x and y such that
(1)(x, y) = (y, x), where the line over the ordered pair on the right indicates
complex conjugation,
(2) (x, α1 y1 + α2 y2 ) = α1 (x, y1 ) + α2 (x, y2 )
(3) (x, x) ≥ 0; (x, x) = 0 if and only if x = 0.
The condition (1) implies that (x, x) is always real, so that the inequality
in (3) makes sense. In a space with an inner product defined, the norm of a
vector ||x|| is defined
p
||x|| = (x, x).
The inner product thus makes it possible to associate a norm or length with
every vector in the space. For the space C n , the inner product of two vectors
 
 
x1
y1
 x2 
 y2 

 
x=
. . . , y = . . .
xn
yn
3
is defined
(x, y) =
n
X
xi yi .
i=1
Now for a little terminology. The vectors, x and y, are said to be orthogonal if
the inner product of the two vectors is equal to zero
(x, y) = 0.
A vector x is said to be normalized if its norm is equal to one
||x|| = 1,
and a basis of vectors {φi } is said to be orthonormal if each basis vector is
orthogonal to the other members of the basis and if each basis vector is normalized.
The wave functions representing the states of a physical system may be
thought of as vectors in a function space. The inner product of two wave
functions, ψ1 and ψ2 , which depend upon a single variable x is defined
Z
(ψ, φ) = ψ1 (x)ψ2 (x)dx,
and the inner product for wave functions depending on two or three variables
is defined accordingly. For a particle moving in three dimensions, the inner
product of the wave functions, ψ1 and ψ2 , is
Z
(ψ, φ) = ψ1 (r)ψ2 (r)dV,
where dV is the volume element.
The presence of a basis in a vector space makes it possible to associate a
column vector in C n with every vector in the space and to associate a matrix
with operators acting on vectors in the space. Let V be a vector space and let
X = φ1 , φ2 , . . . , φn be a basis of V. Using the basis, a vector x may be expressed
x=
n
X
x i φi ,
i=1
and we may associate a column vector,


x1
 x2 

x=
. . . ,
xn
with each vector x.
(1)
4
The product of an operator A and a vector x is a vector which may also
be expressed as a linear combination of the basis vectors φi . This will be true
when A acts on the members of the basis itself
Aφj =
n
X
aij φi ,
(2)
i=1
for j = 1, . . . , n. The set {aij } of numbers, indexed with the double subscript i, j
is the matrix corresponding to A. We shall generally denote the matrix of A by
[A]. The matrix may be written more explicitly in the form of a square array


a11 a12 . . . a1n
 a21 a22 . . . a2n 
.
[A] = 
(3)
...

an1 an2 . . . ann
If the basis is ortho-normal, an explicit expression can be derived for the
matrix elements aij by using the inner product. Taking the inner product of
eq. (2) from the left with φk , gives
(φk , Aφj ) = (φk ,
n
X
aij φi ) =
i=1
n
X
aij (φk , φi ) = akj .
i=1
To derive this last equation, we have used the second property of the inner
product and the fact that the basis is orthonormal. The last result may be
written
akj = (φk , Aφj ).
The result of acting with an operator A upon a vector x can be obtained
from the matrix associated with A and the column vector associated with x.
Using eqs. (1) and (2), we obtain
n
n
n
n
n X
n
X
X
X
X
X
Ax = A(
x i φi ) =
xi Aφi =
xi
aji φj =
(
aji xi )φj
i=1
i=1
i=1
j=1
j=1 i=1
We may write
(Ax)j =
n
X
aji xi .
i=1
The j th component of the vector Ax may thus be obtained by forming the sum
of the elements of j th row of the matrix [A] times the components of the column
vector [x]. As an example of this rule, we give the result of the matrix-vector
multiplications in Problem 7 of Chapter 10.

   
1 2 0
1
3
1 1 2 1 = 2
1 3 1
0
4
5

1 0
1 2
1 1

2 1
1 0
1 1
   
1
1
2
1 0 = 2
3
1
4
   
1
1
5
1 2 = 2
0
1
3
The matrix corresponding to the product of two operators, A and B, may
be obtained by allowing to AB to act upon an element of the basis
(AB)φj = A(Bφj ) = A(
n
X
bkj φk ) =
k=1
=
n
X
n
X
bkj Aφk
k=1
n
n X
n
X
X
aik bkj )φi .
(
aik φi ) =
bkj (
i=1
k=1
i=1 k=1
We thus define the matrix product [A][B] by the equation
([A][B])ij =
n
X
aik bkj .
(4)
k=1
The process of forming the product of two matrices can be described in terms
of the individual matrices. To obtain the ij th element of the product matrices,
one forms the sum of the products of the elements of the i th row of [A] with
the elements of the j th column of [B]. The matrix multiplication will be well
defined only if the matrix [A] has as many columns as the matrix [B] has rows.
As an example of the rule (4) for multiplying matrices, we give the result of the
matrix-matrix multiplications in Problem 6 of Chapter 10.
0 1 0 −i
i 0
=
1 0 i 0
0 −i
0
1

1
1
1
2
1
3
1 2 0
0 −2
=
0 0 −2
2 0

 

0 2 1 1
4 1 3
2 1 0 1 = 5 3 2
1 1 1 0
6 2 4
1
APPENDIX LL
Algebraic solution of the oscillator
We now return to the harmonic oscillator using operator methods to obtain
the wave functions and the energy. Te harmonic oscillator is used in quantum
field theory as a model for the behavior of ions vibrating in a crystal and as
a model for the quantum mechanic treatment of the electromagnetic field. An
expression for the energy of the harmonic oscillator is obtained by adding the
kinetic energy p2 /2m to the general expression for the potential energy of the
oscillator provided by eq. (1.4) of the introduction to obtain
E=
1 2 1
p + mω 2 x2 .
2m
2
Following the procedure described in Section 3.1, the energy operator is obtained
by replacing the momentum p and the position x in this last equation by the
operators, pˆ and x
ˆ. We obtain
ˆ = 1 pˆ2 + 1 mω 2 x
H
ˆ2 .
2m
2
(1)
We recall that the momentum operator is given by eq. (3.2) and the position
operator x
ˆ is equal to the position coordinate x.
To see how we might find the wave functions for the harmonic oscillator
by algebraic methods, we first note that if the momentum operator pˆ was an
ordinary number, we could factor the Hamiltonian operator for the oscillator
given by eq. (1) as follows
2
ˆ = pˆ + 1 mω 2 x
H
ˆ2 = ~ω
2m 2
r
mω
2~
iˆ
p
x
ˆ−
mω
r
mω
2~
iˆ
p
x
ˆ+
mω
(2)
This factorization may readily be confirmed using the identity
(a − ib)(a + ib) = a2 + b2 .
Such a factorization of the Hamiltonian can not be carried out because the momentum and the position are represented in quantum mechanics by operators,
and the order of two operators cannot generally be interchanged as can the
products of numbers.
The effect of changing the order of the momentum and position operators
can be determined by using the explicit form of the momentum operator given
by eq. (3.2). The product of the momentum and position operators is
pˆx
ˆ = −i~
d
x.
dx
2
We can study the properties of this operator product by allowing it to act on
an arbitrary wave function ψ(x) giving
−i~
d
d
x ψ = −i~ (xψ) .
dx
dx
The derivative of (xψ) on the right can now be evaluated using the ordinary
product rule to obtain
−i~
d
dψ
(xψ) = −i~ψ − i~x
.
dx
dx
Bringing the last term on the right-hand side of the above equation over to the
left-hand side and multiplying the whole equation through with −1, we obtain
d
d
+ i~
x ψ = i~ψ.
−i~x
dx
dx
Since this last equation is true for any arbitrary function ψ, we can write it as
an operator identity
d
d
−i~x
+ i~
x = i~
dx
dx
or
x
ˆ pˆ − pˆ x
ˆ = i~.
(3)
The expression on the left-hand side of this last equation involving the position
and momentum operators, x
ˆ and pˆ, may be written more simply using the idea
of a commutator. The commutator of two operators, A and B, is defined by the
equation
[A, B] = AB − BA.
(4)
Using this notation, eq. (3) becomes
[ˆ
x, pˆ] = i~.
(5)
Even though the Hamiltonian operator (1) cannot be factored as simply
as indicated in eq. (2), we would like to evaluate the product of the “factors”
occurring in this equation. For this purpose, we define the two operators,
r
mω
iˆ
p
†
a =
x
ˆ−
(6)
2~
mω
and
r
a=
mω
2~
iˆ
p
x
ˆ+
.
mω
(7)
We consider now the operator product
~ω a† a,
(8)
which is equal to the expression appearing after the second equality in eq. (2).
3
Using the commutation relation (5), the operator product (8) can be written
1
i
pˆ2
mω 2 x
[ˆ
x, pˆ]
ˆ2 + 2 2 +
2
m ω
mω
2
pˆ
1
1
=
+ mω 2 x
ˆ2 − ~ω
2m 2
2
~ωa† a =
. (9)
The first two terms appearing after the second equality can be identified as the
ˆ given by eq. (1). Bringing the last term on the rightoscillator Hamiltonian H
hand side of eq. (9) over to the left-hand side and then interchanging the two
sides of the equation, we obtain
ˆ = ~ω a† a + 1 .
(10)
H
2
This equation may be regarded as the corrected version of the naive factorization (2).
We note that the order of the operators a† and a in eq. (9) is important.
The same line of argument with the operators, a† and a, interchanged leads to
the equation
pˆ2
1
1
~ω aa† =
+ mω 2 x
ˆ2 + ~ω.
(11)
2m 2
2
We can obtain a commutation relation for the operators a and a† by subtracting
eq. (9) from eq. (11) giving
[a, a† ] = aa† − a† a = 1.
(12)
The commutation relation between a and a† may be used to derive a relation
between successive eigenfunctions of the oscillator Hamiltonian. Suppose that
the wave function ψ is an eigenfunction of the Hamiltonian (10) corresponding
to the eigenvalue E. Then ψ satisfies the equation
1
~ω a† a +
ψ = Eψ.
(13)
2
Multiplying this equation from the left with a† gives
1
~ω a† a† a + a† ψ = Ea† ψ.
2
The first term on the left-hand side of this last equation may be rewritten by
using the commutation relation (12) to replace a† a with aa† − 1 giving
1
a† ψ = Ea† ψ.
~ω a† a − 1 +
2
Finally, we take the second term on the left-hand side over to the right-hand
side to obtain
1
~ω a† a +
a† ψ = (E + ~ω)a† ψ.
2
4
a†
E + 3h
(a†) 3
E + 2h
(a†) 2
E + h
a†
E

E – h
a
E – 2h
a2
E + 3h
(a†) 3
a†
Figure 1: The effect of the raising and lowering operators on the states of the
harmonic operator.
Hence, the wave function a† ψ is an eigenfunction of the oscillator Hamiltonian corresponding to the eigenvalue E + ~ω. We shall thus refer to a† as a
ˆ into
raising operator or a step-up operator. It transforms an eigenfunction of H
an eigenfunction corresponding to the next higher eigenvalue. In the same way,
the operator a may be shown to be a lowering operator or step-down operator
ˆ into an eigenfunction corresponding to
which transforms an eigenfunction of H
the next lower eigenvalue. The effect of the raising and lowering operators on
the states of the Harmonic oscillator is illustrated in Fig. 1.
By repeatedly operating on an eigenfunction of the harmonic oscillator with
the lowering operator, one can produce eigenfunctions corresponding to lower
and lower eigenvalues. Allowed to continue indefinitely, this process would eventually lead to energy eigenvalues less than zero. However, one may show that the
harmonic oscillator, which has a potential energy that is positive everywhere,
cannot have a bound state with a negative eigenvalue.
The lowering process must end in some way. This can only happen by the
product of the lowering operator a and the wave function producing a zero
function. Denoting the lowest bound state by ψ0 , we must have
aψ0 = 0.
(14)
Since every term in an eigenvalue equation depends upon the eigenfunction, an
eigenvalue equation will always be satisfied by a function which is equal to zero
everywhere.
The energy of the lowest state can be determined by substituting the func-
5
tion ψ0 into eq. (13) to obtain
1
†
~ω a a +
ψ0 = E0 ψ0 .
2
Since aψ0 = 0, the first term on the left-hand side of this equation must be
equal to zero. We must have
1
E0 = ~ω.
2
A state of the oscillator with ~ω more energy can be obtained by operating
with the operator a† upon the lowest state, and this process can be continued
producing states with greater and greater energy. The energy levels of the
oscillator are thus given by the formula
E = ~ω(n + 1/2).
Eq. (14) can be solved for the wave function corresponding to the lowest
eigenvalue. More energetic states can then be obtained by operating upon ψ0
with the step-up operator a† . Substituting eq. (7) into eq. (14), we obtain
iˆ
p
ψ0 = 0.
x
ˆ+
mω
This equation may be written as a differential equation by using the explicit
expression for the momentum operator given by eq. (3.2) to get
~ dψ0
+ xψ0 = 0,
mω dx
or
dψ0
mω
x dx.
=−
ψ0
~
Integrating this last equation, we obtain
ln ψ0 = −
mω 2
x + ln A0 ,
2~
where we have denoted the arbitrary integration constant as ln A0 . Taking the
term ln A0 over to the left-hand side of the equation and using the properties
of the natural logarithm, we obtain
ψ0
mω 2
ln
=−
x .
A0
2~
The lowest eigenfunction ψ0 can then be obtained by taking the anti-logarithm
of both sides of the equation and rearranging terms to get
2
ψ0 = A0 e(−mω/2~) x .
(15)
All of the results obtained in this section have been obtained before by
solving the Schr¨
odinger equation for the oscillator. The energy and the wave
6
function of the oscillator are given by eqs. (2.44) and (2.45). We leave as an exercise to use the methods developed in this section to obtain the wave functions
of the first two excited states of the oscillator. The algebraic solution of this
problem provides its own insights into the oscillator. Since the harmonic oscillator can only vibrate with a single frequency, all of its states can be regarded as
excitations of a single state. The first excited state of the oscillator corresponds
to a state for which the oscillator has absorbed a single photon with energy ~ω,
and higher excited states correspond to states for which the oscillator has absorbed additional photons. This way of thinking about the oscillator is used
in quantum field theory as a model for the quantization of the electromagnetic
field.