Quantum Physics Lecture Notes

Transcription

Quantum Physics Lecture Notes
Quantum Physics Lecture Notes
Understanding the Schrödinger Equation
Sebastian de Haro
Amsterdam University College, Fall Semester, 2014
Cover illustration: Wikipedia
Contents
Introduction
2
1 Motivating the Schrödinger Equation
3
1.1
Classical waves
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Enter the quantum
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.3
The wave function
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2 Four Steps to Solve the Schrödinger Equation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
10
2.1
The Problem
2.2
Step 1: Reduce to TISE
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.3
Step 2: General Solution of TDSE . . . . . . . . . . . . . . . . . . . . . . .
11
2.4
Step 3: Impose Initial Condition
2.5
Step 4: Plug Back into
2.6
Example: Gaussian Wave Function
Ψ(x, t)
Ψ0 (x)
10
. . . . . . . . . . . . . . . . . . . .
12
. . . . . . . . . . . . . . . . . . . . . . . . .
14
. . . . . . . . . . . . . . . . . . . . . .
3 One-Dimensional Potentials
14
16
3.1
General Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
3.2
The TISE with one-dimensional potentials
17
. . . . . . . . . . . . . . . . . .
4 Fourier Integrals and the Dirac Delta
19
4.1
The Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
4.2
Fourier transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
5 The Formalism of Quantum Mechanics
21
5.1
Why?
5.2
The Postulates of Quantum Mechanics
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3
Linear Algebra
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
5.4
Continuous spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
. . . . . . . . . . . . . . . . . . . .
6 Dirac Notation
21
22
26
6.1
Base-free notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
6.2
Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
6.3
Bras and kets
28
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 The Interpretation of Quantum Mechanics
29
7.1
EPR and Hidden Variables . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
7.2
Bohr's Reply to EPR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
7.3
The Measurement Problem . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
7.4
Other Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
7.4.1
Decoherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
7.4.2
Many Worlds
36
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A Mathematical Formulas and Tricks
A.1
Gaussian integration
37
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
37
B Technicalities of Quantum Mechanical Measurements
B.1
Time Evolution Operators
B.2
The Measurement Operator
. . . . . . . . . . . . . . . . . . . . . . . . . . .
U
. . . . . . . . . . . . . . . . . . . . . . . . .
References
38
38
39
40
2
Introduction
Welcome to the fascinating world of quantum mechanics! We are going to learn how to
do computations in quantum mechanics and how to interpret our results physically. I say
quantum mechanics is fascinating because it is so dierent from any other physical theory
we have ever seen before. In classical physics, particles travel along trajectories that can
be drawn in space and time.
In quantum mechanics, particles don't have trajectories
and sometimes we can't even say where they are located in space!
In fact, quantum
mechanics is so weird that some of the scientists who made major contributions to it
notoriously, Albert Einstein and Erwin Schrödingernever believed the interpretation of
quantum theory that has now become standard. Richard Feynman famously said, I think
I can safely say that nobody understands quantum mechanics. So, remember: if you get
confused, you are in good company. However, I don't think Feynman was entirely right
on this one. We don't seem to have quantum mechanical intuition wired into our brains
and bodies (as we do learn to appreciate weight, height, etc., intuitively). Feynman was
certainly right about that. But we can learn to work with the theory and progressively
develop a rational intuition for it, based on what we learn from the formulas and the
physical principles they embody. By logical thinking and some guesswork, by looking at
the experimental facts (black body radiation, photoelectric eect, double slit experiment,
etc.) we can try to nd the necessary physical principles that will allow us to construct
the theory of quantum mechanics. We can't
about that, just like we can't
derive quantum mechanics,
let me be clear
derive, from sheer thinking about the concept of mass, the
fact that Newton's law of gravity decreases with the square of the distance. But we can
look for the minimal set of principles and equations that will reproduce and unify all of
the facts we know about quantum mechanics. This is what Schrödinger's equation does
for us. Once we have motivated why it should be true, we have to learn how to work with
it and to interpret it. Indeed this is the right order: rst learn to work with it, then gain
deeper understanding of what it means. Remember that in the short period in which the
`new' quantum theory was developed (basically, Christmas of 1925 to the summer of 1926)
Heisenberg and Schrödinger worked on the math and that only later on did they focus on
its meaning. We will be doing both things as simultaneously as possibledeveloping the
theory while discussing its interpretation.
These lecture notes are to be used as a complement to a textbook on quantum mechanics such as Griths' [1]. I will focus on a number of selected issues that I think are
important to understand quantum mechanics.
In the rst chapter I motivate how the
Schrödinger equation comes to be in the rst place, rather than just throwing it at you
and asking you to trust quantum mechanics. Whereas the Schrödinger equation cannot be
derived
from classical mechanics, it can be
motivated
from semi-classical considerations.
I also expand, in later sections, on mathematical explanations and include study tools.
Writing down a mathematical theory of quantum mechanics assumes knowledge of the
basic experiments
from them.
that led to this theory and the broad principles that were derived
Therefore I recommend that, before you start reading these lecture notes,
you refresh your memory (or catch up, whichever may be the case) on the most relevant
historical experiments and the physical principles that were drawn from them. I recommend chapter 37 of Giancoli [2]. If you want a more thorough exposition, you can read
3
chapter 1 of [3]. I will refer to these experiments regularly.
1
1.1
Motivating the Schrödinger Equation
Classical waves
We recall some concepts from classical physics that will both serve x notation and motivate the way we introduce waves in quantum mechanics. Consider a one-dimensional
wave, say:
y(x) = A sin kx .
This is a static, sinusoidal wave extending in the
called the
amplitude
of the wave. The
(1)
x-direction. A,
wavelength
the maximum height, is
is the distance between two identical
points on the wave (e.g. two crests, two troughs, two nodes) and in the above example it
2π
is given by λ =
because this is the smallest number for which the wave repeats itself,
k
i.e. y(x + λ) = y(x). k is usually called the
. Consider now a wave traveling
wave number
at speed
v:
y(x, t) = A sin (k(x − vt)) = A sin
The period
T
2π
(x − vt) .
λ
(2)
of this wave is the time it takes for the wave to go back to itself in time,
analogously to the wavelength:
y(x, t + T ) = y(x, t),
hence
T = λ/v .
You see that the
as the speed is right, as v is indeed the wavelength divided by the
1
2π
period. The frequency is dened as ν =
, and the angular frequency is ω = 2πν =
.
T
T
The advantage of using the wave number and the angular frequency is that they include
interpretation of
v
the periodicity of the sine and cosine and the factors of
2π
nicely disappear:
y(x, t) = A sin(kx − ωt) .
(3)
ν for the frequency
f , because this is the standard notation in more advanced texts. We will use
the Greek alphabet extensively in this course, so you better get used to it!: α, β, γ, δ, , . . .
This is a notation that we will use quite often. I am deliberately using
here and not
Complex waves.
In quantum mechanics we will always use complex waves, for rea-
sons that will become clear later. A complex wave looks like:
Ψ(x, t) = Aei(kx−ωt) .
(4)
Now recall the following result:
Euler's formula. eiϕ = cos ϕ + i sin ϕ.
Proof. This can be proven by using the following relations for the sine and cosine, which
you should know:
eiϕ − e−iϕ
2i
eiϕ + e−iϕ
cos ϕ =
.
2
sin ϕ =
4
(5)
From this, we get:
cos ϕ + i sin ϕ =
1 iϕ 1 −iϕ
1
1
e + e
+ i eiϕ − i e−iϕ = eiϕ ,
2
2
2i
2i
(6)
which proves Euler's formula.
Remark.
If you don't recognise formula (5) at all, try to prove it using the Taylor
expansions (Taylor series) of the sine, cosine, and exponential functions, which you
should
know:
∞
X
(−1)n
sin x =
x2n+1
(2n + 1)!
n=0
cos x =
∞
X
(−1)n
n=0
ex =
(2n)!
∞
X
xn
n=0
n!
x2n
.
(7)
As you can see, the cosine is an even function of x, hence it only has terms with even
2n
2n+1
powers, x , whereas the sine is odd and only has odd powers x
. By taking combinaiϕ
−iϕ
tions of e
and e
with plus and minus sign, respectively, in (5), we cancel the odd/even
terms and are left with the cosine or sine, respectively.
Because of Euler's formula, we see that the cosine is the real part of the wave while
the sine is its imaginary part:
cos ϕ =
sin ϕ =
Re
Im
eiϕ
eiϕ .
(8)
Hence, we can can always replace sinusoidal waves such as (3) by complex waves (4) and
take the real or imaginary part as needed.
Having made these remarks on complex waves we now go back to the physical meaning
of the quantities involved in a wave.
I want to make the following point about the
ix
amplitude A. When we have a wave of the type A sin x or A e , the quantity that is of
2
∗
physical interest is not A itself, but |A| ≡ AA . Mathematically, the reason is that A
could be made to be negative by simply changing our coordinate by
x → −x in sin x, and
obviously that shouldn't aect the amplitude, which is independent of the orientation of
iϕ
the coordinate. Also, in a complex wave e
we could shift ϕ → ϕ + α by some constant
iα
α and generate a
e which then would become part of A. But also that
complex phase
is a change of variables (namely, choosing a dierent zero point for
ϕ)
which should not
aect the amplitude. For these reasons, the physical quantities of interest depend on the
2
absolute value (squared) of the amplitude |A| and not on A itself.
1.2
Enter the quantum
Experiments such as the two-slit experiment make clear that particles sometimes behave
as waves. This is true not only for light, but also for material particles such as electrons.
5
Indeed, in a ash of genius Louis de Broglie hypothesized that not only electromagnetic
radiation has a dual nature as waves and as particles (photons), but that also matter,
usually believed to be of corpuscular nature, should possess wave-like properties.
For photons, Planck's 1900 hypothesisclaried and generalized by Einstein's 1905
description of the photoelectric eecthad been that the energy is quantized in units
of
h
Planck constant): E = hν = hc/λ.
(the
The quantity
p = E/c = h/λ
had been
identied as the photon's momentum, which played an important role in the impressive
Compton eect, where a single photon could hit an electron at rest and impart it non-zero
momentum.
de Broglie now proposed that, analogously to the case of light, matter waves also have
a frequency
ν
and a wavelength
λ,
given by:
E
h
h
λ =
.
p
ν =
Indeed it is one of the great features of Planck's constant
(9)
h
that, by introducing a funda-
mental constant with units J · s, such relations between frequency and energy and between
wavelength and momentum can be written down. For the same reasons for which one introduces the angular frequency and the wave number, we will often use the
reduced Planck
constant
~=
h
,
2π
(10)
which is usually simply referred to as `Planck's constant'. In terms of these, the above
read:
E = ~ω
p = ~k .
Furthermore, remember the classical relation
E = T +V.
(11)
If we ignore the potential, we
have, classically:
E=
where
m
p2
~2 k 2
=
,
2m
2m
(12)
is the mass of the electron. We will use this expression a lot in what follows.
Remark.
The above formulas (11)-(12) also imply a relation between the frequency
and the wavelength of the wave (viz. between the angular frequency and the wave num~k2
ber): ω(k) =
. Such a relation is called a
because it determines the
2m
1
eect of dispersion of the wave as it travels through a medium . The above dispersion
dispersion relation
1 Namely,
by giving the speed as a function of the wavelength.
6
relation holds for a matter particle. A photon, on the other hand, is massless and has the
2
dispersion relation
E = pc.
The central question that Schrödinger asked, then, is as follows. We know that, at
least semi-classically, electrons have energy given by (12). But they are also waves of the
type:
Ψ(x, t) = A ei(kx−ωt) .
(13)
So how can such a wave have energy (12)? The answer from optics would be: write down
a wave equation! So following de Broglie's lead, Schrödinger had the idea writing down a
wave equation that would reproduce (12).
i~
The equation in question is the following:
~2 ∂ 2 Ψ
∂Ψ
=−
.
∂t
2m ∂x2
(14)
As you can see by lling (13) into this equation, (14) looks like:
i~ (−iω)A ei(kx−ωt) = −
~2
(−k 2 )A ei(kx−ωt) ,
2m
(15)
~k2
, which is precisely the condition (12). So the expression for
2m
the energy does appear as a consequence of Schrödinger's wave equation (13): the wave
which holds if:
ω =
equation forces on us a relation between the angular frequency
k,
ω
and the wave number
and this relation is nothing but the classical energy relation (12). This is nice: waves
seem to describe electrons too!
The is the one-dimensional Schrödinger equation for a
free
particle (free electron),
i.e. one with zero potential energy hence satisfying (12). Now if there is potential energy
V
around, (12) will change to:
E=
p2
+V ,
2m
(16)
again a result from classical mechanics. So it is natural to modify the Schrödinger equation
as follows:
i~
Here,
∂Ψ
~2 ∂ 2 Ψ
=−
+ V (x)Ψ(x) ≡ HΨ(x) .
∂t
2m ∂x2
V (x) is the potential function,
(17)
that is the function from which (in classical mechan-
ics) the force can be derived. And the Hamiltonian
H=−
H
~2 ∂ 2
+ V (x) .
2m ∂x2
2 Ironically,
was dened as:
(18)
quantum theory originated from considerations of the wave vs. particle nature of photons,
but the Schrödinger equation only describes matter particles such as electrons and protons which, because
they are massive, travel at low speeds. Photons travel at the speed of light and one needs to take
relativistic eects into account in order to describe them quantum mechanically, which in turn means that
one needs to generalize the Schrödinger equation and replace it by an appropriate equation incorporating
special relativity. The reason the Schrödinger equation works for massive particles such as the electron
is that it is based on the non-relativistic limit of the energy (12), which can be conveniently `quantized',
as I show below.
7
The Hamiltonian is an
tum mechanics.
operator
(a kind of derivative) that
represents
energy in quan-
If you have studied the Hamiltonian formalism in classical mechanics,
you might remember that classically the Hamiltonian is the sum of the kinetic and the
potential energy
but an
operator:
Hψ(x) ≡
to it a
functions, T + V .
The Hamiltonian we just dened is not a function,
∂/∂x (also an operator), it acts on functions:
~2 ∂ 2 ψ
− 2m ∂x2 + V (x)ψ(x). That is, given a (wave) function ψ(x), an operator assigns
wave function Hψ(x) dened by the equation above. In quantum mechanics,
just like the derivative
new
classical quantities such as the energy are replaced by operators. A quantum system may
not always have a denite energy, but the energy operator can always be dened. We will
see later on under what conditions we reproduce classical formulas like (16).
The above is not a
derivation of the Schrödinger equation.
All I have done is motivate
that, given classical formulas such as (12), and given the assumption that particles are also
waves like (13), we can write down a formula that reproduces (12) and hence encompasses
both principles. But I have not derived it. Then, based on classical mechanics, I made
it plausible that (17) is the right generalization to the case of non-vanishing potential.
Much of what we will do in this course is solving the Schrödinger equation (17) for various
forms of the potential
V (x), and nding out precisely under what conditions we can make
predictions of the type (16).
1.3
The wave function
We will now look in more detail at the interpretation of the wave function.
proceed, let me remark that the wave function
Ψ
Before we
is necessarily a complex quantity. We
cannot stick to its real or to its imaginary part (as we would do with, for instance, classical
electromagnetic waves) because the time evolution will normally develop an imaginary
part, even if we start with a purely real wave function.
The reason is in the form of
the Schrödinger equation (17). Taking the complex conjugate of this equation, we get a
dierent equation:
−i~
~2 ∂ 2 Ψ∗
∂Ψ∗
+ V Ψ∗ .
=−
∂t
2m ∂x2
Notice the dierence in sign on the left-hand side compared to (17).
(19)
Thus,
Ψ
and
Ψ∗
satisfy dierent equations, which means that the real and the imaginary part of the wave
∗
function contain dierent information. Roughly speaking, Ψ is the mirror image of Ψ
∗
under t → −t, and if Ψ propagates `forward' in time, we can say that Ψ propagates
`backward'. In other words, it is not enough to look for real solutions of the Schrödinger
equation as we would be losing essential information.
Interpretation.
Schrödinger originally thoughtalthough he was careful enough not to
2
∗
write this in his paperthat the absolute value squared of the wave function, |Ψ| = ΨΨ ,
could be interpreted as some kind of charge density or particle density distributed over
the space. This was analogous to the classical theory of light, where the intensity of the
light is proportional to the square of the eld. As it turns out, this interpretation does
not conate well with quantum mechanics.
with a
single
Consider doing the double-slit experiment
particle. Repeating the experiment many times the pattern that appears
8
is described by the distribution
|Ψ(x, t)|2 .
Since every time we do the experiment there
is just one particle in the set-up, this means that the wave function is not the average
particle density in a single experiment, but rather an average over many experiments.
ensemble average
This is usually called an
. It is Max Born who is responsible for the in2
2
terpretation of |Ψ| as a probability density. |Ψ(x, t)| dx is the probability to nd, upon
3
detection, a particle within a small neighbourhood
(x, x + dx)
of
x
at time
t.
Notice
the insistence on detection: this is necessary because before we measure the particle we
cannot make any assumptions about its location. In the double slit experiment that we
discussed earlier, we cannot say which slit the particle went through unless we measure
2
it. This is another reason why |Ψ(x, t)| cannot be interpreted as a `particle density': the
particles cannot be localized until we measure them.
Let me now further motivate why the probability grows like the
square
of the wave
function, instead of the wave function itself. I have already mentioned that, for waves,
only
amplitudes such as |Ψ|, and not Ψ itself, can have a physical meaning.
This is because
Ψ
can become negative, even complex (something we don't want for a probability). Now
2
we also must explain why |Ψ| , and not |Ψ|, is the relevant quantity. This follows from
the wavelike nature of the wave function, and I will motivate it with three examples:
1.
The electromagnetic eld.
The intensity of radiation (the intensity of the radiation
emitted by, say, an antenna, or the amount of light emitted by a light bulbboth manifestations of the electromagnetic eld) grows proportionally to the square of the electromagnetic eld
E,
4
not linearly with the eld . And it is the electromagnetic eld which
satises the dierential equations of electromagnetism (called Maxwell's equations), something analogous to our Schrödinger equation. This intensity of light is proportional to the
number of photons with the given frequency (and, hence, to the density of these photons).
It makes sense that the probability for a particle to have a given frequency be the quantum mechanical quantity corresponding to the number of particles with a given frequency
in classical electromagnetism.
2.
The harmonic oscillator.
We have seen that the Schrödinger equation could be inter-
preted as reproducing the relation between the energy and the momentum (12) (for the
case of a simple wave like (13)). We extended it to (17) in the case of non-zero potential. So let us actually compute this energy for a familiar classical system with non-zero
1
kx2 and the energy function is
potential, the harmonic oscillator. The potential is V =
2
1
1
2
2
H = 2 mẋ + 2 kx . Solving Newton's equation x = A sin(ωt + φ), plugging this back into
3 It
is incorrect to say that |ψ(x)|2 is the probability to nd the particle at point x. The probability to
nd a particle exactly at point x is in fact zero. On the other hand, the probability to nd the particle
in the neighborhood (x, x + dx) is innitesimal and non-zero. The probability to nd a particle in a
Rb
nite interval
(a, b) is P (a, b) = a |ψ(x)|2 dx. From the latter formula we see that, for any point x,
Rx
P (a, x) = a |ψ(y)|2 dy (notice the dierent name for the dummy variable y ), from which it follows that
∂P
2
2
∂x = |ψ(x)| . Hence |ψ(x)| itself can be interpreted as the spatial rate of change, or gradient, of the
probability at point x.
4 The intensity of radiation is dened as the amount of energy emitted per time per unit area, that
is, the power transported across a given unit area perpendicular to the ow. It is measured in units of
W/m2 .
9
p
k/m,
1
kA2 , indeed
2
the square of the amplitude. So, again, classical energies behave like squares of amplitudes.
the energy function and using
3.
ω=
Conservation of probability.
Taking
we get the familiar result:
P ∝ |Ψ|2 ,
E=
one can prove a `probability conser-
vation theorem' completely analogous to the charge conservation theorem in electrodynamics (if charge disappears, it must be taken away by a current, i.e. the time derivative
∂ρ
= −∇ · J). The analoof the charge density is minus the gradient of the current,
∂t
gous result for probabilities involves the time derivative of the probability (see problem
1.14 of [1]). This only works if one takes the
4.
Interference.
square of the amplitude to be the probability.
If, in view of these arguments, we accept the identication of the prob-
ability distribution with the
interference appears.
square
of the wave function, we still have to explain how
What I will show now is that interference appears when we add
up the wave functions,
ψ = ψ1 + ψ2 ,
rather than the probabilities. Consider again the
two-slit experiment, where we now close one of the slits. Call
point
x
ψ1 (x)
the wave function at
on the screen when only one slit is open (the left one, say) and
ψ2 (x)
when the
other is open. According to the superposition principle of waves, when both slits are open
2
the wave function is ψ(x) = ψ1 (x) + ψ2 (x). Now because P ∝ |ψ1 (x) + ψ2 (x)| , this is
dierent from the sum of the probabilities. There is a cross term in the square, and this
cross term is responsible for interference. You can easily simulate this yourself by writing
the following mathematica program:
a = 0.5; b = 5;
]
]
psi1[x
psi2[x
psi[x
]
2
:= Exp[-(x+a) /2];
2
:= Exp[-(x-a) /2];
:= psi1[x] - psi2[x];
Plot[psi1[x], {x,-b,b}]
Plot[psi2[x], {x,-b,b}]
Plot[psi[x], {x,-b,b}]
2
2
Plot[psi1[x] +psi2[x] , {x,-b,b}]
The outcome of the program is depicted in Figure 1. As you see from the picture, there
is an interference pattern that follows when we add up the wave functions rather than the
probabilities themselves. The result of adding up the probabilities is in the last picture.
You see that in that case interference completely disappears. Hence, it is important that
2
we add the wave functions, ψ = ψ1 + ψ2 (corresponding P ∝ |ψ1 + ψ2 | ) rather than the
probabilities (P
6= P1 + P2 ).
Hopefully the above arguments have convinced you that we should interpret
as the probability of nding the particle within a region
(x, x + dx)
at time
|Ψ(x, t)|2
t. These
arguments of course do not provide a derivation; they only make the assertion plausible.
Like when we set up the Schrödinger equation, we are now constructing a new theory and
we cannot give derivations from rst principles. Otherwise this would be no new theory at
all! So we write down tentative equations and try to uncover the fundamental principles.
In the end, of course, we will know that the structure is right because it worksit solves
many problems that would otherwise be intractable, and it agrees with the results of
experiment. Both things tell us that quantum mechanics is right!
10
-4
1.0
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
2
-2
4
-4
-2
2
4
2
4
1.5
0.30
0.25
1.0
0.20
0.15
0.5
0.10
0.05
-4
2
-2
4
-4
-2
Figure 1: Two slit experiment (from left to right, and from top to bottom): 1) with only
the left slit open; 2) with only the right slit open; 3) interference pattern when both slits
are open; 4) classical result one obtains by simply adding up the probabilities.
2
Four Steps to Solve the Schrödinger Equation
In this section I will summarize and further explain the steps you have to take to solve
the time-dependent Schrödinger equation as outlined in [1].
2.1
The Problem
The basic problem is to solve the time-dependent Schrödinger equation
i~
∂
Ψ(x, t) = H Ψ(x, t) ,
∂t
~2 ∂ 2
H ≡ −
+ V (x) ,
2m ∂x2
(20)
given a known initial wave function (also called the `initial condition') at time5 t = 0:
Ψ(x, 0) = Ψ0 (x) ,
where
Ψ0 (x) is a given function.
I make the distinction between
(21)
Ψ and Ψ0
because
Ψ(x, t)
is the function we want to solve for (we only know it at one particular point in time,
t = 0),
whereas
Ψ0
is a given function of
x.
Griths doesn't use
Ψ0 ,
only
Ψ(x, 0).
The solution to the above problem consists in a four-step procedure.
2.2
Step 1: Reduce to TISE
Look for special solutions of the type
Ψ(x, t) = ϕ(t) ψ(x) .
5 One
may as well choose the initial condition at any other initial time t = tin .
11
(22)
These solutions are called
completely solve for
ϕ
separable.
Filling this ansatz into the Schrödinger equation, we
and nd another equation for
ψ:
i
ϕ(t) = e− ~ Et
~2 d2 ψ(x)
H ψ(x) = E ψ(x) ⇒ −
+ V (x) ψ(x) = E ψ(x) .
2m dx2
Notice that I write partial derivatives
∂/∂t
for functions such as
several variables, and regular derivatives d/dx for functions
variable.
Equation (24) is called the
ψ(x)
(though not `
E,
that depend on
ϕ
Filling the ansatz
in (23). This gave us
which reappears in (24) and can be interpreted as an energy
the energy of the system', a concept which may not always be well dened).
Solving (24) will give us
called the
(24)
that depend on a single
time-independent Schrödinger equation.
(22) into the time dependent equation, we were able to solve for
an integration constant
Ψ(x, t)
(23)
ψ(x)
as well as the allowed values of the energy
spectrum of the Hamiltonian H .
E,
which is
The spectrum depends on the potential
V (x).
In general, (24) has many solutions (usually an innite number of them). For instance,
ψ(x) = sin nx
for positive integers
n = 1, 2, 3, . . .
In this case, the spectrum is called
discrete (as there are discrete energy levels labeled by n = 1, 2, 3, . . .).
label both the energies and the wave function by
n
Accordingly, we
in this case:
H ψn (x) = En ψn (x) .
(25)
The main problem now is to solve this equation, imposing the appropriate boundary
conditions (this will be dealt with in chapter (3) for dierent types of potentials). For
~2 π 2 n2
.
instance, for the innite square well it turns out that En =
2ma2
If the spectrum is
rather than discrete, the energies are labeled by a
continuous
continuous variable that we usually call k , the wave number. For instance, for the free
~2 k2
ikx
, and the wave functions are ψk (x) = e
. In this case, k is the wave
particle: Ek =
2m
number, with k > 0 for right-moving, k < 0 for left-moving waves. Remember that the
wave number relates to the momentum of the wave by the de Broglie formula (11). In
this case, the energy is given by (12).
To summarize this step, we have reduced the time-dependent Schrödinger equation
(a partial dierential equation in
t
(a second order dierential equation
problem.
ψn
H
x) to the time-independent
in x). It is useful to think of
and
Schrödinger equation
(25) as an eigenvalue
is an operator (an object which, like a matrix, acts on a vector space) and
are the eigenfunctions, with corresponding eigenvalues
En .
Solving this eigenvalue
problem gives us, like in linear algebra, both the eigenvalues and the eigenfunctions. We
will develop this point of view further when we discuss the formalism.
2.3
Step 2: General Solution of TDSE
Once you have the solutions
ψn (x)
and
En
of the TISE (25), you can move on to get the
general solution of the TDSE (20). It is given by:
Ψ(x, t) =
X
i
cn e− ~ En t ψn (x) .
n
12
(26)
It is straightforward to show that, if the individual solutions (22) solve (20), then a
linear
superposition of them like in (26) also is a solution. The reason is that (20) is a
equation, and its solutions obey the superposition principle: if we have a set of solutions
(22), we can add them together with arbitrary coecients
solution.
cn
and the result is also a
most
But the key point to understand here is that (26) in fact gives us the
general solution of the TDSE (20). Indeed, whereas the separation of variables (22) was
an ansatz (an assumption, in order words, we only obtained a specic solution), one can
show that any solution of the TDSE is in fact of the form (26).
To show this in general is a little tricky because it actually depends on the form of
the potential function
potentials
V (x),
V (x).
But the idea, which we will later on substantiate for specic
t
x.
is as follows. (20) is a rst-order partial dierential equation in
which we are solving in a two-step procedure of rst solving for
t
regard this equation as an ordinary, linear dierential equation in
and then for
t
(i.e. we treat
and
x
So we
x
as a
t derivative), it depends on a single
Ψ0 (x) (which I have not yet specied;
constant). Since the equation is rst order (only one
`integration constant', which is our initial function
therefore, for the time being
tial equation tells us that,
condition
Ψ0 (x), then
Ψ0
is a generic function). The theory of rst-order dieren-
if I can choose cn
in (26) such that
Ψ(x, t)
(26) is the unique solution associated with that
satises the initial
Ψ0 (x).
Remember,
a rst order linear dierential equation has only one integration constant/boundary condition (in this case,
Ψ0 ).
Since
Ψ0 (x)
was generic to start with, this is the same as saying
that (26) is the most general solution (i.e. for
any boundary condition) of the Schrödinger
t.
φ(k),
and the sum becomes an
φ(k) e− ~ Ek t ψk (x) .
(27)
equation regarded as a rst-order dierential equation in
1
In the continuous case, we replace n → k , cn → √
2π
integral:
1
Ψ(x, t) = √
2π
Z
∞
dk
i
−∞
ψk (x) = eikx and this becomes:
Z ∞
2
1
t
i kx− ~k
2m
dk φ(k) e
Ψ(x, t) = √
.
2π −∞
For the simple case of a free particle,
2.4
Step 3: Impose Initial Condition
In this step, we choose the coecients
cn
(28)
Ψ0 (x)
in (26) such that our initial condition (21) is
satised. If we can do this, then we are left with the unique solution of the Schrödinger
equation. Physically, what we are doing is we are imposing that the most general solution
of the Schrödinger equation
Ψ(x, t)
agrees with our experimental situation at time
t = 0.
We have prepared our system (e.g. a system of electrons with spin up) in a particular
state
Ψ0 (x) at time t = 0.
Given
this particular state, Ψ(x, t) tells us, via the Schrödinger
equation, how the system evolves in time. For instance,
H
might contain an interaction
between the spins and an external magnetic eld, and so some of the spin states of the
electrons may change in time. So,
H
encodes the dynamics of the system.
13
All we have to do is set
t=0
in (26) or (27) for the discrete/continuous case, respec-
tively. We get:
X
Ψ0 (x) =
cn ψn (x) ,
discrete spectrum
n
1
Ψ0 (x) = √
2π
So in this step, the task is to
Z
∞
dk
φ(k) ψk (x) ,
continuous spectrum
(29)
−∞
nd cn
φ(x) such that
ψn and ψk .
or
we have to use the specic knowledge about
(29) is satised. It is here where
Once we have done this, we have
shown that (26)-(27) satisfy the given boundary condition.
Before I show how to this in practice, we must reect on the following question: does
(29) always have a solution? Can we always be sure that this is somehow the case? Can I
always nd
Ψ(x, t)
cn
and
φ(k)
the answer was yes.
cn
and
Ψ0 (x)
to match
on the l.h.s.? This is the same as asking whether
was in fact the most general solution of the TDSE, and I already anticipated that
φ(k)
functions.
Now we see how this works: indeed, the reason I can always nd
ψn (x)
to solve (29) is that
and
ψk (x)
form what is called a
This means that any (nice enough) function
Ψ0 (x)
complete set of
can be written as a linear
6
superposition of them, precisely in the form (29). Mathematicians have shown this , and
we will see that in some specic cases as well during the course.
ψn (x) and ψk (x), which means
Remember that to solve (29) explicitly we need to know
V (x) and solved (24). A very useful example
ψk (x) = eikx . Equation (29) then becomes the
we have made a choice of potential function
to work out is the free particle, where
Fourier decomposition of Ψ0 (x):
1
Ψ0 (x) = √
2π
We say that
φ(k)
is the
Z
∞
dk
Plancherel's Theorem.
Let
the real line, i.e. the integrals
φ(k)
Rf (x)
∞
(30)
−∞
Fourier transform
Plancherel's theorem, how to nd
φ(k) eikx .
for
Ψ0 (x). Fourier
given Ψ0 (x):
of
theory now tells us, via
F (k) be two
functions over
R ∞ square integrable
2
2
dx |f (x)| and
dk |F (k)| are nite. Then the
−∞
−∞
and
following holds:
1
f (x) = √
2π
Z
∞
dk
1
⇔ F (k) = √
2π
ikx
F (k) e
−∞
Z
∞
dx f (x) e
−ikx
.
(31)
−∞
This condition precisely allows us to solve for φ(k) in (30): assuming that Ψ0 and φ(k)
|Ψ0 (x)|2 is a probability density, it should integrate
are square integrable (in fact, since
to one!), we see that (30) is of the form of the l.h.s. of (31) if we identify
and
φ(k) = F (k).
f (x) = Ψ0 (x)
Therefore Plancherel's theorem tells us that the r.h.s. is true:
1
φ(k) = √
2π
Z
∞
−ikx
dx Ψ0 (x) e
.
(32)
−∞
6 This
is valid under certain conditions for the potential V (x), which we will normally assume to be
polynomial.
14
Ψ0 (x) at
t = 0, we nd the unique innite set of coecients φ(k) that solve the boundary condition
This is how step 3 looks like for a free particle. Given any boundary condition
(29).
2.5
Step 4: Plug Back into
Ψ(x, t)
In the previous step we found the coecients
cn
or
φ(k) such that the boundary condition
(29) is satised. We now have to plug this back into the general solution of the Schrödinger
equation (26) or (27) to get the full solution. Of course, to obtain a completely explicit
solution one has to carry out the summation or the integral.
Don't take the above four-step procedure as if it were written in stone. Sometimes
you may simply be interested in the wave functions
the coecients
cn ,
ψn (x) and will not bother to calculate
sometimes step 4 is trivial because only a few terms contribute or
sometimes you will directly jump to step 3 because you already know the wave functions.
2.6
Example: Gaussian Wave Function
We will now apply this procedure to the case of a free particle, which means
V (x) = 0.
We will take as our initial wave function a Gaussian distribution:
Ψ0 (x) =
2a
π
1/4
2
e−ax .
(33)
Some comments about the physics before we start. This wave function represents a par1
ticle localized around x = 0 with standard deviation σ = √ . So when a is large, the
2 a
particle is well localized around x = 0, whereas if a → 0, the probability distribution
attens and the particle becomes more and more delocalized in space. Using the basic
Gaussian integral (85), we check that the wave function is indeed normalized to one:
R∞
2
dx|Ψ0 (x)| = 1.
−∞
Step 1.
The TISE (24) reduces to:
−
~2 d2 ψ(x)
= E ψ(x) .
2m dx2
This equation is readily solved.
ψ(x) = A sin kx + B cos kx
The solutions are sinusoidal.
(34)
We can write then as
or alternately as complex exponentials, as we saw in section
1.1. We will take exponentials:
ψk (x) = eikx
√
2mEk
~2 k 2
k = ±
⇔ Ek =
.
~
2m
The second line follows from lling the wave function into (34).
15
(35)
Step 2.
We ll this into the general solution of the Schrödinger equation, which gives us
the formula I wrote earlier:
1
Ψ(x, t) = √
2π
Here, we took both the
k>0
Z
and the
∞
dk
φ(k) e
2
i kx− ~k
t
2m
.
(36)
−∞
k<0
7 by integrating from
solutions into account
minus innity to plus innity with arbitrary coecients
Step 3. Impose initial condition Ψ0 (x).
φ(k).
We use Plancherel's theorem directly for
the initial wave function at hand:
1/4
Z ∞
Z ∞
1
2a
1
2
−ikx
dx Ψ0 (x) e
=√
dx
e−ax e−ikx
φ(k) = √
π
2π −∞
2π −∞
a 1/4 Z ∞
−ax2 −ikx
dx e
.
=
2π 3
−∞
(37)
The integral can once again be done using the basic Gaussian integral (85). To bring it
to this form, we need to cancel the
−ikx
in the exponent. This can be done by a change
8
of variables. One readily checks that the following does the job :
x=y−
ik
.
2a
(38)
One way to see why this trick works is to rewrite the term in the exponent as follows:
ik
−ax − ikx = −ax x +
≡ a (y 2 + C) .
a
2
(39)
We impose the last equality because we want to get a Gaussian integral, up to a constant
C
but with no linear term in
x.
From this form it is more or less obvious that we can
complete the square as follows:
ik
ik
ik
ik
ik
ik
ik
= −a x +
−
x+
+
= −a y −
y+
−ax x +
a
2a 2a
2a 2a
2a
2a
2 !
ik
k2
k2
= −a y 2 −
= −a y 2 + 2 = −ay 2 −
.
(40)
2a
4a
4a
In other words, to nd the trick (38) I just add a constant to x such that in the end I
2
2
get (y − c)(y + c) = y − c . This way of reducing integrals of exponentials containing a
7 As in the sinusoidal representation,
the general solution for given k is actually ψk (x) = Aeikx +Be−ikx .
However, as mentioned in the text, the negative solution is automatically taken into account by the fact
that we integrate over both positive and negative k. It is also unnecessary to include the normalization
constant A, as overall normalizations are taken care of by φ(k) with which this wave function is multiplied
(in other words, A can always be reabsorbed in φ(k)).
8 One should worry about the fact that this change of variables involves an imaginary shift of the
ik
integration variable, by a factor of − 2a
. However, for real and positive a the integral is everywhere nite
and it follows from complex analysis that this change of variables can be done.
16
Gaussian piece to a pure Gaussian is an important trick. We also have dx
= dy
and the
integration range is the same. Hence:
a 1/4 Z ∞
a 1/4 k2 r π
a 1/4 Z ∞
2
−ax2 −ikx
−ay 2 − k4a
φ(k) =
=
dx e
=
dy e
e− 4a
3
3
3
2π
2π
2π
a
−∞
−∞
1/4
k2
1
e− 4a .
(41)
⇒ φ(k) =
2πa
To summarize the trick, we have derived the following generalized Gaussian formula:
Z
∞
−αx2 −iβx
dx e
r
=
−∞
Step 4. Plug back into Ψ(x, t).
π − β2
e 4α .
α
(42)
Now we can plug this back into the solution of the
TDSE (28):
1/4 Z ∞
Z ∞
2
2
~k2
1
1
i kx− ~k
t
− k4a i kx− 2m t
2m
√
Ψ(x, t) =
dk φ(k) e
=
e
dk e
8aπ 3
2π −∞
−∞
1/4 Z ∞
i~t
1
1
+ 2m
ikx−k2 ( 4a
).
=
dk e
3
8aπ
−∞
This last integral is of the
integral.
same
(43)
type as the one before, it is a generalized Gaussian
The only dierence is that we now integrate over
apply formula (42) all the same since
x
and
k
k
instead of
x,
but we can
are dummy variables. We get (see exercise
3.2):
Ψ(x, t) =
2a
π
1/4
1
q
1+
e
−
ax2
1+ 2ai~t
m
.
(44)
2ai~t
m
This is the nal result for the wave function for the given boundary condition (33). As
you see, it is a completely explicit function of time. For its physical interpretation, see
exercise 3.2.
3
One-Dimensional Potentials
3.1
General Theorems
There are some useful theorems that save you work when solving the time-dependent
Schrödinger equation with one-dimensional potentials
V (x).
These theorems are pre-
sented in problems 2.1 and 2.2 of [1]. I summarize them here.
1) The assumption that (whenever this is possible) solutions of the Schrödinger equation should be normalizable implies the following:
a)
E ∈ R.
That is, we can always take the energy to be positive.
17
b)
E > Vmin .
The energy is bounded from below.
Remark.
Normalizable solutions do not always exist. For instance, plane waves ψ(x) =
R∞
R∞
e are not normalizable in the range r ∈ (−∞, ∞) because −∞ dx |ψ(x)|2 = −∞ dx =
∞. But we use normalizable solutions whenever they exist. Whenever a non-normalizable
ikx
solution is used, one must have a physical motivation for doing that. For instance, in the
case of plane waves, it is not surprising that they are non-normalizable because they ll all
of space. We regard them as approximations to more realistic situations that can be built
by superpositions of plane waves.
solution, as in (28).
function
Ψ(x, t).
Superposing plane waves we
can
get a normalizable
Remember that we always only need to normalize the
total
wave
In this case, plane waves are like `fundamental building blocks' of more
physical solutions of the TDSE and as such they are useful.
2) In the TISE,
ψ(x)
can always be taken to be real. A complex solution of the TISE is
always a linear combination of real solutions.
3) If the potential is even,
V (x) = V (−x),
then the even and odd solutions of the TISE
can be analyzed separately. In this case, any solution
ψ(x)
of the TISE can be written
as a linear combination of an even and an odd solution. This is useful when we consider
bound states because it means that we can solve the Schrödinger equation to one side of
the potential (x
> 0,
say) and we automatically obtain the other side by use of the sym-
metry. On the other hand, we often cannot apply this to scattering states, because there
is an explicit breaking of the symmetry by the boundary conditions (which are generically
dierent at plus and minus innity). See the next section.
3.2
The TISE with one-dimensional potentials
As we saw in chapter 2, the rst step in solving the TDSE is reducing to the TISE. Here
we concentrate on this step. There are a few things to remember:
•
Normalizability imposes E > Vmin (1b).
Further, one distinguishes
bound states
and
scattering states
because their physi-
cal behavior is completely dierent and they give independent contributions to the nal
wave function
Ψ(x, t).
Also, we solve the Schrödinger equation separately in each region in which the potential is a continuous, dierentiable function (a smooth function).
•
Bound states have the following generic behavior which you should check:
X Exponential behavior (damped/exponential growth) in the exterior regions.
X The behavior in the interior regions (if there are any) depends on the details of
the potential. It can be oscillatory (sine, cosine or plane wave) or exponential. See the
example in Figure 2.
X Symetric
potenial:
in that case, it is useful to separate even/odd solutions, which
leads to sines and cosines (or hyperbolic sines and cosines) in the
18
interior region.
Use
Figure 2: Potential symmetric around
x = 0,
with narrow spikes at
x=a
and
x = −a.
exponentials in the exterior regions.
X
Discrete spectrum:
in that case, the values of
E
are limited (sometimes a only a
handful of solutions or none, or an innite number of them).
X
•
Solve the energy equation
graphically to obtain the spectrum.
Scattering states have the following generic properties:
X Oscillatory behavior. Use plane waves.
X Transmission and reection coecients are interesting quantities.
X There is an interesting physical interpretation in terms of waves with which one can
probe a potential.
X
Algebraic manipulations can be heavy if one has to match dierent regions.
• Impose (dis-)continuity conditions at each point at which the potential is non-dierentiable
(it has a kink or it is singular at those points):
X If the potential is everywhere nite, then ψ and
can be non-smooth though.
X
but
If the potential has
dψ
dx are continuous. Such a potential
singularities (points where it is innite) then ψ is continuous
dψ
dx will have discontinuities at the singular points. The discontinuities are known and
can be obtained by integrating the Schrödinger equation around the singular points, as
in the steps leading to [2.125]. Obviously, an innite potential is only an idealization of a
very large one.
X
If the potential is innite in a whole region (more than a single point), the wave
function must vanish there. For example: the innite potential well.
19
4
Fourier Integrals and the Dirac Delta
4.1
The Dirac delta
Consider the following Gaussian function with unit area:
(x−x0 )2
1
f (x) = √ e− ε .
πε
R∞
that
dx f (x) = 1. The
−∞
Using (85), one easily sees
(45)
standard deviation of this Gaus-
p
√
sian is σ =
/2 (see Exercise
1.3b) of [1]), so clearly
ε determines the width of the
√
distribution and f (x0 ) = 1/ πε its height.
Consider what happens when we make ε smaller and smaller; the peak clearly becomes
higher and higher (in order for the function to preserve its area 1) and the distribution
becomes narrower and narrower. In the limit, the height of the distribution is innite and
its width goes to zero, but the area stays equal to one. We can picture what hapens in
√
√
ε and height 1/ ε such
the limit as follows. Replace the Gaussian by a square of height
√
1
√
that the total area under the square is
ε × ε = 1, i.e. a nite limit. Now since in the
limit the height of the distribution is innite but the width goes to zero, we can think of
a function
δ(x)
that takes the following values:
δ(x − x0 ) =
Z
0
∞
for
for
x 6= x0
x = x0
∞
dx
δ(x − x0 ) = 1 .
(46)
−∞
This is called the Dirac delta distribution. It is called a distribution because it is not a
proper mathematical function (it has an explicit divergence at one point; this divergence,
however, is mild enough that it can be treated in the more general theory of distributions,
which goes beyond the scope of these notes). Our treatment here has been mathematically
heuristic, but sucient. We argued that this function is the following limit:
(x−x0 )2
1
δ(x − x0 ) = lim √ e− ε .
ε→0
πε
(47)
To check the integration property (46), we rst perform the integral and then take the
limit
ε → 0.
It now also follows that for any function
Z
f (x):
∞
dx δ(x
− x0 ) f (x) = f (x0 ) .
(48)
−∞
From the fact that the Dirac delta vanishes everywhere except at
δ(x − x0 ) f (x) = δ(x − x0 ) f (x0 ) ∀x.
Now we can take
f (x0 )
x = x0 ,
we have:
outside the integral, and
evaluate the integral using (46). The result is (48).
The Dirac delta is paramount to many applications in physics, where we want to mimic
situations with `spikes'. If we identify
t = ε as time, it turns out that (45) satises the heat
equation. This Gaussian for instance describes heating up an aluminium bar. Initially
20
at
(x
t = 0, the
= x0 ), but
temperature is very high at the point where the material is being heated
as the material goes to thermal equilibrium the temperature quickly drops
and the temperature prole looks like a blob centered around the origin but spreading in
time, as heat exits the center. See Figure 3.
Figure 3: Heat distribution, all energy concentrated at
x=0
at
t = 0,
spreading in space
as time increases according to a Gaussian distribution.
4.2
Fourier transformations
Plancherel's theorem (31) tells us that, for every square-integrable function
a unique square integrable function
called the
F (k)
Fourier transform of f (x).
from which
f (x)
f (x),
can be reconstructed.
there is
F (k)
is
Its physical signicance is that it gives an expansion
2π
ikx
.
of the function f (x) in terms of plane waves e
with a well-dened wavelength λ =
k
Thus, f (x) is written as a superposition of (an innite number of ) monochromatic waves,
one for each frequency. Fourier analysis is used in many branches of physics. Decomposing
a sound signal into a spectrum of frequencies of sound is an example of Fourier analysis.
Filling the second of (31) into the rst, we nd the following relation:
Z ∞
Z ∞
Z ∞
Z ∞
1
1
0
0
−ikx0
ikx
0
0
ik(x−x0 )
dx f (x ) e
=
f (x) =
dk e
dk
dx f (x ) e
2π −∞
2π −∞
−∞
−∞
Z ∞
Z ∞
1
0
0
0
ik(x−x )
=
dx f (x )
dk e
.
(49)
2π −∞
−∞
The integral shifts are allowed because we were careful about distinguishing x from the
0
dummy variable x that is integrated over (and anything to the
of the integral is
right
integrated over).
21
9
Now comparing (49) with (48), we conclude that
1
δ(x − x ) =
2π
0
∞
Z
dk
0
eik(x−x ) ,
(50)
−∞
x ↔ x0 , x0 ↔ x.
x = x0 the integrand
where in comparing the two we have relabeled
Let us evaluate this expression. When
x 6= x0 , we can evaluate the integral explicitly:
is one, and the integral
diverges. When
0
1 eik(x−x ) ∞
δ(x − x ) =
=0.
2π i(x − x0 ) −∞
0
(51)
Hence this agrees with our previous denition (46).
The expression (50) we have obtained for the Dirac delta is a very useful one. Comf (x − x0 ) = δ(x − x0 ), we nd from the
paring it to Plancherel's theorem (31), taking
left-hand side of (31) that:
1
.
F (k) = √
2π
(52)
In other words, the Dirac delta is the Fourier transform of a constant! This agrees with
our intuition about position vs. momentum space: a function that is completely localized
at some point in space corresponds, under Fourier analysis, to a constant functioni.e. a
superposition of all wavelengths, all with the same amplitude.
5
The Formalism of Quantum Mechanics
5.1
Why?
If you have read a text on the formalism of quantum mechanics, you might have been left
with the question: how is this useful? Why study it?
There are some important reasons why it pays o to learn quantum mechanics in this
high-brow language of Hilbert spaces:
1) For the simple case of nite Hilbert spaces, the formalism simply reduces to the algebra
of matrices (diagonalization of matrices, etc.).
2) In general, the formalism allows for a unied conceptual treatment of continuous and
discrete cases by means of linear algebra.
3) The formalism is independent of the particular basis you choose. Describing your system in terms of its
positions or
in terms of
momenta does
not make any dierence, just
as you can describe a wave either by specifying its spatial prole or by specifying the
elementary frequencies that the wave is built up from.
It is a choice of
basis,
and the
9 This equation certainly makes (49) and (48) compatible. But is this the unique solution? In fact, it
is. Assume that
diered by some function g(x − x0 ). This function would have to
R ∞ the 0two 0sides of (50)
be such that −∞
dx f (x ) g(x − x0 ) = 0 ∀f ∈ L2 (R). But the only such function is g(x) = 0. Hence the
solution (50) is unique (even if for the reasons explained the Dirac delta is not a proper function).
22
formalism shows that quantum mechanics does not depend on a choice of basis.
4) It is conceptually simple: it allows to formulate the postulates of quantum mechanics
in a clear and concise way, and in particular it allows for a natural introduction of the
measurement postulate.
Some approaches to quantum mechanics do not need a measure-
ment postulate (or they claim they don't), but inthe standard Copenhagen interpretation
this is now the connection between theory and experiment is made.
5.2
The Postulates of Quantum Mechanics
Von Neumann gave an axiomatic formulation of quantum mechanics, similarly to what
Einstein had done for special relativity.
The great advantage of this approach is that,
if you want to generalize the theory, all you have to do is modify one or several of its
postulates. Similarly, if something turns out to be `wrong' with the theory, the axiomatic
structure makes it easier to trace the inconsistency back to one or several of the axioms. By
explicitly including a projection postulate we are not only making it possible to interpret
measurements, but also making explicit something that might turn out to be a weakness
of the theoryrather than hiding it under the rug of the theoretical machinery. So here
10 [3]:
are the postulates
Postulate 1. State of the System:
An ensemble of physical systems is completely described by a
function.
wave function
or
state
The wave function lives in a Hilbert space, and it may be multiplied by an
arbitrary complex number without altering its physical signicance.
Postulate 2. Observables:
Observables are represented by Hermitian (or self-adjoint) operators,
Q̂ = Q̂† .
Justication: Hermiticity is the natural notion of `reality' for operators. An Hermitian
operator has real eigenvalues hence real expectation values (see the next postulate).
Postulate 3. Measurement postulate:
The only result of a precise measurement of
Q̂
is one of the eigenvalues
Justication: when we measure any quantity in the laboratory,
tual
superposition11 .
qn
of
Q̂.
we never observe an ac-
We always nd a unique value for the observed outcome of the
experimentthe energy, say. Since the eigenvalues are well-dened real numbers associated with an Hermitian operator, they are good candidates for measurement outcomes.
2
The probabilities that, upon measurement, we nd qn , are given by |cn | .
10 Dierent
authors sometimes rank the postulates in a dierent order.
you will nd a text, even in prestigious journals, where it is claimed that a superposition
has been observed. Beware of such metaphorical phrasing! So far no superposition has been observed as
an actual experimental outcome. We observe interference patterns after repeating an experiment many
times, from which we infer that the state of the system was a superposition. The dierence is subtle, but
important. The detector either clicks, or it doesn't!
11 Occasionally
23
Postulate 4. Schrödinger postulate:
The time evolution of the system is governed by the Schrödinger equation.
Postulate 5. Projection postulate:
After a precise measurement of
ment) in the state
Q̂
with outcome
qn ,
the system is (shortly after measure-
ψqn .
The justication of this postulate is that repeated measurements of
same result if the time interval between them is small. The only state
ψqn
itself, therefore after measurement the system must be in state
Q̂ should give the
with P (qn ) = 1 is
ψqn .
This postulate introduces what we call the `collapse of the wave function'. If prior to
measurement the system is in the state:
|Ψ(t)i =
X
cn (t) |ψqn i ,
(53)
n
and after measruement it is in state:
|Ψi = |ψqn i ,
(54)
we see that we have projected onto one of the components of (53) and lost all of the
information about the
cn 's
prior to measurement. We can neither predict with certainty
which state we will nd upon measurement, nor can we, after measurement, use the
Schrödinger equation to `trace back' the state (53).
5.3
Linear Algebra
In the table below I summarize the main concepts from linear algebra that are used in
the description of quantum systems, including a detailed comparison with inner products
N
in
, with their respective notations:
C
24
linear algebra in
CN
wave function
Hilbert space
Ψ(x, t)
|Ψ(t)i
|αi
ei :
vector:
basis


 
1
0
 0 
 .. 
 
 
e1 =  ..  , · · · , eN =  . 
 . 
 0 
0
1
vector |αi in a basis:


α1
P
 .. 
α= N
i=1 ei αi =  . 
αN
PN ∗
inner product: hα|βi =
i=1 αi βi
orthonormal basis: hei |ej i = δij
ψn (x)
(n = 1, 2, . . . , ∞)
P
Ψ(x, t) =
Rb
Rb
a
n cn (t) ψn (x)
∗
dx Ψ1 (x) Ψ2 (x)
∗
dx ψn (x) ψm (x) =
a
operator Q̂:
linear transformation (matrix):
|ψn i
δnm
β =T ·α
ψ = Q̂ Ψ2
R b1
∗
coecients in a basis: αi = hei |αi cn (t) =
dx ψn (x) Ψ(x, t)
a
∗
dual vector: hα|
Ψ (x, t)
|Ψ(t)i =
or
|ni
P
n cn (t) |ψn i
hΨ1 |Ψ2 i
hψn |ψm i = δnm
|Ψ1 i = Q̂ |Ψ2 i
cn (t) = hψn |Ψ(t)i
hΨ(t)|
dual basis:
eTi
T
α =
= (0, · · · , 0, 1, 0, · · · , 0)
(1 on i th place)
ψn∗ (x)
vector in dual basis:
∗
∗
T ∗
i=1 ei αi = (α1 , · · · , αN )
PN
The Hilbert space considered here is
between
∗
∗
n cn (t) ψn (x)
Ψ∗ (x, t) =
P
L2 (a, b),
hψn |
hΨ(t)| =
∗
n cn (t) hψn |
P
that is the square integrable functions
(a, b):
b
Z
2
dx |Ψ(x)|
<∞
(55)
a
where
a, b
5.4
Continuous spectra
can be nite or innite
(−∞, ∞).
With appropriate modications, the above formulas also hold in the case of continuous
spectra.
p̂ = −i~ d/dx
x ∈ (−∞, ∞). This
p̂ are:
For deniteness, we will consider the momentum operator
the associated wave functions of a free particle moving along
continuous spectrum labeled by
p = ~k .
The eigenfunctions of
ψp (x) = √
i
1
e ~ px ,
2π~
where the reason for the normalization will become clear in a moment.
25
and
is a
(56)
R∞
2
dx |ψp (x)| = ∞. How−∞
ever, as long as we consider dierent p's, we do have the following orthonormality condiThe above wave-functions are not normalizable because
tion:
Z
∞
1
=
2π~
∗
dx ψp0 (x) ψp (x)
−∞
∞
Z
i(p−p0 )x/~
dx e
= δ(p − p0 ) ,
(57)
−∞
where in the last equality we used the representation of the Dirac delta function introduced
in (50), after interchanging
for the
√
1/ ~
x and k
and introducing
p = ~k
(this also explains the reason
in the normalization of (56). So we get:
hψp0 |ψp i = δ(p − p0 ) ,
(58)
which is analogous to the orthonormality condition in the discrete case.
Q̂.
Let
is a continuous variable (z
=k
Let us generalize this picture to the eigenfunctions of an arbitrary observable
us assume that the eigenvalues are labeled by
and
q(z) = ~k , in the previous example).
q(z), where z
They come from solving the eigenvalue problem:
Q̂ ψz (x) = q(z) ψz (x) .
The probability of nding a result
|c(z, t)|2 dz , where:
q(z)
in the range
(59)
(z, z + dz)
at time
t
is then given by
Z
Ψ(x, t) =
dz
c(z, t) ψz (z)
c(z, t) = hψz |Ψi .
Notice that
c(z, t)
contains as much information as
(60)
Ψ(x, t)
does. There is a one-to-one
correspondence between the two which is brought out by Plancherel's theorem.
z = k,
specic case
then,
c(z, t)
has a special name,
Φ(p, t),
In the
and it is called the `wave
function in momentum space':
Z ∞
1
ipx/h
dp e
Φ(p, t)
Ψ(x, t) = √
2π~ −∞
Z ∞
1
−ipx/~
Φ(p, t) = √
dx e
Ψ(x, t) .
2π~ −∞
(61)
Examples
a) If we take
Q̂ = H
2
p
z = p (the momentum) and q(z) = E(p) = 2m
.
1
ipx/~
plane waves, ψp (x) = √
e
. Alternatively, we
2π~
for a free particle, then
So these wave functions are the usual
use the wave number instead of the momentum,
z=k
and
E(k) =
~2 k2
. The result is the
2m
same.
p (for instance,
p̂ ψp (x) = p ψp (x). Hence z = p and the
1
eigenvalues are the momenta themselves, q(z) = p. Hence also ψp (x) =
eipx/~ .
2π~
c) If we take Q̂ = x̂, the eigenvalue equation is x̂ ψy (x) = y ψy (x). Then z = y and
q(z) = y . The solutions of the eigenvalue equation are delta functions, ψy (x) = δ(x − y).
b) If we take
Q̂ = p̂
for a particle allowed to move with any momentum
a free particle), then the eivenvalue equation is
26
Once we have the basis
ψz (x)
that solves (59), we can expand the wave function in
this basis:
Z
Ψ(x, t) =
dz
c(z, t) ψz (x) .
(62)
In the above examples, this formula reproduces known results:
R
R
1
ipx/~
a) Ψ(x, t) =
dp c(p, t) ψp (x) = √
dp c(p, t) e
which is the Fourier transform (61).
2π~
b) Idem.
c)
Ψ(x, t) =
R
dy
the expansion of
6
R
c(y, t) ψy (x) = dy c(y, t) δ(x − y) = c(x, t). This is a `diagonal'
Ψ(x, t) contains just one term, namely c(x, t) = Ψ(x, t) itself.
basis:
Dirac Notation
6.1
Base-free notation
In the above examples we have been expanding the same wave function in a dierent
basis of wave functions (eigenfunctions of some Hermitian operator). The Hilbert space
formalism allows us to do this in a basis-free way, as announced. To that end we introduce
the vector
|S(t)i ∈ H,
where we now think of the Hilbert space as an abstract space
without the need to specify a basis.
The only requirement is that
L2 (−∞, ∞)
TDSE. We dene for this vector space the
satises the
∞
Z
0
|S(t)i
inner product:
∗
0
dx Ψ (x, t)Ψ (x, t)
hS(t)|S (t)i =
,
(63)
−∞
where
Ψ(x, t)
is the wave function that corresponds to the state
representation.
|S(t)i in the position
We can show, using Fourier transformation, that the above inner product
is independent of the basis. Fill in the rst of (61), then we can rewrite the above inner
product as (try to show this):
Z
0
∞
∗
0
dp Φ (p, t)Φ (p, t)
hS(t)|S (t)i =
.
(64)
−∞
In the discrete case (for example, the harmonic oscillator), we nd:
hS(t)|S 0 (t)i =
X
c∗n (t)c0n (t) .
(65)
n
So this inner product is independent of the basis. Using this, we also nd:
Z
∞
hψx |S(t)i =
ψx∗ (y) Ψ(y, t) = Ψ(x, t)
Z ∞
−ipx/~
dx e
Ψ(x, t) = Φ(p, t)
dy
−∞
hψp |S(t)i = √
1
2π~
hψn |S(t)i = cn e
−∞
− ~i En t
= cn (t) .
27
(66)
This gives a nice interpretation:
wave function
|S(t)i
Ψ(x, t), Φ(x, t)
cn (t)
and
are the overlaps of the abstract
with the respective basis, i.e. the projections of that vector onto a
particular basis. These bases are often simply denoted
|xi, |pi, |ni,
hence we can write:
Ψ(x, t) = hx|S(t)i
Φ(p, t) = hp|S(t)i
cn (t) = hn|S(t)i .
In other words, these are the coecients
(67)
c(z, t) that determine `how much q(z) is in |S(t)i'
(for the continuous/discrete case).
ψp (x) = hx|ψp i = hx|pi
With this new notation, we can also write:
hx|ψy i = hx|yi.
simply the overlap
y ',
ψy (x) =
ψp (x) is
amount of x in
and
This makes the notation much more symmetric and simple.
hx|pi,
`the amount of
x
in the state
|pi'. hx|yi
is `the
given by a Dirac delta function.
ψx (p) = hp|xi = (hx|pi)∗ = ψp∗ (x), as it should
(notice that in Plancherel's theorem we have +ipx in one exponential, −ipx in the other).
The overlaps hx|pi contain the information about the spectrum of the particular op1
erator. For a free particle, hx|pi = √
eipx/~ . In other cases, however, this can take a
2π~
Now it is clear that we also have:
completely dierent form.
Examples
a) Harmonic oscillator: in this case, the spectrum of the Hamiltonian is discrete, En =
~ω n + 21 whereas the spectrum of x̂ is continuous and innite, x ∈ (−∞, ∞). The
overlaps are now hx|ni = ψn (x) (the harmonic oscillator wave function).
φ ∈ [0, 2π].
Here, the spatial coordinate is continuous (x = aφ
nπ
. The overlaps are:
with a the radius of the circle) but momenta are quantized, pn =
a
1
inφ
√
hφ|ni = 2π e .
b) Free particle on a circle,
6.2
Closure
Since now we have a basis-free notation, we can notice that, for
f, g ∈ H,
the following
holds:
Z
hf |gi =
dx f
∗
Z
(x) g(x) =
Z
dx hf |xihx|gi
= hf |
dx |xihx|gi
In the last step, we pulled the integral inside the inner product because
x;
i.e. it does not depend on
as well we regard
R
vector which we obtain by multiplying the vector
integrating over
x.
dx |xihx|gi
|xi
=
R
.
(68)
f
is base-free,
dx g(x) |xi as a new
with the number
g(x) = hx|gi
and
Now let us dene:
hf |Q̂|gi ≡ hf |Q̂ gi .
(69)
(68) can now be written as:
Z
hf |gi = hf |
dx |xihx| |gi .
28
(70)
We can regard the part in the middle,
R
dx |xihx|, as an operator that acts on
produces another vector. Since this equation holds for all
f, g ∈ H,
|gi
and
this operator must be
the identity:
Z
dx |xihx|
This is called the
=1.
(71)
closure relation and indicates that the basis is complete.
Multiplying
0
|pi and hp |, this gives:
Z
Z
0
0
0
∗
hp |
dx |xihx| |pi =
dx ψp0 (x) ψp (x) = hp |pi = δ(p − p ) ,
on the right and left with
(72)
0
which is indeed true for instance for a free particle. If we multiply with |yi, hy | instead,
R
R
∗
0
0
we get:
dx ψy 0 (x) ψy (x) =
dx δ(x − y ) δ(x − y) = δ(y − y ) which is again true.
In the same way we can derive closure for p:
Z
dp |pihp|
Multiplying both sides with
6.3
|xi, |x0 i,
=1.
(73)
this is the same as
R
∗
dp ψx0 (p) ψx (p)
= δ(x − x0 ).
Bras and kets
Something funny has happened. The notation (69) gave rise to operators such as
R
dx |xihx|.
These operators act on a state to give a new state. But what do we really mean by
more generally,
hα|?
hx| or,
The denition of these states was given in (69). We can in fact think
hΨ| as the `complex conjugate' of |Ψi.
hΨ| is called a `bra', so that we can form
of
For obvious reasons,
|Ψi
is called a `ket' and
an inner producta bra-ketby multiplying
3
3
the two as in a dot product. Here our analogy with
and
comes in handy. In usual
3
the standard inner product is in fact the dot product:
R
R
C


vx
hw|vi = wx vx + wy vy + wz vz = (wx , wy , wz ) ·  vy  .
vz
In
(74)
C3 :


vx
hw|vi = wx∗ vx + wy∗ vy + wz∗ vz = (wx∗ , wy∗ , wz∗ ) ·  vy  .
vz
(75)
Q = 1, the identity operator), we regard hΨ1 |Ψ2 i as the
hΨ1 | and |Ψ2 i: hΨ1 |·|Ψ2 i ≡ hΨ1 |1|Ψ2 i = hΨ1 |Ψ2 i. The bra corresponds to
In the same way, via (69) (with
`dot product' of
the (complex conjugate) row vector, the ket to the column vector. They are each other's
`duals'.
Remark (optional).
One can show that the bras form a vector space, in the same way
29
dual
∗
In fact, this is the vector space
to H, and it is often denoted H . By
∗
denition, the dual vector space H of a vector space H is a linear (non-degenerate) map-
the kets
|αi do.
ping from
H → C.
Indeed,
hα|
maps vectors in
H
to
C
by the inner product:
hα|βi ∈ C.
This mapping forms itself a vector space.
|αihβ|, this is an
(|αihβ|) |γi = |αihβ|γi = hβ|γi |αi.
Now when we put together
another vector
operator as it maps a vector
Here,
hβ|γi
|γi
to
is a number so it doesn't
matter if you write it left or right. On the other hand, the order does matter in
|αihβ|:
|αihβ| =
6 hβ|αi!
7
The Interpretation of Quantum Mechanics
We have seen that quantum mechanics gives us a
probabilistic interpretation of physical
quantities: it is not always possible to determine the outcome of a measurement with
absolute certainty, but we can predict the possible measurement outcomes and their respective probabilities in measurements on an ensemble of identical systems. The only case
determinate
where the outcome of a measurement is unique is when the system is in a
state
and we measure the corresponding observable. If the system is in an eigenstate of
an operator
value
qn
Q̂
with eigenvalue
qm ,
the formalism tells us that the probability to nd the
upon measurement is 1 for
n=m
and 0 for all other states. If
Q̂
is the position
operator, then the particle is well localized; but the momentum is not well dened in that
case, as the uncertainty principle expresses.
x̂
This is because the operators
and
p̂
do
not commute. So the formalism tells us that there is no state that describes all possible
observables simultaneously, i.e. no state in which all possible measurable quantities have
well-dened values.
We can get used to the above interpretation of quantum mechanics, but there is something unsatisfactory about it. Imagine carrying out a two-slit experiment with photons.
Given knowledge of the initial wave function and of the interactions, we can predict the
intensity distribution of light on the detection screen. If we decrease the intensity of the
source so that only one photon at the time goes through the slits, we can predict the
probability that that detection of the photon will take place in a particular region on the
screen. But there is something counterintuivite about this. Which slit did the particle
actually go through?
Quantum mechanics does not tell us the answer. If we try to mea-
sure which slit the photon goes through, the interference pattern disappearsknowledge
of which slit the particle went through destroys the quantum superposition. How can the
act of measurement be decisive here? Is a measurement any dierent from other physical
interactions? If the photon did not have a position before it was measured, but it does
have a well-dened position when it is detected, it would seem as if the act of measurement
is able to give particles properties they did not possess before. How can a measurement
be so decisive as to the
presence of
a physical property? What makes it so unlike other
physical interactions?
This set of questions, and their proposed solutions, which we will next turn to, is what
12 . It is seen
people usually call the problem of the `interpretation of quantum mechanics'
12 In
writing this chapter, I have drawn ideas from [4]
30
as problematic because it dees our classical intuitions about how physical systems are
supposed to work.
7.1
EPR and Hidden Variables
In a 1935 article that was meant to be a death blow on quantum theory as a fundamental
theory of reality, Einstein, Podolsky, and Rosen claimed that quantum mechanics was
incomplete. Imagine a pair of particles that are initially in contact, at rest at the origin,
and then go separate ways until their mutual distance is very large.
For instance, you
could think of an isotope that decays into two sub-particles that shoot on to dierent
sides, preserving the total momentum.
Notice that, quantum mechanically, we cannot
determine the position and momentum of the individual particles because
[Q̂1 , P̂1 ] = i~,
[Q̂2 , P̂2 ] = i~. However, we can determine the center of mass position of the system
Q̂1 + Q̂2 as well as the relative momenta P̂1 − P̂2 because these operators do commute:
[Q̂1 + Q̂2 , P̂1 − P̂2 ] = [Q̂1 , P̂1 ] − [Q̂2 , P̂2 ] = i~ − i~ = 0 (where we used the fact that the
operators of the two particles commute). Now imagine that we measure the position of
the rst particle,
q1 ,
once they are far apart. Since we know the location of the center of
q2 , the location of the second particle. But if
p2 , the momentum of the second particle, we may also infer p1
mass, we can infer from this measurement
we simultaneously measure
in virtue of the fact that we know the dierence in momenta. But now, EPR concluded,
since the two particles are very far apart, one measurement cannot inuence the other.
This would violate the result of relativity that there can be no inuences that travel faster
than the speed of light (in Einstein's words, there can be no `spooky action at a distance').
Hence we can, by means of independent measurements, have complete knowledge of the
positions and momenta of the two particles. Since the outcome of the location of particle
1 cannot aect (by relativity) the location of particle 2, this means that particle 2 must
have had a well-dened position prior to measurement even though quantum mechanics
tells us it didn't. A similar argument can be made for the momentum. Of course this is
alll in contradiction with the predictions of quantum mechanics. So measurements can
give us complete knowledge about these properties of particles, but the theory doesn't.
Therefore, EPR concluded, quantum mechanics is an incomplete theory.
Before trying to rebute the EPR argument, it is natural to ask the following question:
could it be that quantum mechanics is indeed missing some piece of information? Is it
possible to extend quantum mechanics into a more predictive theory?
Such attempts
are called `hidden variable theories': assuming that the position and momenta of particles
have precise values
before measurement amounts to introducing some variable that remains
hidden to quantum mechanics, but determines those values before we measure them. One
such attempt was carried out by David Bohm in 1952. In Bohm's theory, particles have
well-dened values of positions and momenta, and the predictions of quantum mechanics
are statistically reproduced. However, in order to achieve this, Bohm has to add a
local interaction potential between the particles.
non-
This potential is called non-local because
it acts at a distance; when a particle changes momentum or position, its interaction with
all other particles changes.
Whereas it is not clear whether there is a contradiction
with special relativity (as the interactions assumed by Bohm cannot be directly used to
transmit information faster than light), it is clearly an unwarranted feature of the theory.
31
For this reason, theorists have looked for
local
hidden variable theories, that is, hidden
variable theories where interactions do not propagate faster than light. The main criticism
to Bohm's theory, however, has been that it merely adds theoretical structure without
any predictive gain, as its predictions are compatible with quantum mechanics and all
experimental results agree with quantum mechanics. Furthermore, in order for the theory
to reproduce the results of quantum mechanics, ad hoc assumptions about the distribution
of the hidden variables have to be added.
In 1964 John Bell showed that local hidden variable theories are inconsistent with the
results of quantum mechanics. That is, that any hidden variable theory of the local type
makes experimental predictions which subtly deviate from those of quantum mechanics.
The corresponding experiments were carried out in 1982 by Allain Aspen and the predictions of quantum mechanics, rather than those of hidden variable theories, were found to
hold. These results have been conrmed in more rened experiments later on.
Hidden variable theories are sometimes identied with
e.g. section 1.2 of [1]).
realist
interpretations (see
This is a misnomerwhereas the hidden variable theory can
certainly be regarded as an (extreme) realist position, there are realist positions that do
not require particles to have well-dened properties (see the next section).
7.2
Bohr's Reply to EPR
Bohr's reply to the EPR article is all but a clear-cut physical argument. Instead, it is a
piece of philosophical discourse (and rather obscure at that) the main message of which
seems to be that what we call `position' and `momentum' cannot be detemined a priori,
but essentially depends on the
measurement context in which these concepts are dened.
The measurement process has, in Bohr's view, an essential inuence on the conditions
for the denition of physical quantities.
Since the conditions of measurement play an
essential role in dening what the call the `physical reality' (and, as part of that, the
concepts of position and momentum), one cannot infer a conclusion about the supposed
incompleteness of quantum mechanics: as far as the physical phenomena are concerned,
there simply
is nothing else to describe than either position or momentum (but not both).
In more practical terms, one could say that Bohr's position amounts to saying that the
particular experimental
context determines whether the concepts `position' or `momentum'
make sense. If we measure the position, there is no sense in which we can meaningfully
talk about the `momentum' of a particle: this concept is simply not dened.
Second, whereas Bohr denies that there are `spooky actions at a distance', he remarks
that EPR's conclusion that one can infer simultaneously the position and the momentum
of the particle is incoherent. Once position is measured on the rst particle, this concept is
applicable to the second particle as well. A measurement of the momentum of the second
particle creates a new measurement context which automatically introduces an uncertainty
in the position, which is now no longer applicable. We have therefore not determined the
position and momentum of particle two, but simply measured uncorrelated properties of
the second particle.
By 1927, Bohr's position had blended with the views of his younger colleages Heisenberg, Pauli, and Dirac, into the dominant paradigm in the interpretation of quantum
mechanics, known as the `Copenhagen interpretation'. The fact that Bohr's texts on this
32
matter are rather oracle-like and obscure and that his pupils developed related, but not
entirely coinciding accounts of quantum mechanics
13 , have contributed to a lack of clarity
as to how `the' Copenhagen interpretation should be understood.
Dierences in interpretation between Bohr, Heisenberg, Pauli, and Dirac nonwithstanding, the Copenhagen interpretation does seem to oer a genuine reply to the EPR
argumentthis being the reason that it is widely accepted.
The interpretation, how-
ever, does not come without a cost. At least two issues in the Copenhagen interpretation
require further thought:
1. As a philosophical interpretation, it talks about the
measurement context as posing
restrictions on the class of macroscopic concepts (such as position, momentum,
etc.) that can be applied to a microscopic system at any given time. However, the
philosophical interpretation does not tell us how this should work. This necessity of
invoking the macroscopic context in dening the concepts that are applicable to the
microscopic phenomena, whilst upholding that the quantum mechanical description
is complete, seems to imply the existence of a fundamental boundary between the
macroscopic and the macroscopic. It is not at all clear what physical principle, if
any, denes this boundary.
2. Bohr's sketched solution to the EPR paradox does not imply action at a distance,
but does require an explanation of how measurement of particle 1 inuences the
applicability of the concept of `position' to particle 2, and the actual determination
of this position. So measurement still seems to play a special role hereit is not
simply regarded as an ordinary physical interaction.
7.3
The Measurement Problem
When trying to translate Bohr's philosophical account of quantum mechanics into a physically workable model, one runs, as mentioned, into the fact that it seems hard to regard
measurement as an ordinary physical interaction. Indeed, in Bohr's view, a measurement
does much more than simply `giving the value of a variable': it sets the conditions under
which one can meaningfully talk about that variable. In order to get a grasp of how deep
this problem runs, we will here give a model of classical and quantum measurements and
compare them to each other
14 .
Measurement in Classical Mechanics
M (a scale to measure your
R on this device (the value of the scale pointer). Consider also a
A corresponding to S (your mass) with possible values a ∈ {a1 , · · · , an } ∈ R (ai
Consider a system
S
(for instance, you), a measuring device
weight) and the readings
quantity
13 For
instance, Heisenberg used the Aristotelian concepts of act and potency in his accounts of the
measurement problem. This is quite dierent from the Kantian avor of Bohr's remarks, as well as from
the instrumentalist interpretation that the `Copenhagen interpretation' has often been given.
14 This model goes back to von Neumann, but I am very much indebted to Jos Unk for the formulation
presented here.
33
for short). Corresponding to these values there are readings
ri
on the scale which indicate
your weight. If the scale works properly, then there is an invertible function
the reading
ri
for given mass
m
that gives
ai : ri = m(ai ).
You are standing in front of the scale. Since there is nothing on the scale, the pointer
is in its rest position, which we call
tion, your weight
ai ,
r0 .
We have a pair of numbers describing this situa-
and the at scale pointer:
(ai , r0 ).
Once you step on the scale, these
numbers change. Your weight does not change, but the pointer position does: we have a
(ai , ri ) = (ai , m(ai )). Now the values of ai and ri are correlated via the function m.
−1
Reading o your weight ri from the scale, and applying the function m
to it, you nd
−1
your mass: ai = m (ri ) = 70 kg. Notice that, as an abstract property about you, your
pair
mass is unmeasurable; we measure your weight on the scale and use it to infer the mass.
Measurement in Quantum Mechanics
S
M
A
Consider now a quantum system
with a property
sured by a measuring apparatus
(a detector, a phosphor plate, a photon camera). To
A
(e.g. energy, or spin) to be mea-
A with eigenvalues {a1 , · · · , an }. To these eigenvalues correspond states |a1 i, · · · , |an i ∈ HS , the Hilbert space of the system. Since we want to model
measurement physically, we are going to associate an operator R to the masuring apparatus. R gives the possible `readings' of the detector (numbers on a computer screen, dots
on a phosphor surface, etc.), which can take values r0 , r1 , . . . , rn (the value r0 denoting,
as before, `no detection'). So the Hilbert space HM of the measuring apparatus contains
one more basis state: |r0 i, |r1 i, · · · , |rn i. We regard |r0 i as the ground state, the state
that indicates that nothing is being measured. We assume |ai i (i = 1, · · · , r ) and |ri i
(i = 0, 1, · · · , n) to form orthonormal bases of HS and HM , respectively.
Before detection, S is in a state |ai i (for some i) and M in the state |r0 i. So the total
state of the system is the product of states |ai i|r0 ithis is the quantum analog of the
statement that you and the scale are described by the pair (ai , r0 ).
Now we want to measure A using R. By the measurement interaction, the system M
will undergo a change from |r0 i to |ri i, and the latter should be indicative for the state
|ai i of S . For an ideal measurement, the system S itself should not change. So we have
corresponds an operator
the transition:
|ai i|r0 i → |ai i|ri i .
(76)
Our task of nding out `whether measurement is a physical interaction' now amounts
H
|ai i|ri i.
to asking whether there is a Hamiltonian
initial state
|ai i|r0 i
to a nal state
that describes (76) as a transition from an
In order to eliminate technical details, we
will go about this question as follows. It turns out that this can be translated into the
question whether there is a linear operator
15
U
such that:
U (|ai i|r0 i) = |ai i|ri i .
(77)
We can think of this operator as eecting the Schrödinger time evolution with Hamiltonian
H . If U exists, there is a Hamiltonian H
S and the measuring apparatus M .
15 As
that contains the interaction between the system
it turns out, such an operator also has to be unitary, meaning U U † = 1.
34
In fact, the answer to this question is: yes, such an operator
U
exists (see the appendix
for more details). In principle this is very nice, because it means that:
1. We have succeeded in describing measurement as a physical interaction via a unitary
operator.
2. No measurement postulate is needed.
3. There is no distortion of the state of the system
However, the fact that
works if
S
U
ai .
is linear has some strange implications. The above really only
is in an eigenstate
|ai i
of
A,
as we have assumed above. What if the system
S
is in a superposition?
|ψi =
X
cj |aj i .
(78)
j
Let us apply our operator to this state. After measurement, we get a nal state:
!
|ψnal i = U (|ψi|r0 i) = U
X
cj |aj i|r0 i
=
j
X
cj U (|aj i|r0 i) =
X
j
cj |aj i|rj i ,
(79)
j
U . You see the important consequencethe system is no
|ai i|ri i with denite value of i (corresponding to one eigenvalue
where we have used linearity of
longer in a product state
of
A)
after the measurement! Instead, the nal state is a
system and the measuring apparatus!
superposition of the measured
This means that you and the scale are forever
entangled after you step on it! So which state is this system in? We simply cannot tell:
2
it might be any of |aj i|rj i with probabilities |cj | .
Could we solve this problem by bringing in another measuring device that will nd out
what state you and the scale are in? It is not hard to see that that second measuring device
will again become entangled with
S
and
M
in this way.
The inescapable conslusion is
that, once we regard measurement interactions to be described by operators
U
acting as in
(77), the whole universe becomes entangled. Of course, this is nothing but Schrödinger's
cat in disguise.
Once we start with a superposition somewhere in the universe, and if
all interactions are given by `nice' linear operators, it will not be long before the whole
visible universe will nd itself in a superposition that includes
S
and
M.
So this does not
give us a way of explaining how measurements give denite values after all.
This further motivates von Neumann's description of measurement not as a `normal'
unitary operator
U,
but as a `special' projection operator
Pi
that projects unto some
particular state: if we start with
|ψi =
X
cj |aj i
(80)
j
then after measurement of eigenvalue
ai
the system will project to the following state:
Pi |ψi = ci |ai i .
35
(81)
Since
|aj i are linearly
Pi has to act
operator
independent, we see that in order for this to work the projection
on a basis as:
Pi |aj i = δij |ai i .
(82)
We then also get for our nal state:
Pi |ψnal i = ci |ai i|ri i .
(83)
Notice that the Pi 's are not unitary because the nal state is not normalized to 1: ci |ai i|ri i
|ci |2 , hence this is not a regular physical interaction of the type described by the
has norm
Schrödinger equation because it does not preserve total probability. We have to rescale
the state again in order to normalize it to 1. It can also be shown that projections do not
conserve energy.
7.4
Other Approaches
7.4.1 Decoherence
The basic idea of decoherence is quite simple: the reason for the seemingly non-unitary
evolution (81) is in the interactions with the environment. The evolution (81)
looks as if it
was non-unitary, but in reality (i.e. if we would be able to include all the very complicated
interactions with the environment) we would see that it is unitary. The wave function
appears
to collapse, but in fact it satises the Schrödinger equation. Some parts of the
wave-function, which should be present in (81), have been projected out simply because
their coecients
cj
are very small.
So we are dealing with a kind of thermodynamic
irreversibility here in which the `thermal bath' provided by the environment washes away
some of the information.
Another way of saying this is that, after measurement, the wave function (after rescaling by
ci )
does not look like (81), but actually looks like:
|ai i|ri i + . . . ,
(84)
where the dots relate to terms with low probability. The wave function is actually entangled, but
for all practical purposes
it looks as if just one term in the superposition
contributes. The reason for this is that we neglect inteactions with the environment in
the Hamiltonian.
In my opinion (but some of my colleagues may disagree on this) this is not strictly
speaking a solution to the fundamental problem we posed of why we never see superpositions. The magic words above are `for all practical purposes': decoherence is a solution
that works in practice, but, in fact, the whole universe is entangled in the state (84) (since
the environment includes the whole visible universe). So it does not solve the matter of
principle we raised: how do we pass, from a formalism which predicts universal entanglement, to states in which particles have unique propertiesdeterminate states? Simple
decoherence just tells us that the determinate states are the most probable ones, but it
does not explain how to actually obtain them. Unfortunately, long discussions on texts
on decoherence fail to address this fundamental point.
36
Perhaps the answer to this is that we should not interpret the wave function to describe
the world
as it actually is, but only as a collection of possibilities.
The wave function tells
us what is possible and probable, but not what is actual.
7.4.2 Many Worlds
I just suggested that the wave function maybe only describes possibilities, not actualities.
At the other extreme of the spectrum there is the many worlds interpretation, which
assigns objective actuality to
all of the terms in the wave function (whether they decohere
or not), i.e. to all the summands in (84). It replaces the collapse of the wave function
by its branching o: at each single measurement, the world branches o into distinct
possibilities, so that every possible outcome is realized in a dierent world. This reconciles
the appearance of non-deterministic events (such as random decay of an atom) with
deterministic equations such as the Schrödinger equation.
This may sound crazy, but you would be surprised by the growing number of physicists
who actually support this interpretation. There is some theoretical support for the manyworlds interpretation coming from the `histories approach' to quantum mechanics as well
as from some recent puzzles in quantum cosmology. Perhaps the fact that physicists are
willing to support such a crazy idea is simply an indication of how bad the measurement
problem sits with them.
I see a simple puzzle for the many worlds interpretation which never seems to be
discussed in the literature. Consider a harmonic oscillator. It is intuitively clear how the
many worlds interpretation would work for this system, since its spectrum is discrete. If
a world with wave function
ψ2 ,
etc.
P
n cn ψn , there is a world with wave function ψ1 ,
(up to innity). But this only works because the
the wave function is in a superposition
spectrum is discrete. If the spectrum is continuous and labeled by, say, the wave number
k,
then there is a continuum of worlds between any two nite values of
k.
This seems
nonsensical because it is well-known from mathematics that we cannot label the real
numbers using natural numbers. The latter is an innite but countable set. The former
is not only an innite, but an uncountably innite set!
If we cannot even count the
number of worlds, how can they all have separate existence? The concept of `continuum'
seems counter to the idea of all these worlds having `separate existence'. So to me, this
seems like nonsense. A way to resolve this puzzle could be to turn all continous spectra
in quantum mechanics into discrete ones, by introducing a new scale so small that we
cannot see it, so that particles `seem to have continuous momentum
k'
but are actually
always quantized and their energy levels are so dense that we cannot tell them apart. But
then again: 1) This is never discussed in the many worlds literature; and 2) It amounts
changing quantum mechanics into a dierent fundamental theory, which runs counter
to our initial aim of simply interpreting quantum mechanics! (which we assumed to be a
to
complete theory).
37
A
Mathematical Formulas and Tricks
A.1
•
Gaussian integration
The basic Gaussian integral is:
∞
Z
−αx2
dx e
r
=
−∞
π
.
α
(85)
To show this, we compute the integral in two dierent ways. Consider the area under the
−r2
on the plane in plane polar coordinates:
function e
Z
−r2
dA e
Z
∞
Z
dr
=
0
2π
∞
Z
−r2
dθ r e
=
0
0
du
2
−u
2π e
Z
∞
=π
−u
du e
0
u = r2 . Now compute this same integral in
x = r sin θ, y = r cos θ:
Z
Z ∞
Z ∞
Z ∞
Z ∞
−(x2 +y 2 )
−x2
−y 2
π=
dx
dy e
=
dx e
dy e
=
−∞
−∞
= −π e
0
where we dened
−∞
−∞
∞
= π ,(86)
−u Cartesian coordinates
∞
−x2
dx e
2
.
(87)
−∞
Taking the square root, it follows that:
Z
∞
−x2
dx e
=
√
π.
(88)
−∞
To obtain the Gaussian integral with generic width, we simply rescale
x→
√
αx
in the
integrand, which gives back (85).
•
From (85) we can also compute the following integral:
Z
∞
−αx2 −iβx
dx e
r
=
−∞
π − β2
e 4α .
α
(89)
For the proof of this, see section 2.4.
•
Using (85), we can also compute integrals of the following type:
Z
∞
n
dx x
2
e−αx .
(90)
−∞
n is odd. The reason is that in that
the integrand is an odd function under x → −x. The integral of an odd function
a range (−a, a) is zero. Since we integrate from (−∞, ∞), the result of this integral
First of all, we notice that this integral vanishes if
case
over
is zero.
When
n
is even, we use the partial integration formula:
Z
D
Z
u dv = uv −
D
38
D
du v
,
(91)
D
denotes the domain of integration. Noting that d(e
−αx2
/(−2α). Applying (91), we get
can take u = x and v = e
where
Z
∞
2
dx x
e
−αx2
−∞
−αx2
2
) = −2αx e−αx dx,
r
√
Z ∞
2
2
x e−αx ∞
e−αx
1
π
π
=
dx
−
=
= 3/2 ,
−2α −∞
−2α
2α α
2α
−∞
we
(92)
where the boundary term vanished because the Gaussian function vanishes at innity
more rapidly than any polynomial.
We can get the result for higher powers of
x
by
successive partial integrations. The result is:
∞
Z
2n −αx2
dx x
e
−∞
B
B.1
√
(2n)! 2 π
.
=
n! (4α)n+ 12
(93)
Technicalities of Quantum Mechanical Measurements
Time Evolution Operators
Remember that the problem of quantum mechanics amounts to nding a solution
Ψ0 (x).
of the TDSE (20) given an initial condition
We can recast this problem in the
language of operators, as follows: nd a linear operator
U
such that:
U : Ψ0 (x) 7→ Ψ(x, t) = U (t) Ψ0 (x)
satises the TDSE. Here,
U
Ψ(x, t)
is a time-dependent operator called the
(94)
evolution operator.
In a particular basis, and in the discrete case, it maps:
X
cn ψn (x) 7→
U
i
cn e− ~ En t ψn (x) .
(95)
n
n
The operator
X
is easy to nd:
− ~i Ht
U (t) ≡ e
∞
X
− ~i Ht
≡
n!
n=0
where the exponential, which includes an operator (the
n
,
(96)
nth power of the Hamiltonian) has
been dened by its Taylor expansion. Although this requires some care, Taylor expansions
can indeed be dened for linear operators (as for matrices) analogously to how they are
dened for numbers:
Q̂
e ≡
∞
X
(Q̂)n
n=0
Now the claim is that
to
If
U (t),
n!
.
as dened in (96), does the job in (94): when
(97)
U (t)
Ψ0 (x), it gives the solution of the TDSE with that initial condition. Let us
Ψ(x, t) is to be given by (94) with U as in (96), it must solve the TDSE:
i~
∂Ψ(x, t)
∂U (t)
= i~
Ψ0 (x) .
∂t
∂t
39
is applied
check this.
(98)
Let us compute the time derivative:
∂U
∂ − i Ht =
e ~
=
∂t
∂t
i
i
i
− H e− ~ Ht = − HU (t) .
~
~
(99)
Filling this back into (98):
∂Ψ(x, t)
i
i~
= i~ − H U (t)Ψ0 (x) = H Ψ(x, t) ,
∂t
~
(100)
which is precisely the TDSE. Hence, we have shown that (96) does the job of solving for
us the TDSE.
Let us now see what these formulas mean in practice.
condition
P
Ψ0 (x) =
n cn ψn (x),
If we are given the initial
then:
!
U (t) Ψ0 (x, t) = U (t)
X
cn ψn (x)
=
n
X
=
X
cn (U (t) ψn (x)) =
X
n
− ~i
cn e
En t
i
cn e− ~ Ht ψn (x)
n
ψn (x) = Ψ(x, t) ,
(101)
n
H satises the TISE, Hψn = En ψn , and hence we
can replace any powers of H by powers of En when they act on ψn (also in the exponential,
where we have used the fact that when
by using the Taylor expansion we can see this term by term). Now the last expression in
(101) is indeed the usual solution of the TDSE.
U
happens to be a unitary operator, i.e.
U U † = U † U = 1.
This can be shown by
computing the adjoint:
i †
†
i
i
U † = e− ~ Ht = e(− ~ Ht) = e ~ Ht .
H
where we have used the fact that
(102)
is Hermitian and that, again, to compute the adjoint
of the exponential, we can use the Taylor expansion, compute the adjoint term by term,
and resum the Taylor series. Since we get a plus sign in the exponential, when we multiply
it with
U which contains a minus sign, the two cancel out and we
U is unitary, it preserves amplitudes (and probabilities):
get the unit matrix.
Because
hΨ|Ψi = hU Ψ|U Ψi = hΨ0 |U † U Ψ0 i = hΨ0 |Ψ0 i .
B.2
The Measurement Operator
(103)
U
The measurement operator we were looking for in (77) can be explicitly written as follows:
U=
X
|ak i|rj+k ihrj |hak | .
(104)
jk
It is not hard to show that
of projectors onto
A
U
indeed satises (77). In fact, this operator is a combination
and onto
R:
U=
X
(a)
(r)
Pk,k Pj,j+k
jk
40
(105)
where
(r)
Pi,j |rl i = δil |rj i
(a)
Pi,j |al i = δil |aj i ,
and
P (a)
Proof:
is dened similarly to act on
We simply let
U
(106)
|al i.
act on (77):
U (|ψi|r0 i) =
X
cj U (|aj i|r0 i) .
(107)
j
We compute the r.h.s. separately using the decomposition into projectors (we do some
relabeling of indices):
X
cj U (|aj i|r0 i) =
X
=
X
j
l
cl
X
(a)
(r)
Pk,k Pj,j+k |al i|r0 i
jk
(a)
(r)
cl Pk,k |al i Pj,j+k |r0 i =
X
ck |ak i|rk i ,
(108)
jkl
(r)
where in the second equality sign we used the fact that P
acts only on eigenstates of R
(a)
and P
only on eigenstates of A. In the last formula we used the property (106). This
is precisely what we wanted to show.
References
[1] D.J. Griths, Introduction to Quantum Mechanics, Pearson, 2nd Edition.
[2] D.C. Giancoli, Physics: Principles with Applications, Pearson, 6th Edition.
[3] B.H. Bransden and C.J. Joachain, Quantum Mechanics, Addison-Wesley, 2nd Edition.
[4] D. Dieks, Filosoe/grondslagen van de natuurkunde, Utrecht University, 2008-2009.
41