Quantum Physics Lecture Notes
Transcription
Quantum Physics Lecture Notes
Quantum Physics Lecture Notes Understanding the Schrödinger Equation Sebastian de Haro Amsterdam University College, Fall Semester, 2014 Cover illustration: Wikipedia Contents Introduction 2 1 Motivating the Schrödinger Equation 3 1.1 Classical waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Enter the quantum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 The wave function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Four Steps to Solve the Schrödinger Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 10 2.1 The Problem 2.2 Step 1: Reduce to TISE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Step 2: General Solution of TDSE . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Step 3: Impose Initial Condition 2.5 Step 4: Plug Back into 2.6 Example: Gaussian Wave Function Ψ(x, t) Ψ0 (x) 10 . . . . . . . . . . . . . . . . . . . . 12 . . . . . . . . . . . . . . . . . . . . . . . . . 14 . . . . . . . . . . . . . . . . . . . . . . 3 One-Dimensional Potentials 14 16 3.1 General Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 The TISE with one-dimensional potentials 17 . . . . . . . . . . . . . . . . . . 4 Fourier Integrals and the Dirac Delta 19 4.1 The Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Fourier transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5 The Formalism of Quantum Mechanics 21 5.1 Why? 5.2 The Postulates of Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.4 Continuous spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 . . . . . . . . . . . . . . . . . . . . 6 Dirac Notation 21 22 26 6.1 Base-free notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 6.2 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 6.3 Bras and kets 28 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Interpretation of Quantum Mechanics 29 7.1 EPR and Hidden Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 7.2 Bohr's Reply to EPR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 7.3 The Measurement Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 7.4 Other Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.4.1 Decoherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.4.2 Many Worlds 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Mathematical Formulas and Tricks A.1 Gaussian integration 37 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 37 B Technicalities of Quantum Mechanical Measurements B.1 Time Evolution Operators B.2 The Measurement Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . U . . . . . . . . . . . . . . . . . . . . . . . . . References 38 38 39 40 2 Introduction Welcome to the fascinating world of quantum mechanics! We are going to learn how to do computations in quantum mechanics and how to interpret our results physically. I say quantum mechanics is fascinating because it is so dierent from any other physical theory we have ever seen before. In classical physics, particles travel along trajectories that can be drawn in space and time. In quantum mechanics, particles don't have trajectories and sometimes we can't even say where they are located in space! In fact, quantum mechanics is so weird that some of the scientists who made major contributions to it notoriously, Albert Einstein and Erwin Schrödingernever believed the interpretation of quantum theory that has now become standard. Richard Feynman famously said, I think I can safely say that nobody understands quantum mechanics. So, remember: if you get confused, you are in good company. However, I don't think Feynman was entirely right on this one. We don't seem to have quantum mechanical intuition wired into our brains and bodies (as we do learn to appreciate weight, height, etc., intuitively). Feynman was certainly right about that. But we can learn to work with the theory and progressively develop a rational intuition for it, based on what we learn from the formulas and the physical principles they embody. By logical thinking and some guesswork, by looking at the experimental facts (black body radiation, photoelectric eect, double slit experiment, etc.) we can try to nd the necessary physical principles that will allow us to construct the theory of quantum mechanics. We can't about that, just like we can't derive quantum mechanics, let me be clear derive, from sheer thinking about the concept of mass, the fact that Newton's law of gravity decreases with the square of the distance. But we can look for the minimal set of principles and equations that will reproduce and unify all of the facts we know about quantum mechanics. This is what Schrödinger's equation does for us. Once we have motivated why it should be true, we have to learn how to work with it and to interpret it. Indeed this is the right order: rst learn to work with it, then gain deeper understanding of what it means. Remember that in the short period in which the `new' quantum theory was developed (basically, Christmas of 1925 to the summer of 1926) Heisenberg and Schrödinger worked on the math and that only later on did they focus on its meaning. We will be doing both things as simultaneously as possibledeveloping the theory while discussing its interpretation. These lecture notes are to be used as a complement to a textbook on quantum mechanics such as Griths' [1]. I will focus on a number of selected issues that I think are important to understand quantum mechanics. In the rst chapter I motivate how the Schrödinger equation comes to be in the rst place, rather than just throwing it at you and asking you to trust quantum mechanics. Whereas the Schrödinger equation cannot be derived from classical mechanics, it can be motivated from semi-classical considerations. I also expand, in later sections, on mathematical explanations and include study tools. Writing down a mathematical theory of quantum mechanics assumes knowledge of the basic experiments from them. that led to this theory and the broad principles that were derived Therefore I recommend that, before you start reading these lecture notes, you refresh your memory (or catch up, whichever may be the case) on the most relevant historical experiments and the physical principles that were drawn from them. I recommend chapter 37 of Giancoli [2]. If you want a more thorough exposition, you can read 3 chapter 1 of [3]. I will refer to these experiments regularly. 1 1.1 Motivating the Schrödinger Equation Classical waves We recall some concepts from classical physics that will both serve x notation and motivate the way we introduce waves in quantum mechanics. Consider a one-dimensional wave, say: y(x) = A sin kx . This is a static, sinusoidal wave extending in the called the amplitude of the wave. The (1) x-direction. A, wavelength the maximum height, is is the distance between two identical points on the wave (e.g. two crests, two troughs, two nodes) and in the above example it 2π is given by λ = because this is the smallest number for which the wave repeats itself, k i.e. y(x + λ) = y(x). k is usually called the . Consider now a wave traveling wave number at speed v: y(x, t) = A sin (k(x − vt)) = A sin The period T 2π (x − vt) . λ (2) of this wave is the time it takes for the wave to go back to itself in time, analogously to the wavelength: y(x, t + T ) = y(x, t), hence T = λ/v . You see that the as the speed is right, as v is indeed the wavelength divided by the 1 2π period. The frequency is dened as ν = , and the angular frequency is ω = 2πν = . T T The advantage of using the wave number and the angular frequency is that they include interpretation of v the periodicity of the sine and cosine and the factors of 2π nicely disappear: y(x, t) = A sin(kx − ωt) . (3) ν for the frequency f , because this is the standard notation in more advanced texts. We will use the Greek alphabet extensively in this course, so you better get used to it!: α, β, γ, δ, , . . . This is a notation that we will use quite often. I am deliberately using here and not Complex waves. In quantum mechanics we will always use complex waves, for rea- sons that will become clear later. A complex wave looks like: Ψ(x, t) = Aei(kx−ωt) . (4) Now recall the following result: Euler's formula. eiϕ = cos ϕ + i sin ϕ. Proof. This can be proven by using the following relations for the sine and cosine, which you should know: eiϕ − e−iϕ 2i eiϕ + e−iϕ cos ϕ = . 2 sin ϕ = 4 (5) From this, we get: cos ϕ + i sin ϕ = 1 iϕ 1 −iϕ 1 1 e + e + i eiϕ − i e−iϕ = eiϕ , 2 2 2i 2i (6) which proves Euler's formula. Remark. If you don't recognise formula (5) at all, try to prove it using the Taylor expansions (Taylor series) of the sine, cosine, and exponential functions, which you should know: ∞ X (−1)n sin x = x2n+1 (2n + 1)! n=0 cos x = ∞ X (−1)n n=0 ex = (2n)! ∞ X xn n=0 n! x2n . (7) As you can see, the cosine is an even function of x, hence it only has terms with even 2n 2n+1 powers, x , whereas the sine is odd and only has odd powers x . By taking combinaiϕ −iϕ tions of e and e with plus and minus sign, respectively, in (5), we cancel the odd/even terms and are left with the cosine or sine, respectively. Because of Euler's formula, we see that the cosine is the real part of the wave while the sine is its imaginary part: cos ϕ = sin ϕ = Re Im eiϕ eiϕ . (8) Hence, we can can always replace sinusoidal waves such as (3) by complex waves (4) and take the real or imaginary part as needed. Having made these remarks on complex waves we now go back to the physical meaning of the quantities involved in a wave. I want to make the following point about the ix amplitude A. When we have a wave of the type A sin x or A e , the quantity that is of 2 ∗ physical interest is not A itself, but |A| ≡ AA . Mathematically, the reason is that A could be made to be negative by simply changing our coordinate by x → −x in sin x, and obviously that shouldn't aect the amplitude, which is independent of the orientation of iϕ the coordinate. Also, in a complex wave e we could shift ϕ → ϕ + α by some constant iα α and generate a e which then would become part of A. But also that complex phase is a change of variables (namely, choosing a dierent zero point for ϕ) which should not aect the amplitude. For these reasons, the physical quantities of interest depend on the 2 absolute value (squared) of the amplitude |A| and not on A itself. 1.2 Enter the quantum Experiments such as the two-slit experiment make clear that particles sometimes behave as waves. This is true not only for light, but also for material particles such as electrons. 5 Indeed, in a ash of genius Louis de Broglie hypothesized that not only electromagnetic radiation has a dual nature as waves and as particles (photons), but that also matter, usually believed to be of corpuscular nature, should possess wave-like properties. For photons, Planck's 1900 hypothesisclaried and generalized by Einstein's 1905 description of the photoelectric eecthad been that the energy is quantized in units of h Planck constant): E = hν = hc/λ. (the The quantity p = E/c = h/λ had been identied as the photon's momentum, which played an important role in the impressive Compton eect, where a single photon could hit an electron at rest and impart it non-zero momentum. de Broglie now proposed that, analogously to the case of light, matter waves also have a frequency ν and a wavelength λ, given by: E h h λ = . p ν = Indeed it is one of the great features of Planck's constant (9) h that, by introducing a funda- mental constant with units J · s, such relations between frequency and energy and between wavelength and momentum can be written down. For the same reasons for which one introduces the angular frequency and the wave number, we will often use the reduced Planck constant ~= h , 2π (10) which is usually simply referred to as `Planck's constant'. In terms of these, the above read: E = ~ω p = ~k . Furthermore, remember the classical relation E = T +V. (11) If we ignore the potential, we have, classically: E= where m p2 ~2 k 2 = , 2m 2m (12) is the mass of the electron. We will use this expression a lot in what follows. Remark. The above formulas (11)-(12) also imply a relation between the frequency and the wavelength of the wave (viz. between the angular frequency and the wave num~k2 ber): ω(k) = . Such a relation is called a because it determines the 2m 1 eect of dispersion of the wave as it travels through a medium . The above dispersion dispersion relation 1 Namely, by giving the speed as a function of the wavelength. 6 relation holds for a matter particle. A photon, on the other hand, is massless and has the 2 dispersion relation E = pc. The central question that Schrödinger asked, then, is as follows. We know that, at least semi-classically, electrons have energy given by (12). But they are also waves of the type: Ψ(x, t) = A ei(kx−ωt) . (13) So how can such a wave have energy (12)? The answer from optics would be: write down a wave equation! So following de Broglie's lead, Schrödinger had the idea writing down a wave equation that would reproduce (12). i~ The equation in question is the following: ~2 ∂ 2 Ψ ∂Ψ =− . ∂t 2m ∂x2 (14) As you can see by lling (13) into this equation, (14) looks like: i~ (−iω)A ei(kx−ωt) = − ~2 (−k 2 )A ei(kx−ωt) , 2m (15) ~k2 , which is precisely the condition (12). So the expression for 2m the energy does appear as a consequence of Schrödinger's wave equation (13): the wave which holds if: ω = equation forces on us a relation between the angular frequency k, ω and the wave number and this relation is nothing but the classical energy relation (12). This is nice: waves seem to describe electrons too! The is the one-dimensional Schrödinger equation for a free particle (free electron), i.e. one with zero potential energy hence satisfying (12). Now if there is potential energy V around, (12) will change to: E= p2 +V , 2m (16) again a result from classical mechanics. So it is natural to modify the Schrödinger equation as follows: i~ Here, ∂Ψ ~2 ∂ 2 Ψ =− + V (x)Ψ(x) ≡ HΨ(x) . ∂t 2m ∂x2 V (x) is the potential function, (17) that is the function from which (in classical mechan- ics) the force can be derived. And the Hamiltonian H=− H ~2 ∂ 2 + V (x) . 2m ∂x2 2 Ironically, was dened as: (18) quantum theory originated from considerations of the wave vs. particle nature of photons, but the Schrödinger equation only describes matter particles such as electrons and protons which, because they are massive, travel at low speeds. Photons travel at the speed of light and one needs to take relativistic eects into account in order to describe them quantum mechanically, which in turn means that one needs to generalize the Schrödinger equation and replace it by an appropriate equation incorporating special relativity. The reason the Schrödinger equation works for massive particles such as the electron is that it is based on the non-relativistic limit of the energy (12), which can be conveniently `quantized', as I show below. 7 The Hamiltonian is an tum mechanics. operator (a kind of derivative) that represents energy in quan- If you have studied the Hamiltonian formalism in classical mechanics, you might remember that classically the Hamiltonian is the sum of the kinetic and the potential energy but an operator: Hψ(x) ≡ to it a functions, T + V . The Hamiltonian we just dened is not a function, ∂/∂x (also an operator), it acts on functions: ~2 ∂ 2 ψ − 2m ∂x2 + V (x)ψ(x). That is, given a (wave) function ψ(x), an operator assigns wave function Hψ(x) dened by the equation above. In quantum mechanics, just like the derivative new classical quantities such as the energy are replaced by operators. A quantum system may not always have a denite energy, but the energy operator can always be dened. We will see later on under what conditions we reproduce classical formulas like (16). The above is not a derivation of the Schrödinger equation. All I have done is motivate that, given classical formulas such as (12), and given the assumption that particles are also waves like (13), we can write down a formula that reproduces (12) and hence encompasses both principles. But I have not derived it. Then, based on classical mechanics, I made it plausible that (17) is the right generalization to the case of non-vanishing potential. Much of what we will do in this course is solving the Schrödinger equation (17) for various forms of the potential V (x), and nding out precisely under what conditions we can make predictions of the type (16). 1.3 The wave function We will now look in more detail at the interpretation of the wave function. proceed, let me remark that the wave function Ψ Before we is necessarily a complex quantity. We cannot stick to its real or to its imaginary part (as we would do with, for instance, classical electromagnetic waves) because the time evolution will normally develop an imaginary part, even if we start with a purely real wave function. The reason is in the form of the Schrödinger equation (17). Taking the complex conjugate of this equation, we get a dierent equation: −i~ ~2 ∂ 2 Ψ∗ ∂Ψ∗ + V Ψ∗ . =− ∂t 2m ∂x2 Notice the dierence in sign on the left-hand side compared to (17). (19) Thus, Ψ and Ψ∗ satisfy dierent equations, which means that the real and the imaginary part of the wave ∗ function contain dierent information. Roughly speaking, Ψ is the mirror image of Ψ ∗ under t → −t, and if Ψ propagates `forward' in time, we can say that Ψ propagates `backward'. In other words, it is not enough to look for real solutions of the Schrödinger equation as we would be losing essential information. Interpretation. Schrödinger originally thoughtalthough he was careful enough not to 2 ∗ write this in his paperthat the absolute value squared of the wave function, |Ψ| = ΨΨ , could be interpreted as some kind of charge density or particle density distributed over the space. This was analogous to the classical theory of light, where the intensity of the light is proportional to the square of the eld. As it turns out, this interpretation does not conate well with quantum mechanics. with a single Consider doing the double-slit experiment particle. Repeating the experiment many times the pattern that appears 8 is described by the distribution |Ψ(x, t)|2 . Since every time we do the experiment there is just one particle in the set-up, this means that the wave function is not the average particle density in a single experiment, but rather an average over many experiments. ensemble average This is usually called an . It is Max Born who is responsible for the in2 2 terpretation of |Ψ| as a probability density. |Ψ(x, t)| dx is the probability to nd, upon 3 detection, a particle within a small neighbourhood (x, x + dx) of x at time t. Notice the insistence on detection: this is necessary because before we measure the particle we cannot make any assumptions about its location. In the double slit experiment that we discussed earlier, we cannot say which slit the particle went through unless we measure 2 it. This is another reason why |Ψ(x, t)| cannot be interpreted as a `particle density': the particles cannot be localized until we measure them. Let me now further motivate why the probability grows like the square of the wave function, instead of the wave function itself. I have already mentioned that, for waves, only amplitudes such as |Ψ|, and not Ψ itself, can have a physical meaning. This is because Ψ can become negative, even complex (something we don't want for a probability). Now 2 we also must explain why |Ψ| , and not |Ψ|, is the relevant quantity. This follows from the wavelike nature of the wave function, and I will motivate it with three examples: 1. The electromagnetic eld. The intensity of radiation (the intensity of the radiation emitted by, say, an antenna, or the amount of light emitted by a light bulbboth manifestations of the electromagnetic eld) grows proportionally to the square of the electromagnetic eld E, 4 not linearly with the eld . And it is the electromagnetic eld which satises the dierential equations of electromagnetism (called Maxwell's equations), something analogous to our Schrödinger equation. This intensity of light is proportional to the number of photons with the given frequency (and, hence, to the density of these photons). It makes sense that the probability for a particle to have a given frequency be the quantum mechanical quantity corresponding to the number of particles with a given frequency in classical electromagnetism. 2. The harmonic oscillator. We have seen that the Schrödinger equation could be inter- preted as reproducing the relation between the energy and the momentum (12) (for the case of a simple wave like (13)). We extended it to (17) in the case of non-zero potential. So let us actually compute this energy for a familiar classical system with non-zero 1 kx2 and the energy function is potential, the harmonic oscillator. The potential is V = 2 1 1 2 2 H = 2 mẋ + 2 kx . Solving Newton's equation x = A sin(ωt + φ), plugging this back into 3 It is incorrect to say that |ψ(x)|2 is the probability to nd the particle at point x. The probability to nd a particle exactly at point x is in fact zero. On the other hand, the probability to nd the particle in the neighborhood (x, x + dx) is innitesimal and non-zero. The probability to nd a particle in a Rb nite interval (a, b) is P (a, b) = a |ψ(x)|2 dx. From the latter formula we see that, for any point x, Rx P (a, x) = a |ψ(y)|2 dy (notice the dierent name for the dummy variable y ), from which it follows that ∂P 2 2 ∂x = |ψ(x)| . Hence |ψ(x)| itself can be interpreted as the spatial rate of change, or gradient, of the probability at point x. 4 The intensity of radiation is dened as the amount of energy emitted per time per unit area, that is, the power transported across a given unit area perpendicular to the ow. It is measured in units of W/m2 . 9 p k/m, 1 kA2 , indeed 2 the square of the amplitude. So, again, classical energies behave like squares of amplitudes. the energy function and using 3. ω= Conservation of probability. Taking we get the familiar result: P ∝ |Ψ|2 , E= one can prove a `probability conser- vation theorem' completely analogous to the charge conservation theorem in electrodynamics (if charge disappears, it must be taken away by a current, i.e. the time derivative ∂ρ = −∇ · J). The analoof the charge density is minus the gradient of the current, ∂t gous result for probabilities involves the time derivative of the probability (see problem 1.14 of [1]). This only works if one takes the 4. Interference. square of the amplitude to be the probability. If, in view of these arguments, we accept the identication of the prob- ability distribution with the interference appears. square of the wave function, we still have to explain how What I will show now is that interference appears when we add up the wave functions, ψ = ψ1 + ψ2 , rather than the probabilities. Consider again the two-slit experiment, where we now close one of the slits. Call point x ψ1 (x) the wave function at on the screen when only one slit is open (the left one, say) and ψ2 (x) when the other is open. According to the superposition principle of waves, when both slits are open 2 the wave function is ψ(x) = ψ1 (x) + ψ2 (x). Now because P ∝ |ψ1 (x) + ψ2 (x)| , this is dierent from the sum of the probabilities. There is a cross term in the square, and this cross term is responsible for interference. You can easily simulate this yourself by writing the following mathematica program: a = 0.5; b = 5; ] ] psi1[x psi2[x psi[x ] 2 := Exp[-(x+a) /2]; 2 := Exp[-(x-a) /2]; := psi1[x] - psi2[x]; Plot[psi1[x], {x,-b,b}] Plot[psi2[x], {x,-b,b}] Plot[psi[x], {x,-b,b}] 2 2 Plot[psi1[x] +psi2[x] , {x,-b,b}] The outcome of the program is depicted in Figure 1. As you see from the picture, there is an interference pattern that follows when we add up the wave functions rather than the probabilities themselves. The result of adding up the probabilities is in the last picture. You see that in that case interference completely disappears. Hence, it is important that 2 we add the wave functions, ψ = ψ1 + ψ2 (corresponding P ∝ |ψ1 + ψ2 | ) rather than the probabilities (P 6= P1 + P2 ). Hopefully the above arguments have convinced you that we should interpret as the probability of nding the particle within a region (x, x + dx) at time |Ψ(x, t)|2 t. These arguments of course do not provide a derivation; they only make the assertion plausible. Like when we set up the Schrödinger equation, we are now constructing a new theory and we cannot give derivations from rst principles. Otherwise this would be no new theory at all! So we write down tentative equations and try to uncover the fundamental principles. In the end, of course, we will know that the structure is right because it worksit solves many problems that would otherwise be intractable, and it agrees with the results of experiment. Both things tell us that quantum mechanics is right! 10 -4 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 2 -2 4 -4 -2 2 4 2 4 1.5 0.30 0.25 1.0 0.20 0.15 0.5 0.10 0.05 -4 2 -2 4 -4 -2 Figure 1: Two slit experiment (from left to right, and from top to bottom): 1) with only the left slit open; 2) with only the right slit open; 3) interference pattern when both slits are open; 4) classical result one obtains by simply adding up the probabilities. 2 Four Steps to Solve the Schrödinger Equation In this section I will summarize and further explain the steps you have to take to solve the time-dependent Schrödinger equation as outlined in [1]. 2.1 The Problem The basic problem is to solve the time-dependent Schrödinger equation i~ ∂ Ψ(x, t) = H Ψ(x, t) , ∂t ~2 ∂ 2 H ≡ − + V (x) , 2m ∂x2 (20) given a known initial wave function (also called the `initial condition') at time5 t = 0: Ψ(x, 0) = Ψ0 (x) , where Ψ0 (x) is a given function. I make the distinction between (21) Ψ and Ψ0 because Ψ(x, t) is the function we want to solve for (we only know it at one particular point in time, t = 0), whereas Ψ0 is a given function of x. Griths doesn't use Ψ0 , only Ψ(x, 0). The solution to the above problem consists in a four-step procedure. 2.2 Step 1: Reduce to TISE Look for special solutions of the type Ψ(x, t) = ϕ(t) ψ(x) . 5 One may as well choose the initial condition at any other initial time t = tin . 11 (22) These solutions are called completely solve for ϕ separable. Filling this ansatz into the Schrödinger equation, we and nd another equation for ψ: i ϕ(t) = e− ~ Et ~2 d2 ψ(x) H ψ(x) = E ψ(x) ⇒ − + V (x) ψ(x) = E ψ(x) . 2m dx2 Notice that I write partial derivatives ∂/∂t for functions such as several variables, and regular derivatives d/dx for functions variable. Equation (24) is called the ψ(x) (though not ` E, that depend on ϕ Filling the ansatz in (23). This gave us which reappears in (24) and can be interpreted as an energy the energy of the system', a concept which may not always be well dened). Solving (24) will give us called the (24) that depend on a single time-independent Schrödinger equation. (22) into the time dependent equation, we were able to solve for an integration constant Ψ(x, t) (23) ψ(x) as well as the allowed values of the energy spectrum of the Hamiltonian H . E, which is The spectrum depends on the potential V (x). In general, (24) has many solutions (usually an innite number of them). For instance, ψ(x) = sin nx for positive integers n = 1, 2, 3, . . . In this case, the spectrum is called discrete (as there are discrete energy levels labeled by n = 1, 2, 3, . . .). label both the energies and the wave function by n Accordingly, we in this case: H ψn (x) = En ψn (x) . (25) The main problem now is to solve this equation, imposing the appropriate boundary conditions (this will be dealt with in chapter (3) for dierent types of potentials). For ~2 π 2 n2 . instance, for the innite square well it turns out that En = 2ma2 If the spectrum is rather than discrete, the energies are labeled by a continuous continuous variable that we usually call k , the wave number. For instance, for the free ~2 k2 ikx , and the wave functions are ψk (x) = e . In this case, k is the wave particle: Ek = 2m number, with k > 0 for right-moving, k < 0 for left-moving waves. Remember that the wave number relates to the momentum of the wave by the de Broglie formula (11). In this case, the energy is given by (12). To summarize this step, we have reduced the time-dependent Schrödinger equation (a partial dierential equation in t (a second order dierential equation problem. ψn H x) to the time-independent in x). It is useful to think of and Schrödinger equation (25) as an eigenvalue is an operator (an object which, like a matrix, acts on a vector space) and are the eigenfunctions, with corresponding eigenvalues En . Solving this eigenvalue problem gives us, like in linear algebra, both the eigenvalues and the eigenfunctions. We will develop this point of view further when we discuss the formalism. 2.3 Step 2: General Solution of TDSE Once you have the solutions ψn (x) and En of the TISE (25), you can move on to get the general solution of the TDSE (20). It is given by: Ψ(x, t) = X i cn e− ~ En t ψn (x) . n 12 (26) It is straightforward to show that, if the individual solutions (22) solve (20), then a linear superposition of them like in (26) also is a solution. The reason is that (20) is a equation, and its solutions obey the superposition principle: if we have a set of solutions (22), we can add them together with arbitrary coecients solution. cn and the result is also a most But the key point to understand here is that (26) in fact gives us the general solution of the TDSE (20). Indeed, whereas the separation of variables (22) was an ansatz (an assumption, in order words, we only obtained a specic solution), one can show that any solution of the TDSE is in fact of the form (26). To show this in general is a little tricky because it actually depends on the form of the potential function potentials V (x), V (x). But the idea, which we will later on substantiate for specic t x. is as follows. (20) is a rst-order partial dierential equation in which we are solving in a two-step procedure of rst solving for t regard this equation as an ordinary, linear dierential equation in and then for t (i.e. we treat and x So we x as a t derivative), it depends on a single Ψ0 (x) (which I have not yet specied; constant). Since the equation is rst order (only one `integration constant', which is our initial function therefore, for the time being tial equation tells us that, condition Ψ0 (x), then Ψ0 is a generic function). The theory of rst-order dieren- if I can choose cn in (26) such that Ψ(x, t) (26) is the unique solution associated with that satises the initial Ψ0 (x). Remember, a rst order linear dierential equation has only one integration constant/boundary condition (in this case, Ψ0 ). Since Ψ0 (x) was generic to start with, this is the same as saying that (26) is the most general solution (i.e. for any boundary condition) of the Schrödinger t. φ(k), and the sum becomes an φ(k) e− ~ Ek t ψk (x) . (27) equation regarded as a rst-order dierential equation in 1 In the continuous case, we replace n → k , cn → √ 2π integral: 1 Ψ(x, t) = √ 2π Z ∞ dk i −∞ ψk (x) = eikx and this becomes: Z ∞ 2 1 t i kx− ~k 2m dk φ(k) e Ψ(x, t) = √ . 2π −∞ For the simple case of a free particle, 2.4 Step 3: Impose Initial Condition In this step, we choose the coecients cn (28) Ψ0 (x) in (26) such that our initial condition (21) is satised. If we can do this, then we are left with the unique solution of the Schrödinger equation. Physically, what we are doing is we are imposing that the most general solution of the Schrödinger equation Ψ(x, t) agrees with our experimental situation at time t = 0. We have prepared our system (e.g. a system of electrons with spin up) in a particular state Ψ0 (x) at time t = 0. Given this particular state, Ψ(x, t) tells us, via the Schrödinger equation, how the system evolves in time. For instance, H might contain an interaction between the spins and an external magnetic eld, and so some of the spin states of the electrons may change in time. So, H encodes the dynamics of the system. 13 All we have to do is set t=0 in (26) or (27) for the discrete/continuous case, respec- tively. We get: X Ψ0 (x) = cn ψn (x) , discrete spectrum n 1 Ψ0 (x) = √ 2π So in this step, the task is to Z ∞ dk φ(k) ψk (x) , continuous spectrum (29) −∞ nd cn φ(x) such that ψn and ψk . or we have to use the specic knowledge about (29) is satised. It is here where Once we have done this, we have shown that (26)-(27) satisfy the given boundary condition. Before I show how to this in practice, we must reect on the following question: does (29) always have a solution? Can we always be sure that this is somehow the case? Can I always nd Ψ(x, t) cn and φ(k) the answer was yes. cn and Ψ0 (x) to match on the l.h.s.? This is the same as asking whether was in fact the most general solution of the TDSE, and I already anticipated that φ(k) functions. Now we see how this works: indeed, the reason I can always nd ψn (x) to solve (29) is that and ψk (x) form what is called a This means that any (nice enough) function Ψ0 (x) complete set of can be written as a linear 6 superposition of them, precisely in the form (29). Mathematicians have shown this , and we will see that in some specic cases as well during the course. ψn (x) and ψk (x), which means Remember that to solve (29) explicitly we need to know V (x) and solved (24). A very useful example ψk (x) = eikx . Equation (29) then becomes the we have made a choice of potential function to work out is the free particle, where Fourier decomposition of Ψ0 (x): 1 Ψ0 (x) = √ 2π We say that φ(k) is the Z ∞ dk Plancherel's Theorem. Let the real line, i.e. the integrals φ(k) Rf (x) ∞ (30) −∞ Fourier transform Plancherel's theorem, how to nd φ(k) eikx . for Ψ0 (x). Fourier given Ψ0 (x): of theory now tells us, via F (k) be two functions over R ∞ square integrable 2 2 dx |f (x)| and dk |F (k)| are nite. Then the −∞ −∞ and following holds: 1 f (x) = √ 2π Z ∞ dk 1 ⇔ F (k) = √ 2π ikx F (k) e −∞ Z ∞ dx f (x) e −ikx . (31) −∞ This condition precisely allows us to solve for φ(k) in (30): assuming that Ψ0 and φ(k) |Ψ0 (x)|2 is a probability density, it should integrate are square integrable (in fact, since to one!), we see that (30) is of the form of the l.h.s. of (31) if we identify and φ(k) = F (k). f (x) = Ψ0 (x) Therefore Plancherel's theorem tells us that the r.h.s. is true: 1 φ(k) = √ 2π Z ∞ −ikx dx Ψ0 (x) e . (32) −∞ 6 This is valid under certain conditions for the potential V (x), which we will normally assume to be polynomial. 14 Ψ0 (x) at t = 0, we nd the unique innite set of coecients φ(k) that solve the boundary condition This is how step 3 looks like for a free particle. Given any boundary condition (29). 2.5 Step 4: Plug Back into Ψ(x, t) In the previous step we found the coecients cn or φ(k) such that the boundary condition (29) is satised. We now have to plug this back into the general solution of the Schrödinger equation (26) or (27) to get the full solution. Of course, to obtain a completely explicit solution one has to carry out the summation or the integral. Don't take the above four-step procedure as if it were written in stone. Sometimes you may simply be interested in the wave functions the coecients cn , ψn (x) and will not bother to calculate sometimes step 4 is trivial because only a few terms contribute or sometimes you will directly jump to step 3 because you already know the wave functions. 2.6 Example: Gaussian Wave Function We will now apply this procedure to the case of a free particle, which means V (x) = 0. We will take as our initial wave function a Gaussian distribution: Ψ0 (x) = 2a π 1/4 2 e−ax . (33) Some comments about the physics before we start. This wave function represents a par1 ticle localized around x = 0 with standard deviation σ = √ . So when a is large, the 2 a particle is well localized around x = 0, whereas if a → 0, the probability distribution attens and the particle becomes more and more delocalized in space. Using the basic Gaussian integral (85), we check that the wave function is indeed normalized to one: R∞ 2 dx|Ψ0 (x)| = 1. −∞ Step 1. The TISE (24) reduces to: − ~2 d2 ψ(x) = E ψ(x) . 2m dx2 This equation is readily solved. ψ(x) = A sin kx + B cos kx The solutions are sinusoidal. (34) We can write then as or alternately as complex exponentials, as we saw in section 1.1. We will take exponentials: ψk (x) = eikx √ 2mEk ~2 k 2 k = ± ⇔ Ek = . ~ 2m The second line follows from lling the wave function into (34). 15 (35) Step 2. We ll this into the general solution of the Schrödinger equation, which gives us the formula I wrote earlier: 1 Ψ(x, t) = √ 2π Here, we took both the k>0 Z and the ∞ dk φ(k) e 2 i kx− ~k t 2m . (36) −∞ k<0 7 by integrating from solutions into account minus innity to plus innity with arbitrary coecients Step 3. Impose initial condition Ψ0 (x). φ(k). We use Plancherel's theorem directly for the initial wave function at hand: 1/4 Z ∞ Z ∞ 1 2a 1 2 −ikx dx Ψ0 (x) e =√ dx e−ax e−ikx φ(k) = √ π 2π −∞ 2π −∞ a 1/4 Z ∞ −ax2 −ikx dx e . = 2π 3 −∞ (37) The integral can once again be done using the basic Gaussian integral (85). To bring it to this form, we need to cancel the −ikx in the exponent. This can be done by a change 8 of variables. One readily checks that the following does the job : x=y− ik . 2a (38) One way to see why this trick works is to rewrite the term in the exponent as follows: ik −ax − ikx = −ax x + ≡ a (y 2 + C) . a 2 (39) We impose the last equality because we want to get a Gaussian integral, up to a constant C but with no linear term in x. From this form it is more or less obvious that we can complete the square as follows: ik ik ik ik ik ik ik = −a x + − x+ + = −a y − y+ −ax x + a 2a 2a 2a 2a 2a 2a 2 ! ik k2 k2 = −a y 2 − = −a y 2 + 2 = −ay 2 − . (40) 2a 4a 4a In other words, to nd the trick (38) I just add a constant to x such that in the end I 2 2 get (y − c)(y + c) = y − c . This way of reducing integrals of exponentials containing a 7 As in the sinusoidal representation, the general solution for given k is actually ψk (x) = Aeikx +Be−ikx . However, as mentioned in the text, the negative solution is automatically taken into account by the fact that we integrate over both positive and negative k. It is also unnecessary to include the normalization constant A, as overall normalizations are taken care of by φ(k) with which this wave function is multiplied (in other words, A can always be reabsorbed in φ(k)). 8 One should worry about the fact that this change of variables involves an imaginary shift of the ik integration variable, by a factor of − 2a . However, for real and positive a the integral is everywhere nite and it follows from complex analysis that this change of variables can be done. 16 Gaussian piece to a pure Gaussian is an important trick. We also have dx = dy and the integration range is the same. Hence: a 1/4 Z ∞ a 1/4 k2 r π a 1/4 Z ∞ 2 −ax2 −ikx −ay 2 − k4a φ(k) = = dx e = dy e e− 4a 3 3 3 2π 2π 2π a −∞ −∞ 1/4 k2 1 e− 4a . (41) ⇒ φ(k) = 2πa To summarize the trick, we have derived the following generalized Gaussian formula: Z ∞ −αx2 −iβx dx e r = −∞ Step 4. Plug back into Ψ(x, t). π − β2 e 4α . α (42) Now we can plug this back into the solution of the TDSE (28): 1/4 Z ∞ Z ∞ 2 2 ~k2 1 1 i kx− ~k t − k4a i kx− 2m t 2m √ Ψ(x, t) = dk φ(k) e = e dk e 8aπ 3 2π −∞ −∞ 1/4 Z ∞ i~t 1 1 + 2m ikx−k2 ( 4a ). = dk e 3 8aπ −∞ This last integral is of the integral. same (43) type as the one before, it is a generalized Gaussian The only dierence is that we now integrate over apply formula (42) all the same since x and k k instead of x, but we can are dummy variables. We get (see exercise 3.2): Ψ(x, t) = 2a π 1/4 1 q 1+ e − ax2 1+ 2ai~t m . (44) 2ai~t m This is the nal result for the wave function for the given boundary condition (33). As you see, it is a completely explicit function of time. For its physical interpretation, see exercise 3.2. 3 One-Dimensional Potentials 3.1 General Theorems There are some useful theorems that save you work when solving the time-dependent Schrödinger equation with one-dimensional potentials V (x). These theorems are pre- sented in problems 2.1 and 2.2 of [1]. I summarize them here. 1) The assumption that (whenever this is possible) solutions of the Schrödinger equation should be normalizable implies the following: a) E ∈ R. That is, we can always take the energy to be positive. 17 b) E > Vmin . The energy is bounded from below. Remark. Normalizable solutions do not always exist. For instance, plane waves ψ(x) = R∞ R∞ e are not normalizable in the range r ∈ (−∞, ∞) because −∞ dx |ψ(x)|2 = −∞ dx = ∞. But we use normalizable solutions whenever they exist. Whenever a non-normalizable ikx solution is used, one must have a physical motivation for doing that. For instance, in the case of plane waves, it is not surprising that they are non-normalizable because they ll all of space. We regard them as approximations to more realistic situations that can be built by superpositions of plane waves. solution, as in (28). function Ψ(x, t). Superposing plane waves we can get a normalizable Remember that we always only need to normalize the total wave In this case, plane waves are like `fundamental building blocks' of more physical solutions of the TDSE and as such they are useful. 2) In the TISE, ψ(x) can always be taken to be real. A complex solution of the TISE is always a linear combination of real solutions. 3) If the potential is even, V (x) = V (−x), then the even and odd solutions of the TISE can be analyzed separately. In this case, any solution ψ(x) of the TISE can be written as a linear combination of an even and an odd solution. This is useful when we consider bound states because it means that we can solve the Schrödinger equation to one side of the potential (x > 0, say) and we automatically obtain the other side by use of the sym- metry. On the other hand, we often cannot apply this to scattering states, because there is an explicit breaking of the symmetry by the boundary conditions (which are generically dierent at plus and minus innity). See the next section. 3.2 The TISE with one-dimensional potentials As we saw in chapter 2, the rst step in solving the TDSE is reducing to the TISE. Here we concentrate on this step. There are a few things to remember: • Normalizability imposes E > Vmin (1b). Further, one distinguishes bound states and scattering states because their physi- cal behavior is completely dierent and they give independent contributions to the nal wave function Ψ(x, t). Also, we solve the Schrödinger equation separately in each region in which the potential is a continuous, dierentiable function (a smooth function). • Bound states have the following generic behavior which you should check: X Exponential behavior (damped/exponential growth) in the exterior regions. X The behavior in the interior regions (if there are any) depends on the details of the potential. It can be oscillatory (sine, cosine or plane wave) or exponential. See the example in Figure 2. X Symetric potenial: in that case, it is useful to separate even/odd solutions, which leads to sines and cosines (or hyperbolic sines and cosines) in the 18 interior region. Use Figure 2: Potential symmetric around x = 0, with narrow spikes at x=a and x = −a. exponentials in the exterior regions. X Discrete spectrum: in that case, the values of E are limited (sometimes a only a handful of solutions or none, or an innite number of them). X • Solve the energy equation graphically to obtain the spectrum. Scattering states have the following generic properties: X Oscillatory behavior. Use plane waves. X Transmission and reection coecients are interesting quantities. X There is an interesting physical interpretation in terms of waves with which one can probe a potential. X Algebraic manipulations can be heavy if one has to match dierent regions. • Impose (dis-)continuity conditions at each point at which the potential is non-dierentiable (it has a kink or it is singular at those points): X If the potential is everywhere nite, then ψ and can be non-smooth though. X but If the potential has dψ dx are continuous. Such a potential singularities (points where it is innite) then ψ is continuous dψ dx will have discontinuities at the singular points. The discontinuities are known and can be obtained by integrating the Schrödinger equation around the singular points, as in the steps leading to [2.125]. Obviously, an innite potential is only an idealization of a very large one. X If the potential is innite in a whole region (more than a single point), the wave function must vanish there. For example: the innite potential well. 19 4 Fourier Integrals and the Dirac Delta 4.1 The Dirac delta Consider the following Gaussian function with unit area: (x−x0 )2 1 f (x) = √ e− ε . πε R∞ that dx f (x) = 1. The −∞ Using (85), one easily sees (45) standard deviation of this Gaus- p √ sian is σ = /2 (see Exercise 1.3b) of [1]), so clearly ε determines the width of the √ distribution and f (x0 ) = 1/ πε its height. Consider what happens when we make ε smaller and smaller; the peak clearly becomes higher and higher (in order for the function to preserve its area 1) and the distribution becomes narrower and narrower. In the limit, the height of the distribution is innite and its width goes to zero, but the area stays equal to one. We can picture what hapens in √ √ ε and height 1/ ε such the limit as follows. Replace the Gaussian by a square of height √ 1 √ that the total area under the square is ε × ε = 1, i.e. a nite limit. Now since in the limit the height of the distribution is innite but the width goes to zero, we can think of a function δ(x) that takes the following values: δ(x − x0 ) = Z 0 ∞ for for x 6= x0 x = x0 ∞ dx δ(x − x0 ) = 1 . (46) −∞ This is called the Dirac delta distribution. It is called a distribution because it is not a proper mathematical function (it has an explicit divergence at one point; this divergence, however, is mild enough that it can be treated in the more general theory of distributions, which goes beyond the scope of these notes). Our treatment here has been mathematically heuristic, but sucient. We argued that this function is the following limit: (x−x0 )2 1 δ(x − x0 ) = lim √ e− ε . ε→0 πε (47) To check the integration property (46), we rst perform the integral and then take the limit ε → 0. It now also follows that for any function Z f (x): ∞ dx δ(x − x0 ) f (x) = f (x0 ) . (48) −∞ From the fact that the Dirac delta vanishes everywhere except at δ(x − x0 ) f (x) = δ(x − x0 ) f (x0 ) ∀x. Now we can take f (x0 ) x = x0 , we have: outside the integral, and evaluate the integral using (46). The result is (48). The Dirac delta is paramount to many applications in physics, where we want to mimic situations with `spikes'. If we identify t = ε as time, it turns out that (45) satises the heat equation. This Gaussian for instance describes heating up an aluminium bar. Initially 20 at (x t = 0, the = x0 ), but temperature is very high at the point where the material is being heated as the material goes to thermal equilibrium the temperature quickly drops and the temperature prole looks like a blob centered around the origin but spreading in time, as heat exits the center. See Figure 3. Figure 3: Heat distribution, all energy concentrated at x=0 at t = 0, spreading in space as time increases according to a Gaussian distribution. 4.2 Fourier transformations Plancherel's theorem (31) tells us that, for every square-integrable function a unique square integrable function called the F (k) Fourier transform of f (x). from which f (x) f (x), can be reconstructed. there is F (k) is Its physical signicance is that it gives an expansion 2π ikx . of the function f (x) in terms of plane waves e with a well-dened wavelength λ = k Thus, f (x) is written as a superposition of (an innite number of ) monochromatic waves, one for each frequency. Fourier analysis is used in many branches of physics. Decomposing a sound signal into a spectrum of frequencies of sound is an example of Fourier analysis. Filling the second of (31) into the rst, we nd the following relation: Z ∞ Z ∞ Z ∞ Z ∞ 1 1 0 0 −ikx0 ikx 0 0 ik(x−x0 ) dx f (x ) e = f (x) = dk e dk dx f (x ) e 2π −∞ 2π −∞ −∞ −∞ Z ∞ Z ∞ 1 0 0 0 ik(x−x ) = dx f (x ) dk e . (49) 2π −∞ −∞ The integral shifts are allowed because we were careful about distinguishing x from the 0 dummy variable x that is integrated over (and anything to the of the integral is right integrated over). 21 9 Now comparing (49) with (48), we conclude that 1 δ(x − x ) = 2π 0 ∞ Z dk 0 eik(x−x ) , (50) −∞ x ↔ x0 , x0 ↔ x. x = x0 the integrand where in comparing the two we have relabeled Let us evaluate this expression. When x 6= x0 , we can evaluate the integral explicitly: is one, and the integral diverges. When 0 1 eik(x−x ) ∞ δ(x − x ) = =0. 2π i(x − x0 ) −∞ 0 (51) Hence this agrees with our previous denition (46). The expression (50) we have obtained for the Dirac delta is a very useful one. Comf (x − x0 ) = δ(x − x0 ), we nd from the paring it to Plancherel's theorem (31), taking left-hand side of (31) that: 1 . F (k) = √ 2π (52) In other words, the Dirac delta is the Fourier transform of a constant! This agrees with our intuition about position vs. momentum space: a function that is completely localized at some point in space corresponds, under Fourier analysis, to a constant functioni.e. a superposition of all wavelengths, all with the same amplitude. 5 The Formalism of Quantum Mechanics 5.1 Why? If you have read a text on the formalism of quantum mechanics, you might have been left with the question: how is this useful? Why study it? There are some important reasons why it pays o to learn quantum mechanics in this high-brow language of Hilbert spaces: 1) For the simple case of nite Hilbert spaces, the formalism simply reduces to the algebra of matrices (diagonalization of matrices, etc.). 2) In general, the formalism allows for a unied conceptual treatment of continuous and discrete cases by means of linear algebra. 3) The formalism is independent of the particular basis you choose. Describing your system in terms of its positions or in terms of momenta does not make any dierence, just as you can describe a wave either by specifying its spatial prole or by specifying the elementary frequencies that the wave is built up from. It is a choice of basis, and the 9 This equation certainly makes (49) and (48) compatible. But is this the unique solution? In fact, it is. Assume that diered by some function g(x − x0 ). This function would have to R ∞ the 0two 0sides of (50) be such that −∞ dx f (x ) g(x − x0 ) = 0 ∀f ∈ L2 (R). But the only such function is g(x) = 0. Hence the solution (50) is unique (even if for the reasons explained the Dirac delta is not a proper function). 22 formalism shows that quantum mechanics does not depend on a choice of basis. 4) It is conceptually simple: it allows to formulate the postulates of quantum mechanics in a clear and concise way, and in particular it allows for a natural introduction of the measurement postulate. Some approaches to quantum mechanics do not need a measure- ment postulate (or they claim they don't), but inthe standard Copenhagen interpretation this is now the connection between theory and experiment is made. 5.2 The Postulates of Quantum Mechanics Von Neumann gave an axiomatic formulation of quantum mechanics, similarly to what Einstein had done for special relativity. The great advantage of this approach is that, if you want to generalize the theory, all you have to do is modify one or several of its postulates. Similarly, if something turns out to be `wrong' with the theory, the axiomatic structure makes it easier to trace the inconsistency back to one or several of the axioms. By explicitly including a projection postulate we are not only making it possible to interpret measurements, but also making explicit something that might turn out to be a weakness of the theoryrather than hiding it under the rug of the theoretical machinery. So here 10 [3]: are the postulates Postulate 1. State of the System: An ensemble of physical systems is completely described by a function. wave function or state The wave function lives in a Hilbert space, and it may be multiplied by an arbitrary complex number without altering its physical signicance. Postulate 2. Observables: Observables are represented by Hermitian (or self-adjoint) operators, Q̂ = Q̂† . Justication: Hermiticity is the natural notion of `reality' for operators. An Hermitian operator has real eigenvalues hence real expectation values (see the next postulate). Postulate 3. Measurement postulate: The only result of a precise measurement of Q̂ is one of the eigenvalues Justication: when we measure any quantity in the laboratory, tual superposition11 . qn of Q̂. we never observe an ac- We always nd a unique value for the observed outcome of the experimentthe energy, say. Since the eigenvalues are well-dened real numbers associated with an Hermitian operator, they are good candidates for measurement outcomes. 2 The probabilities that, upon measurement, we nd qn , are given by |cn | . 10 Dierent authors sometimes rank the postulates in a dierent order. you will nd a text, even in prestigious journals, where it is claimed that a superposition has been observed. Beware of such metaphorical phrasing! So far no superposition has been observed as an actual experimental outcome. We observe interference patterns after repeating an experiment many times, from which we infer that the state of the system was a superposition. The dierence is subtle, but important. The detector either clicks, or it doesn't! 11 Occasionally 23 Postulate 4. Schrödinger postulate: The time evolution of the system is governed by the Schrödinger equation. Postulate 5. Projection postulate: After a precise measurement of ment) in the state Q̂ with outcome qn , the system is (shortly after measure- ψqn . The justication of this postulate is that repeated measurements of same result if the time interval between them is small. The only state ψqn itself, therefore after measurement the system must be in state Q̂ should give the with P (qn ) = 1 is ψqn . This postulate introduces what we call the `collapse of the wave function'. If prior to measurement the system is in the state: |Ψ(t)i = X cn (t) |ψqn i , (53) n and after measruement it is in state: |Ψi = |ψqn i , (54) we see that we have projected onto one of the components of (53) and lost all of the information about the cn 's prior to measurement. We can neither predict with certainty which state we will nd upon measurement, nor can we, after measurement, use the Schrödinger equation to `trace back' the state (53). 5.3 Linear Algebra In the table below I summarize the main concepts from linear algebra that are used in the description of quantum systems, including a detailed comparison with inner products N in , with their respective notations: C 24 linear algebra in CN wave function Hilbert space Ψ(x, t) |Ψ(t)i |αi ei : vector: basis 1 0 0 .. e1 = .. , · · · , eN = . . 0 0 1 vector |αi in a basis: α1 P .. α= N i=1 ei αi = . αN PN ∗ inner product: hα|βi = i=1 αi βi orthonormal basis: hei |ej i = δij ψn (x) (n = 1, 2, . . . , ∞) P Ψ(x, t) = Rb Rb a n cn (t) ψn (x) ∗ dx Ψ1 (x) Ψ2 (x) ∗ dx ψn (x) ψm (x) = a operator Q̂: linear transformation (matrix): |ψn i δnm β =T ·α ψ = Q̂ Ψ2 R b1 ∗ coecients in a basis: αi = hei |αi cn (t) = dx ψn (x) Ψ(x, t) a ∗ dual vector: hα| Ψ (x, t) |Ψ(t)i = or |ni P n cn (t) |ψn i hΨ1 |Ψ2 i hψn |ψm i = δnm |Ψ1 i = Q̂ |Ψ2 i cn (t) = hψn |Ψ(t)i hΨ(t)| dual basis: eTi T α = = (0, · · · , 0, 1, 0, · · · , 0) (1 on i th place) ψn∗ (x) vector in dual basis: ∗ ∗ T ∗ i=1 ei αi = (α1 , · · · , αN ) PN The Hilbert space considered here is between ∗ ∗ n cn (t) ψn (x) Ψ∗ (x, t) = P L2 (a, b), hψn | hΨ(t)| = ∗ n cn (t) hψn | P that is the square integrable functions (a, b): b Z 2 dx |Ψ(x)| <∞ (55) a where a, b 5.4 Continuous spectra can be nite or innite (−∞, ∞). With appropriate modications, the above formulas also hold in the case of continuous spectra. p̂ = −i~ d/dx x ∈ (−∞, ∞). This p̂ are: For deniteness, we will consider the momentum operator the associated wave functions of a free particle moving along continuous spectrum labeled by p = ~k . The eigenfunctions of ψp (x) = √ i 1 e ~ px , 2π~ where the reason for the normalization will become clear in a moment. 25 and is a (56) R∞ 2 dx |ψp (x)| = ∞. How−∞ ever, as long as we consider dierent p's, we do have the following orthonormality condiThe above wave-functions are not normalizable because tion: Z ∞ 1 = 2π~ ∗ dx ψp0 (x) ψp (x) −∞ ∞ Z i(p−p0 )x/~ dx e = δ(p − p0 ) , (57) −∞ where in the last equality we used the representation of the Dirac delta function introduced in (50), after interchanging for the √ 1/ ~ x and k and introducing p = ~k (this also explains the reason in the normalization of (56). So we get: hψp0 |ψp i = δ(p − p0 ) , (58) which is analogous to the orthonormality condition in the discrete case. Q̂. Let is a continuous variable (z =k Let us generalize this picture to the eigenfunctions of an arbitrary observable us assume that the eigenvalues are labeled by and q(z) = ~k , in the previous example). q(z), where z They come from solving the eigenvalue problem: Q̂ ψz (x) = q(z) ψz (x) . The probability of nding a result |c(z, t)|2 dz , where: q(z) in the range (59) (z, z + dz) at time t is then given by Z Ψ(x, t) = dz c(z, t) ψz (z) c(z, t) = hψz |Ψi . Notice that c(z, t) contains as much information as (60) Ψ(x, t) does. There is a one-to-one correspondence between the two which is brought out by Plancherel's theorem. z = k, specic case then, c(z, t) has a special name, Φ(p, t), In the and it is called the `wave function in momentum space': Z ∞ 1 ipx/h dp e Φ(p, t) Ψ(x, t) = √ 2π~ −∞ Z ∞ 1 −ipx/~ Φ(p, t) = √ dx e Ψ(x, t) . 2π~ −∞ (61) Examples a) If we take Q̂ = H 2 p z = p (the momentum) and q(z) = E(p) = 2m . 1 ipx/~ plane waves, ψp (x) = √ e . Alternatively, we 2π~ for a free particle, then So these wave functions are the usual use the wave number instead of the momentum, z=k and E(k) = ~2 k2 . The result is the 2m same. p (for instance, p̂ ψp (x) = p ψp (x). Hence z = p and the 1 eigenvalues are the momenta themselves, q(z) = p. Hence also ψp (x) = eipx/~ . 2π~ c) If we take Q̂ = x̂, the eigenvalue equation is x̂ ψy (x) = y ψy (x). Then z = y and q(z) = y . The solutions of the eigenvalue equation are delta functions, ψy (x) = δ(x − y). b) If we take Q̂ = p̂ for a particle allowed to move with any momentum a free particle), then the eivenvalue equation is 26 Once we have the basis ψz (x) that solves (59), we can expand the wave function in this basis: Z Ψ(x, t) = dz c(z, t) ψz (x) . (62) In the above examples, this formula reproduces known results: R R 1 ipx/~ a) Ψ(x, t) = dp c(p, t) ψp (x) = √ dp c(p, t) e which is the Fourier transform (61). 2π~ b) Idem. c) Ψ(x, t) = R dy the expansion of 6 R c(y, t) ψy (x) = dy c(y, t) δ(x − y) = c(x, t). This is a `diagonal' Ψ(x, t) contains just one term, namely c(x, t) = Ψ(x, t) itself. basis: Dirac Notation 6.1 Base-free notation In the above examples we have been expanding the same wave function in a dierent basis of wave functions (eigenfunctions of some Hermitian operator). The Hilbert space formalism allows us to do this in a basis-free way, as announced. To that end we introduce the vector |S(t)i ∈ H, where we now think of the Hilbert space as an abstract space without the need to specify a basis. The only requirement is that L2 (−∞, ∞) TDSE. We dene for this vector space the satises the ∞ Z 0 |S(t)i inner product: ∗ 0 dx Ψ (x, t)Ψ (x, t) hS(t)|S (t)i = , (63) −∞ where Ψ(x, t) is the wave function that corresponds to the state representation. |S(t)i in the position We can show, using Fourier transformation, that the above inner product is independent of the basis. Fill in the rst of (61), then we can rewrite the above inner product as (try to show this): Z 0 ∞ ∗ 0 dp Φ (p, t)Φ (p, t) hS(t)|S (t)i = . (64) −∞ In the discrete case (for example, the harmonic oscillator), we nd: hS(t)|S 0 (t)i = X c∗n (t)c0n (t) . (65) n So this inner product is independent of the basis. Using this, we also nd: Z ∞ hψx |S(t)i = ψx∗ (y) Ψ(y, t) = Ψ(x, t) Z ∞ −ipx/~ dx e Ψ(x, t) = Φ(p, t) dy −∞ hψp |S(t)i = √ 1 2π~ hψn |S(t)i = cn e −∞ − ~i En t = cn (t) . 27 (66) This gives a nice interpretation: wave function |S(t)i Ψ(x, t), Φ(x, t) cn (t) and are the overlaps of the abstract with the respective basis, i.e. the projections of that vector onto a particular basis. These bases are often simply denoted |xi, |pi, |ni, hence we can write: Ψ(x, t) = hx|S(t)i Φ(p, t) = hp|S(t)i cn (t) = hn|S(t)i . In other words, these are the coecients (67) c(z, t) that determine `how much q(z) is in |S(t)i' (for the continuous/discrete case). ψp (x) = hx|ψp i = hx|pi With this new notation, we can also write: hx|ψy i = hx|yi. simply the overlap y ', ψy (x) = ψp (x) is amount of x in and This makes the notation much more symmetric and simple. hx|pi, `the amount of x in the state |pi'. hx|yi is `the given by a Dirac delta function. ψx (p) = hp|xi = (hx|pi)∗ = ψp∗ (x), as it should (notice that in Plancherel's theorem we have +ipx in one exponential, −ipx in the other). The overlaps hx|pi contain the information about the spectrum of the particular op1 erator. For a free particle, hx|pi = √ eipx/~ . In other cases, however, this can take a 2π~ Now it is clear that we also have: completely dierent form. Examples a) Harmonic oscillator: in this case, the spectrum of the Hamiltonian is discrete, En = ~ω n + 21 whereas the spectrum of x̂ is continuous and innite, x ∈ (−∞, ∞). The overlaps are now hx|ni = ψn (x) (the harmonic oscillator wave function). φ ∈ [0, 2π]. Here, the spatial coordinate is continuous (x = aφ nπ . The overlaps are: with a the radius of the circle) but momenta are quantized, pn = a 1 inφ √ hφ|ni = 2π e . b) Free particle on a circle, 6.2 Closure Since now we have a basis-free notation, we can notice that, for f, g ∈ H, the following holds: Z hf |gi = dx f ∗ Z (x) g(x) = Z dx hf |xihx|gi = hf | dx |xihx|gi In the last step, we pulled the integral inside the inner product because x; i.e. it does not depend on as well we regard R vector which we obtain by multiplying the vector integrating over x. dx |xihx|gi |xi = R . (68) f is base-free, dx g(x) |xi as a new with the number g(x) = hx|gi and Now let us dene: hf |Q̂|gi ≡ hf |Q̂ gi . (69) (68) can now be written as: Z hf |gi = hf | dx |xihx| |gi . 28 (70) We can regard the part in the middle, R dx |xihx|, as an operator that acts on produces another vector. Since this equation holds for all f, g ∈ H, |gi and this operator must be the identity: Z dx |xihx| This is called the =1. (71) closure relation and indicates that the basis is complete. Multiplying 0 |pi and hp |, this gives: Z Z 0 0 0 ∗ hp | dx |xihx| |pi = dx ψp0 (x) ψp (x) = hp |pi = δ(p − p ) , on the right and left with (72) 0 which is indeed true for instance for a free particle. If we multiply with |yi, hy | instead, R R ∗ 0 0 we get: dx ψy 0 (x) ψy (x) = dx δ(x − y ) δ(x − y) = δ(y − y ) which is again true. In the same way we can derive closure for p: Z dp |pihp| Multiplying both sides with 6.3 |xi, |x0 i, =1. (73) this is the same as R ∗ dp ψx0 (p) ψx (p) = δ(x − x0 ). Bras and kets Something funny has happened. The notation (69) gave rise to operators such as R dx |xihx|. These operators act on a state to give a new state. But what do we really mean by more generally, hα|? hx| or, The denition of these states was given in (69). We can in fact think hΨ| as the `complex conjugate' of |Ψi. hΨ| is called a `bra', so that we can form of For obvious reasons, |Ψi is called a `ket' and an inner producta bra-ketby multiplying 3 3 the two as in a dot product. Here our analogy with and comes in handy. In usual 3 the standard inner product is in fact the dot product: R R C vx hw|vi = wx vx + wy vy + wz vz = (wx , wy , wz ) · vy . vz In (74) C3 : vx hw|vi = wx∗ vx + wy∗ vy + wz∗ vz = (wx∗ , wy∗ , wz∗ ) · vy . vz (75) Q = 1, the identity operator), we regard hΨ1 |Ψ2 i as the hΨ1 | and |Ψ2 i: hΨ1 |·|Ψ2 i ≡ hΨ1 |1|Ψ2 i = hΨ1 |Ψ2 i. The bra corresponds to In the same way, via (69) (with `dot product' of the (complex conjugate) row vector, the ket to the column vector. They are each other's `duals'. Remark (optional). One can show that the bras form a vector space, in the same way 29 dual ∗ In fact, this is the vector space to H, and it is often denoted H . By ∗ denition, the dual vector space H of a vector space H is a linear (non-degenerate) map- the kets |αi do. ping from H → C. Indeed, hα| maps vectors in H to C by the inner product: hα|βi ∈ C. This mapping forms itself a vector space. |αihβ|, this is an (|αihβ|) |γi = |αihβ|γi = hβ|γi |αi. Now when we put together another vector operator as it maps a vector Here, hβ|γi |γi to is a number so it doesn't matter if you write it left or right. On the other hand, the order does matter in |αihβ|: |αihβ| = 6 hβ|αi! 7 The Interpretation of Quantum Mechanics We have seen that quantum mechanics gives us a probabilistic interpretation of physical quantities: it is not always possible to determine the outcome of a measurement with absolute certainty, but we can predict the possible measurement outcomes and their respective probabilities in measurements on an ensemble of identical systems. The only case determinate where the outcome of a measurement is unique is when the system is in a state and we measure the corresponding observable. If the system is in an eigenstate of an operator value qn Q̂ with eigenvalue qm , the formalism tells us that the probability to nd the upon measurement is 1 for n=m and 0 for all other states. If Q̂ is the position operator, then the particle is well localized; but the momentum is not well dened in that case, as the uncertainty principle expresses. x̂ This is because the operators and p̂ do not commute. So the formalism tells us that there is no state that describes all possible observables simultaneously, i.e. no state in which all possible measurable quantities have well-dened values. We can get used to the above interpretation of quantum mechanics, but there is something unsatisfactory about it. Imagine carrying out a two-slit experiment with photons. Given knowledge of the initial wave function and of the interactions, we can predict the intensity distribution of light on the detection screen. If we decrease the intensity of the source so that only one photon at the time goes through the slits, we can predict the probability that that detection of the photon will take place in a particular region on the screen. But there is something counterintuivite about this. Which slit did the particle actually go through? Quantum mechanics does not tell us the answer. If we try to mea- sure which slit the photon goes through, the interference pattern disappearsknowledge of which slit the particle went through destroys the quantum superposition. How can the act of measurement be decisive here? Is a measurement any dierent from other physical interactions? If the photon did not have a position before it was measured, but it does have a well-dened position when it is detected, it would seem as if the act of measurement is able to give particles properties they did not possess before. How can a measurement be so decisive as to the presence of a physical property? What makes it so unlike other physical interactions? This set of questions, and their proposed solutions, which we will next turn to, is what 12 . It is seen people usually call the problem of the `interpretation of quantum mechanics' 12 In writing this chapter, I have drawn ideas from [4] 30 as problematic because it dees our classical intuitions about how physical systems are supposed to work. 7.1 EPR and Hidden Variables In a 1935 article that was meant to be a death blow on quantum theory as a fundamental theory of reality, Einstein, Podolsky, and Rosen claimed that quantum mechanics was incomplete. Imagine a pair of particles that are initially in contact, at rest at the origin, and then go separate ways until their mutual distance is very large. For instance, you could think of an isotope that decays into two sub-particles that shoot on to dierent sides, preserving the total momentum. Notice that, quantum mechanically, we cannot determine the position and momentum of the individual particles because [Q̂1 , P̂1 ] = i~, [Q̂2 , P̂2 ] = i~. However, we can determine the center of mass position of the system Q̂1 + Q̂2 as well as the relative momenta P̂1 − P̂2 because these operators do commute: [Q̂1 + Q̂2 , P̂1 − P̂2 ] = [Q̂1 , P̂1 ] − [Q̂2 , P̂2 ] = i~ − i~ = 0 (where we used the fact that the operators of the two particles commute). Now imagine that we measure the position of the rst particle, q1 , once they are far apart. Since we know the location of the center of q2 , the location of the second particle. But if p2 , the momentum of the second particle, we may also infer p1 mass, we can infer from this measurement we simultaneously measure in virtue of the fact that we know the dierence in momenta. But now, EPR concluded, since the two particles are very far apart, one measurement cannot inuence the other. This would violate the result of relativity that there can be no inuences that travel faster than the speed of light (in Einstein's words, there can be no `spooky action at a distance'). Hence we can, by means of independent measurements, have complete knowledge of the positions and momenta of the two particles. Since the outcome of the location of particle 1 cannot aect (by relativity) the location of particle 2, this means that particle 2 must have had a well-dened position prior to measurement even though quantum mechanics tells us it didn't. A similar argument can be made for the momentum. Of course this is alll in contradiction with the predictions of quantum mechanics. So measurements can give us complete knowledge about these properties of particles, but the theory doesn't. Therefore, EPR concluded, quantum mechanics is an incomplete theory. Before trying to rebute the EPR argument, it is natural to ask the following question: could it be that quantum mechanics is indeed missing some piece of information? Is it possible to extend quantum mechanics into a more predictive theory? Such attempts are called `hidden variable theories': assuming that the position and momenta of particles have precise values before measurement amounts to introducing some variable that remains hidden to quantum mechanics, but determines those values before we measure them. One such attempt was carried out by David Bohm in 1952. In Bohm's theory, particles have well-dened values of positions and momenta, and the predictions of quantum mechanics are statistically reproduced. However, in order to achieve this, Bohm has to add a local interaction potential between the particles. non- This potential is called non-local because it acts at a distance; when a particle changes momentum or position, its interaction with all other particles changes. Whereas it is not clear whether there is a contradiction with special relativity (as the interactions assumed by Bohm cannot be directly used to transmit information faster than light), it is clearly an unwarranted feature of the theory. 31 For this reason, theorists have looked for local hidden variable theories, that is, hidden variable theories where interactions do not propagate faster than light. The main criticism to Bohm's theory, however, has been that it merely adds theoretical structure without any predictive gain, as its predictions are compatible with quantum mechanics and all experimental results agree with quantum mechanics. Furthermore, in order for the theory to reproduce the results of quantum mechanics, ad hoc assumptions about the distribution of the hidden variables have to be added. In 1964 John Bell showed that local hidden variable theories are inconsistent with the results of quantum mechanics. That is, that any hidden variable theory of the local type makes experimental predictions which subtly deviate from those of quantum mechanics. The corresponding experiments were carried out in 1982 by Allain Aspen and the predictions of quantum mechanics, rather than those of hidden variable theories, were found to hold. These results have been conrmed in more rened experiments later on. Hidden variable theories are sometimes identied with e.g. section 1.2 of [1]). realist interpretations (see This is a misnomerwhereas the hidden variable theory can certainly be regarded as an (extreme) realist position, there are realist positions that do not require particles to have well-dened properties (see the next section). 7.2 Bohr's Reply to EPR Bohr's reply to the EPR article is all but a clear-cut physical argument. Instead, it is a piece of philosophical discourse (and rather obscure at that) the main message of which seems to be that what we call `position' and `momentum' cannot be detemined a priori, but essentially depends on the measurement context in which these concepts are dened. The measurement process has, in Bohr's view, an essential inuence on the conditions for the denition of physical quantities. Since the conditions of measurement play an essential role in dening what the call the `physical reality' (and, as part of that, the concepts of position and momentum), one cannot infer a conclusion about the supposed incompleteness of quantum mechanics: as far as the physical phenomena are concerned, there simply is nothing else to describe than either position or momentum (but not both). In more practical terms, one could say that Bohr's position amounts to saying that the particular experimental context determines whether the concepts `position' or `momentum' make sense. If we measure the position, there is no sense in which we can meaningfully talk about the `momentum' of a particle: this concept is simply not dened. Second, whereas Bohr denies that there are `spooky actions at a distance', he remarks that EPR's conclusion that one can infer simultaneously the position and the momentum of the particle is incoherent. Once position is measured on the rst particle, this concept is applicable to the second particle as well. A measurement of the momentum of the second particle creates a new measurement context which automatically introduces an uncertainty in the position, which is now no longer applicable. We have therefore not determined the position and momentum of particle two, but simply measured uncorrelated properties of the second particle. By 1927, Bohr's position had blended with the views of his younger colleages Heisenberg, Pauli, and Dirac, into the dominant paradigm in the interpretation of quantum mechanics, known as the `Copenhagen interpretation'. The fact that Bohr's texts on this 32 matter are rather oracle-like and obscure and that his pupils developed related, but not entirely coinciding accounts of quantum mechanics 13 , have contributed to a lack of clarity as to how `the' Copenhagen interpretation should be understood. Dierences in interpretation between Bohr, Heisenberg, Pauli, and Dirac nonwithstanding, the Copenhagen interpretation does seem to oer a genuine reply to the EPR argumentthis being the reason that it is widely accepted. The interpretation, how- ever, does not come without a cost. At least two issues in the Copenhagen interpretation require further thought: 1. As a philosophical interpretation, it talks about the measurement context as posing restrictions on the class of macroscopic concepts (such as position, momentum, etc.) that can be applied to a microscopic system at any given time. However, the philosophical interpretation does not tell us how this should work. This necessity of invoking the macroscopic context in dening the concepts that are applicable to the microscopic phenomena, whilst upholding that the quantum mechanical description is complete, seems to imply the existence of a fundamental boundary between the macroscopic and the macroscopic. It is not at all clear what physical principle, if any, denes this boundary. 2. Bohr's sketched solution to the EPR paradox does not imply action at a distance, but does require an explanation of how measurement of particle 1 inuences the applicability of the concept of `position' to particle 2, and the actual determination of this position. So measurement still seems to play a special role hereit is not simply regarded as an ordinary physical interaction. 7.3 The Measurement Problem When trying to translate Bohr's philosophical account of quantum mechanics into a physically workable model, one runs, as mentioned, into the fact that it seems hard to regard measurement as an ordinary physical interaction. Indeed, in Bohr's view, a measurement does much more than simply `giving the value of a variable': it sets the conditions under which one can meaningfully talk about that variable. In order to get a grasp of how deep this problem runs, we will here give a model of classical and quantum measurements and compare them to each other 14 . Measurement in Classical Mechanics M (a scale to measure your R on this device (the value of the scale pointer). Consider also a A corresponding to S (your mass) with possible values a ∈ {a1 , · · · , an } ∈ R (ai Consider a system S (for instance, you), a measuring device weight) and the readings quantity 13 For instance, Heisenberg used the Aristotelian concepts of act and potency in his accounts of the measurement problem. This is quite dierent from the Kantian avor of Bohr's remarks, as well as from the instrumentalist interpretation that the `Copenhagen interpretation' has often been given. 14 This model goes back to von Neumann, but I am very much indebted to Jos Unk for the formulation presented here. 33 for short). Corresponding to these values there are readings ri on the scale which indicate your weight. If the scale works properly, then there is an invertible function the reading ri for given mass m that gives ai : ri = m(ai ). You are standing in front of the scale. Since there is nothing on the scale, the pointer is in its rest position, which we call tion, your weight ai , r0 . We have a pair of numbers describing this situa- and the at scale pointer: (ai , r0 ). Once you step on the scale, these numbers change. Your weight does not change, but the pointer position does: we have a (ai , ri ) = (ai , m(ai )). Now the values of ai and ri are correlated via the function m. −1 Reading o your weight ri from the scale, and applying the function m to it, you nd −1 your mass: ai = m (ri ) = 70 kg. Notice that, as an abstract property about you, your pair mass is unmeasurable; we measure your weight on the scale and use it to infer the mass. Measurement in Quantum Mechanics S M A Consider now a quantum system with a property sured by a measuring apparatus (a detector, a phosphor plate, a photon camera). To A (e.g. energy, or spin) to be mea- A with eigenvalues {a1 , · · · , an }. To these eigenvalues correspond states |a1 i, · · · , |an i ∈ HS , the Hilbert space of the system. Since we want to model measurement physically, we are going to associate an operator R to the masuring apparatus. R gives the possible `readings' of the detector (numbers on a computer screen, dots on a phosphor surface, etc.), which can take values r0 , r1 , . . . , rn (the value r0 denoting, as before, `no detection'). So the Hilbert space HM of the measuring apparatus contains one more basis state: |r0 i, |r1 i, · · · , |rn i. We regard |r0 i as the ground state, the state that indicates that nothing is being measured. We assume |ai i (i = 1, · · · , r ) and |ri i (i = 0, 1, · · · , n) to form orthonormal bases of HS and HM , respectively. Before detection, S is in a state |ai i (for some i) and M in the state |r0 i. So the total state of the system is the product of states |ai i|r0 ithis is the quantum analog of the statement that you and the scale are described by the pair (ai , r0 ). Now we want to measure A using R. By the measurement interaction, the system M will undergo a change from |r0 i to |ri i, and the latter should be indicative for the state |ai i of S . For an ideal measurement, the system S itself should not change. So we have corresponds an operator the transition: |ai i|r0 i → |ai i|ri i . (76) Our task of nding out `whether measurement is a physical interaction' now amounts H |ai i|ri i. to asking whether there is a Hamiltonian initial state |ai i|r0 i to a nal state that describes (76) as a transition from an In order to eliminate technical details, we will go about this question as follows. It turns out that this can be translated into the question whether there is a linear operator 15 U such that: U (|ai i|r0 i) = |ai i|ri i . (77) We can think of this operator as eecting the Schrödinger time evolution with Hamiltonian H . If U exists, there is a Hamiltonian H S and the measuring apparatus M . 15 As that contains the interaction between the system it turns out, such an operator also has to be unitary, meaning U U † = 1. 34 In fact, the answer to this question is: yes, such an operator U exists (see the appendix for more details). In principle this is very nice, because it means that: 1. We have succeeded in describing measurement as a physical interaction via a unitary operator. 2. No measurement postulate is needed. 3. There is no distortion of the state of the system However, the fact that works if S U ai . is linear has some strange implications. The above really only is in an eigenstate |ai i of A, as we have assumed above. What if the system S is in a superposition? |ψi = X cj |aj i . (78) j Let us apply our operator to this state. After measurement, we get a nal state: ! |ψnal i = U (|ψi|r0 i) = U X cj |aj i|r0 i = j X cj U (|aj i|r0 i) = X j cj |aj i|rj i , (79) j U . You see the important consequencethe system is no |ai i|ri i with denite value of i (corresponding to one eigenvalue where we have used linearity of longer in a product state of A) after the measurement! Instead, the nal state is a system and the measuring apparatus! superposition of the measured This means that you and the scale are forever entangled after you step on it! So which state is this system in? We simply cannot tell: 2 it might be any of |aj i|rj i with probabilities |cj | . Could we solve this problem by bringing in another measuring device that will nd out what state you and the scale are in? It is not hard to see that that second measuring device will again become entangled with S and M in this way. The inescapable conslusion is that, once we regard measurement interactions to be described by operators U acting as in (77), the whole universe becomes entangled. Of course, this is nothing but Schrödinger's cat in disguise. Once we start with a superposition somewhere in the universe, and if all interactions are given by `nice' linear operators, it will not be long before the whole visible universe will nd itself in a superposition that includes S and M. So this does not give us a way of explaining how measurements give denite values after all. This further motivates von Neumann's description of measurement not as a `normal' unitary operator U, but as a `special' projection operator Pi that projects unto some particular state: if we start with |ψi = X cj |aj i (80) j then after measurement of eigenvalue ai the system will project to the following state: Pi |ψi = ci |ai i . 35 (81) Since |aj i are linearly Pi has to act operator independent, we see that in order for this to work the projection on a basis as: Pi |aj i = δij |ai i . (82) We then also get for our nal state: Pi |ψnal i = ci |ai i|ri i . (83) Notice that the Pi 's are not unitary because the nal state is not normalized to 1: ci |ai i|ri i |ci |2 , hence this is not a regular physical interaction of the type described by the has norm Schrödinger equation because it does not preserve total probability. We have to rescale the state again in order to normalize it to 1. It can also be shown that projections do not conserve energy. 7.4 Other Approaches 7.4.1 Decoherence The basic idea of decoherence is quite simple: the reason for the seemingly non-unitary evolution (81) is in the interactions with the environment. The evolution (81) looks as if it was non-unitary, but in reality (i.e. if we would be able to include all the very complicated interactions with the environment) we would see that it is unitary. The wave function appears to collapse, but in fact it satises the Schrödinger equation. Some parts of the wave-function, which should be present in (81), have been projected out simply because their coecients cj are very small. So we are dealing with a kind of thermodynamic irreversibility here in which the `thermal bath' provided by the environment washes away some of the information. Another way of saying this is that, after measurement, the wave function (after rescaling by ci ) does not look like (81), but actually looks like: |ai i|ri i + . . . , (84) where the dots relate to terms with low probability. The wave function is actually entangled, but for all practical purposes it looks as if just one term in the superposition contributes. The reason for this is that we neglect inteactions with the environment in the Hamiltonian. In my opinion (but some of my colleagues may disagree on this) this is not strictly speaking a solution to the fundamental problem we posed of why we never see superpositions. The magic words above are `for all practical purposes': decoherence is a solution that works in practice, but, in fact, the whole universe is entangled in the state (84) (since the environment includes the whole visible universe). So it does not solve the matter of principle we raised: how do we pass, from a formalism which predicts universal entanglement, to states in which particles have unique propertiesdeterminate states? Simple decoherence just tells us that the determinate states are the most probable ones, but it does not explain how to actually obtain them. Unfortunately, long discussions on texts on decoherence fail to address this fundamental point. 36 Perhaps the answer to this is that we should not interpret the wave function to describe the world as it actually is, but only as a collection of possibilities. The wave function tells us what is possible and probable, but not what is actual. 7.4.2 Many Worlds I just suggested that the wave function maybe only describes possibilities, not actualities. At the other extreme of the spectrum there is the many worlds interpretation, which assigns objective actuality to all of the terms in the wave function (whether they decohere or not), i.e. to all the summands in (84). It replaces the collapse of the wave function by its branching o: at each single measurement, the world branches o into distinct possibilities, so that every possible outcome is realized in a dierent world. This reconciles the appearance of non-deterministic events (such as random decay of an atom) with deterministic equations such as the Schrödinger equation. This may sound crazy, but you would be surprised by the growing number of physicists who actually support this interpretation. There is some theoretical support for the manyworlds interpretation coming from the `histories approach' to quantum mechanics as well as from some recent puzzles in quantum cosmology. Perhaps the fact that physicists are willing to support such a crazy idea is simply an indication of how bad the measurement problem sits with them. I see a simple puzzle for the many worlds interpretation which never seems to be discussed in the literature. Consider a harmonic oscillator. It is intuitively clear how the many worlds interpretation would work for this system, since its spectrum is discrete. If a world with wave function ψ2 , etc. P n cn ψn , there is a world with wave function ψ1 , (up to innity). But this only works because the the wave function is in a superposition spectrum is discrete. If the spectrum is continuous and labeled by, say, the wave number k, then there is a continuum of worlds between any two nite values of k. This seems nonsensical because it is well-known from mathematics that we cannot label the real numbers using natural numbers. The latter is an innite but countable set. The former is not only an innite, but an uncountably innite set! If we cannot even count the number of worlds, how can they all have separate existence? The concept of `continuum' seems counter to the idea of all these worlds having `separate existence'. So to me, this seems like nonsense. A way to resolve this puzzle could be to turn all continous spectra in quantum mechanics into discrete ones, by introducing a new scale so small that we cannot see it, so that particles `seem to have continuous momentum k' but are actually always quantized and their energy levels are so dense that we cannot tell them apart. But then again: 1) This is never discussed in the many worlds literature; and 2) It amounts changing quantum mechanics into a dierent fundamental theory, which runs counter to our initial aim of simply interpreting quantum mechanics! (which we assumed to be a to complete theory). 37 A Mathematical Formulas and Tricks A.1 • Gaussian integration The basic Gaussian integral is: ∞ Z −αx2 dx e r = −∞ π . α (85) To show this, we compute the integral in two dierent ways. Consider the area under the −r2 on the plane in plane polar coordinates: function e Z −r2 dA e Z ∞ Z dr = 0 2π ∞ Z −r2 dθ r e = 0 0 du 2 −u 2π e Z ∞ =π −u du e 0 u = r2 . Now compute this same integral in x = r sin θ, y = r cos θ: Z Z ∞ Z ∞ Z ∞ Z ∞ −(x2 +y 2 ) −x2 −y 2 π= dx dy e = dx e dy e = −∞ −∞ = −π e 0 where we dened −∞ −∞ ∞ = π ,(86) −u Cartesian coordinates ∞ −x2 dx e 2 . (87) −∞ Taking the square root, it follows that: Z ∞ −x2 dx e = √ π. (88) −∞ To obtain the Gaussian integral with generic width, we simply rescale x→ √ αx in the integrand, which gives back (85). • From (85) we can also compute the following integral: Z ∞ −αx2 −iβx dx e r = −∞ π − β2 e 4α . α (89) For the proof of this, see section 2.4. • Using (85), we can also compute integrals of the following type: Z ∞ n dx x 2 e−αx . (90) −∞ n is odd. The reason is that in that the integrand is an odd function under x → −x. The integral of an odd function a range (−a, a) is zero. Since we integrate from (−∞, ∞), the result of this integral First of all, we notice that this integral vanishes if case over is zero. When n is even, we use the partial integration formula: Z D Z u dv = uv − D 38 D du v , (91) D denotes the domain of integration. Noting that d(e −αx2 /(−2α). Applying (91), we get can take u = x and v = e where Z ∞ 2 dx x e −αx2 −∞ −αx2 2 ) = −2αx e−αx dx, r √ Z ∞ 2 2 x e−αx ∞ e−αx 1 π π = dx − = = 3/2 , −2α −∞ −2α 2α α 2α −∞ we (92) where the boundary term vanished because the Gaussian function vanishes at innity more rapidly than any polynomial. We can get the result for higher powers of x by successive partial integrations. The result is: ∞ Z 2n −αx2 dx x e −∞ B B.1 √ (2n)! 2 π . = n! (4α)n+ 12 (93) Technicalities of Quantum Mechanical Measurements Time Evolution Operators Remember that the problem of quantum mechanics amounts to nding a solution Ψ0 (x). of the TDSE (20) given an initial condition We can recast this problem in the language of operators, as follows: nd a linear operator U such that: U : Ψ0 (x) 7→ Ψ(x, t) = U (t) Ψ0 (x) satises the TDSE. Here, U Ψ(x, t) is a time-dependent operator called the (94) evolution operator. In a particular basis, and in the discrete case, it maps: X cn ψn (x) 7→ U i cn e− ~ En t ψn (x) . (95) n n The operator X is easy to nd: − ~i Ht U (t) ≡ e ∞ X − ~i Ht ≡ n! n=0 where the exponential, which includes an operator (the n , (96) nth power of the Hamiltonian) has been dened by its Taylor expansion. Although this requires some care, Taylor expansions can indeed be dened for linear operators (as for matrices) analogously to how they are dened for numbers: Q̂ e ≡ ∞ X (Q̂)n n=0 Now the claim is that to If U (t), n! . as dened in (96), does the job in (94): when (97) U (t) Ψ0 (x), it gives the solution of the TDSE with that initial condition. Let us Ψ(x, t) is to be given by (94) with U as in (96), it must solve the TDSE: i~ ∂Ψ(x, t) ∂U (t) = i~ Ψ0 (x) . ∂t ∂t 39 is applied check this. (98) Let us compute the time derivative: ∂U ∂ − i Ht = e ~ = ∂t ∂t i i i − H e− ~ Ht = − HU (t) . ~ ~ (99) Filling this back into (98): ∂Ψ(x, t) i i~ = i~ − H U (t)Ψ0 (x) = H Ψ(x, t) , ∂t ~ (100) which is precisely the TDSE. Hence, we have shown that (96) does the job of solving for us the TDSE. Let us now see what these formulas mean in practice. condition P Ψ0 (x) = n cn ψn (x), If we are given the initial then: ! U (t) Ψ0 (x, t) = U (t) X cn ψn (x) = n X = X cn (U (t) ψn (x)) = X n − ~i cn e En t i cn e− ~ Ht ψn (x) n ψn (x) = Ψ(x, t) , (101) n H satises the TISE, Hψn = En ψn , and hence we can replace any powers of H by powers of En when they act on ψn (also in the exponential, where we have used the fact that when by using the Taylor expansion we can see this term by term). Now the last expression in (101) is indeed the usual solution of the TDSE. U happens to be a unitary operator, i.e. U U † = U † U = 1. This can be shown by computing the adjoint: i † † i i U † = e− ~ Ht = e(− ~ Ht) = e ~ Ht . H where we have used the fact that (102) is Hermitian and that, again, to compute the adjoint of the exponential, we can use the Taylor expansion, compute the adjoint term by term, and resum the Taylor series. Since we get a plus sign in the exponential, when we multiply it with U which contains a minus sign, the two cancel out and we U is unitary, it preserves amplitudes (and probabilities): get the unit matrix. Because hΨ|Ψi = hU Ψ|U Ψi = hΨ0 |U † U Ψ0 i = hΨ0 |Ψ0 i . B.2 The Measurement Operator (103) U The measurement operator we were looking for in (77) can be explicitly written as follows: U= X |ak i|rj+k ihrj |hak | . (104) jk It is not hard to show that of projectors onto A U indeed satises (77). In fact, this operator is a combination and onto R: U= X (a) (r) Pk,k Pj,j+k jk 40 (105) where (r) Pi,j |rl i = δil |rj i (a) Pi,j |al i = δil |aj i , and P (a) Proof: is dened similarly to act on We simply let U (106) |al i. act on (77): U (|ψi|r0 i) = X cj U (|aj i|r0 i) . (107) j We compute the r.h.s. separately using the decomposition into projectors (we do some relabeling of indices): X cj U (|aj i|r0 i) = X = X j l cl X (a) (r) Pk,k Pj,j+k |al i|r0 i jk (a) (r) cl Pk,k |al i Pj,j+k |r0 i = X ck |ak i|rk i , (108) jkl (r) where in the second equality sign we used the fact that P acts only on eigenstates of R (a) and P only on eigenstates of A. In the last formula we used the property (106). This is precisely what we wanted to show. References [1] D.J. Griths, Introduction to Quantum Mechanics, Pearson, 2nd Edition. [2] D.C. Giancoli, Physics: Principles with Applications, Pearson, 6th Edition. [3] B.H. Bransden and C.J. Joachain, Quantum Mechanics, Addison-Wesley, 2nd Edition. [4] D. Dieks, Filosoe/grondslagen van de natuurkunde, Utrecht University, 2008-2009. 41