Excerpt. - Dover Publications

Transcription

Excerpt. - Dover Publications
Chapter 10
PROOFS OF IMPOSSIBILITY
AND POSSIBILITY PROOFS
There are proofs of impossibility and possibility proofs, a situation that
sometimes produced a bit of confusion in the debates on hidden-variables
theories and sometimes led to something worse than confusion. So, we
have to clarify it.
Proofs of Impossibility
von Neumann
Brief Beginning of the History
In his famous book, providing an axiomatization of quantum mechanics,
von Neumann [26] introduced in 1932 a proof of impossibility of hiddenvariables theories, the famous “von Neumann’s proof.” This were a few
years after the Solvay Congress of 1927 during which de Broglie’s pilot
wave was defeated. Although de Broglie had already renegated his work,
and rejoined the camp of orthodoxy, the publication of the proof certainly
confirmed that he had lost time with his heretical detour. Furthermore, in
the words of Pinch [154], “the proof was accepted and welcomed by the
physics elite.” The continuation of the story shows that the acceptance was
not based on a careful examination of the proof but rather on other elements
such as a psychological pressure exerted by the very high reputation of its
author or the unconscious desire to definitively eliminate an ill-timed and
embarrassing issue. Afterward, for a long time, the rejection of hidden
variables could be made just by invoking von Neumann (the man himself
more than the proof). As stated by Belinfante [1], “the truth, however,
happens to be that for decades nobody spoke up against von Neumann’s
arguments, and that his conclusions were quoted by some as the gospel …
the authority of von Neumann’s over-generalized claim for nearly two
decades stifled any progress in the search for hidden variables theories.”
Belinfante also [1] remarked that “the work of von Neumann (1932) was
mainly concerned with the axiomatization of the mathematical methods
Proofs of Impossibility and Possibility Proofs
229
of quantum theory. His side remarks on hidden variables were merely an
unfortunate step away from the main line of reasoning.” This was more
than unfortunate, however; it was erroneous, and even deeply erroneous,
“because of the obviousness of inapplicability of one of von Neumann’s
axioms to any realistic hidden variables theory [1].”
As far as I know, the first person to challenge the validity of von Neumann’s proof was the philosopher Grete Hermann in 1935 [117], three
years after von Neumann’s book, followed nine years after, in 1944, by
another philosopher, Hans Reichenbach [134], according to Pinch [154].
As noted by Bitbol [133], and can be checked in the original reference,
recently republished with enlightening complementary discussions by
Léna Soler, Hermann correctly identified the weakest point of von Neumann’s proof, namely, the misleading use of an additivity condition of
average values (which are soon to be discussed), pointed out by Bell again
in 1966 [188]. Furthermore, she charged von Neumann’s proof with the
accusation of circularity.
Likely because they were philosophers, Hermann and Reichenbach
seemingly did not have much influence on the community of physicists.
Very often, they are even forgotten or dismissed. For instance, Pipkin [156]
could erroneously write that it was Bell, in 1966, “who first pointed out
the axiom by which von Neumann’s formulation violated the elementary
principles of any realistic hidden variables theory,” a statement strongly
and strangely in agreement with Belinfante [1] when he claimed, five
years before that “The first to publicly pinpoint the axiom by which von
Neumann’s formulation violated the elementary principles of any realistic
hidden variables theory was Bell (1966).”
As far as I know, the first physicist to challenge von Neumann was
Bohm, twenty years after the proof was proposed, in one of his 1952
papers [30]. The best rebuttal of the early Bohm has obviously been the
fact that he produced a logically consistent hidden-variables theory. This
was a constructivist rebuttal, explicitly showing that what was demonstrated as being impossible was in fact possible. It definitely established
that something was wrong with von Neumann’s proof; but what exactly
was wrong? This is something that was compulsory to establish. The formal answer to this issue by Bohm was rather poorly conceived, but he
correctly identified the most important point, namely, that von Neumann
implicitly restricted himself, in his proof, to an excessively narrow class of
hidden-variables theories. It just so happened that the early Bohm’s pilot
wave did not pertain to this class. Louis de Broglie then also manifested
230
Hidden Worlds in Quantum Physics
a reluctance against von Neumann’s proof, e.g., in [62, 64]. The fact that
Bohm’s rebuttal was unsatisfactory, and required more clarification, was
also pointed out by Bell [188], who commented, “The analysis of Bohm
seems to lack clarity, or else accuracy ….”
Bell’s Rebuttal of von Neumann
Accepting the idea that von Neumann’s proof has been rebutted, PEDESTRIANS (∼∼∼∼) are allowed to skip this subsubsection.
Bell [188] was able to make clear in a fairly laconic way what was
unclear in Bohm’s attack against von Neumann. In essence, his identification of the reason why von Neumann failed is in agreement with what
Hermann had already understood, some thirty years before. It was, however, the influence of Bell’s paper, about thirty-five years after the proof
was announced to the world, a long Middle Age for hidden variables, that
was the most powerful. The dragon was struck down. At that time there was
a dragon still living and it had to be destroyed; this was not obvious to all
scientists, as we can infer from the following quotation [188]: “The present
paper … is addressed to those who do find the question interesting, and
more particularly to those among them who believe that ‘the question
concerning the existence of such hidden variables received an early and
decisive answer in the form of von Neumann’s proof on the mathematical impossibility of such variables in quantum theory’.” The influence of
Bell’s paper is acknowledged by Bitbol [133] when he stated that the very
sociological turnaround of the community of physicists regarding von Neumann’s theorem nevertheless only took place from 1966 on, the date of the
publication of Bell’s paper …
Bell’s attack pointed out a faulty additivity postulate used by von
Neumann that, when applied to hidden-variables theories, was “one of
the weakest points in the proof,” in the words of Jammer [24]. The essential
assumption used by von Neumann, the one we called above the “additivity
condition on average values,” from now on called the additivity postulate,
is described in the following statement: “Any linear combination of any
two Hermitian operators represents an observable, and the same linear combination of expectation values is the expectation value of the combination.”
In the words of Mermin [197], this was “von Neumann’s silly assumption.”
The sentence quoted above actually contains two statements. The first
one says that any linear combination of any two Hermitian operators represents an observable. This seems obvious, but it actually is not guaranteed
Proofs of Impossibility and Possibility Proofs
231
at all. Let us, however, forget this problem as irrelevant for our purpose
and focuss on the second statement, the one we properly call the additivity
postulate, saying that the same linear combination of expectation values is
the expectation value of the combination.
Let us consider two Hermitian operators A and B, and let us assume
that the sum of the two Hermitian operators A and B is another Hermitian
operator, defining an observable, which we denote by C. Let us remark
that we have been here a bit more careful than von Neumann. We do not
claim that C is an observable for any couple (A, B). We just consider a
couple such that C is indeed an observable. We then have
C =A+B
(10.1)
The additivity postulate tells us that
C = A + B
(10.2)
where X is the expectation value of X. In quantum mechanics, the expectation value of X for a given wave function can be evaluated according
to the following rule:
X =
∗ Xdr
(10.3)
where ∗ is the complex conjugate of . Then, the additivity postulate is seen to be always true in quantum mechanics, since
C =
=
∗
Cdr =
∗
Adr +
∗ (A + B)dr
∗ Bdr = A + B
(10.4)
In particular, it is true regardless of the value of the commutator [A, B];
that is, it does not depend on whether or not the operators A and B commute.
As obvious as it seems, the additivity postulate is not trivial. This can be
exemplified by noting that it does not necessarily apply to eigenvalues.
To see this, let us consider a counterexample taken from Jammer [24].
Let us consider the spin component of an electron in the direction along
Hidden Worlds in Quantum Physics
232
the bisector line between the x axis and the y axis. This observable is
represented by the operator
1
S45◦ = √ (σx + σy )
2
(10.5)
The outcome of the measurement of S45◦ for the electron, a spin-1/2
particle, may be either h̄/2 or −h̄/2; that√is, the eigenvalues are ±1, in units
of h̄/2. This is different from (±1 ± 1)/ 2, which is what we would obtain
if we applied the additivity postulate to the right-hand side of Eq. 10.5.
Therefore, indeed, eigenvalues do not combine linearly. Nevertheless, we
still have, as we should,
1
(10.6)
√ (σx + σy ) = ±1 = 0
2
1
1
(10.7)
√ (σx + σy ) = √ (±1 + ±1) = 0
2
2
that is,
1
S45◦ = √ (σx + σy )
2
(10.8)
Although the additivity postulate (which has to be true for commuting
as well as for noncommuting observables) is satisfied by Eq. 10.8. This
example is particularly interesting, because σx and σy precisely do not
commute. For commuting variables, the additivity postulate may hold for
eigenvalues having well-defined values for common eigenstates. In such
a case, it follows immediately that it holds too for expectation values. In
contrast, we just have exhibited an example, with noncommuting variables,
where it does not hold for eigenvalues, although it holds for expectation
values.
As stated by Bell [188],
A measurement of a sum of noncommuting observables cannot
be made by combining trivially the results of separate observation on the two terms—it requires a quite distinct experiment. For example, the measurement of σx for a magnetic
particle might be made with a suitably oriented Stern–Gerlach
magnet. The measurement of σy would require a different
orientation, and of (σx + σy ) a third and different orientation. But this explanation of the nonadditivity of allowed
Proofs of Impossibility and Possibility Proofs
233
values also established the nontriviality of the additivity of
expectation values. The latter is a quite peculiar property
of quantum mechanical states, not to be expected a priori.
There is no reason to demand it individually of the hypothetical dispersion-free states, whose function is to reproduce the
measurable peculiarities of quantum mechanics when averaged over.
Another example, similar to the one above (picked up from Jammer),
but with a different flavor, is available from Bohm and Hiley [68]. Let us
explain it because of its complementary interest. Consider a particle whose
observables (or operators) for the components of the orbital angular
momentum are denoted as Lx , Ly , and Lz , in the usual notation. We also
restrict ourselves to the case in which the eigenvalues
are h̄, 0,√and −h̄. For
√
A and B of Eq. 10.1, let us take A = Lx / 2 and B = Ly / 2. Let now
L45◦ be the operator associated with a measurement of the orbital angular
momentum component along the bisector line between the x axis and the
y axis. We have
1
L45◦ = √ (Lx + Ly )
2
(10.9)
so that L45◦ is the operator C of Eq. 10.1. The additivity postulate, which is
valid for expectation values in quantum mechanics, tells us that we indeed
may write
1
L45◦ = √ (Lx + Ly )
2
(10.10)
Now, let us assume the existence of dispersion-free subensembles in
which the observables Lx , Ly , and L45◦ have well-defined values V (Lx ),
V (Ly ), and V (L45◦ ), respectively. If these values satisfy the additivity
postulate, we should have
1
V (L45◦ ) = √ [V (Lx ) + V (Ly )]
2
(10.11)
Now, let us assume for instance that V (L45◦ ) = h̄, one of the allowed
values indeed. Next, we have V (Lx ) = h̄, 0, −h̄ and also V (Ly ) =
h̄, 0, −h̄. We can then see that there is no way to satisfy Eq. 10.11. Hence,
there are no dispersion-free states, and hidden variables do not exist. This
is substantially von Neumann’s proof specified for a convenient example.
234
Hidden Worlds in Quantum Physics
The rebuttal, again specified for this example, goes as follows. First,
in the words of Bohm, “As is well known (and as von Neumann agrees),
there is really no meaning to combine the results of noncommuting operators such as Lx , Ly , and L45◦ . These measurements are incompatible and
mutually exclusive.” Nevertheless, Eq. 10.10 for expectation values is still
true, owing to the validity of the additivity postulate in quantum mechanics, even if it requires the use of three separate mutually exclusive series
of experiments. Now, from the fact that Eq. 10.11 cannot be satisfied for
dispersion-free sets, we should not conclude that dispersion-free sets do not
exist. Conversely, we have to conclude that the additivity postulate does
not apply to such sets. Indeed, the true value of V are not of a quantum
nature. The additivity postulate is a peculiarity of quantum mechanics that
need not be satisfied by the true values of hidden-variables theories.
From the previous arguments, the mistake of von Neumann should
now be clear: He unduly extended an additivity postulate from quantum
mechanics, where it is valid but nontrivial, to the realm of hidden variables
where it is trivially nonvalid. The same issue of an undue extension is also
put forward by Mugur-Schächter, although in the context of an argumentation that is more logical than physical, and sheds complementary light to
illuminate the landscape.
Mugur-Schächter’s Rebuttal of von Neumann
I believe that PEDESTRIANS (∼∼∼∼) are allowed to skip this subsubsection too, just accepting the fact that Mugur-Schächter provided a rebuttal
to von Neumann’s proof. They will find a very short comment, worth
visiting, at the end of it.
Two years before Bell’s rebuttal in 1966 [188], another rebuttal had
already been provided in 1964 by Mugur-Schächter in her thesis [291],
more precisely in the first part of her thesis. This work has been less influential than Bell’s, but I must confess, without any disrespect concerning
Bell, that I found it particularly impressive and convincing. What MugurSchächter did is to analyze the logical structure of von Neumann’s proof
to demonstrate that the logic used is inconsistent. We may say that the
deep physical content of Mugur-Schächter’s rebuttal is similar to Bell’s,
but it provides a quite different, but complementary, point of view by
insisting on the flawed logical organization of von Neumann’s proof. The
thesis, interestingly enough, is prefaced by De Broglie, who stated that now
for several years there were already doubts rising against von Neumann’s
Proofs of Impossibility and Possibility Proofs
235
proof and that he already came to the conviction that von Neumann’s reasoning was misleading and circular. The circularity of von Neumann’s
proof lies in the fact that he implicitly introduced in his premises the
result that he intended to demonstrate. De Broglie also stated that MugurSchächter’s work achieved a genuine logical dissection of von Neumann’s
proof and rigorously demonstrated its fallacious character. (For Jammer,
however [24], the charge of circularity is not justified.)
I am now going to present the argument of Mugur-Schächter. However,
I shall simplify it a bit without, I hope, destroying the gist of it. Also, I shall
do a bit of rewording to better match the terminology used today. MugurSchächter starts with a logical analysis of von Neumann’s proof, which
can be summarized in a few steps.
To set the stage, let us consider a wave function and N realizations of , i.e., N copies of named 1 , 2 , . . . , N (in other words, we
consider a statistical set {1 , 2 , . . . , N }). Let us also consider an observable R (any observable) that we can measure on . From the copies, we
may then evaluate the expectation value R of R, which actually can also
be quantum mechanically evaluated by using Eq. 10.3, or its more general
formulation expressed in Dirac notation as
R = | R | (10.12)
We say that the statistical set is dispersion free if, for any R, we have
R 2 = R2
(10.13)
Now, the steps to be considered are as follows.
(1) In quantum mechanics, there is no dispersion-free set (as is clear if we
just think of the Heisenberg uncertainty relations). This is, in particular,
true for pure states when, and only when, the density operator is a
projector [43].
(2) Assume that we possess hidden variables associated with the behavior
of quantum objects. Then, a pure state set can be decomposed into
dispersion-free subsets by sorting them according to the values of the
hidden variables.
(3) However, by (1), there is no dispersion-free set, and hence we have a
contradiction, which establishes that hidden variables do not exist. As
a corollary, quantum mechanics is intrinsically indeterministic.
236
Hidden Worlds in Quantum Physics
The rebuttal may proceed as follows. Surely, items (1) and (2) are
correct. The point is that item (2) does not apply to quantum mechanics.
Indeed, hidden variables are not necessarily of a quantum nature; i.e., they
are not necessarily compatible with the postulates of quantum mechanics. In particular, they are not necessarily distributed according to the
probability rules of quantum mechanics. For instance, in the pilot wave
theory hidden variables are of a classical kind, insofar as they are associated with classical (or pseudoclassical) deterministic trajectories whose
existence is rejected in the framework of quantum mechanics. They are
therefore not distributed according to Born’s postulate, which applies to
the guiding wave , or, say, to the observed values of positions, and not
to the hidden trajectories, or, say, to the objective values of positions.
We may also say that the hidden variables, being of a classical kind,
are defined outside of the logical framework of quantum mechanics in
which measured values of observables are of a quantum type. In other
words, the theory TH V consisting of quantum mechanical theory TQM and
its completion with hidden variables is larger than quantum mechanics
(TH V > TQM ). This is a convenient way to say that TH V does not operate
inside TQM or that it is not internal to TQM . Therefore, item (1) pertains
to TQM and item (2) to TH V . Then, item (3) is faulty because it unduly
transports an element of a smaller theory into the structure of a larger
theory.
PEDESTRIANS (∼∼∼∼), start here.
Borrowing a question from the title of a paper from Pinch [154], we then
may ask, “What does a proof do if it does not prove?” Well, the answer
might be that von Neumann’s proof indeed did not prove what it was
aiming to prove, but yet it is proving something. It is proving that if hidden
variables do exist, they cannot be of a quantum type, because, if they were
then von Neumann’s proof would apply to them. This restricts the class
of admissible hidden-variables theories, but it does not rule them out. The
restriction of the class of admissible hidden-variables theories, as we shall
see, will require other conditions: They will have to be contextualist and
nonlocal, as is the case for the early Bohm’s pilot wave.
Mugur-Schächter went on discussing other proofs against the proof,
from Bohm, Weizel, Fenyes, Bocchieri and Loinger, and de Broglie,
pointing out their inadequacies. I can, without any loss of completeness,
leave them out. Complementary discussions on von Neumann’s proof are
available from de Broglie [62, 63, 65], Pauli [164], Bohm [60], Bohm and
Bub [297], Bohm and Hiley [68], Siegel [303], Ballentine [304], Belinfante [1],
Proofs of Impossibility and Possibility Proofs
237
Jammer [24], Flato et al. [305], Pinch [154], Pipkin [156], Bell [142], Wheeler
and Zurek [100], Squires [13], Mermin [197], Bitbol [133], and d’Espagnat
[84]. This rather significant list of references is certainly a testimony to
the fame of von Neumann’s impossibility proof for the debate on hidden
variables.
von Neumann Still Worrying about Hidden Variables
Once things are published, they are published. Some authors may feel,
sooner or later, that they did not present things in the best possible way.
They may find that a particular sentence is not properly phrased or that a
certain demonstration could have been made shorter and more elegant,
or they may even discover an error—a real nightmare for theoretical
physicists who crave perfection. However, we must accept the fact that
everything produced in theoretical physics will ultimately be shown to be
wrong, in a long process of mistakes being pinpointed and corrected. If
there is only one truth, it is a matter of elementary computations in probability theory to demonstrate that the probability of having reached the
truth is zero. This paragraph applies in particular to von Neumann, who
did publish an erroneous proof.
As far as I know, we do not possess any information concerning what
von Neumann eventually thought of von Neumann’s proof. It seems well
established, however, that he went on worrying about hidden-variables
theories. In any case, according to Wigner [74, 306], von Neumann possessed another (although unpublished) objection against hidden variables.
To explain this objection, let us consider a Stern–Gerlach experiment, or,
more precisely, an indefinite number of repetitions of Stern–Gerlach experiments. We begin, for instance, with a measurement of a spin component
along the z direction, followed by a measurement of a spin component in
the x direction, followed by still another measurement of a spin component along the z direction, followed again by another measurement of a
spin component along the x direction, and so on. Let us now assume again,
for instance, that we are dealing with spin-1/2 particles.
The first measurement is used to prepare the system in a pure state,
say, spin up in the z direction. Once this is done, the outcomes of the
second measurement can be spin forward or spin backward along the x
direction, both with probability 1/2. If these values are deterministically
determined by hidden variables, the result obtained (say, + or − along
the axis 0x) provides restrictive information on the values of the hidden
238
Hidden Worlds in Quantum Physics
variables. The next measurement, now with probability 1/2 for both spin
up or spin down (along the z direction), will further restrict the values
of the hidden variables. Eventually, after N measurements (where N is
very large but finite), the hidden variables will become restricted to a very
narrow range. Yet, this narrow range should still be able to completely
determine all further measurements, after the N first measurements, even
if the number of further measurements is assumed to be infinite. (We will
not worry about the fact that it is in practice impossible to carry out an
infinite number of measurements.) Is this possible? Well, for von Neumann
it seemed unreasonable.
According to Wigner [306], it seems that von Neumann was thinking
in terms of a variety of hidden variables to determine the outcomes of the
measurements, not in terms of a single variable. Then von Neumann would
have pointed “to the unreasonable large variety of hidden variables which
must be assumed if one wishes to account for the postulate (implicit in
quantum mechanical theory) that no matter how many successive measurements we undertake of a system, the distribution of the hidden variables
remains sufficiently unsharp so that the outcomes of measurements are as
unpredictable as they were to begin with.”
It is difficult to know why von Neumann did not publish his proof.
A likely possibility is that, although the proof seems intuitively convincing, von Neumann did not succeed in giving it a satisfactory mathematical
shape. Also, we may advance as a guess that when trying to make it mathematically safe, he found that the proof was basically flawed. I am daring
to propose a rebuttal that maybe von Neumann eventually discovered.
Let us begin by assuming that each hidden variable in the variety of hidden variables is continuous. Then, we may invoke the Cantor–Bernstein
theorem, telling us that R n is equipotent to R m , whatever the values of
n and m, which are positive integers [307]. Let n denote the number of
hidden variables in the von Neumann variety of hidden variables, and let
m be equal to 1. Then, the Cantor–Bernstein theorem implies that we may
reduce the von Neumann variety to a single hidden variable ranging over
R. This reduction is not strictly necessary for the rebuttal, but it is useful
to support the intuition. Now the cardinal of R is infinity (not a countable infinity, however, but the infinity of the power of the continuum).
Therefore, there is plenty of room in R to deal with an infinite number of
successive measurements, and von Neumann still worrying is rebutted.
The full real line is not even necessary to develop the argument. For
instance, let us assume that the single hidden variable does not spread
Proofs of Impossibility and Possibility Proofs
239
over R but is instead distributed on the open segment ]0, 1[. After N successive measurements, the hidden variable may have been restricted to a
number of open segments located inside the original unit open segment,
with children segments narrowing more and more when the number N of
measurements increases. However, it is a property of the power of the continuum that, whatever N, whatever the number of children segments, and
whatever the narrowness of these segments, the cardinal of the set formed
by all children open segments remains strictly equal to the cardinal of the
original unit open segment.
As another possibility, let us assume that each hidden variable in the
variety of von Neumann may take an infinite countable number of values. The argument used above for continuous variables may be, mutatis
mutandis, repeated for this new case by relying on the equipotence between
Nn and Nm . Finally, the rebuttal may be made complete, in a similar way,
if we assume that the variety of von Neumann consists of a mixture of
continuous and countable hidden variables. I believe that von Neumann,
although he had a new objection in mind, soon realized that it could not
be mathematically defended.
Early Other Proofs of Impossibility
There are also other early proofs of impossibility that were never so famous
or disputed as von Neumann’s proof. One year after von Neumann’s
proof appeared in the literature, Solomon [308] also examined the possibility of hidden-variables theories and concluded that any deterministic
hidden-variables theory is incompatible with the intrinsic indeterminacy
of quantum mechanics. Another demonstration of the fact that quantum
mechanics is intrinsically indeterministic was given by Destouches-Fevrier
in 1945 [309, 310], who reached the conclusion that it is no longer possible
to return to determinism in the microphysics realm. One year after Bohm’s
1952 publication of his early papers on the pilot wave, Destouches [202]
took as guaranteed the results of von Neumann and Solomon, gave a rebuttal to a recent objection from de Broglie [152], and concluded that quantum
mechanics is essentially indeterministic. He affirmed that any future theory that would replace the present quantum theories would have to be
indeterministic too.
I am not aware of any attempt to refute these works, a kind of indifference that might be easy to explain. First, these new proofs of impossibility,
dated 1933, 1945, or 1953, when von Neumann was still largely dominating
240
Hidden Worlds in Quantum Physics
the stage, were likely found to be neither interesting nor worthwhile enough
to be deeply examined. Second, later on and today, it may be simply sufficient to state that we possess a convincing rebuttal of von Neumann’s proof
of impossibility and, more importantly, that we possess a consistent example of deterministic hidden-variables theories, namely, the early Bohm’s
pilot wave theory. Therefore, without any doubt, these other early proofs
of impossibility must be flawed somewhere. I shall be content in this book
with this expedient point of view, although a more careful examination of
these early other proofs, to identify where the shoe pinches, might deserve
a bit of effort (if we had plenty of time to do it…).
Late Other Proofs of Impossibility
Even after Bohm’s pilot wave of 1952, and after the criticisms already
addressed against von Neumann, other proofs of impossibility have
been published. In presenting these new proofs, some authors considered themselves in the filiation of von Neumann and were aiming to
improve an imperfect proof. After all, maybe von Neumann’s demonstration was wrong, but this does not mean that the conclusion was wrong
too. It could be right, and an improved demonstration could be able to
establish this point correctly. The persistence of such an effort to get rid
of hidden variables might seem to be welcomed, particularly from those
who were not aware of the existence of Bohm’s pilot wave theory—or
who simply ignored it. Indeed, if these damned hidden variables do not
exist, we better be able to prove it and to eliminate this long-lasting and
irritating issue. However, Bohm was performing on the center stage and,
simply, could not be ignored. The effort to find new proofs of impossibility,
which, given Bohm’s influence, was doomed to failure, seemed weird to
Bell. For instance, he would ask [142]: “… extraordinarily, why did people
go on producing impossibility proofs, after 1952 …?” and also speak of
“the strange story of the von Neumann impossibility proof and of the even
stranger story of later impossibility proofs.”
The topic of late other proofs of impossibility is not an easy one.
It does not make for easy reading or easy reporting, particularly when
one needs to invoke logical-mathematical approaches and the propositional calculus of quantum mechanics, and when one has use swear words
such as lattice of propositions or orthocomplemented lattice or even swear
words that are more familiar to mathematicians than to physicists. Furthermore, as we shall illustrate, the literature is often confusing, polemical,
Proofs of Impossibility and Possibility Proofs
241
and contradictory. The reader wanting to quickly sample a flavor of the
topic may refer to a collection of papers on the logico-algebraic approach to
quantum mechanics, edited by Hooker [136]. The second paper, by Strauss
[311], leads to the result that “all attempts at interpreting the quantum
mechanical formalism in terms of classical probabilities … are doomed
to failure,” a way to dismiss hidden variables. Conversely, in a subsequent
paper of the same collection, Kamber [312] concluded that “the existence
of hidden parameters … cannot … be excluded … mathematically.” These
two quotations illustrate the contradictions that we may have to face.
The story goes on in the same style. An early work among the late
works is by Gleason [313] on the basis of which Jauch and Piron [314]
thought that they had dismissed the possibility of hidden variables. The
works of Gleason, and of Jauch and Piron, are discussed in a simplified
manner by Belinfante [1]. Also, other comments on these works are available from Shimony [93]. However, Belinfante, considering that Gleason’s
proof was rather abstract, preferred to derive Gleason’s result from a later,
more understandable work by Kochen and Specker, to which we shall
return soon. Bohm and Bub [315] provided a rebuttal to Jauch and Piron,
which was rebutted by Jauch and Piron, followed by a rebuttal of the rebuttal of the rebuttal by Bohm and Bub again [316]. The refutation of Jauch
and Piron by Bohm and Bub is also discussed by Gudder [317]. In 1966,
Bell in the same paper in which he pointed out the faulty axiom in von
Neumann’s proof [188], also criticized the proofs from Gleason, and from
Jauch and Piron, but made clear, relying on Gleason’s work, that a class of
hidden-variables theories, possibly called noncontextualist theories (which
we shall discuss later), are inconsistent, more specifically inconsistent for
quantum systems whose Hilbert spaces are more than two dimensional
(i.e., dimension three or greater). In his paper, Bell took the opportunity to
provide a new proof of Gleason’s result, which is independent of the scheme
used by Gleason. This is similar to the result that was soon to be obtained by
Kochen and Specker [28]. These impossibility proofs are actually “no-go”
theorems (i.e., do not go to the classes of forbidden hidden variables). In
the words of Mermin [197], the results obtained, “by demonstrating that a
hidden-variable program necessarily requires outcomes for certain experiments that disagree with the data predicted by the quantum theory, are
called no-hidden-variables theorems (or, vulgarly, ‘no-go theorems’).”
It should now be clear to the reader that the so-called proofs of impossibility cannot prove the impossibility of something that, with Bohm, has
been constructively shown to be possible. What they can, however, do when
242
Hidden Worlds in Quantum Physics
properly examined is to exclude some classes of hidden-variables theories.
Focusing on this point of view, Bell examined which classes of hiddenvariables theories are actually ruled out by these proofs. He emphasized
that these classes form a rather small subset of all possible hidden-variables
theories. Jauch and Piron’s proof, as von Neumann’s proof, required the
additivity of expectation values of noncommuting variables, which, as
we have seen, is valid in quantum mechanics but need not be applied to
hidden variables. Gleason considered theories in which the outcome of a
measurement is independent of which compatible observables were simultaneously measured. Such theories are called noncontextual theories. Here
again, however, this was an undue restriction. Hidden-variables theories
need not be noncontextual theories. Conversely, they have to be contextual. The rather subtle issue of contextuality will be examined and discussed
in the next chapter. As stated by Shimony [93], “Bell (1966) noted however
that Gleason’s theorem does not preclude a more complex type of hiddenvariables theory, called ‘contextual’ according to which a complete state
assigns a definite truth value to a proposition only relative to a specific context.” Bohm and Bub [315] also contradicted Jauch and Piron, saying that
they actually prove nothing at all and that the argument of Jauch and Piron
is circular, an accusation already addressed previously to von Neumann’s
proof. In a subsequent paper, Jauch and Piron [318] provided a rebuttal to
this rebuttal, something like turning down the objection point blank, being
content with a “restatement of our result in a nontechnical language.” See
also the book by Jauch [319] in which Chapter 7 is dedicated to hidden
variables. In this chapter, Jauch maintains a theorem according to which
the existence of hidden variables (of a certain kind ) is in contradiction
with empirical facts. This is soft enough because of the cautious expression of a certain kind. Indeed, what occurs most generally for proofs of
impossibility is that they refer to certain definitions or premises that imply
the impossibility of hidden variables when these definitions or premises
are accepted. However, the use of other definitions or premises might be
allowed, so that proofs of impossibility only refer to a certain kind of
hidden variables.
A second somewhat parallel line of exposition starts with Kochen and
Specker [28], who give a proof of the nonexistence of hidden variables.
Actually, once again, they did not prove what they claimed to prove, but
they proved that hidden-variables theories must be contextual, a result
already obtained ten years before by Gleason, although by different means.
As stated by Freedman and Holt [300], they “considered only theories in
Proofs of Impossibility and Possibility Proofs
243
which the outcome of a measurement was independent of which compatible observables were simultaneously measured,” that is, noncontextual
theories. However, it is acknowledged that the proof of Kochen and
Specker is complicated (and, indeed, it is). For an easier path, the interested
reader should better turn to Belinfante [1] who provided a simple-enough
discussion concerning contextuality. In plain terms, Belinfante remarked
that the argument used by Kochen and Specker (and also by Jauch and
Piron) simply denies that the way in which a measurement is made could
have an effect on the result of the measurement; this is a very strong hypothesis indeed! A conclusion may be borrowed from Jammer [24]: “Kochen
and Specker have proved once again that noncontextual hidden variables
do not exist in quantum mechanics.”
For another line of exposition, let us consider a work from Gudder
[320]. Relying on previous works, in particular on Jauch and Piron, Gudder
examined proofs of impossibility in the framework of formal logic, using
the concept of an orthocomplemented partial set that is complete with
respect to compatible elements. Although the content makes for frightening
reading, the conclusion can fortunately be easily delivered as follows:
A quantum system admits hidden variables if and only if it acts physically
as an ideal classical system. Hence, because it is well known that quantum
systems do not behave like classical systems (e.g., consider Heisenberg’s
uncertainty principle), hidden variables must be excluded from quantum
mechanics. We identify here again something that is like a circle, as in
von Neumann’s proof. A heuristic rebuttal can be constructed by referring
to the example of Bohm’s pilot wave. In this pilot wave, there are two
levels: the higher level of quantum mechanics and the lower level of hidden
variables. The higher level, the one of quantum mechanics, being not of
a classical nature, indeed does not receive hidden variables. The lower
level is the one of the hidden trajectories. What Bohm did is to succeed in
connecting the two levels, which, somehow, can each be considered each
in its own right. At least this is the result of my effort to dismiss Gudder
in a simple and I hope essentially correct way, avoiding any technicality.
A bit later on, Gudder [321] himself remarked, taking into account the fact
that Bohm’s theory had never been shown to be inconsistent, that “clearly,
there is something wrong here. One obviously cannot have a HV [hiddenvariables] theory if it is impossible.” Going on further, Gudder eventually
demonstrated that hidden variables are always possible.
Bub [322], facing such a confusion, took the expedient but rather arbitrary point of view that “the term hidden-variables theory is justifiably
244
Hidden Worlds in Quantum Physics
used to denote the kind of theories rejected by von Neumann, Jauch and
Piron, Kochen and Specker, [but] it is suggested that the term should not
be used as a label for the theories considered by Bohm and other workers
in the field. Such theories should be regarded as fundamentally compatible with the original Copenhagen interpretation of the quantum theory, as
expressed by Bohr,” adding, “The conclusion of this paper will therefore
be that there are no hidden variables theories of quantum phenomena
in the usual sense, that the term hidden variables theory for the kind of
theory considered by Bohm and his collaborators is unfortunate and misleading, and that this latter approach might well be characterized as an
extension of Bohr’s conception of wholeness as opposed to von Neumann
philosophy.” This is a kind of normative attitude introducing a dichotomy
between what deserves and what does not deserve to be called hidden
variables, if not a kind of propaganda to shield Bohm from attacks from
hidden-variables opponents. In any case, even if the concept of wholeness
is somehow common to Bohr and Bohm, it is extraordinarily difficult to
see Bohm as a continuator of Bohr.
In a clever and amusing manner, Greechie and Gudder [137] considered
two hidden variable proofs in the quantum logic framework: one a proof
that they do not exist and one a proof that they do. This is indeed reminiscent of a Jesuitical exercise in rhetoric, in which the same man can
prove that God does not exist and, just after, with the same cleverness in
persuasion, that He does exist.
At this point, is the reader confused by a literature that is indeed
confusing? There are, however, a small number of simple key points that
can provide some insight: (1) Proofs of impossibility can never prove
impossibility. (2) They can only prove that some classes of hidden variables are impossible. (3) Among the set of impossible hidden-variables
theories, there are those theories that are noncontextual. This last item
needs clarification. It will require a chapter, after this chapter.
Further details are available in Jauch [319], Belinfante [1], Jammer [24],
Freedman and Holt [300], Flato et al. [323], Gréa [324], Pinch [154], Fine and
Teller [325], Peres [326], and Bohm and Hiley [68].
Possibility Proofs
Once again, the easiest possibility proof is simply the very existence of
Bohm’s pilot wave, a constructivist proof in which an edifice is constructed
so that we can simply say, “Just look at it.” It is logically consistent, and
Proofs of Impossibility and Possibility Proofs
245
this consistency has never been contradicted. Once we understand Bohm’s
theory, and understand that it is logically consistent, there is really no need
to further investigate the issue. If we just want to know whether or not
hidden-variables theories are possible, the issue is indeed closed. However,
we are going to dig a bit further with formal approaches, to provide a still
better understanding. Actually, we have many other things to learn.
PEDESTRIANS (∼∼∼∼) may skip the next subsection, being content with the conclusive last lines, but they should be able to deal with the
next one after it.
A Simple Particular Formal Approach
In this subsection, we are going to provide a simple example due to Bell
[188], an example that is also discussed by Flato et al. [323]. The reader might
find that this example is too particular to be convincing. In the words of
Flato et al., it may be too simple. However, even if it is too simple, it is
enlightening and sufficiently illustrating to support later discussions.
Following Bell, we consider a system with a two-dimensional state
space, to be specific, a particle of spin 1/2 without any translational motion,
or, more generally, a two-dimensional system that can be reduced to a
spin-1/2 system (see, e.g., several examples in [43]). In such a case, any
quantum mechanical state may be represented by a spinor , which is a
two-component wave function. Any observable R in the two-dimensional
state space may be represented by a square Hermitian matrix of rank 2,
which can always be decomposed into the form
R = αI + β · σ
(10.14)
where I is the unity matrix, the supervector σ has for components the Pauli
matrices σx , σy , and σz , α is taken to be a real constant, and β is taken to be
a real vector (and Bell indeed takes α and β as real). Explicitly, Eq. 10.14
can be rewritten as
1 0
0 1
0 −i
1 0
+ βx
+ βy
+ βz
(10.15)
R=α
0 1
1 0
i 0
0 −1
Let us consider the rather simple case in which βx = βy = 0. Then
1 0
1 0
(10.16)
+ βz
R=α
0 −1
0 1
Hidden Worlds in Quantum Physics
246
from which we see that the eigenvalues are
Ev = α ± |βz |
(10.17)
Physics should not depend on the orientation of the vector β. Therefore, in
the most general case of Eq. 10.15, eigenvalues should be
Ev = α ± |β|
(10.18)
According to the basic rules of quantum mechanics, the associated expectation values are
αI + β · σ = , (αI + β · σ )
(10.19)
Let us now complement quantum mechanics with a real hidden variable, denoted λ, pertaining to the interval [−1/2, 1/2]. A microstate is
then defined by the couple (, λ) in which is the spinor. By a rotation
of coordinates, the spinor can always be given the simple form
1
(10.20)
=
0
For such a spinor, by using Eq. 10.19, quantum mechanics computation
rules allow one to find
1
= α + βz
αI + β · σ = (1 0)(αI + β · σ )
(10.21)
0
Indeed,
1
(1 0) α
0
= 1
1
0 1
0 −i
1 0
0
+ βy
+ βz
+ βx
0
1 0
i 0
0 −1
1
1
0
0
1
0 α
+ βx
+ βy
+ βz
0
1
i
0
= α + βz
(10.22)
Now, let us set
⎫
(1) X = βz if βz = 0
⎪
⎬
(2) X = βx if βz = 0, βx = 0
⎪
(3) X = βy if βz = 0, βx = 0⎭
(10.23)
Proofs of Impossibility and Possibility Proofs
and also
247
signX = +1 if X ≥ 0
signX = −1 if X < 0
(10.24)
Now, let us introduce microstate eigenvalues denoted as E(,λ)
defined by
1
(10.25)
E(,λ) = α + |β|sign λ|β| + |βz | signX
2
where the vector β is still given by (βx , βy , βz ), but in the new coordinate system in which the spinor reduces to the form of Eq. 10.20. This
vector being given, the dispersion-free microstate (, λ) completely determines the microstate eigenvalues E(,λ) . If we were able to measure the
microstate, particularly the value of λ, the probability for the associated
microstate eigenvalue would be exactly equal to 1, in contrast with the
probabilities associated with quantum eigenvalues.
The philosophy of hidden-variables theories is that quantum expectation values are obtained by averaging over hidden variables. Using a
uniform averaging over λ, we are then led to the evaluation of an integral
over the microstate eigenvalues, according to
1
α + |β|sign λ|β| + |βz | signX dλ
2
+1/2
α + β · σ =
−1/2
(10.26)
After a simple evaluation (starting with case (3) of Eq. 10.23 to check), we
obtain
αI + β · σ = α + βz
(10.27)
that is, the same result as found in quantum mechanics.
PEDESTRIANS (∼∼∼∼), reconnect here.
We therefore possess a deterministic hidden-variables model, based
on a hidden variable λ, that reproduces quantum mechanical predictions.
There is however an important difference with respect to Bohm’s pilot
wave, namely, the fact that λ does not receive any physical meaning, so
that this hidden-variables model does not allow one to propose any new
interpretation of quantum mechanics. Such hidden variables may be called
dummy hidden variables. (Some people might prefer to call them artificial
hidden variables, and this is actually the terminology we shall use later.)
248
Hidden Worlds in Quantum Physics
A Simple General Formal Approach
The reader might worry that the previous model has been developed in a
very special case (spin 1/2) and that perhaps it would be impossible to generalize it. As it is, it might even look a bit ad hoc and contrived. However,
it is a fact that one formal example is sufficient to demonstrate that hidden
variables are possible, simultaneously ruining all pretensions of proofs of
impossibility. We may even have a stronger result: Not only are hidden
variables possible, but they are always possible. Bell [327] stated this result
as follows: “If no restrictions whatever are imposed on the hidden variables, or on the dispersion-free states, it is trivially clear that such schemes
can be found to account for any experimental results whatever. Ad hoc
schemes of this kind are devised every day when experimental physicists,
to optimize the design of their equipment, simulate the expected results
by deterministic computer programs drawing on a table of random numbers.” He added, however, that “such schemes … are not very interesting.
Certainly what Einstein wanted was a comprehensive account of physical processes evolving continuously and locally in ordinary space and
time.” The same argument has been served again in 1976 [328]: “That the
apparent indeterminism of quantum phenomena can be simulated deterministically is well known to every experimenter. It is now quite usual,
in designing an experiment, to construct a Monte Carlo computer programme to simulate the expected behavior … Every such programme is
effectively an ad hoc deterministic theory, for a particular set-up, giving
the same statistical predictions as quantum mechanics.”
According to Wigner [306], “it is rather obvious that, given any quantum mechanical measurement represented by the operator Q one can introduce a hidden-variable model.” This can be done as follows. Together with
any operator Q, and associated measurements, let us introduce a hidden
variable denoted q. Let us demand that the statistical distribution of q reproduces the probabilities for the various possible (eigenvalues) λ1 , λ2 , . . . of
the quantum measurements for Q and examine whether this is always possible. The answer is yes. To demonstrate this, we biunivocally associate
domains D(λ1 ), D(λ2 ), . . . of q with each possible measurement outcome
λ1 , λ2 , . . . . Let us now consider a state | and a distribution P (q).
We then postulate that the distribution P (q) assigns a probability to the
domain D(λi ) that is equal to the probability that the measurement of Q
on yields the value λi . We are done!
This may be generalized to the case of several operators Q1 , Q2 , . . ..
It suffices to introduce a hidden variable qj for each operator Qj . Assume,
Proofs of Impossibility and Possibility Proofs
249
for instance, that the spectra of all operators are discrete. Hence, let us
call λij the ith eigenvalue (i = 1, 2, . . . , Nj ) of the j th operator (j =
1, 2, . . . , N). To each discrete value λij , we can associate a hidden variable
qij in such a way that the value of qij determines λij . If we associate a
probability P (qij ) to each qij , we may adjust the P (qij )’s to recover
the various probabilities for the λij ’s. Of course, for each j , the sum of all
probabilities over i must be equal to 1.
Let Nhv be the number of hidden variables required for the process.
We may use Nj hidden variables for the j th operator having Nj eigenvalues. The summation of all the Nj ’s over j = 1, 2, . . . , N, that is, for
all operators, will provide a value for Nhv . We then have a list of hidden
variables qij , which may be arranged by using a single integer s ranging
from 1 to Nhv . This means that the number of hidden variables can actually be reduced to 1, denoted by s, taking Nhv possible different discrete
values. If we are dealing with continuous spectra, j still takes on discrete
values, but the discrete index i is to be replaced by a continuous index.
We are then dealing with a number N of continuous segments that, by
invoking again the Cantor–Bernstein theorem, can be merged into one single segment. Once again, the number of hidden variables may be reduced
to 1. Similar considerations may be used for mixed spectra. Then, one can
reduce all discrete spectra to one discrete hidden variable and reduce all
continuous spectra to one continuous hidden variable. The number of hidden variables has then been reduced to 2. Whether it is possible to achieve
a further reduction from 2 to 1 is something that I am unable to answer.
The previous discussion is similar to the one we used when considering von Neumann’s consecutive measurements. Let us remark that the
procedure we used here is the one also used for the construction of
dispersion-free statistical sets, which, according to von Neumann’s proof,
are forbidden. Indeed, in his book [26], von Neumann was already aware
of such a possibility for a decomposition of a statistical set into dispersionfree subsets. Ironically enough, though, he did not use this fact to conclude
that hidden variables are always possible. Instead, as we know, he ended
with the claim that such decompositions are actually not possible and that
hidden variables must always be excluded.
Wigner [306] stated that the number of hidden variables increases enormously when we assume an increasing number of operators and of consecutive measurements. I hope I have correctly proven that such is not
the case. It remains, however, true that hidden variables introduced as
above do not receive any physical interpretation. Therefore, they do not
250
Hidden Worlds in Quantum Physics
permit alternative interpretations of quantum mechanics and look ad hoc
and artificial. Certainly, this is a kind of hidden variables that most physicists would not like to consider. I do not know whether this difficulty can
be overcome. It might simply be “a lack of imagination,” in the words of
Bell [142]. In any case, as stated by Flato et al. [323], “… it is quite trivial
to construct theories containing hidden variables which do not have any
physical meaning. The real problem will of course be to construct theories
with hidden variables having physical meaning, capable of reproducing
all known quantum-mechanical results and possessing also predictions in
domains not yet covered by quantum mechanics. No example of such a
theory is known to our day” (and, as far as I know, still today).
Gudder [321] explains the contradiction between proofs of impossibility and possibility proofs in the following terms: “The proponents of
hidden variable theories have an idea of what these theories should be
and have given examples of such theories. The antagonists have a different idea of what a hidden variable theory should be and have proved that
such theories are impossible in the present general framework of quantum
mechanics. These proofs are irrelevant since they do not refer to the hidden variable theories as formulated by the advocates of these theories.”
To avoid any ambiguity, he then provided a definition of what constitutes a
hidden-variables theory, which is in Bohm’s spirit. Now that we have several examples of hidden-variables theories, it is appropriate to precisely
define what a hidden-variables theory is, and this can be done in a concise and perfect way, following Gudder, as follows: “The state m of a
quantum mechanical system is not complete in the sense that another variable ξ can be adjoined to m so that the pair (m, ξ ) completely determines
the system. That is, a knowledge of (m, ξ ) enables one to predict precisely the outcome of any single measurement. Furthermore, an average
of (m, ξ ) over the values of ξ gives the usual quantum state m.” Gudder
afterward demonstrated that, in a certain sense, a hidden-variables theory
is always possible (although the usual, even clever, physicist might feel
distressed by some mathematical statements, heavily relying on a calculus
of propositions, used to establish the result). Gudder’s work is acknowledged by Holland [40], saying, “It proved possible to show that quantum
mechanics can always be supplemented by hidden variables.” The paper
of Gudder has been complemented by Greechie and Gudder [137]. See
also Fine [329], who agreed with the statement that we may always build a
deterministic hidden-variables model that reproduces quantum mechanical
results.
Chapter 11
CONTEXTUALITY
We have already met the concept of contextuality in several places, when
we stated and loosely explained that Bohm’s pilot wave is a contextual
theory, but also regarding the proofs of impossibility of Gleason, and
Kochen and Specker, when we restated their results, saying that what they
proved is that admissible hidden-variables theories have to be contextual.
This result may be viewed as a theorem, often called the Kochen–Specker
(KS) theorem, or for reasons that we shall make clear the Bell–KS theorem. There is another important result: that admissible hidden-variables
theories must be nonlocal. This result also may be viewed as a theorem,
called Bell’s theorem (which is much more famous than the KS theorem).
Nonlocality will be discussed in the next chapter. In the present chapter,
we will deal with contextuality (a concept that we have now to understand
clearly) and with the proof that hidden-variables theories must be contextual; that is, the properties assigned to hidden variables must be determined
by the experimental context of the measurement.
Kochen and Specker for Everyday Cyclists
Cyclists go faster than PEDESTRIANS (∼∼∼∼) but climb lower than
mountaineers. I believe that cyclists can handle this section, but I am not
sure about the fate of PEDESTRIANS (∼∼∼∼). Anyway, they may try.
Otherwise, given that the next section is somehow in the same mathematical mood as the present one, they better go directly to the next chapter, just
taking a bit of time to gain some specific information from a few useful
comments.
As we stated, Kochen and Specker’s proof is difficult. Belinfante produced a simplified proof but it is still too involved for an easy presentation, in a reasonable number of pages (at least in the framework of this
book). Fortunately, we also possess a still simpler version as presented by
Bitbol in Annex III of one of his books [133]. This is akin to a kind of green
lane for everyday cyclists.
Following the green lane, let us consider spin-1 particles. When we
measure a component of the spin in any direction, the possible outcomes
252
Hidden Worlds in Quantum Physics
are −1, 0, or +1 (in units of h̄), producing three spots in a Stern–Gerlach
experiment. This is in particular true if measurements are made along
the x, y and z directions, with associated observables denoted Sx , Sy ,
and Sz , respectively. Hence, if we measure the squares of the components
in any direction, in particular Sx2 , Sy2 , and Sz2 , there are only two possible
outcomes, namely, 0 and 1. Let us now consider the observable
= Sx2 + Sy2 + Sz2
(11.1)
For a spin s = 1, the result of the measurement of is s(s + 1) = 2. If
we admit hidden variables, these hidden variables must determine the values (a concise way to designate the values of outcomes of measurements)
of Sx , Sy , and Sz .Therefore, they must also determine Sx2 , Sy2 , and Sz2 . Then,
because each component Si (i = x, y, z) may have the values −1, 0, or 1,
and Si2 (i = x, y, z) may have the values 0 or 1, we conclude that two of
the values of Si2 must receive a value equal to 1, and the third one must
receive a value equal to 0 so that, as required, the sum of the three values
() is equal to 2.
Let us now rotate our coordinates about the x direction and, instead
of Sx , Sy , and Sz , consider Sxnew , Synew , and Sznew , with xnew, ynew,
and znew the new directions after rotation. We then may consider the new
observable
2
2
new = Sx2 + Synew
+ Sznew
(11.2)
Here, the reader should not worry: On the right-hand side (r.h.s.) of Eq. 11.2,
2
. We can now follow the same kind of
the first term is indeed Sx2 , not Sxnew
reasoning as above, leading to the following observations. If we measure with an apparatus M, or new with an apparatus Mnew , then in both cases
the outcome of the measurement must be equal to 2. Again, among the
2
2
three observables Sx2 , Synew
, and Sznew
, two of them must receive a measurement value equal to 1, and the third one a measurement value equal to 0.
We are now going to introduce an assumption that we call the noncontextual assumption, which looks most reasonable and from which we shall
draw dramatic consequences concerning quantum mechanics and hiddenvariables theories possibly underlying quantum measurements. To begin,
take notice of the fact that the observable Sx2 pertains to the r.h.s. summations of both Eqs. 11.1 and 11.2. Now, let us assume that Sx2 receives
one of its possible values, say 1, determined by the hidden variables. The
most reasonable noncontextual assumption tells us the following: If we
Contextuality
253
have 1 with apparatus M for , then we also have 1 with apparatus Mnew
for new . In other words, the value 1 does not depend on the context, that
is, on whether the experimental setup is M or a rotated setup Mnew . Similarly, if Sx2 yields its other possible value 0, it must be 0 independently of
the experimental setup being M or Mnew . Our most reasonable assumption
therefore provides an additional constraint concerning the observable Sx2 ,
which is shared by both and new . However, we have the following extraordinary result: This constraint is incompatible with quantum
numerical predictions, meaning that the noncontextual assumption must be
rejected. Therefore, hidden-variables theories have to be contextual, as is
indeed Bohm’s pilot wave.
We now complete the proof of the KS theorem by demonstrating that
quantum mechanical predictions cannot be recovered from our aforementioned noncontextual hidden-variables theory. For this, we make a geometrical transposition of our constraint (or, more precisely, two constraints:
one for the value 1and the other for the value 0).
Let us consider a set of three orthogonal vectors in Newtonian space,
called a trio. Next, consider a family of trios. We assign the value 1 or 0 to
each vector of a trio. Our algebraic constraints may then be converted into
geometrical constraints as follows:
(1) For any trio, one vector is assigned the value 0 and the two other
vectors are assigned the value 1.
(2) Let us consider the value assigned to a given vector. This vector may
pertain to several trios. Then the value assigned to the vector does not
depend on the trio to which it pertains.
Next, consider two orthogonal vectors a = (xa , ya , za ) and b =
(xb , yb , zb ) with components taken from a Cartesian coordinate system
(x, y, z). Let us assign the value 1 to each of these vectors, and we denote
this as
[a] = [(xa , ya , za )] = 1
[b] = [(xb , yb , zb )] = 1
(11.3)
(11.4)
There is one and only one vector c orthogonal to both a and b. Item
(1) implies that this vector must be assigned the value 0 and, according
to item (2), any vector orthogonal to c must be assigned the value 1. Any
vector of this kind lies in the plane defined by a and b and therefore can
be written as (αa + βb) with α, β ∈ R. Hence, we have the following
254
Hidden Worlds in Quantum Physics
rule: If a and b are two orthogonal vectors and [a] = [b] = 1, then [αa +
βb] = 1, whatever α and β pertaining to R.
Now, we will consider the trios (1, 0, 0), (0, 1, 0), and (0, 0, 1) and
let us assume [(1, 0, 0)] = 0. By item (1), we then must have [(0, 1, 0)] =
[(0, 0, 1)] = 1. We next examine the vector (2, 1, 0), which lies in the
plane defined by (1, 0, 0) and (0, 1, 0) and makes an acute angle with
respect to (1, 0, 0). We now show that this vector cannot receive the value 1,
by establishing that it would lead to a contradiction.
For this, we therefore have [(1, 0, 0)] = 0, [(0, 1, 0)] = [(0, 0, 1)] = 1,
and we assume [(2, 1, 0)] = 1. Applying several times the aforementioned
rule, we deduce a set of relations:
[(2, 1, 0) + (0, 0, 1)] = [(2, 1, 1)] = 1
[−(0, 1, 0) + (0, 0, 1)] = [(0, −1, 1)] = 1
[(2, 1, 0) − (0, 0, 1)] = [(2, 1, −1)] = 1
[−(0, 1, 0) − (0, 0, 1)] = [(0, −1, −1)] = 1
[(2, 1, 1) + (0, −1, 1)] = [(2, 0, 2)] = 1
[(2, 1, −1) + (0, −1, −1)] = [(2, 0, −2)] = 1
(11.5)
(11.6)
(11.7)
(11.8)
(11.9)
(11.10)
as can be seen by noting that the two vectors present on the left-hand side
(l.h.s.) of any of these relations are orthogonal.
Now, the vectors (2, 0, 2), (2, 0, −2), and (0, 1, 0) are three orthogonal
vectors; i.e., they form a trio. From one of our starting relations, namely,
[(0, 1, 0)] = 1, and from the results obtained in Eqs. 11.9 and 11.10, each
of the vectors in the above trio is assigned the value 1. By item (1), which
demands that one of the vectors should be assigned the value 0, this is
impossible. Therefore, if we have [(1, 0, 0)] = 0, implying [(0, 1, 0)] =
[(0, 0, 1] = 1, we cannot simultaneously have [(2, 1, 0)] = 1. Hence, we
must have [(2, 1, 0)] = 0. So we have the following easily stated result:
[(1, 0, 0)] = 0 implies [(2, 1, 0)] = 0.
In other words, what we have obtained is the following: Given items
(1) and (2), if a certain vector is assigned the value 0, then there exists at
least another vector, making an acute angle with respect to the previous
one, that must be assigned the value 0 too.
Let us now convert this result to our original problem concerning spin-1
particles. For this, we consider a component of the spin in a certain direction. If the square of this component receives the value 0, a true value
determined by hidden variables, then there is at least another component
Contextuality
255
in a direction making an acute angle with the previous one whose square
must receive the value 0 too.
However, this is in deep contradiction with the predictions of quantum mechanics. Indeed, if Su in direction u is measured to be 0, that is, Su2
is measured to be 0 too, then quantum mechanics predicts that Sv , in direction v making an acute angle with u, possesses a certain probability to be
measured as Sv = ±1, so that Sv2 possesses a certain probability to be measured as Sv2 = 1. In other words, among the particles having Su2 = 0, there
is a certain number of them that will be measured as Sv2 = 1. This does not
agree with our previous result telling us that there exists at least a direction
v such as all particles must receive the value 0.
Therefore, we must reject the noncontextual assumption. Hence we
have the Kochen–Specker theorem: Any admissible hidden-variables theory must be contextual.
Mermin, Just by Himself
We now examine another argument published by Mermin in 1993 [197],
about twenty-five years after the work by Kochen and Specker. Such a large
time gap results from this argument being a lengthy historical development,
starting, more or less arbitrarily but significantly, with Bell in 1964 [189] on
the concepts of locality and nonlocality and discussed in the next chapter.
All through this period, many things were learned, organized, and reorganized, and a better understanding was gained through many efforts of
several contributors that both deepened and often simplified the argument.
As a result, we do not need to find a version of the Mermin argument for
everyday cyclists, as required with Kochen and Specker. We may examine
Mermin, just by himself (but resorting to some minor rewordings), although
it may still be fairly difficult reading for PEDESTRIANS (∼∼∼∼). Before
doing this, let us mention another paper by Mermin [330], providing two
examples that significantly simplify the no-go theorem of Kochen and
Specker. Preferably, both papers should be examined together.
Following Mermin [197], we now consider a four-dimensional state
space generated by two spin-1/2 particles, particle 1 being associated with
Pauli matrices σ 1μ and particle 2 with Pauli matrices σ 2ν . Any observable
in the state space may conveniently be represented in terms of these Pauli
matrices (and of unity matrices; see, e.g., [43]). They enjoy a certain number
of properties worth recalling.
256
Hidden Worlds in Quantum Physics
Property 1 The squares of each matrix are unity, and therefore each of
them have eigenvalues equal to ±1:
1 2 2 2
σμ = σν = 1
(11.11)
Property 2 Any component of σ 1μ commutes with any other component
of σ 2ν .
Property 3 Let μ and ν specify orthogonal directions (μ = x, y, z and
ν = x, y, z); then σ iμ anticommutes with σ iν for i = 1, 2, and for i = 1, 2,
σ ix σ iy = iσ iz
(11.12)
It is convenient, for either i = 1 or 2, to recapitulate the properties in
a series of formulas as follows [14]:
⎫
σx2 = σy2 = σz2 = 1 ⎪
⎪
⎪
σx σy = −σy σx = iσz ⎬
(11.13)
σy σz = −σz σy = iσx ⎪
⎪
⎪
⎭
σz σx = −σx σz = iσy
where we have omitted the superscripts and simplified the matrix notation
from bold σ to normal σ . One can then build nine observables conveniently
arranged in the following table:
σ 1x
σ 2x σ 1x σ 2x
σ 2y
σ 1y σ 1y σ 2y
σ 1x σ 2y σ 2x σ 1y σ 1z σ 2z
(11.14)
(returning to bold notation, with superscripts for particles 1 and 2). These
observables exhibit the following properties:
Property A The observables in each of the three rows and each of the
three columns are mutually commuting. This is immediately evident for
the top two rows and for the first two columns from the left. It is also true
for the bottom row and rightmost column, because in every case we can
use a pair of anticommutations.
Here is an example concerning the first row:
1 1 2
σ x , σ x σ x = σ 1x σ 1x σx2 − σ 1x σ 2x σx1 = σ 1x σ 1x σ 2x − σ 1x σ 1x σ 2x = 0 (11.15)
Contextuality
Here is an example using anticommutations:
1 2 1 2
σ x σ x , σ y σ y = σ 1x σ 2x σ 1y σ 2y − σ 1y σ 2y σ 1x σ 2x
= σ 1x σ 1y σ 2x σ 2y − σ 1y σ 1x σ 2y σ 2x
= iσ 1z iσ 2z − −iσ 1z −iσ 2z = 0
257
(11.16)
Property B The product of the three observables in the column on the
right is (−1). The product of the three observables in the other two columns
and all three rows is (+1).
Here is a demonstration for the first product:
2 2
σ 1x σ 2x σ 1y σ 2y σ 1z σ 2z = σ 1x σ 1y σ 1z σ 2x σ 2y σ 2z = i σ 1z i σ 2z = −1
(11.17)
For the second product, here is an example:
2
2
σ 1x σ 2y σ 2x σ 1y σ 1z σ 2z = σ 1x σ 1y σ 1z σ 2y σ 2x σ 2z = i σ 1z (−i) σ 2z = +1
(11.18)
Now, if we consider mutually commuting observables, then any identity satisfied by the observables must also be obeyed by the values they
receive. This means in particular that Property B may be converted to a
corollary in which “observables” is replaced by “values of the observables.” Hence, the product of the values assigned to the three observables
in each row must be (+1), and the product of the values assigned to the
three observables must be (+1) for the first two columns and must be (−1)
for the rightmost column.
Mermin pointed out that this “is impossible to satisfy, since the
row identities require the product of all nine values to be 1, while the
column identities require it to be −1.” The impossibility may also be shown
from Eqs. 11.17 and 11.18. Indeed, let V (X) denote the value of the observable X. Then, by recalling that values are numbers, that is, commuting
quantities, Eq. 11.17 may be converted to
V (σ 1x )V (σ 2x )V (σ 1y )V (σ 2y )V (σ 1z )V (σ 2z ) = −1
(11.19)
and Eq. 11.18 to
V (σ 1x )V (σ 2y )V (σ 2x )V (σ 1y )V (σ 1z )V (σ 2z )
= V (σ 1x )V (σ 2x )V (σ 1y )V (σ 2y )V (σ 1z )V (σ 2z ) = +1
(11.20)
258
Hidden Worlds in Quantum Physics
Hence there is a contradiction. Afterward, Mermin similarly examined,
with a similar conclusion, the case of an eight-dimensional state space
(instead of a four-dimensional state space) consisting of three independent
spin-1/2 particles (instead of only two particles), a case that is even simpler
and more versatile, as stated by Mermin.
What has been proven here is a Bell–KS theorem on the impossibility of some kinds of hidden variables, if these hidden variables have to
satisfy quantum mechanical predictions. To see that hidden variables are
indeed concerned, we just need to imagine that they determine the values
V (X) discussed above. The theorem has something to do with locality
since the particles are far apart but, to justify its discussion in this chapter,
it must be noted that it also has something to do with contextuality.
To understand this, let us return to Mermin’s words: “In all these
cases … we have tacitly assumed that the measurement of an observable
must yield the same value independently of what other measurement must
be made simultaneously ….” In particular, in the four-dimensional example
above, and also in the eight-dimensional example that we did not discuss,
We required each observable to have a value in an individual
system that would give the result of its measurement, regardless of which of two sets of mutually commuting observables
we chose to measure it with. But since the additional observables in one of those sets do not commute with the additional
observables in the other, the two cases are incompatible. These
different possibilities require different experimental arrangements: There is no a priori reason to believe that the results …
should be the same. The result of an observation may reasonably depend not only on the state of the system (including
hidden variables) but also on the complete disposition of the
apparatus.
Here is now the explicit introduction of the concept of contextuality by
Mermin: “This tacit assumption that a hidden variable theory has to assign
to an observable A the same value whether A is measured as part of the
mutually commuting set A, B, C, . . . or as part of a second mutually commuting set A, L, M, . . . even when some of the L, M, . . . fail to commute
with some of the B, C, . . . is called noncontextuality.” Therefore, what
has been proven is that the tacit assumption must be rejected: Quantum mechanics is contextualist, and hidden-variables theories have to be
Contextuality
259
contextualist. This result might appear to be strange, at least sufficiently
strange to motivate asking: Is the Bell–KS theorem silly?
Mermin further commented, “It is surely an important fact that the
impossibility of embedding quantum mechanics in a noncontextual hiddenvariables theory rests not only on Bohr’s doctrine of the inseparability of the
objects and the measuring instruments, but also on a straightforward contradiction, independent of one’s philosophic point of view, between some
quantitative consequences of noncontextuality and the quantitative predictions of quantum mechanics.” Also, according to Mermin, the previous
remark may be expressed another way, in connection with the fact that the
KS or Bell–KS theorem is less famous than the Bell’s theorems discussed
in the next chapter, as follows: “One reason the Bell–KS theorem is … less
celebrated … is that the assumptions made by the hidden-variables theories
it prohibits can only be formulated within the formal structure of quantum
mechanics.” This statement can only be fully appreciated when we have
in mind sufficient material concerning nonlocality, the EPR paradox, and
Bell’s inequalities and theorems, all topics to which we now turn.