What is...Linear Logic? Introduction Jonathan Skowera
Transcription
What is...Linear Logic? Introduction Jonathan Skowera
What is...Linear Logic? Jonathan Skowera Introduction Let us assume for the moment an old definition of logic as the art “directive of the acts of reason themselves so that humans may proceed orderly, easily and without error in the very act of reason itself.”1 It is a definition to reflect everyday usage: people usually say reasoning is logical or illogical according to the apparent quality of the reasoning. So assuming there is a human art of logic, and asking what domain of human activity it governs, a reasonable guess would be acts of reasoning. With this in mind, it is not clear at first glance that linear logic falls in the domain of logic. The logic is called linear logic, and its rules of inference treat formulas as much like finite resources as propositions. That is a radical difference. With intuitionistic logic, you can rephrase calculus as non-standard analysis and then define enigmatic (yet consistent!) infinitesimals. But an analogous “linear analysis” based on linear logic seems unlikely. The logic lacks inferences any proposition should satisfy. On the other hand, logician certainly recognize linear logic immediately as just another symbolic logic with a formal language, rules of inference and theory of its models. It is nontrivial with interesting categorical models. Even better, linear logic has plenty of uses for computer scientists. That’s a strong selling point in the tradition of the Curry-Howard isomorphism. This counter-intuitive, ”what is it?” nature of linear logic which will hopefully fuel your interest, even if it lies outside your everyday working experience. One caveat: the definitions of logic ground the definitions of mathematics, but the logician remains free to use any (sound) deductive reasoning whatever to infer properties of the logic. Logicians are not mathematicians attempting to turn their eyeballs inwards; that is not an effective way to know oneself. 1 St. Thomas Aquinas, Expositio libri Posteriorum. quoted in “Did St. Thomas Aquinas justify the transition from ‘is’ to ‘ought’ ?” by Piotr Lichacz 1 So it is perfectly consistent to use mathematical results to prove theorems about the foundations of mathematics (e.g., the compactness theorem in logic follows from Tychonoff’s theorem in topology) or analyse their content (e.g., categories as a semantics for logic), provided that those properties are unnecessary for grounding mathematics. More to the point (or to the comma) Linear logic arises by suppressing the rules of weakening and contraction. These are rules of sequent calculus. A sequent look like, A, B ` C, D where A, B, C and D are variables standing in for unspecified formulas. If you’re seeing this for the first time, I suggest for this talk to secretly read it as “A and B imply C or D”. That’s just a stopgap solution, since this leads to interpreting two different symbols as implication, since a formula can itself contain an implication symbol, e.g., A1 , A1 → A2 ` A2 is a sequent in some logic. A better reading of the turnstile ` might be “proves”. For example, ((∀g)ge = g), ((∀g)gg = e) ` ((∀g)(∀h)gh = hg) where e is an atomic formula. This could easily be viewed as a lemma in the theory of groups which says that g 2 = e for all g ∈ G implies G is commutative (not directly, but there exists a proof). To combine valid statements together into more complex statements requires rules of inferences. A (sound) rule of inference infers one sequent (the conclusion) from another (the premise). This is written by writing the premise over the conclusion in fraction style. ((∀g)ge = g), ((∀g)gg = e) ` ((∀g)(∀h)gh = hg) ((∀g)ge = g) ∧ ((∀g)gg = e) ` ((∀g)(∀h)gh = hg) where A ∧ B symbolizes A and B resembling the set intersection symbol A ∩ B, selects points which are in A and B. This could be viewed as combing the two “theorems” on the left hand side into a single theorem which continues to imply the theorem on the right hand side. The small size of the above logical step admittedly makes the above inference mathematically uninteresting. The steps of a proof usually make much bigger leaps. But for the study of the proof system itself, restricting to a few simple rules of inference greatly simplifies the analysis of the proof system and, as long as the simple rules of inference of the logician compose to form the complex inferences of the mathematician, there is no loss of generality. 2 The three forms of inference could be viewed as sorting mathematics into three levels: A → B represents the sort of implication within the statement of a proposition, e.g., if A is true, then B is true. A ` B is the sort of implication between propositions, e.g., assuming A one can prove B. A`B C`D is the sort of implication performed in the process of proving, e.g., given a proof of B from A one can prove D from C. In this sense, the rules of inference are rules for rewriting. Newcomers should be warned that this innocuous looking interpretation actually relates to a contentious philosophy called intuitionism that views a proposition A especially as the claim that one has a proof of A (that one has reduced A to steps which intuition can verify). This leads to contention, because the age-old law of the excluded middle2 does not satisfy this interpretation, i.e., it is not true that one either has a proof of A or a proof of ¬A. For example, if A is independent of the axioms, then neither A nor ¬A is provable, and one may freely decide whether to assume A or ¬A. Let Γ, Γ0 , ∆, ∆0 , etc. be finite sequences of formulas3 , and A, B, C, . . . be formulas. A certain number of formulas should be true using the above natural language interpretation of sequents, ax A`A i.e., a formula proves itself, and also, Γ, A ` ∆ L∧ Γ, A ∧ B ` ∆ Γ ` ∆, A Γ ` ∆, B R∧ Γ ` ∆, A ∧ B Γ, A, B ` ∆ Γ ` ∆, A Γ0 ` ∆0 , B 0 L ∧0 R∧ Γ, A ∧ B ` ∆ Γ, Γ0 ` ∆, ∆0 , A ∧ B There should also be a few banal rules about the commas, Γ, A, B, Γ0 ` ∆ exch Γ, B, A, Γ0 ` ∆ Γ`∆ weak Γ, A ` ∆ Γ, A, A ` ∆ contr Γ, A ` ∆ and their right-side analogues. The first is called the exchange rule. The last two are called weakening and contraction, respectively. 2 The law of the excluded middle says that either a proposition or its negation holds: ` A ∨ ¬A. 3 This follows the very well-written notes, Lectures on Linear Logic, by A.S. Troelstra. 3 Keeping track of formulas In order to treat formulas more like resources, their spontaneous appearance and disappearance should be restricted. In other words, the weakening and contraction rules should be omitted. This naturally affects the other rules. Assuming weakening and contraction, the rules L∧ and R∧ are equivalent respectively to L∧0 and R∧0 . For example, R∧0 and R∧ are equivalent in the presence of weakening and contraction rules (∆ is omitted for simplicity). Γ`A w Γ0 ` B w Γ, Γ0 ` B Γ, Γ0 ` A R∧ Γ, Γ0 ` A ∧ B Γ`A Γ`B R∧’ Γ, Γ ` A ∧ B c Γ`A∧B Without weakening and contraction, these are no longer equivalent, so we could legitimately distinguish between the ∧ of the first two rules, and the connective of the second two rules, call it ∧0 . That is just what we’ll do, but the first connective will be called instead & (with), and the second connective will be called ⊗ (tensor ). The ⊗ rules apply in any context (they are contextfree), while the & rule R∧ requires the context of the two sequences to agree (it is context-sensitive). Another way to divide the rules? We might attempt to divy up the rules among the connectives differently. For example, we might define a connective with rules R∧ and L∧0 . In that case, as soon as we add the cut rule (see below), we can derive the rule of contraction. Weakening follows from the complementary choice. So if the ∧ is either replaced by two connectives in the wrong way, or if it is not kept as it is, the rules of weakening and contraction can be derived. Hence linear logic is forced upon us by the single choice to suppress weakening and contraction. To sum up, we have arrived at rules Γ, A ` ∆ L& Γ, A & B ` ∆ Γ ` ∆, A Γ ` ∆, B R& Γ ` ∆, A & B Γ, A, B ` ∆ L⊗ Γ, A ⊗ B ` ∆ Γ ` ∆, A Γ0 ` ∆0 , B R⊗ Γ, Γ0 ` ∆, ∆0 , A ⊗ B Linear logic was proposed by the French logician Jean-Yves Girard in 1986. It seems to have been a rare case of logician single-handedly proposing an entire theory. Though published, the principal paper [Gir87] was not subjected to the normal review process because of its length (102pp) and novelty. 4 Does it have any negatives? The modifications to classical logic multiplied connectives, but it does not multiply negatives. There will still be a single negation operator called dual and written using the orthogonal vector sub-space notation A⊥ . At this point, a few assumptions will simplify the presentation without affecting the expressiveness of the logic. 1. Applications of the exchange rule are ignored by letting Γ, ∆, etc. denote multisets instead of sequences. (A multiset is a function f : S → Z >0 assigning a positive number to each element of a set.) 2. The right-hand side of a sequent suffices to express all possible sequents, Γ ` A, ∆ Γ, A⊥ ` ∆ Γ, A ` ∆ Γ ` A⊥ , ∆ So all sequents will be assumed to be of the form ` ∆. 3. The negation operator does not appear in the signature of the logic. Instead of being a connective, the dual is an involution on atomic formulas which induces an involution on all formulas by a few simple rules (see below) [?, p.68]. This removes the unnecessary bookkeeping of moving between equivalent expressions of a single formula. New connectives The introduction of negation doubles the number of connectives in analogy with the relation ¬(A ∧ B) = ¬A ∨ ¬B. Negating and’s should give or’s. Negating ⊗ gives a connective which we will write ` (par, as in “parallel or”) and negating & gives ⊕ (plus). In other words, negation satisifes, A⊥⊥ = A (A ⊗ B)⊥ = A⊥ ` B ⊥ (A & B)⊥ = A⊥ ⊕ B ⊥ (A ` B)⊥ = A⊥ ⊗ B ⊥ (A ⊕ B)⊥ = A⊥ & B ⊥ These properties determine the rules of inference for ` and ⊕ (just switch the rules for ⊗ and & to the left hand side of the turnstile). They will be summarized below. The connectives differ chiefly by how they handle the context of a formula. The context of the conclusion of the & and ⊕ rules agrees with the context of each premise. These are called additive connectives. On the other hand, the rules for ⊗ and ` do not restrict the context of the affected formulas, and the conclusion concatenates the contexts of the premises. These are called multiplicative connectives. 5 Multiplicatives ⊗ (multiplicative ∧) and ` (multiplicative ∨) give an embedding of classical logic Additives & (additive ∧) and ⊕ (additive ∨) give an embedding of intuitionistic logic, Cf. Section . As for the choice of symbols, it might at first glance seem better to label the negative of ⊗ as ⊕ instead of `. The symbols are assigned so that the relation between ⊗ and ⊕ is U ⊗ (V ⊕ W ) ∼ = U ⊗ V ⊕ U ⊗ W . The choice of symbols reflects this relation, but proving it requires a complete list of rules of inference, and we aren’t quite there yet. Implication Using the negative, implication can be defined in analogy with the classical case where A → B = ¬A ∨ B. In other words, implication is true when either the premise A is false or the conclusion B is true. If that seems odd to you, you are not the first to think so. This implies, for example, that if A is true, then B → A is true for all B. This could make irrelevant events seem to have cosmic ramifications:“If you will give me some G¨ uetzli, then 2 and 2 will make 4.” a child might say, and be correct! The truth of the statement A → (B → A) lies at the root of the strangeness. In any case, mimicking the definition ¬A ∨ B in linear logic gives the definition, A ( B := A⊥ ` B So the implication ( is not a connective of the logic but a shorthand. Since the ∨ split into ⊕ as well as `, one might also try defining implication as ! A ( B = A⊥ ⊕ B. This does not work for the philosophical reason that the additives & and ⊕ have intuitionistic properties whereas the negation ⊥ used in the definition of the implication has classical properties (namely A⊥⊥ = A.) This mismatch would produce undesirable results. Interpretation As promised, these symbols result in a resource aware logic, and with all in hand a menu can be written: (10CHF)⊗(10CHF) ( (Salad)⊗((Pasta)&(Rice))⊗((Apricots)⊕(CarrotCake)) The interpretation: ⊗ : receive both ( : give the left side and receive the ride side (convert resources) & : receive one or the other (you choose) 6 ⊕ : receive one or the other (you don’t choose), e.g., apricots in spring and summer, otherwise cake When a formula appears negated, its interpretation should be reversed (receive ↔ give). Since moving formulas across ( negates them, the ⊗ on the right side should be reas as giving instead of . The interpretations come from reading the rules of inference for each connective backwards (from bottom to top). The multiplicative A⊗B saves the context of both variables, so reading upwards, we have enough resources (the contexts) to follow both branches. But the additives & and ⊕ save only one copy of the context, so there are only enough resources to follow one or the other branch upward. In that case, why not view the & as a type of or ? The menu interpretation concerns the contexts of a & connective, whereas the interpretation of & as a type of and concerns the interpretation of A & B in terms of true/false values. Since the formula A & B ( A is provable, & cannot be a type of or. It might look like the ` connective was omitted, but it hides in A ( B which is A⊥ ` B by definition. Exponentials What if we would like to have our cake and eat it too? This is the raison d’ˆetre of the so-called exponentials, ! and ?: ` cake ` cake, ?(cake ( ⊥) The exponentials are the unary operators resembling modals. They allow controlled application of weakening and contraction rules. In other words, they allow the inferences, `?A, ?A `A ` A, ?B `?A Though it is simple, the idea works. The negation operator ensures there will be two different exponentials which are related by (!A)⊥ =?A⊥ . Their eccentric names are ofcourse for ! and whynot for ?. But two questions still need to be addressed. 1. How should the rules handle the context of A, i.e., other formulas in the same sequent? 2. In what ways should one be permitted to introduce exponentials? These are called exponentials because they convert additive connectives into multiplicative ones: !(A & B) ≡ (!A⊗!B) ?(A ⊕ B) ≡ (?A`?B) where the equivalence ≡ means that the formula (A ( B) & (B ( A) is derivable. 7 Units Not just the connectives of classical logic but also the constant T and F multiply. The equation T ∧ T = T becomes 1⊗1=1 >&>=> The notation is again confusing to the newcomer: 1⊥ = ⊥ >⊥ = 0 Cutting The cut rule puts a finishing touch on the rules of inference. Just the axiom rule allows the introduction of formulas into a proof, so the cut rule allows the removal of formulas. ` ∆, A ` ∆0 , A cut ` ∆, ∆0 Given the above rules, cut implies (but is not equivalent to) the familiar rule of modus ponens. Even if the name sounds unfamiliar, you use this rule every day. It says, if A implies B and A is true, then B is true, `A(B `A `B The proof of modus ponens using the cut rule: ax `A ` B, B ⊥ ⊗ `A(B ` A ⊗ B⊥, B cut `B where the identification A ⊗ B ⊥ = (A⊥ ` B)⊥ = (A ( B)⊥ has been used. Now the rules of inference for linear logic can be listed in figure 1. The cut rule is the only way of removing formulas when proving, but it adds no inferential power to the proof system. In fact, all the cuts can be removed by a simple process called cut elimination. Figure 1 shows the two cut elimination operations for proofs in M LL− . It has two rules: one for cuts of atomic formulas and one for cuts of composite formulas. Proposition 1. [?, Section 3.14] Linear logic permits cut elimination. As an algorithm applicable to all linear logic proofs, cut elimination serves as a model of computation. In this regard, it resembles modus ponens, which corresponds to currying in λ-calculus. The contexts play no active role in cut elimination; it is entirely local. Cut elimination terminates in linear time (linear in the number of premises). 8 LL ⊥ ax ` A, A ` Γ, A, B ` ` Γ, A ` B ` Γ, A ` Γ, B & ` Γ, A & B `?∆, A ! `?∆, !A ` ∆, ?A, ?A c ` ∆, ?A M A E ` Γ, A ` ∆, A⊥ cut ` Γ, ∆ ` Γ, A ` ∆, B `Γ 1 0 ⊗ `1 ` Γ, 0 ` Γ, ∆, A ⊗ B ` Γ, A ` Γ, B > ⊕1 ⊕2 ` Γ, > ` Γ, A ⊕ B ` Γ, A ⊕ B ` A, ∆ ? `?A, ∆ `∆ w ` ∆, ?A Table 1: Rules of inference for linear logic. Γ and ∆ are multisets of formulas. The labels on the left specify fragments of the logic. For example, the LL, M and E rows comprise the MELL fragment of linear logic. A superscript minus sign indicates that units should be excluded from the logic. For example, MLL− should be the LL and M rows without the rules about 1 and 0. A⊥ B⊥ −→ A A⊥ m⊥ B ⊥ cut A B AmB A, A⊥ ax A A A⊥ cut B B ⊥ cut cut −→ A Figure 1: Cut elimination for M LL− where m = ⊗, ` is a multiplicative connective. Proof nets Proof nets surpass sequent calculus proofs by eliminating the bureaucracy of proofs, to use Girard’s dramatic phrase. For example, the following two proofs differ by a necessary but arbitrary choice (applications of the exchange rule, as above, are left implicit), ` A, B, C `D ` A, B, C ⊗ D ` A ` B, C ⊗ D ` A, B, C ` A ` B, C `D ` A ` B, C ⊗ D These correspond to the same proof net pictured in Figure 2. A proof net is a particular type of proof structure, and a proof structure is a sort of graph. Their precise graphical representation varies slightly from place to place (sometimes they are even defined as a set of “links”, e.g., ⊗ A, B C A ⊗ B, with the graphical expression left implicit), so the below is 9 i i A i i C B ` A`B o D ⊗ C ⊗D o Figure 2: A single proof structure which corresponds to multiple proofs. an improvised definition to fit the mathematical definition of a graph (every edge connects exactly two vertices) and remain simple (there is exactly one label for each edge). Definition 2 (Proof structure). A proof structure is a graph with vertices labelled by i (input), o (output), `, ⊗, a (axiom) or c (cut) and edges labelled by formulas. The edge labels must be compatible with the vertex labels, e.g., the edges of a ` should be labelled A, B and A ` B. Exactly three edges must be attached to ` and ⊗ vertices, exactly two edges to a and c vertices, and exactly one edge to i and o vertices. The only exception is the i which may also be connected by one or two edges to other i-vertices to symbolize formulas appearing in a single sequent (these edges distinguish ` A, B from the pair ` A and ` B). The immediate question arises: which proof structures come from sequent calculus proofs? Correctness criterion answer this question. A proof structure satisfying a correctness criterion is called a proof net and comes from a sequent calculus proof. MLL admits a rather simple correctness criterion. But it requires a further definition. Definition 3 (Switch). Given a proof structure P , a switch of P is the unlabelled graph associated to P with every `-vertex replaced by an edge (see Figure 3). Hence if N is the number of `-vertices in a proof structure, then there are 2N switches associated the proof structure. Some proof structures are not the image of any MLL sequent proof. Definition 4. A proof net is a proof structure whose switches are connected and acyclic. Proposition 5. Proof nets are exactly those proof structures which come from proofs. From a proof net, a proof can be produced. Not surprisingly, this process is named sequentialization. How does this compare with classical logic? 10 (a) (b) i i A i i i i B ` A`B o o o Figure 3: The par link in (a) is replaced by one of the two switches in (b). Back to classical logic It would be nice to have a way to allow weakening and contraction for particular formulas as desired. A unary operator resembling a modality enables the transition back to intuitionistic logic. The operator ! is called the exponential modality. The embedding of intuitionistic logic is generated by, A∧B A∨B A→B T F 7→ 7→ 7→ 7 → 7 → A&B !A⊕!B !A ( B > 0 where A and B are atomic formulas. Then provability in intuitionistic logic coincides with provability in (exponential) linear logic. [?, Section 1.3.3] Categorical models A model of classical and linear logic is: l (C, ×, e) m - (L, ⊗, 1) c where • Classical logic: (C, ×, e) is a cartesian category with finite products and terminal object e • Linear logic: (L, ⊗, 1) is a symmetric monoidal closed category with product ⊗ and identity 1 11 • c is symmetric monoidal and takes a linear formula and makes it classical • l is symmetric monoidal and takes a classical formula and makes it linear • l is the left-adjoint to c: L(l(φ), λ) ∼ = C(φ, c(λ)) • The exponential modality is the composition l ◦ c : L → L. Given a model of the linear logic, there are at least three ways to form an associated classical logic: 1. (comonads) In this case C is the category of monoids in the functor category End(L)op : C := comon(L, ⊗, e) = {(κ : L → L, δ : κ =⇒ κ ◦ κ, : κ =⇒ id)} 2. (algebras of comonad) Given a comonad T on L, C := LT , where Ob(LT ) = Ob(L) LT (A, B) = L(T A, B) 3. (coalgebras of comonad) Given a comonad T on L, C := LT , where Ob(LT ) = ∪A∈L L(A, T A) L (A → T A, B → T B) = L(A, B) commuting with structure morphism, comultiplicatio T 12 Bibliography [Gir87] J.-Y. Girard, Linear logic, Theoretical Computer Science 50 (1987), no. 1, 1–102. 13