cryptanalysis and design of synchronous stream ciphers
Transcription
cryptanalysis and design of synchronous stream ciphers
KATHOLIEKE UNIVERSITEIT LEUVEN FACULTEIT INGENIEURSWETENSCHAPPEN DEPARTEMENT ELEKTROTECHNIEK–ESAT Kasteelpark Arenberg 10, 3001 Leuven-Heverlee CRYPTANALYSIS AND DESIGN OF SYNCHRONOUS STREAM CIPHERS Promotor: Prof. Dr. ir. Bart Preneel Proefschrift voorgedragen tot het behalen van het doctoraat in de ingenieurswetenschappen door Joseph LANO Juni 2006 KATHOLIEKE UNIVERSITEIT LEUVEN FACULTEIT INGENIEURSWETENSCHAPPEN DEPARTEMENT ELEKTROTECHNIEK–ESAT Kasteelpark Arenberg 10, 3001 Leuven-Heverlee CRYPTANALYSIS AND DESIGN OF SYNCHRONOUS STREAM CIPHERS Jury: Prof. Yves Willems, voorzitter Prof. Bart Preneel, promotor Prof. André Barbé Prof. Lars Knudsen (TU Denmark) Prof. Frank Piessens Dr. Matthew Robshaw (France Telecom) Prof. Joos Vandewalle Prof. Luc Van Eycken U.D.C. 681.3*D46 Proefschrift voorgedragen tot het behalen van het doctoraat in de ingenieurswetenschappen door Joseph Lano Juni 2006 c Katholieke Universiteit Leuven – Faculteit Ingenieurswetenschappen ° Arenbergkasteel, B-3001 Heverlee (Belgium) Alle rechten voorbehouden. Niets uit deze uitgave mag vermenigvuldigd en/of openbaar gemaakt worden door middel van druk, fotokopie, microfilm, elektronisch of op welke andere wijze ook zonder voorafgaande schriftelijke toestemming van de uitgever. All rights reserved. No part of the publication may be reproduced in any form by print, photoprint, microfilm or any other means without written permission from the publisher. D/2006/7515/59 ISBN 90-5682-725-1 Abstract This thesis is a contribution to the ongoing research in the cryptanalysis and design of stream ciphers, an important class of algorithms used to protect the confidentiality of data in the digital world. We focus on synchronous stream ciphers as these appear to offer the best combination of security and performance. We present a framework that describes the most important classes of attacks on synchronous stream ciphers. We stress the importance of linear cryptanalysis, a statistical attack related to the nonlinearity properties of the building blocks, by developing a simple framework, which we apply to the nonlinear filters and combiners. Algebraic attacks have an important impact on the cryptanalysis of stream ciphers with a linear update function. We present a coherent framework for these attacks, quantify the properties that the stream cipher building blocks should satisfy, present a new and more efficient algorithm and theoretical bounds. Next to statistical and algebraic properties, some structural issues also are of importance, including trade-off attacks and Guess-and-Determine attacks. We also describe and illustrate the relevance of these approaches to cryptanalysis. Resynchronization attacks are very powerful attacks on the initialization of stream ciphers. We extend the resynchronization attack to overcome some limitations of the original attack by further refining the original resynchronization attack and by combining the attack with other cryptanalysis methods. We subsequently discuss the implementation of these mathematical building blocks in practice. We discuss efficient and secure Boolean functions for the filter generator and elaborate on the issue of secure implementation of stream ciphers, by introducing differential power analysis of synchronous stream ciphers. To support our theoretical treatment, we also present practical applications of our analysis to actual primitives. This includes the NESSIE candidate Sobert32, the eSTREAM candidates SFINKS and WG, the Bluetooth encryption algorithm E0, the GSM stream cipher A5/1 and the Alleged SecurID Hash Function. i ii Beknopte Samenvatting Deze thesis is een bijdrage tot het onderzoek naar de cryptanalyse en het ontwerp van stroomcijfers. Stroomcijfers zijn een belangrijke klasse encryptiealgoritmen en worden gebruikt om gegevens te vercijferen in de digitale wereld. Synchrone stroomcijfers, die de beste mix van veiligheid en efficiëntie bieden, worden behandeld. Er wordt een raamwerk voorgesteld dat de voornaamste aanvallen op synchrone stroomcijfers omvat. De aandacht wordt eerst gevestigd op het belang van lineaire cryptanalyse, een statistische aanval die gebruik maakt van de niet-lineaire eigenschappen van de bouwblokken. Een eenvoudig raamwerk wordt voorgesteld, dat wordt toegepast op niet-lineaire filter- en combinatiegeneratoren. Algebraı̈sche aanvallen hebben een aanzienlijke impact op de cryptanalyse van stroomcijfers die een lineair variërende toestand hebben. Een coherent raamwerk voor deze aanvallen wordt voorgesteld, de eigenschappen waaraan de bouwblokken van het stroomcijfer moeten voldoen worden gekwantificeerd, nieuwe en efficiëntere algoritmen en theoretische grenzen worden voorgesteld. Naast de statistische en algebraı̈sche eigenschappen zijn er ook structurele vereisten. Dit omhelst trade-off aanvallen en Gok-en-Determineer aanvallen. Het belang van deze cryptanalytische aanvallen wordt besproken en geı̈llustreerd. Resynchronizatie-aanvallen zijn erg krachtige aanvallen op de initialisatie van stroomcijfers. De oorspronkelijke aanval wordt uitgebreid en beperkingen van de aanval worden weggenomen. Dit wordt bereikt door de aanval verder te verfijnen en door de aanval te combineren met andere methodes van cryptanalyse. Vervolgens bespreken we de implementatie van dergelijke wiskundige bouwblokken. We bestuderen efficiënte en veilige Booleaanse functies voor de filtergenerator en bestuderen beveiligde implementaties van stroomcijfers, waartoe we differentiële vermogensanalyse van synchrone stroomcijfers voorstellen. Onze theoretische uiteenzetting wordt geı̈llustreerd door toepassing op enkele algoritmen. Zo wordt cryptanalyse van de NESSIE kandidaat Sober-t32, de eSTREAM kandidaten SFINKS en WG, het Bluetooth stroomcijfer E0, het GSM stroomcijfer A5/1 en de Alleged SecurID Hash Function besproken. iii iv Notation <S > AA AES AI ANF ASHF BG BS CI ct D DES deg f or df d(f, g) (dg , dh ) − R DPA E ECRYPT p. p. p. p. p. p. p. p. p. p. p. p. p. p. p. p. p. p. 85 45 2 12 10 119 64 64 12 15 15 2 11 11 13 114 15 3 eSTREAM FAA Fq Fnq GD GE GSM In IV Kc lsb p. p. p. p. p. p. p. p. p. p. p. 3 45 7 9 68 104 18 15 15 15 72 Linear Span of the set S Algebraic Attack Advanced Encryption Standard Algebraic Immunity Algebraic Normal Form Alleged SecurID Hash Function Babbage-Golic trade-off Biryukov-Shamir trade-off Correlation Immune Ciphertext digit Decryption function Data Encryption Standard Degree of the Boolean function f Hamming distance between f and g (dg , dh )-Relation Differential Power Analysis Encryption function ECRYPT - European Network of Excellence for Cryptology ECRYPT Stream Cipher Project Fast Algebraic Attack Finite field with q elements n-dimensional vector space over Fq Guess-and-Determine attack Gate Equivalent Global System for Mobile Communications Initialization function Initialization Vector Cipher key least significant bit v LFSR MAC ma (x) Mud p. p. p. p. 7 19 10 22 msb MSB NESSIE p. 72 p. 69 p. 2 Nf NIST O O pt RF RFID S S-Box SCA SPA U Wf (ω) XOR zt p. p. p. p. p. p. p. p. p. p. p. p. p. p. p. 11 2 15 22 15 105 104 15 13 113 114 15 10 35 15 Linear Feedback Shift Register Message Authentication Code aϕ−1 A monomial xa0 0 . . . xϕ−1 Number of monomials with u variables and degree up to d most significant bit Most Significant Byte New European Schemes for Signatures, Integrity, and Encryption The nonlinearity of the Boolean function f National Institute of Standards and Technology Output function Order of complexity Plaintext digit Resynchronization Factor Radio Frequency Identification Internal state Substitution Box (or vectorial Boolean function) Side Channel Attack Simple Power Analysis State update function Walsh spectrum of the Boolean function f eXclusive OR, ⊕ Key stream digit vi Contents Abstract i Beknopte Samenvatting iii Notation v List of Figures xi List of Tables xiv 1 Introduction 1.1 Cryptology . . . . . . . . . . . . . . . 1.2 Focus and Background for This Thesis 1.3 This Thesis . . . . . . . . . . . . . . . 1.3.1 Outline of the Thesis . . . . . . . . . . 2 Basic Concepts 2.1 Linear Feedback Shift Registers . . . . . 2.2 Boolean Functions and S-Boxes . . . . . 2.2.1 Definition and Representations of 2.2.2 Properties of Boolean Functions 2.2.3 S-Boxes . . . . . . . . . . . . . . 2.3 Symmetric Algorithms . . . . . . . . . . 2.3.1 Block Ciphers . . . . . . . . . . . 2.3.2 Stream Ciphers . . . . . . . . . . 2.3.3 Message Authentication Codes . 2.4 Some Important Cryptanalytic Tools . . 2.4.1 Differential Cryptanalysis . . . . 2.4.2 Linear Cryptanalysis . . . . . . . 2.4.3 Algebraic Cryptanalysis . . . . . vii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Boolean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 3 4 . . . . . . . . . . . . Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 9 9 10 13 14 14 15 19 19 20 21 21 . . . . . . . . . . . . . . . . . . . . 2.5 2.4.4 Application in This Thesis . . . . . . . . . . . . . . . . . . . 23 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3 Linear Attacks 3.1 General Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Filter and Combination Generators . . . . . . . . . . . . . . . . . 3.2.1 Distinguishing Attacks . . . . . . . . . . . . . . . . . . . . 3.2.2 Fast Correlation Attacks . . . . . . . . . . . . . . . . . . . 3.3 Linear Distinguishing Attack on Sober-t32 . . . . . . . . . . . . 3.3.1 Extending the Attack on Unstuttered Sober-t32 to Full Sober-t32 . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Extending the Attack on Sober-t16 to Sober-t32 . . . . 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 . 39 . 44 4 Algebraic Attacks 4.1 Algebraic Attacks and Fast Algebraic Attacks . . . . . . . . 4.1.1 Algebraic Attacks . . . . . . . . . . . . . . . . . . . 4.1.2 Fast Algebraic Attacks . . . . . . . . . . . . . . . . . 4.2 Algorithms for Computing AI and (dg , dh ) − R . . . . . . . 4.2.1 Computing the AI of a Boolean Function . . . . . . 4.2.2 Computing the (dg , dh ) − R for a Boolean Function 4.3 Theoretical Bounds . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Theoretical Bounds on Algebraic Immunity . . . . . 4.3.2 Theoretical Bounds on the Existence of (dg , dh ) − R 4.4 Application to Concrete Designs . . . . . . . . . . . . . . . 4.4.1 SFINKS . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 WG . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Discussion of the Current Analysis of the Resistance 4.5 Conclusions and Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 45 46 47 51 51 52 54 54 55 57 57 59 61 62 5 Structural Attacks 5.1 Overview of Some Attacks . . . . . . . . . . 5.1.1 Trade-off Attacks . . . . . . . . . . . 5.1.2 Inversion Attacks . . . . . . . . . . . 5.1.3 Guess-and-Determine Attacks . . . . 5.2 Guess-And-Determine Attack on Sober-t32 5.2.1 Attack without Carry Bits . . . . . . 5.2.2 Taking Account of the Carry Bits . . 5.2.3 Further Improvements . . . . . . . . 5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 63 63 68 68 69 69 71 73 74 viii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 25 27 27 33 34 6 Resynchronization Attacks 6.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Used Framework for Synchronous Stream Ciphers . . . . 6.1.2 Some Special Properties of Boolean Functions and Related Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3 Resynchronization Attacks . . . . . . . . . . . . . . . . . 6.2 The Daemen et al. Resynchronization Attack . . . . . . . . . . . 6.2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Real-time Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Attack with a Small Number of Resynchronizations . . . . . . . . 6.4.1 Computational Approach . . . . . . . . . . . . . . . . . . 6.4.2 Using Algebraic Attacks . . . . . . . . . . . . . . . . . . . 6.5 Resynchronization Attacks with Large ϕ . . . . . . . . . . . . . . 6.5.1 A Chosen IV Attack . . . . . . . . . . . . . . . . . . . . . 6.5.2 A Known IV Attack . . . . . . . . . . . . . . . . . . . . . 6.5.3 An Algebraic Attack . . . . . . . . . . . . . . . . . . . . . 6.5.4 The Degree-1 Attack . . . . . . . . . . . . . . . . . . . . . 6.6 Attacks on Combiners with Memory . . . . . . . . . . . . . . . . 6.6.1 A Standard Resynchronization Attack . . . . . . . . . . . 6.6.2 Using Ad-hoc Equations . . . . . . . . . . . . . . . . . . . 6.7 Nonlinear Resynchronization . . . . . . . . . . . . . . . . . . . . 6.7.1 A Chosen IV Attack . . . . . . . . . . . . . . . . . . . . . 6.7.2 A Known IV Attack . . . . . . . . . . . . . . . . . . . . . 6.7.3 Implications of the Attacks . . . . . . . . . . . . . . . . . 6.8 Application to some Key Stream Generators . . . . . . . . . . . . 6.8.1 Attacks on E0 . . . . . . . . . . . . . . . . . . . . . . . . 6.8.2 Attacks on Toyocrypt . . . . . . . . . . . . . . . . . . . . 6.8.3 The Summation Generator with 4 Inputs . . . . . . . . . 6.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Implementation 7.1 Efficient Hardware Implementation . . . . . . 7.1.1 Area, Performance and Energy . . . . 7.1.2 Efficient and Secure Boolean Functions erator . . . . . . . . . . . . . . . . . . 7.2 Secure Implementation . . . . . . . . . . . . . 7.2.1 Description of Power Analysis Attacks 7.2.2 Power Analysis of Stream Ciphers . . 7.3 Conclusion . . . . . . . . . . . . . . . . . . . ix . . . . for . . . . . . . . . . . . . . . . . . . . . . . . . . . . the Filter Gen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 . 76 . 76 . . . . . . . . . . . . . . . . . . . . . . . . . . 76 78 79 79 80 80 81 82 83 86 86 87 89 89 89 90 91 92 93 93 94 95 95 98 100 100 103 . 104 . 104 . . . . . 108 113 114 115 117 8 Cryptanalysis of Alleged SecurID 8.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Description of Alleged SecurID Hash Function . . . . . . . . . 8.2.1 The Expansion Function . . . . . . . . . . . . . . . . . 8.2.2 The Key-Dependent Initial and Final Permutation . . 8.2.3 The Key-Dependent Round . . . . . . . . . . . . . . . 8.2.4 The Key-Dependent Conversion to Decimal . . . . . . 8.3 Cryptanalysis of the Key-Dependent Rounds . . . . . . . . . 8.3.1 A Linear Property . . . . . . . . . . . . . . . . . . . . 8.3.2 Avalanche of One-Bit Differences . . . . . . . . . . . . 8.3.3 Differential Characteristics . . . . . . . . . . . . . . . 8.3.4 Vanishing Differentials . . . . . . . . . . . . . . . . . . 8.4 Extending the Attack . . . . . . . . . . . . . . . . . . . . . . 8.4.1 The Attack on the Full ASHF . . . . . . . . . . . . . . 8.4.2 The Attack on the Full ASHF with the Time Format . 8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 119 122 123 123 124 125 126 126 127 129 131 132 133 133 138 9 Conclusions and Open Problems 139 9.1 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . . . 139 9.2 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 A Description of Algorithms A.1 Description of the GSM Stream Cipher A5/1 . . . A.2 Description of the Bluetooth Stream Cipher E0 . . A.3 Description of the SFINKS Stream Cipher . . . . . A.3.1 Key, IV and MAC Length . . . . . . . . . . A.3.2 LFSR . . . . . . . . . . . . . . . . . . . . . A.3.3 Filter Function . . . . . . . . . . . . . . . . A.3.4 Resynchronization Mechanism . . . . . . . A.3.5 MAC Algorithm . . . . . . . . . . . . . . . A.4 Description of the Sober-t32 Stream Cipher . . . A.4.1 The Linear Feedback Shift Register (LFSR) A.4.2 The Non-Linear Function (NLF) . . . . . . A.4.3 The Stuttering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 143 144 145 146 146 147 147 148 148 150 150 151 Bibliography 152 Nederlandse Samenvatting 167 List of Publications 175 Curriculum Vitae 178 x List of Figures 2.1 2.2 2.3 Representation of an LFSR . . . . . . . . . . . . . . . . . . . . . . 8 Synchronous stream encryption . . . . . . . . . . . . . . . . . . . . 16 Self-synchronizing stream encryption . . . . . . . . . . . . . . . . . 18 5.1 5.2 BS multiple-key trade-off for 128-bit secret key . . . . . . . . . . . 66 BS multiple-key trade-off for 80-bit secret key . . . . . . . . . . . . 67 6.1 6.2 Model for a two-level combiner . . . . . . . . . . . . . . . . . . . . 92 Upper bound for the number of resyncs as a function of the resilience ρ with as parameter the number ϕ of input bits of the Boolean function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 7.1 RF as a function of the packet length for A5/1, E0 and RC4 8.1 8.2 8.3 Overview of the ASHF hash function . . . . . . . . . . . . . . . . . 122 Sub-round i defined by the operations R and S. . . . . . . . . . . . 125 Percentage of keys that have at least one VD . . . . . . . . . . . . 134 A.1 A.2 A.3 A.4 A.5 Layout of the stream cipher Layout of the stream cipher Layout of the stream cipher Layout of the stream cipher Structure of the function f A5/1 . . . . E0 . . . . . SFINKS . . Sober-t32 . . . . . . . xi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 . . . . . . . . . . . . . . . . . . . . 144 145 146 149 151 xii List of Tables 3.1 3.2 3.3 3.4 4.1 4.2 4.3 Optimal complexities of the distinguishing attacks on the filter generator (for bent functions) . . . . . . . . . . . . . . . . . . . . Optimal complexities of the distinguishing attacks on the filter generator (for bent functions), when the length of a single key stream is restricted to 240 bits . . . . . . . . . . . . . . . . . . . . Optimal complexities (for the cryptanalyst) of the distinguishing attacks on the combination generator for optimal functions (for the cryptographer) . . . . . . . . . . . . . . . . . . . . . . . . . . Optimal complexities (for the cryptanalyst) of the distinguishing attacks on the combination generator for optimal functions (for the cryptographer, when the length of a single key stream is restricted to 240 bits) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 . 30 . 32 . 32 48 50 4.7 Logarithm of complexities of the algebraic attack . . . . . . . . . . Logarithm of complexities of the fast algebraic attack . . . . . . . log2 of complexities of FAA for a 256-bit state if f satisfies (dg , dh )− R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . log2 of complexities for the Relation Search Step for SFINKS with the new algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . log2 of complexities of FAA for a 319-bit state if f satisfies (dg , dh )− R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . log2 of complexities for the Relation Search Step for WG with new algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Complexity of Relation Search Step for WG with the old algorithm 6.1 6.2 6.3 6.4 6.5 The situation of a frequently reinitialized key stream generator The complexities of the practical resynchronization attacks . . Attacks on the E0 key stream generator . . . . . . . . . . . . . Theoretical probability of having a basis . . . . . . . . . . . . . Experimental probability of having a basis . . . . . . . . . . . . 78 83 97 97 98 4.4 4.5 4.6 xiii . . . . . . . . . . 58 59 60 61 61 6.6 6.7 6.8 Experimental probability of having a basis for Daemen’s attack . . 98 Computer simulations of the Degree-1 attack and the Daemen et al. attack using ad-hoc equations on the E0 key stream generator . 99 The Degree-d -1 Attack, the Degree-e attack and the Daemen et al. attack on the summation generator with 4 LFSRs and a key size of 128 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 7.1 7.2 7.3 Gate count of some common stream cipher building blocks . . . . . 104 Known highly nonlinear power functions on F2ϕ of degree ≥ 9 . . . 111 AI of known highly nonlinear power functions on F2ϕ of degree ≥ 9 113 8.1 8.2 The attacks presented in this chapter . . . . . . . . . . . . . . . . . 121 Unrestricted differential trail for the first 50 subrounds (avalanche demonstration). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 A Differential characteristic with probability 2−15 for the first round130 8.3 A.1 The possible actions of the stuttering unit as a function of the dibit151 xiv Chapter 1 Introduction 1.1 Cryptology In the past decades, we have witnessed an explosive growth of the digital storage and communication of data, triggered by some important breakthroughs such as the Internet and the expansive growth of wireless communications. These new information and communication technologies require adequate security. Cryptology is the science that aims to provide information security in the digital world. A thorough overview of cryptology can be found in the Handbook of Applied Cryptography by Menezes et al. [125]. Information security comprises many aspects, the most important of which are confidentiality and authenticity. Confidentiality means keeping the information secret from all except those who are authorized to learn or know it. Authenticity involves both ensuring that data have not been modified by an unauthorized person (data integrity) and being able to verify who is the author of the data (data origin authentication). Cryptology is usually split up into two closely related fields, cryptography and cryptanalysis. Cryptography studies the design of algorithms and protocols for information security. The ideal situation would be to develop algorithms which are provably secure, but this is only possible in very limited cases. Therefore, the best way to assess the security of a cryptographic algorithm or protocol is by testing all possible known attacks on it. Cryptanalysis is concerned with this study of mathematical techniques that attempt to break cryptographic primitives. The cryptographic algorithms are usually split up into two families, symmetric algorithms and public-key algorithms. Symmetric algorithms (also called secretkey algorithms) require that the same secret key is shared by the communicating parties. In public-key algorithms (also called private-key algorithms), the public key e is made public and the private key d is kept secret by a single entity. 1 2 1.2 CHAPTER 1. INTRODUCTION Focus and Background for This Thesis In this thesis, we study a particular class of symmetric algorithms that aim to ensure confidentiality. Such algorithms can efficiently encrypt and decrypt data, and are hence widely used in practice. There are two main types of symmetric encryption algorithms, namely block ciphers and stream ciphers. It is the last category which is the main study topic of this work. We explain the distinction between block and stream ciphers into more detail in Sect. 2.3. Block ciphers have been the most widely studied symmetric encryption algorithms so far. A very important step has been the adoption of DES, the Data Encryption Standard, as a U.S. Federal Information Processing Standard by the National Institute of Standards and Technology (NIST) [130] in 1977. This started an extensive study of the security of block ciphers and resulted in some important cryptanalytic breakthroughs such as differential cryptanalysis [22] and linear cryptanalysis [118, 119]. In 2001, AES, the Advanced Encryption Standard, became the new U.S. Federal Information Processing Standard [131] for encryption of sensitive (unclassified) information. The block cipher Rijndael developed by Joan Daemen and Vincent Rijmen was selected to be the AES after an open selection process. The research in the past decades has led to a relatively good understanding of the security of block ciphers. Despite the success of block ciphers, there also is a need for stream ciphers, which seem to offer advantages in a number of scenarios. In the 60’s, 70’s and 80’s, research has focused on hardware-oriented stream ciphers, see the work of Rueppel [149] for an overview. In the 90’s, several software-oriented stream ciphers such as RC4 have been developed, see Preneel et al. [140] for an overview. The security of stream ciphers has only recently received increasing attention. Two European projects that have had influence on this evolution are the NESSIE and ECRYPT projects. NESSIE. NESSIE [133] stands for New European Schemes for Signatures, Integrity, and Encryption. It was a project within the Information Society Technologies (IST) Programme of the European Commission which started in 2000 and ended in 2004. The main objective of NESSIE has been to put forward a portfolio of strong cryptographic primitives obtained after an open call and evaluated using a transparent and open process. Forty-two cryptographic primitives of different categories were submitted to NESSIE. These were studied by all interested researchers during several years, resulting in a portfolio of twelve algorithms, complemented by five existing standards. No stream cipher made it to the final portfolio, as weaknesses had been discovered in all candidate stream ciphers. This illustrated that the study of stream ciphers is not as mature as the study of block ciphers. 1.3. THIS THESIS 3 ECRYPT. ECRYPT is a Network of Excellence within the Information Societies Technology (IST) Programme of the European Commission; it was launched in February 2004. Seeing an increased interest in stream cipher cryptanalysis and design, ECRYPT decides to launch a call for primitives. In April 2005, ECRYPT received 34 candidate stream ciphers for its eSTREAM project [73]. The eSTREAM call encompassed two different profiles, for which ECRYPT believes stream ciphers may offer significant advantages compared to block ciphers: • Profile 1: Stream ciphers for software applications with high throughput requirements having a long-term security goal. Here a 128-bit secret key should be accommodated. • Profile 2: Stream ciphers for hardware applications with restricted resources having a medium-term security goal. Here a 80-bit secret key should be accommodated. The submitters also had the possibility to propose an associated authentication mechanism, resulting in a primitive that could ensure both confidentiality and data integrity. 1.3 This Thesis This thesis is a contribution to the ongoing research in the cryptanalysis and design of stream ciphers. We focus on synchronous stream ciphers as these appear to offer the best combined security and performance, as we will motivate in this work. Our contributions to the field are the following. The thesis first presents a framework describing the most important classes of attacks on synchronous stream ciphers. This notably include linear cryptanalysis, algebraic attacks and resynchronization attacks. Special attention is given to stream ciphers with linear update function and a Boolean output function, as for these we can establish a relation between the mathematical properties of its building blocks and the security of the design. We stress the importance of linear cryptanalysis, a statistical attack related to the nonlinearity properties of the building blocks, by developing a simple framework, which we apply to the nonlinear filters and combiners. The recent algebraic and fast algebraic attacks introduced by Courtois et al. have had an important impact on the cryptanalysis of stream ciphers with a linear update function. We present a coherent framework for these attacks, and specifically quantify the properties that the stream cipher building blocks should satisfy to resist these attacks. In order to evaluate the resistance of stream ciphers against fast algebraic attacks, we present a new and more efficient algorithm and also derive theoretical bounds. 4 CHAPTER 1. INTRODUCTION Next to statistical and algebraic properties, some structural issues are also of importance, including trade-off attacks and Guess-and-Determine attacks. We also describe and illustrate the relevance of these approaches to cryptanalysis. Resynchronization attacks introduced by Daemen et al. are very strong attacks on the initialization of stream ciphers, but received little attention so far. In this work we extend the resynchronization attack to overcome some limitations of the original attack of Daemen et al. and present new results. We achieve this by further refining the original resynchronization attack and by combining the attack with the other attack methodologies described. In a second part of this thesis, we discuss the implementation of these mathematical building blocks in practice, motivating our preference for stream cipher designs which are easy to analyze for a cryptanalyst, and which consist of small building blocks that are easy to analyze and that have good cryptographic and implementation properties. As an illustration, we discuss efficient and secure Boolean functions for the filter generator. We also elaborate on the issue of secure implementation of stream ciphers, introducing differential power analysis of synchronous stream ciphers. In order to support our theoretical treatment, we also present new cryptanalytic results on actual primitives. This includes the NESSIE candidate Sobert32, the eSTREAM candidates SFINKS and WG, the Bluetooth encryption algorithm E0, the GSM stream cipher A5/1 and the Alleged SecurID Hash Function. 1.3.1 Outline of the Thesis The outline of this thesis is as follows: • Chapter 2 introduces some specialized mathematical concepts that will be used in this thesis. This includes Linear Feedback Shift Registers, Boolean functions, S-boxes, block ciphers, stream ciphers and cryptanalytic tools. • Chapter 3 presents results on cryptanalysis of synchronous stream ciphers using the tools of linear cryptanalysis. This includes a general description of the technique, application to nonlinear filters and combiners, and an attack on the stream cipher Sober-t32. • Chapter 4 discusses algebraic attacks. We present an overview of the attacks, and describe a new algorithm and a theoretical framework to assess the resistance against the most powerful class of fast algebraic attacks currently known. • Chapter 5 presents structural attacks. We discuss the impact of trade-off attacks on stream ciphers, and briefly present some other attack strategies. We also present a Guess-and-Determine attack on Sober-t32. 1.3. THIS THESIS 5 • Chapter 6 presents resynchronization attacks. We extend and refine the original resynchronization attack and combine resynchronization attacks with several attack methodologies of previous chapters. • Chapter 7 gives an overview of the implementation of synchronous stream ciphers. We discuss both the efficiency and the security of the implementation. • Chapter 8 presents an application of the cryptanalytic tools on a particular primitive, the Alleged SecurID Hash Function. • Chapter 9 concludes this thesis and presents some open problems. 6 CHAPTER 1. INTRODUCTION Chapter 2 Basic Concepts For the study of stream ciphers, some general and more specific mathematical concepts are required. Essential mathematical courses notably include linear algebra, finite fields and statistics. For these general topics, we refer respectively to the books of Lay [113], Lidl and Niederreiter [115], and Freedman et al. [80]. In this chapter we introduce some more specific mathematical concepts necessary to understand this thesis. We introduce several stream cipher building blocks, notably Linear Feedback Shift Registers, Boolean functions and S-boxes. We then give a definition of symmetric algorithms and describe the different classes. Finally we describe some substantial tools that will be used during the mathematical analysis, namely linear, differential and algebraic tools. 2.1 Linear Feedback Shift Registers First, an overview of Linear Feedback Shift Registers and some mathematical results that are relevant for this thesis are presented. For more information and background on the subject, we refer the reader to Golomb [90] and to the other references in this section. Let Fq be a finite field with q elements. A Linear Feedback Shift Register is defined as follows. Definition 2.1.1 A Linear Feedback Shift Register (LFSR) of length L over Fq . At each time t the is a collection of L q-bit memory elements s0t , s1t , . . . , sL−1 t memory is updated as follows: ½ sit = si+1 for i = 0, . . . , L − 2 t−1 PL L−1 st = i=1 ci · sL−i t−1 . 7 (2.1) 8 CHAPTER 2. BASIC CONCEPTS where the ci are fixed coefficients that define the feedback equation of the LFSR. The LFSR stream (st )t≥0 consists of the successive values in the memory element s0 . A representation of an LFSR is shown in Fig. 2.1, where + denotes the field addition and ci means field multiplication with ci . L-1 St+1 C1 stL-1 C2 stL-2 C3 stL-3 CL-1 st1 CL st0 Figure 2.1: Representation of an LFSR Associated with an L-stage LFSR is its feedback polynomial P(X) of degree L: P (X) = 1 − L X ci · X i . (2.2) i=1 The weight of the feedback polynomial is equal to its number of nonzero terms. In practical designs, a feedback polynomial is chosen to be primitive. This implies that every nonzero initial state produces an output sequence that has the maximal period q L −1. Such a sequence is called a maximal-length sequence, m-sequence or pn-sequence. Algorithms for finding primitive polynomials are described in [115]. LFSRs are used as a building block in many applications. The sequences generated have a large period and good statistical properties. They can be very efficiently implemented in hardware, where dedicated constructions most often use binary LFSRs. In software a finite field corresponding with the word-size can be used efficiently, for instance F232 on a 32-bit processor. For ease of notation, the LFSRs considered below will always be defined over F2 and have a primitive polynomial P (X), unless otherwise stated. For many cryptanalytic attacks, it is useful to search low-weight multiples of the feedback polynomial P (x), see [38, 92, 84]. The number m(D, w) of multiples PD Q(X) = 1 + i=1 ci · X i of the polynomial P (X), with degree less than or equal to D and with weight w, can be approximated by [38]: m(D, w) ≈ Dw−1 . (w − 1)! · 2L (2.3) 2.2. BOOLEAN FUNCTIONS AND S-BOXES 9 It is interesting to know for which value Dmin we can expect a first multiple Q(X) of weight w to exist with non-negligible probability. It follows from (2.3) that: 1 Dmin (w) ≈ (2L · (w − 1)!) w−1 . (2.4) Currently, the most efficient approach to search for these low-weight multiples is a birthday-like approach, see [84, 44]. The precomputation complexity P needed to find all multiples Q(X) of weight w and degree at most D can be approximated by: w−1 Dd 2 e P (D, w) ≈ w−1 . (2.5) d 2 e! Note that these numbers are also valid if the polynomial is the product of several primitive polynomials with co-prime degree [92]. 2.2 Boolean Functions and S-Boxes Boolean functions and S-boxes are building blocks for stream ciphers and block ciphers. In this section we give the most significant definitions and properties of these building blocks. We refer the interested reader to the references in this section for more background. 2.2.1 Definition and Representations of Boolean Functions Let x = (x0 , x1 , . . . xϕ−1 ) denote an element of the vector space Fϕ 2 . When no confusion is possible, such an element will also be denoted sometimes as x to improve readability. Then a Boolean function can be defined as follows: Definition 2.2.1 A ϕ-variable Boolean function f is a mapping from the vector space Fϕ 2 into F2 . Truth table. Every Boolean function can easily be represented by a truth table, which lists the 2ϕ function values of f corresponding to all inputs. It is hence ϕ clear that there exist 22 ϕ-variable Boolean functions. The truth table will be lexicographically ordered by the arguments x. This is a very simple idea: with every x we associate a natural number x0 · 2ϕ−1 + x1 · 2ϕ−2 + . . . + xϕ−1 , (2.6) and order the x values in the same way as the natural ordering of the associated natural numbers. 10 CHAPTER 2. BASIC CONCEPTS Algebraic normal form. A second often-used unique representation of a Boolean function is its algebraic normal form (ANF): f (x) = f (x0 , . . . , xϕ−1 ) = M a ϕ−1 cf (a0 , . . . , aϕ−1 ) xa0 0 . . . xϕ−1 . (2.7) (a0 ,...,aϕ−1 )∈Fϕ 2 Here cf (a) are the binary coefficients of the ANF and they are defined by cf (a) = M f (x) , (2.8) x¹a a ϕ−1 where x ¹ a means that xi ≤ ai for all i ∈ {0, . . . , n−1}. Every term xa0 0 . . . xϕ−1 is a monomial and will also be denoted as ma (x). Walsh spectrum. spectrum: A third representation of a Boolean function is its Walsh Wf (ω) = X (−1)f (x)⊕x·ω , (2.9) x∈Fϕ 2 where x · ω = x0 ω0 ⊕ x1 ω1 ⊕ · · · ⊕ xϕ−1 ωϕ−1 is the dot product of x and ω. The Walsh spectrum can be efficiently computed from the truth table by a Fast Walsh Transform with complexity O(ϕ · 2ϕ ), see [144] for an explanation of the algorithm. As we will see below, the Walsh spectrum is a very useful tool for analyzing the cryptographic behavior of a Boolean function. Trace representation. A fourth representation is the trace representation of the Boolean function. Here we use the isomorphism between the vector space Fϕ 2 and the finite field F2ϕ . This representation is based on the trace function from F2ϕ to F2 , which is defined as follows: 2 3 tr(x) = x + x2 + x2 + x2 + . . . + x2 ϕ−1 . (2.10) Every Boolean function can be represented as the trace of a polynomial P (x): f (x) = tr(P (x)). Here x is an element of F2ϕ and P (x) is a polynomial with degree at most 2ϕ − 1. 2.2.2 Properties of Boolean Functions Several properties of Boolean functions are relevant from a cryptographic viewpoint. We now give an overview of these notions, which are related to the representations given in the previous section. 2.2. BOOLEAN FUNCTIONS AND S-BOXES 11 Affine equivalence. Two functions f1 , f2 on Fϕ 2 are said to be affine equivalent if f1 (x) = f2 (xA ⊕ b) ⊕ x · c ⊕ d , (2.11) where A is an invertible ϕ × ϕ matrix over F2 , b, c ∈ Fϕ 2 , d ∈ F2 . Two functions are said to be affine equivalent in the input variables if and only if c = 0, d = 0 in the above equation. Algebraic degree. The algebraic degree of f , denoted by df or deg f , is defined aϕ−1 as the highest number of variables in a monomial xa0 0 . . . xϕ−1 in the ANF of f having nonzero coefficient. A function which has df ≤ 1 is called affine. An affine function for which f (0) = 0 is called linear. Balancedness. The support of f is defined as sup(f ) = {x ∈ Fϕ 2 : f (x) = 1} and corresponds to the inputs to f that give a 1 as output. The cardinality of sup(f ) represents the weight wt(f ) of the function, it is the number of ones in the truth table of f . A function is said to be balanced if wt(f ) = 2ϕ−1 . The balancedness can be easily derived from the Walsh spectrum: a function is balanced if and only if Wf (0) = 0. Nonlinearity. We first introduce the Hamming distance d(f, g) between two Boolean functions f and g as the number of inputs for which the two functions differ: d(f, g) = #{x ∈ Fϕ (2.12) 2 |f (x) 6= g(x)} . The nonlinearity Nf of the function f is defined as the minimum distance between f and any affine function. It is very meaningful to note that the nonlinearity can be easily derived from the Walsh spectrum. Indeed, it follows from the definition of the Walsh spectrum that: Wf (ω) = 2ϕ−1 − 2 · wt(f ⊕ x · ω) . (2.13) Therefore, the nonlinearity can be calculated as follows: Nf = 2ϕ−1 − 1 · max |Wf (ω)| . 2 ω∈Fϕ2 (2.14) The best affine approximation l(x) is associated with the notion of nonlinearity. f has bias ² if it has the same output as its best affine approximation with probability 0.5 + ². It is easy to derive the bias of the best affine approximation from the Walsh spectrum: ²= maxω∈Fϕ2 |Wf (ω)| Nf − 0.5 = . 2ϕ 2ϕ+1 (2.15) 12 CHAPTER 2. BASIC CONCEPTS Every Boolean function has a best affine approximation with nonzero ². This follows from Parseval’s theorem [139]: X Wf2 (ω) = 22·ϕ , (2.16) ω∈Fϕ 2 from which the following bound on ² immediately follows: ² ≥ 2−ϕ/2−1 . Another related equation we will use is X Wf (ω) = ±2ϕ . (2.17) (2.18) ω∈Fϕ 2 It is easy to see that ² will be as small as possible when all values of Wf2 (ω) are equal. Such a function with all Wf (ω) = ±2ϕ/2 is called a bent function [146]. Since Wf (ω) is integer-valued, bent functions only exist for even number of inputs ϕ. Note that bent functions are not balanced: Wf (0) = ±2ϕ/2 6= 0. Therefore, stream ciphers typically do not use bent functions. The size of the smallest bias to be found in balanced Boolean functions is still an open problem, but some bounds have been presented, see [79] for an overview. In this work, we will often consider bent functions as the best achievable construction with respect to some attacks. Correlation-immunity and resiliency. A function f is said to be correlationimmune(CI) [155] of order ρ, CI(ρ), if and only if its Walsh transform Wf satisfies Wf (ω) = 0, for 1 ≤ wt(ω) ≤ ρ. If the function is also balanced, then the function is called ρ-resilient. Siegenthaler’s inequality [155] states that if a ϕ-input Boolean function is ρresilient and has degree d, then it holds that ρ ≤ ϕ − d − 1. If a ϕ-input Boolean function is ρ-CI and has degree d, then it holds that ρ ≤ ϕ − d. There is a meaningful trade-off between resiliency and nonlinearity of a Boolean function, as described by [41]. A bound that follows from this work and that we will use here is: ² ≥ 2ρ+1−ϕ , (2.19) Algebraic immunity. The lowest degree of a function g from Fϕ 2 into F2 for which f (x) · g(x) = 0 for all x ∈ Fϕ or (f (x) ⊕ 1) · g(x) = 0 for all x ∈ Fϕ 2 2 is called the algebraic immunity (AI) [123] of the function f . A function g is said to be an annihilator of f if f (x) · g(x) = 0 for all x ∈ Fϕ 2 . It has been shown [55] that any function f with ϕ inputs has algebraic immunity at most d ϕ2 e. 2.2. BOOLEAN FUNCTIONS AND S-BOXES 13 (dg , dh )-relation. We now introduce a notion that is relevant to fast algebraic attacks. A Boolean function f on Fϕ 2 satisfies a (dg , dh )-relation (shortly (dg , dh ) − R) if a tuple of Boolean functions (g, h) of degree (dg , dh ) exists such that f (x) · g(x) = h(x) for all x ∈ Fϕ 2 . This notion is only relevant in this thesis when dg < dh . 2.2.3 S-Boxes A vectorial Boolean function or (n, m) Substitution Box or S-Box F is a mapping from Fn2 into Fm 2 . An S-Box can be represented by an m-tuple (f1 , . . . , fm ) of Boolean functions fi on Fn2 (corresponding to the output bits). Consequently, many of the notions and the analysis methods introduced for Boolean functions can be extended to S-Boxes by looking at its component functions and their linear combinations; see Braeken [31] for an overview. In cryptanalysis, three important properties of S-boxes are its differential, linear and algebraic behavior. The differential behavior with respect to the linear operation ⊕ is represented by its difference distribution table. This is a 2n × 2m table in which every row represents an input difference and every column represents an output difference. Every element δa,b in the table then counts how often an input difference a leads to an output difference b: δa,b = #{x ∈ Fn2 : F (x) ⊕ F (x ⊕ a) = b} ∀ a ∈ Fn2 , ∀ b ∈ Fm 2 . (2.20) In cipher design, a relevant design criterium for S-boxes will often be that the maximum value of δa,b must be small. The linear behavior is represented by the linear approximation table, a 2n ×2m table in which every row represents an input mask a and every column represents an output mask b. Every element λa,b in the table then counts how often a · x equals b · F (x): λa,b = #{x ∈ Fn2 : a · x = b · F (x)} ∀ a ∈ Fn2 , ∀ b ∈ Fm 2 . (2.21) It is also an essential design criterium in cipher design that all values in the linear approximation table are close to 2n−1 . Concerning the algebraic behavior, we can write a set of equations relating the n input and m output bits describing the S-Box. We are interested in finding the minimum positive integer d such that deterministic equations of degree at most d relating input and output bits exist. An upper bound on d is the smallest value dmax that satisfies ¶ dX max µ n+m 2n < . (2.22) i i=0 14 CHAPTER 2. BASIC CONCEPTS Of course smaller values of d may exist. Armknecht [7] gives an overview of methods to find such algebraic relations, and a practical approach for small Sboxes is also illustrated by Biryukov et al. [23]. 2.3 Symmetric Algorithms In symmetric algorithms, sender and receiver share a secret key K prior to the communication. Symmetric algorithms used for encryption are block ciphers and stream ciphers. The sender encrypts the plaintext P = p0 p1 p2 . . . using K and the encryption algorithm, and obtains the ciphertext C = c0 c1 c2 . . .. The receiver then decrypts the ciphertext C using K and the decryption algorithm and recovers the plaintext P . A symmetric algorithm used for authentication is called a Message Authentication Code or MAC. We now describe block ciphers, stream ciphers and MACs. 2.3.1 Block Ciphers An n-bit block cipher divides the plaintext into blocks of n bits, and encrypts one block at a time with a fixed key-dependent invertible transformation. In practice, the block size n is often 64 or 128 bits. A block cipher is a function E : P × K → C, with P, K and C the sets of all possible plaintext blocks, keys and ciphertext blocks. M = C = {0, 1}n and K = {0, 1}k . For every key K ∈ K, the encryption function E(·, K) = EK (·) is invertible and its inverse is denoted D(·, K) = DK (·). In a secure block cipher it is infeasible to determine K from a number of plaintext/ciphertext pairs faster than trying all possible K, and the permutation on all n-bit words defined by EK (·) should be indistinguishable from a random permutation. Practical block ciphers are complex invertible transformations, but these are obtained from some very simple building blocks. An iterated block cipher encrypts plaintext by encrypting it r times sequentially with a round function. Every round function receives a subkey which is derived from the actual key K, and performs confusion and diffusion (cf. Shannon [154]) of its input. The advantages of such an iterated structure are that the round function can be chosen to be efficiently implementable, and that the design can be easy to analyze. A block cipher is just a component for encryption and needs to be used in conjunction with a mode of operation. For confidentiality, NIST [132] has standardized five modes of operation, named ECB, CBC, CFB, OFB and CTR modes. 2.3. SYMMETRIC ALGORITHMS 2.3.2 15 Stream Ciphers Most textbooks describe a stream cipher as being either synchronous or selfsynchronizing. However, some recent designs, notably the eSTREAM candidate Phelix [161], fall in neither of both categories. This is why we will give a more general definition of a stream cipher in this section, where we will see a stream cipher as an entity having a time-varying internal state, and consider synchronous and self-synchronizing stream ciphers as two special cases. This definition is inspired by Daemen [57] and by our experience in the eSTREAM Call for Primitives [73]. A stream cipher encrypts one digit of plaintext at a time with a time-varying transformation. This implies that a stream cipher should have an internal state S. The digit that is encrypted at each time can either be a bit, a byte, a 32-bit word, . . . A stream cipher is first loaded with an s-bit initial state S0 and a kc-bit cipher key Kc. At each time t, the output function O of the stream cipher generates a key stream digit zt . This key stream digit is used to encrypt a plaintext digit pt with an encryption function E and we obtain the ciphertext ct . Then, the state St is updated by the state update function U . The encryption process is thus governed by the following three equations: zt = O(St , Kc) ct = E(pt , zt ) (2.23) St+1 = U (pt , St , Kc) , where the encryption function E is such that it is easy to construct a decryption function D; the decryption process can be described as follows: zt = O(St , Kc) pt = D(ct , zt ) (2.24) St+1 = U (pt , St , Kc) . In many stream ciphers, the encryption operation E is modular addition. Those stream ciphers are called additive. In practice we often work in a field F2n , and E (and therefore also the decryption operation D) is the bitwise exclusive-or operation ⊕. We will from now on only consider such additive stream ciphers for ease of notation, unless otherwise stated. The initial state and cipher key pair (S0 , Kc) should be properly initialized at time t = 0. In practice, they are often derived from a secret k-bit key K and an iv-bit initialization vector IV by an initialization function In as follows: (S0 , Kc) = In(K, IV ) . (2.25) 16 CHAPTER 2. BASIC CONCEPTS The user of the stream cipher should make sure that the key stream generated from a single (K, IV ) pair is never used to encrypt more than one plaintext, otherwise the security of the stream cipher is highly compromised. Synchronous Stream Ciphers In a synchronous stream cipher, the state update mechanism is updated independently of the plaintext. The only considerable restriction is that the plaintext no longer appears in the third equation of (2.23). We hence have the following system of equations for an additive synchronous stream cipher: zt = O(St , Kc) ct = pt ⊕ zt (2.26) St+1 = U (St , Kc) . A synchronous stream cipher during encryption/decryption is depicted in Fig. 2.2. U U St+1 Kc St+1 St Kc O pt St O zt ct zt pt Figure 2.2: Synchronous stream encryption The implications of the state being independent of the data are substantial. • The key stream generator is now an autonomous finite state machine which acts like a cryptographically secure pseudo-random number generator. In this way, it is a practical (but not provably secure) alternative to the provably secure (but impractical) one-time pad or Vernam cipher [159]. • Synchronization requirement: sender and receiver need to be synchronized for proper decryption. As soon as a digit is inserted or deleted during 2.3. SYMMETRIC ALGORITHMS 17 transmission, sender and receiver need to be resynchronized. Typically this is done by starting a new key stream with a new unique IV , using the initialization mechanism from (2.25). The initialization mechanism is hence called resynchronization mechanism for synchronous stream ciphers. • No error propagation: If a digit is modified due to a transmission error, only this digit will be decrypted erroneously. The transmission error will not affect the decryption of other digits. This is a useful property, for example ensuring robustness in wireless communications, where bits are often flipped due to channel noise. • Particular security model: the fact that the plaintext does not affect the key stream generator makes the security analysis of synchronous stream ciphers radically different from the analysis of non-synchronous stream ciphers and block ciphers. For the two latter, the attacker can mount chosen plaintext/ciphertext attacks whereby he can influence the internal state or intermediate variables. For synchronous stream ciphers, this is of no use. The only attack scenario on the cipher during key stream generation that is feasible is a known plaintext attack, which is equivalent to a known key stream attack and a chosen plaintext attack here. Because of these properties, synchronous stream ciphers are the most widely studied type of stream ciphers. For instance, in the eSTREAM project, 31 out of the 34 candidates are synchronous stream ciphers. The cryptanalysis and design of synchronous stream ciphers is also the main subject of this thesis. There exist a very large number of design principles for synchronous stream ciphers. An important distinction is between ciphers that have a linear state update mechanism and those that have a nonlinear state update mechanism. We now give a very rough categorization of synchronous stream ciphers. • Linear state update mechanism: in these stream ciphers the update function U is linear and can hence be represented by a multiplication by a matrix. Two basic models of such key stream generators are the combination and the filter generator. A combination generator uses several LFSRs in parallel and combines their outputs in a nonlinear way, by means of a Boolean combining function, to obtain one key stream digit at every clock. If the internal state only consists of a single LFSR, and a key stream digit is computed at each clock by taking some bits at several positions of the LFSR and inputting them to a nonlinear Boolean function (called filter function here), a filter generator is obtained. Examples of such stream ciphers include the eSTREAM candidates SFINKS [35] and WG [91]. • Nonlinear state update mechanism: here part (or all) of the nonlinearity is due to the state update mechanism. 18 CHAPTER 2. BASIC CONCEPTS Kc ··· ¾ - © HH · · · ©© ?? ? fc pt zt ? -L ct ··· Kc HH H· · · ©© ? ?? fc zt ? -L - pt Figure 2.3: Self-synchronizing stream encryption – LFSR based: many of such designs also employ the widely used LFSRs as a building block, but they destroy the linearity of the state by either: ∗ Adding some nonlinear memory in addition to the LFSRs. This is the strategy adopted in the summation generator [148] and in E0, the stream cipher used in Bluetooth [157]. ∗ Clock control: such designs are also based on LFSRs, but depending on some state values it is decided whether a register is clocked and how often. Such approach is used by the A5 ciphers used in GSM [4] mobile phones, the shrinking [49] and self-shrinking [124] generators, and by the eSTREAM candidates Decim [17], Mickey [12] and Pomaranch [103]. – Various other designs: there are plenty of other design strategies; this includes stream ciphers based on random shuffles such as RC4 [153], Tfunctions [107] and various types of nonlinear feedback shift registers (NFSRs). Self-Synchronizing Stream Ciphers Figure 2.3 shows a simplified representation of a self-synchronizing stream cipher. In such a design, the next key stream digit zt is fully determined by the last nm ciphertext digits and the cipher key Kc which is derived from the secret key K upon initialization. This can be modelled as the key stream symbol being computed by a keyed cipher function fc operating on a shift register that contains the last nm ciphertext digits. This conceptual model can be implemented in various ways. For the first nm plaintext or ciphertext symbols, the previous nm ciphertexts do not exist. Hence the self-synchronizing stream cipher must be initialized by loading nm dummy ciphertext symbols, called the initialization vector IV . 2.4. SOME IMPORTANT CRYPTANALYTIC TOOLS 19 Some important properties of self-synchronizing stream ciphers include the following: • Self-synchronization: if a ciphertext is deleted, inserted or flipped, the stream cipher will automatically resume proper decryption after nm digits. There is hence no need to resynchronize, as this will be done automatically. • Limited error propagation: if, during transmission, a ciphertext digit is inserted, deleted or flipped, only the next nm ciphertext digits will be wrongly decrypted. • Security model: contrary to the case of synchronous stream ciphers, ciphertext does enter the state. This makes chosen plaintext and chosen ciphertext attacks on these designs possible. This last property makes the design of self-synchronizing stream ciphers radically different from the design of synchronous stream ciphers, and this topic falls outside the scope of this thesis. We believe however that the cryptanalysis and design of self-synchronizing stream ciphers is a very interesting research topic. Since the publication of the attacks on the eSTREAM candidates SSS by us [59] and on Mosquito by Joux et al. [105], no self-synchronizing stream cipher remains unbroken. We believe it would be useful to have a dedicated self-synchronizing stream cipher as a more efficient alternative to AES in CFB mode. AES in CFB mode is indeed quite slow, as we will only use one digit (typically one byte or one bit) out of the 128 bits generated at every iteration. 2.3.3 Message Authentication Codes A third class of symmetric algorithms, Message Authentication Codes or MACs, are used for authentication of messages. A MAC function h takes as input an arbitrary-length message M and a secret key K, and computes a n-bit MAC value h(M, K), where the length n is e.g. 64 bit or 128 bit. Given h and M , an attacker may not be able to compute h(M, K) with probability significantly higher than 1/2n . Also, it should be hard for an attacker to compute h(M, K) even when given a large set of pairs {Mi , h(Mi , K)} for any Mi 6= M . Very few dedicated MACs exist, and most MAC algorithms are based on block ciphers or hash functions. In Chapter 8 we will describe and attack one such MAC algorithm, the Alleged SecurID Hash Funtion. 2.4 Some Important Cryptanalytic Tools In our treatment of cryptanalysis of synchronous stream ciphers in the following chapters, some important cryptanalytic techniques will be frequently used. These 20 CHAPTER 2. BASIC CONCEPTS techniques are based on a combination of algebra, statistics, and numerical techniques. We will describe in this section the three techniques that have had the greatest influence in the cryptanalysis of symmetric algorithms. These techniques are differential cryptanalysis, linear cryptanalysis and algebraic cryptanalysis. We also explain how we will apply these and other cryptanalytic techniques in the following sections. 2.4.1 Differential Cryptanalysis Differential cryptanalysis [22] was developed by Biham and Shamir around 1990. Differential cryptanalysis was developed for application to block ciphers, but can also be applied to many other cryptographic primitives. The central idea of differential cryptanalysis is to choose a difference for many pairs of inputs to the primitive, and to observe the corresponding differences in the outputs. Usually the difference is defined with respect to the ⊕ operation. We now give a brief description of differential cryptanalysis for block ciphers. Let P and P 0 be two chosen plaintexts with corresponding ciphertexts C = EK (P ) and C 0 = EK (P 0 ). We denote the corresponding differences by ∆P = P ⊕ P 0 and ∆C = C ⊕ C 0 . In differential cryptanalysis, we will study how the difference ∆P propagates through the cipher. First, we need to look to the individual building blocks of the cipher. For affine transformations, it can be easily shown that the differences propagate in a deterministic way. For the nonlinear S-boxes in the design, the probability that a given input difference leads to a given output difference can be calculated from the difference distribution table as defined in Sect. 2.2.3. Second, we need to concatenate the differentials of the building blocks to cover (several rounds of) the block cipher. Here we try to find a path of differences with high probability through the cipher, which we call a characteristic of the cipher. In practice it is often a good approximation to assume that the probabilities are independent, and the probability of the characteristic is the product of the probabilities of the building blocks through which the characteristic passes. In the end, we want to find a differential (∆P, ∆C) that has high probability. Such a differential can be built from one or more characteristics that have the same input and output difference, and the probability of the differential is the sum of the probabilities of the underlying characteristics. Using such a highprobability differential of (some rounds of) the block cipher, we can then mount a cryptanalytic attack. A concrete application of differential cryptanalysis will be presented in Chapter 8 on SecurID. For synchronous stream ciphers, differential cryptanalysis can be used to attack the resynchronization mechanism, as will be explained in Chapter 6. 2.4. SOME IMPORTANT CRYPTANALYTIC TOOLS 2.4.2 21 Linear Cryptanalysis Linear Cryptanalysis [118, 119] was introduced by Matsui in 1993. Its original application has also been on block ciphers but it can be applied to many other cryptographic primitives as well. Its central idea is to find affine approximations of the cipher, resulting in one or more equations that relate the input, output and key material and hold with probability different from one half. We now give a brief description of linear cryptanalysis for block ciphers. Let ΓP , ΓC ∈ Fn2 and ΓK ∈ Fk2 be the input, output and key masks respectively. Then our aim is to find an equation ΓP · P ⊕ ΓC · C ⊕ ΓK · K = 0 (2.27) which holds with probability 1 + 2 + ε where the absolute value of ε is as large as possible. To find such an equation, we first need to find affine approximations for the building blocks of the cipher. Evidently, all affine parts need not be approximated. For the nonlinear S-boxes, we can deduce the bias ² of any affine approximation from the linear approximation table as defined in Sect. 2.2.3. In a second step, we need to concatenate linear approximations of the building blocks to find an approximation of type (2.27) for the whole cipher. The probability of the total equation is usually very hard to compute, but in practice a good approximation is often obtained by assuming independence and using the piling-up lemma. Lemma 2.4.1 (Piling-up lemma [118]) Let X1 , X2 , . . . , Xn be n independent random binary variables. Then the bias ε of the sum X = X1 ⊕ X2 . . . ⊕ Xn is given by: n Y ε = 2n−1 · ²i , (2.28) i=1 where ²1 , ²2 . . . ²n are the biases of X1 , X2 , . . . , Xn respectively. Matsui [120] has presented a recursive algorithm that can find the best approximation for DES-like ciphers. Similar techniques can be applied to other ciphers, but the efficiency varies greatly. A concrete application of linear cryptanalysis will be presented in Chapter 8 on SecurID. For synchronous stream ciphers, linear cryptanalysis will be treated in Chapter 3 and will also be used to attack the resynchronization mechanism in Chapter 6. 2.4.3 Algebraic Cryptanalysis Differential and linear cryptanalysis use a statistical approach for cryptanalysis. In algebraic cryptanalysis, a different approach is adopted. Any cryptographic 22 CHAPTER 2. BASIC CONCEPTS primitive can be expressed as a large system of multivariate equations. In algebraic attacks, the cryptanalyst will first try to find a set of multivariate equations that is as simple as possible. In a second phase, he will try to solve this system of equations. The first step, finding a simple set of equations describing the primitive, is a complex problem. What is the simplest set of equations and how to find it depends on the type of cipher we are analyzing and the solution method to be used. In particular, systems that are of low degree, sparse, overdefined and have special structure will be easier to solve. We will discuss strategies to find systems of equations that are easier to solve than expected for stream ciphers in Chapter 4. In the second step, we need to solve the system of multivariate equations obtained in the first step. A number of methods exist that achieve this. A detailed description of these algorithms and their complexities falls outside of the scope of this thesis, but we give a brief overview necessary to understand some attacks in this work. • Linearization: in linearization, we replace each monomial in the system of equations by a new unknown. If we have a system of equations with u unknowns of degree d, the total number of possible monomials, which we denote by Mud , is given by Mud = d µ ¶ X u i=0 i . (2.29) To solve this linearized system, the system of equations should be overdetermined so that we have more than Mud equations. Then, the linear system can be solved through Gaussian elimination and similar techniques. The asymptotic complexity of this Gaussian elimination is of the order of (Mud )ω , which we will denote by O((Mud )ω ). For a straightforward implementation of the algorithm, ω = 3. Better algorithms exist, but the actual asymptotical behavior of the problem is unknown. It has been shown that ω < 2.376 in theory [50]. However, the best currently known algorithm is Strassen’s algorithm [158], which has ω = log2 7 ≈ 2.807. • Gröbner bases algorithms. Algorithms using Gröbner bases are the most widely used algorithms for solving multivariate algebraic systems. The most substantial algorithms in this class are Buchberger’s algorithm [37] and Faugères improved algorithms F4 [76] and F5 [77]. The asymptotical behavior of these algorithms depends on how the number of unknowns is related with the number of equations, see [14]. 2.4. SOME IMPORTANT CRYPTANALYTIC TOOLS 23 • XL and XSL algorithms. The XL [54] and XSL [56] algorithms have been recently presented and claim they may be more suited for solving the systems of equations that arise in cryptanalysis of symmetric algorithms. However, recent work on XL [67] and XSL [45] indicates that the complexity of these algorithms is higher than originally expected, and that XL actually turns out to be a special case of F4. 2.4.4 Application in This Thesis In this thesis, we will apply differential, linear, algebraic and other techniques to synchronous stream ciphers. Our attacks can be categorized in various ways. A first distinction regards the aim of the attacks: • Key (or state) recovery attack: here the cryptanalyst tries to recover the secret key K or the entire secret state (St , Kc) at some time t. An attack is successful if the attack is more efficient than trying all 2k secret keys exhaustively. • Distinguishing attack: as explained in Sect. 2.3.2, a synchronous stream cipher is a cryptographically secure pseudo-random number generator, and it makes sense to analyze whether this generator can be distinguished from a random function. This is the purpose of a distinguishing attack. In the NESSIE project, a distinguishing attack was considered successful if it required less than 2k data and time complexity. This was perceived as a too harsh requirement [145] by some authors, who questioned the relevance of such attacks. It is yet unclear how eSTREAM will consider distinguishing attacks. Authors within eSTREAM seem all to have made their own claims on the resistance against such attacks. In this thesis, we are going to take distinguishing attacks into account, mainly for two reasons: – Distinguishing attacks may compromise the privacy of the users in some practical scenarios. If for instance only two plaintexts are possible in a certain communication, a distinguishing attack may reveal which plaintext was most likely to be communicated. – Distinguishing attacks on synchronous stream ciphers mostly indicate a weakness in the design. Very often, a distinguishing attack can later be improved to become a key recovery attack. A second distinction we make is between attacks on the key stream generation and attacks on the resynchronization mechanism: • Attacks on the key stream generation: in such attack we assume that the stream cipher is loaded with a random initial state (S0 , Kc) and starts gen- 24 CHAPTER 2. BASIC CONCEPTS erating key stream. We try to find weaknesses resulting from the combined action of the update function U and the output function O in (2.26). • Attacks on the resynchronization (or initialization) mechanism: these attacks also take into account the initialization function In from (2.25). Typically, such attacks search for relations between the secret key K, a large number of IV values and short strings of key streams generated at the beginning of the corresponding frames. In the following four chapters, we present a framework for cryptanalysis of stream ciphers in which we include our own results in the field. The first three chapters describe attacks on the key stream generation: linear attacks in Chapter 3, algebraic attacks in Chapter 4 and several other attacks in Chapter 5. The last of these, Chapter 6, presents attacks on the resynchronization mechanism. We will then use these insights to discuss actual implementation of stream ciphers in Chapter 7. 2.5 Conclusions In this chapter we have presented several basic concepts used in this thesis. We first introduced some important building blocks in stream cipher design, namely Linear Feedback Shift Registers (LFSRs), Boolean functions and S-Boxes. The classes of symmetric algorithms were presented and the different types of stream ciphers were discussed. We motivated why synchronous stream ciphers are a very interesting topic to be investigated. We then introduced differential, linear and algebraic cryptanalysis as the most substantial tools used by the cryptanalyst, and explained how these tools will be used in the following chapters of this thesis. Chapter 3 Linear Attacks In this chapter we present some results on cryptanalysis of synchronous stream ciphers using the tools of linear cryptanalysis, which have already been briefly introduced in Sect. 2.4.2. First, a general overview of the aim and approach of linear attacks on stream ciphers is presented in Sect. 3.1. We then study two applications of linear attacks on synchronous stream ciphers in more detail. First, a framework for evaluating the resistance of a class of stream ciphers, the filters and combiners, against linear attacks is developed Sect. 3.2. This has previously been published by us in [33]. Then, we describe linear distinguishing attacks on the stream cipher Sober-t32, which we have published in [11]. 3.1 General Approach Linear attacks on stream ciphers also use the basic principle of linear cryptanalysis: the cryptanalyst tries to linearly approximate the nonlinear building blocks of the stream cipher, and uses this approximation to mount an actual attack on the stream cipher. As explained in Sect. 2.4.4, the attacker will either try to mount a distinguishing attack or try to recover the secret state of the cipher. • The idea of distinguishing the stream cipher from random by using linear approximations of the building blocks that hold with nonzero bias ² dates back to the work of Ding et al. [68] and Golic [83]. We first simplify the system of equations for an additive synchronous stream cipher (2.26) by not considering a separate cipher key (if it exists, we just 25 26 CHAPTER 3. LINEAR ATTACKS include it into the state): ½ zt = O(St ) St+1 = U (St ) . (3.1) Then each output bit is determined by zt = O(U ◦ U . . . ◦ U (S0 )) = O(U t (S0 )) and should be a balanced function of S0 . In [83], it is first argued that if you take m bits of key stream where m is strictly larger than the number s of internal state bits, then the output bits (zt , zt+1 , . . . zt+m−1 ) cannot be a balanced function of S0 , since S0 only contains s bits. Therefore, there must exist a mask Γz such that ΓZ · (zt , zt+1 , . . . zt+m−1 ) (3.2) is equal to zero with probability 1/2 + ² and ² 6= 0. It can be shown that O(1/²2 ) samples are then needed to distinguish the key stream from a truly random stream. This idea has been extensively used in several attacks on stream ciphers. A comprehensive framework for such attacks on stream ciphers with linear masking has been presented by Coppersmith et al. in [48]. We do not describe this framework into more detail here but the attack in Sect. 3.3 can be seen as an application of this framework. Distinguishing attacks based on linearization have been extensively used in the literature; some recent results include attacks on SNOW 2.0 [160] and the eSTREAM candidate NLS [43]. • In a distinguishing attack, we need biased equations relating only the key stream bits. If the aim is to recover the internal state instead, biased equations relating the state bits with the key stream bits will first be needed. The idea to mount such attacks on stream ciphers was first introduced by Siegenthaler with the correlation attack on combination generators [156]. In most cases, recovering the internal state from the biased equations is not an easy task and depends on the details of the cipher. Some proposed methods include divide-and-conquer approaches, decoding techniques, the use of parity checks, . . . Attacks by linearization are a very powerful method of cryptanalysis, and every new design should provide a rationale for its resistance against such attacks. The methods for searching the best linear approximations for synchronous stream ciphers are quite often ad hoc, because it is mostly impossible to search for the best approximation with a search algorithm such as proposed by Matsui for DES [120]. 3.2. FILTER AND COMBINATION GENERATORS 27 A simple design strategy is clearly preferable. Ciphers built from simple building blocks are easy to analyze with respect to such attacks. We believe that this is preferable to a strategy where the analysis is made very difficult because of the use of complicated building blocks and interconnections, but where unexpected biases may arise eventually. We will now show two applications of such attacks, on the general filter and combiner model and on the stream cipher Sober-t32. 3.2 Linear Cryptanalysis of Filter and Combination Generators In this section, we investigate linear attacks on the filter and combination generator, and will derive minimal requirements that the LFSRs and Boolean functions should satisfy to resist these attacks. Our goal is to investigate whether it is possible to construct practical and secure filter or combination generators with 80-bit key security (the eSTREAM requirement for hardware-oriented stream ciphers [73]) and low hardware cost, which implies that we should keep it close to the edge of the minimal requirements while at the same time keeping a reasonable security margin. We will cover both distinguishing and state recovery attacks, the latter known as correlation attacks. Our treatment of distinguishing attacks combines and extends the ideas of the recent work of Molland and Helleseth in [127] and of Englund and Johansson in [75] to concrete distinguishing attacks on all filter and combination generators. It follows that distinguishing attacks are very similar to correlation attacks but are often more efficient, as they can use many linear approximations simultaneously and do not need a decoding algorithm. 3.2.1 Distinguishing Attacks The distinguishing attack we describe here is based on the framework developed in [75] combined with the mathematical results from [127]. We extend the framework for the filter generator and also develop the attack for the combination generator. Filter generator The idea of the attack is the following. The LFSR stream has very good statistical properties, but of course there are linear relations, which are determined by the feedback polynomial P (X) and its multiples Q(X), where the cryptanalyst first needs to find the latter in a precomputation step. 28 CHAPTER 3. LINEAR ATTACKS Given a linear relation Q(X) of weight w in the LFSR stream, the same relation for the corresponding bits of the key stream will hold with some probability different from one half, because the Boolean function f does not destroy all linear properties. This probability is determined by the combined action of all affine approximations of the Boolean function. This interaction is governed by the piling-up lemma (2.28), and can be expressed as [127]: P2ϕ −1 (Wf (ω))w 0 ε = ω=0 ϕ·w+1 . (3.3) 2 Some implications of this formula are: • The size of the bias will decrease rapidly with the weight of the linear relation. Hence, the LFSR polynomial should not have a low weight or have low-weight multiples of low degree. • For even weight, the biases will all work together to the advantage of the cryptanalyst, whereas for odd weight there is partial compensation due to the varying signs in the sum. • To minimize the observed bias, the Boolean function should have a flat Walsh spectrum. This incorporates a high nonlinearity (as the largest terms will dominate in (3.3), especially for increasing weight) but also other factors have to be taken into account: the number of times the highest Walsh value is reached, the signs of the highest values, . . . For instance, one can see that the plateaued functions [167] are not an optimal choice against these attacks: they have high nonlinearity, but the maximal Walsh value occurs very often. The function with the best resistance to this attack is the bent function, which has an entirely flat Walsh spectrum. We will hence use this function here to establish upper bounds for the complexity of these attacks. To distinguish the key stream from random, the number of samples needed is O(1/ε02 ), which is also the time complexity of the attack. The data complexity is the sum of the degree of the relation Q(X) and the number of samples needed. It is possible to decrease the data complexity to some extent by using multiple relations simultaneously. We now study the impact of this attack on filter generators with bent functions, as these offer the best resistance against this distinguishing attack. We assume that the LFSR has an internal state of 256 bits and investigate the data complexity of the attack as a function of the number of inputs. By combining (2.16), (2.17), (2.18), (2.19) and (3.3), we can calculate that the bias for bent functions can be written as: ε0 = 2−(dw/2e−1)·ϕ−1 . (3.4) 3.2. FILTER AND COMBINATION GENERATORS 29 Table 3.1: Optimal complexities of the distinguishing attacks on the filter generator (for bent functions) Inputs 4 6 8 10 12 14 16 18 20 Weight 10 8 8 6 6 6 6 6 6 log2 (ε0 ) -17.00 -19.00 -25.00 -21.00 -25.00 -29.00 -33.00 -37.00 -41.00 log2 (Dmin ) 30.50 38.33 38.33 52.58 52.58 52.58 52.58 52.58 52.58 log2 (attack compl.) 34.12 39.17 50.00 52.58 52.80 58.03 66.00 74.00 82.00 A cryptanalyst is interested in finding the weight w for which the attack complexity is minimal. For very low weight w, the degree of Q(X) will be prohibitively high as shown by (2.4). For very high weight w, the bias (3.4) will be too small. We hence expect an optimal trade-off somewhere in between. These optimal values for some bent functions with an even number of inputs are shown in Table 3.1. The conclusion of this table is that, if we take into account the NESSIE requirements, no 256-bit LFSR with a Boolean function with less than 20 inputs can be made secure against this distinguishing attack! However, we have to make two observations: • The precomputation time to find the necessary multiples Q(X) is very high (approximately O(2150 ), governed by (2.4) and (2.5)). Which amount of precomputation we may allow is an interesting topic of discussion. Note that this also applies to other attacks such as trade-off attacks, correlation attacks and algebraic attacks. • There has been some debate on the relevance of distinguishing attacks requiring long key streams during the NESSIE stream cipher competition [145]. Whereas time complexity is only a matter of resources at the attacker’s side, available data depends on what has been encrypted by the sending party. Hence, we propose to limit the maximum amount of key stream generated from a single key/IV pair to 240 bits (practical assumption), after which the scheme should resynchronize. This measure prevents the appearance of low-weight multiples: from (2.4) it follows that normally no multiples of weight less than 8 exist with degree less than 240 . Now, we 30 CHAPTER 3. LINEAR ATTACKS Table 3.2: Optimal complexities of the distinguishing attacks on the filter generator (for bent functions), when the length of a single key stream is restricted to 240 bits Inputs 4 6 8 10 12 14 16 18 20 Weight 10 8 8 8 8 8 8 8 8 log2 (ε0 ) -17.00 -19.00 -25.00 -31.00 -37.00 -43.00 -49.00 -55.00 -61.00 log2 (Dmin ) 30.50 38.33 38.33 38.33 38.33 38.33 38.33 38.33 38.33 log2 (attack compl.) 34.12 39.17 50.00 62.00 74.00 86.00 98.00 110.00 122.00 can recalculate the best attack complexities, by adding the extra constraint log2 (Dmin ) < 40. We now obtain the values in Table 3.2. It follows that, under this practical restriction, nonlinear filters can be built which are secure against distinguishing attacks starting from 14 inputs. Note that the restriction of a single key stream to 240 bits also provides protection against other cryptanalytic attacks, as explained below. Combination generator For combination generators, the attack can be improved by using a divide and conquer approach. Let us assume we have a combiner with ϕ LFSRs and that the Boolean function has resiliency ρ. The average length of one LFSR is thus 256/ϕ, where 256 is the total size of the internal state proposed here. Note that in practical designs, the lengths of the LFSRs have to be distinct. An attacker will of course try to attack the smallest LFSRs first. We here take the average length of the LFSRs to keep the analysis simple. The implication is that the complexities given here are an upper bound for the actual attack complexity. The attacker now mounts the same distinguishing attack, restricting himself to r LFSRs, where r must be of course strictly greater than the order of resiliency ρ. Again, we first search for a low-weight multiple of this equivalent 256 · r/ϕlength LFSR, and then try to detect the bias that remains of this relation after the Boolean function. From the piling-up lemma, it follows that this bias is as follows: P (Wf (ω))w 0 ε = ω∈Sϕ·w+1 , (3.5) 2 3.2. FILTER AND COMBINATION GENERATORS 31 where the set S is defined as: S = {ω | wt(ω) > ρ and ω < 2r } , (3.6) assuming, without loss of generality, that the attacked LFSRs are numbered 0, 1, . . . r − 1. This divide and conquer approach gives the attacker some advantages compared to the case of the nonlinear filter: • If the resiliency of the Boolean function is low, the length of the attacked equivalent LFSR can be low. First, this will allow the attacker to perform the precomputation step governed by (2.4) and (2.5) much faster. Second, the length of the low-weight multiples will be much lower, making the data complexities feasible for much lower weights, where the detectable biases will be much larger as shown by (3.5). Note that when 256 · r/ϕ is very small, we will even be able to mount a powerful weight-2 attack without precomputation step: this will just correspond to the period of the small equivalent LFSR. As shown in [75], such an attack can be easily turned into a key recovery attack. • If the resiliency of the Boolean function is high, we still have the advantages explained in the above point, but to a lesser extent. But here the trade-off between resiliency and nonlinearity (2.19) will come into play, resulting in higher Walsh values in (3.5) and hence a higher bias. It follows that it is much harder to make the combiner model resistant to this distinguishing attack. We show the cases where the optimal trade-off is achieved in Table 3.3. The setting in this table is as follows: the cryptographer can, for a given number of inputs, choose optimal Boolean functions, namely such that either the bound (2.17) or the bound (2.19) holds. Note that this is again an optimal case for the cryptographer. The complexities given in the table are hence upper bounds. On real functions the complexities will be lower. The cryptographer tries to choose ρ such that he maximizes the resistance to the distinguishing attack. The attacker chooses the optimal number r of LFSRs he attacks, as well as the optimal weight of the multiples he will use. We see that it is even harder to obtain security here. In fact, we would need 36 inputs (not shown in the table) and an optimal Boolean function to achieve full security. Again, we propose to limit the maximum amount of key stream obtained from a single key/IV pair to 240 bits. This gives us the results shown in Table 3.4. We now see that, under ideal assumptions for the cryptographer, he can make a combiner which is secure starting from 18 inputs. It is important to note that these lower bounds will be very hard to approach in reality due to the difficulty of finding practical functions that have properties 32 CHAPTER 3. LINEAR ATTACKS Table 3.3: Optimal complexities (for the cryptanalyst) of the distinguishing attacks on the combination generator for optimal functions (for the cryptographer) Inputs 4 6 8 10 12 14 16 18 20 Resil. 1 2 3 4 5 6 7 8 9 r LFSRs 2 4 6 5 6 8 10 12 14 Wt 6 6 6 4 4 4 4 4 4 log2 (ε0 ) -13.00 -16.68 -20.54 -21.00 -25.00 -25.83 -27.19 -28.78 -30.48 log2 (Dmin ) 26.98 35.51 39.78 43.53 43.53 49.62 54.19 57.75 60.59 log2 (compl.) 27.57 35.81 41.57 43.96 50.02 51.97 55.29 58.65 61.79 Table 3.4: Optimal complexities (for the cryptanalyst) of the distinguishing attacks on the combination generator for optimal functions (for the cryptographer, when the length of a single key stream is restricted to 240 bits) Inputs 4 6 8 10 12 14 16 18 20 Resil. 1 2 3 4 5 6 7 8 9 r LFSRs 2 4 6 7 9 10 12 13 15 Wt 6 6 6 6 6 6 6 6 6 log2 (ε0 ) -13.00 -16.68 -20.54 -26.14 -29.98 -35.54 -39.37 -44.91 -48.73 log2 (Dmin ) 26.98 35.51 39.78 37.22 39.78 37.95 39.78 38.36 39.78 log2 (compl.) 27.57 35.81 41.57 52.28 59.96 71.08 78.73 89.81 97.46 3.2. FILTER AND COMBINATION GENERATORS 33 close to the optimal case described here, and due to the fact that our lower bounds consider equal LFSR lengths. In reality all LFSR lengths need to be distinct, which will allow the attack to improve his attack significantly by attacking the shortest LFSRs. The larger the number LFSRs, the more serious this problem becomes. As a result of this, we believe it is not possible to design a practical nonlinear combiner, with 256 bits internal state, that can resist this distinguishing attack. This in comparison to the case of the filter generator, where we can succeed to get close to the bounds with implementable functions, as evidenced by the power functions described in Chapter 7. We now develop a similar reasoning for the case of fast correlation attacks. 3.2.2 Fast Correlation Attacks Correlation [156] and fast correlation [122] attacks exploit the correlation between the LFSR stream st and the key stream zt . These attacks can be seen as a decoding problem, since the key stream zt can be considered as the transmitted LFSR stream st through a binary symmetric channel with error probability p = Pt≥0 (zt 6= st ). For the nonlinear filter generator, p is determined by 0.5 + ². For the combination generator, a divide-and-conquer attack as in the case of a distinguishing attack is possible. Consequently, it is sufficient to restrict the attack to a subset of LFSRs {i0 , . . . , iρ }, such that the following holds: P (f (x) 6= 0|xi0 = a0 , . . . , xiρ = aρ , ∀(a0 , . . . , aρ ) ∈ Fρ+1 ) 6= 1/2 . 2 (3.7) Here, as defined above, the parameter ρ corresponds with the order of resiliency of the Boolean function. Then, the attack consists of a fast decoding method for any LFSR code C of length N (the amount of available key stream) and dimension L (the length of the LFSR), where the length N of the code is lower bounded by Shannon’s channel coding theorem: N≥ L L ln(2)L = ≈ . C(p) 1 + p log2 p + (1 − p) log2 (1 − p) 2²2 (3.8) If we want to achieve perfect security against this attack for a 256-bit LFSR and allowing at most 240 bits in a single key stream, the bias ² should be less than or equal to 2−17 . To achieve this, we would need a highly nonlinear Boolean functions with more than 34 inputs. This can never be implemented efficiently in hardware. However, the above criterion is far too stringent as actual decoding algorithms are not able to achieve a good time complexity. We now look at the current complexities of these decoding algorithms. Besides the maximum-likelihood (ML) decoding, which has very high complexity of L · 2L , mainly two different approaches 34 CHAPTER 3. LINEAR ATTACKS have been proposed in the literature. In the first approach, the existence of sparse parity check equations for the LFSR code are exploited. These parity check equations correspond with the low weight multiples of the connection polynomial. In this way, the LFSR code can be seen as a low-density parity-check (LDPC) code and has several efficient iterative decoding algorithms. In the second approach, a smaller linear [n, l] code with l < L and n > N is associated to the LFSR on which ML decoding is performed. The complexity of both approaches strongly depends on the existence of low degree multiples. As shown above, the precomputation time for finding these low degree multiples is very high. Moreover, the approach requires long key streams and suffers from a high decoding complexity. To give an idea of the complexity of these attacks, we concentrate on the attack of Canteaut and Trabbia [38], since the complexity of the second approach is based on the Shannon’s channel coding theorem. When parity-check equations with weight w are used, the required key stream N is approximated by: µ 1 2² ¶ 2(w−2) w−1 L 2 w−1 , (3.9) and the complexity for the attack can be roughly estimated by µ 1 2² ¶ 2w(w−2) w−1 L 2 w−1 . (3.10) Note the very strong resemblance between this classical framework for the correlation attacks and the framework we developed in the previous subsection for distinguishing attacks. The main difference between the two is that in the correlation attacks we need a decoding method, whereas in the distinguishing attacks we just need to apply a simple distinguisher. Analysis of the above formulae learns that the complexity of the decoding procedure is much higher than for the distinguishing procedure. Even with huge improvements of the existing decoding procedures, we do not expect this situation to change. Hence we can conclude that a choice of parameters that makes the design resistant against distinguishing attacks (explained above), will also make it resistant against correlation attacks. 3.3 Linear Distinguishing Attack on Sober-t32 Sober-t32 is a software-oriented synchronous stream cipher designed by Rose and Hawkes [93] which has been submitted to the NESSIE competition. The cipher uses a Linear Feedback Shift Register (LFSR), a Non-Linear Function (NLF) and a irregular stuttering mechanism for the generation of the pseudorandom key stream. Sober-t32 is described in Appendix A.4. 3.3. LINEAR DISTINGUISHING ATTACK ON SOBER-T32 35 In [74], Ekdahl and Johansson mount distinguishing attacks on full Sobert16 and on Sober-t32 without the so-called stuttering unit. It will now be shown how the attacks from [74] can be extended to describe distinguishing attacks on full Sober-t32. 3.3.1 Extending the Attack on Unstuttered Sober-t32 to Full Sober-t32 In [74], Ekdahl and Johansson presented a distinguishing attack on unstuttered Sober-t32. In this section, we first briefly summarize this attack, and then explain how it can be extended to full Sober-t32. The Attack on Unstuttered Sober-t32 The attack starts by linearizing the equation of the NLF (A.11): h i vt = st ⊕ st+1 ⊕ st+6 ⊕ st+13 ⊕ st+16 ⊕ wt = Ωt ⊕ wt . (3.11) Then it will be argued that the noise wt , introduced by this approximation, has a biased distribution. In the next step, a new linear recurrence is obtained by repetitive squaring of the LFSR equation (A.10): st+τ5 ⊕ st+τ4 ⊕ st+τ3 ⊕ st+τ2 ⊕ st+τ1 ⊕ st = 0 , (3.12) with τ1 = 11, τ2 = 13, τ3 = 4 · 232 − 4, τ4 = 15 · 232 − 4 and τ5 = 17 · 232 − 4. This linear recurrence is valid for each bit position individually. Then the XOR between two adjacent bits in the stream vt are considered: vt [i] ⊕ vt [i − 1] = Ωt [i] ⊕ Ωt [i − 1] ⊕ wt [i] . (3.13) The distribution F [i] of wt [i] is then calculated. Simulation indicates that this distribution is quite biased. The largest bias was found for the XOR of bit 29 and 30. The bias is dependent of the corresponding bits of K, and for bit 29 and bit 30 it is at least ²30 = 0.0052. Now, given the NLF-stream v0 , v1 , . . . vN −1 , the linear recurrence (3.12) can be used to calculate Ωt+τ5 ⊕ wt+τ5 ⊕ Ωt+τ4 ⊕ wt+τ4 ⊕ vt+τ5 ⊕ vt+τ4 ⊕ vt+τ3 ⊕ vt+τ2 ⊕ vt+τ1 ⊕ vt = Ωt+τ3 ⊕ wt+τ3 ⊕ Ωt+τ2 ⊕ wt+τ2 ⊕ Ωt+τ1 ⊕ wt+τ1 ⊕ Ωt ⊕ wt , (3.14) 36 CHAPTER 3. LINEAR ATTACKS where the sum of all Ωj terms is zero because of (3.12). This equation can be rewritten as: vt+τ5 ⊕ vt+τ4 ⊕ vt+τ3 ⊕ vt+τ2 ⊕ vt+τ1 ⊕ vt = 5 M wt+τj . (3.15) j=0 The left hand side of this equation is noted as Vt , the right hand side as Wt . It is now possible to calculate the following probability: P (Vt [i] ⊕ Vt [i − 1] = 0) = P (Wt [i] ⊕ Wt [i − 1] = 0) = 1 + 25 ²6i . 2 (3.16) The final correlation probability for the six independent key stream positions can then be obtained for i = 30: p0 = P (Vt [30] ⊕ Vt [29] = 0) = 1 1 + 25 (0.0052)6 ≈ + 2−40.5 . 2 2 (3.17) In order to distinguish this nonuniform distribution P0 from a uniform source PU , the Chernoff information between the two distributions is calculated: X C(P0 , PU ) = − min log2 P0λ (x)PU1−λ (x) ≈ 2−81.5 . (3.18) 0≤λ≤1 x In order to obtain an error probability of Pe = 2−32 , N = 286.5 samples from the key stream are needed. Each sample spans τ5 = 17 · 232 ≈ 236 bits, so in total N + τ5 ≤ 287 words from the NLF-stream are needed to distinguish unstuttered Sober-t32 from a uniform source. The Attack on Full Sober-t32 In this section the attack above will be extended to Sober-t32 with the stuttering unit. The attack requires that the words vt from (3.14) survive the stuttering, and it should be known where they appear in the key stream zj . One might think that the probability of guessing the right positions for the large values τ3 = 4 · 232 − 4, τ4 = 15 · 232 − 4 and τ5 = 17 · 232 − 4 will be so small that the distinguishing attack will no longer be successful. In this section we will show that this is not the case. First of all, an expression is derived for the probability that a particular word will be at its most probable position in the key stream. A key stream word zi is taken, and this word comes from the word vt in the NLF-stream. Then the probability is calculated that the following words appear at their most probable position in the key stream. The most probable position of vt+11 in the key stream is zi+6 . Simulations indicate that the probability that this is a correct guess is 21.7%. The most 3.3. LINEAR DISTINGUISHING ATTACK ON SOBER-T32 37 probable position of vt+13 in the key stream is zi+7 . Simulations indicate that the probability that this is a correct guess is 19.8%. For the remaining three words, the situation is more complex. The probabilities will be calculated through a theoretical deduction. For the n-th word that goes to the stuttering unit, it can be expected that bn/25c stutter control words (SCW) have been used before. Of all the remaining (non-SCW) words, 50% are expected to go to the key stream. The most probable position in the key stream of the word vn is thus: E[position(vn )] = n c n − b 25 . 2 (3.19) In order to calculate the probability that the word vn will be at this most probable position, we will first calculate the probability that the n-th SCW appears at its most probable position in the NLF-stream. This probability is easier to calculate theoretically, and it is easy to see that the asymptotical behavior of both values will be the same. Two dibits, 00 and 11, determine what is going to happen with the next word. The two other dibits, 01 and 10, determine the stuttering of the two following words. As every dibit appears with the same probability, it is expected that a dibit uses 1.5 words of the NLF-stream on average. A SCW gives 16 dibits and uses thus an average of 24 words. Because the SCW is also coming from the NLF-stream, it is expected that the n-th SCW is at the position 25n of the NLF-stream. The probability that the n-th SCW is indeed on this position, is equal to the probability that n SCW’s determine the stuttering of exactly 24n words from the NLF-stream. This means that half of the 16n dibits should determine the stuttering of one word, the other half should determine the stuttering of two words. This probability can be expressed as follows: µ ¶ 16n 1 8n+8n 16n! P (position(SCW [n]) = 25n) = ( ) = . (3.20) (8n!)2 · 216n 8n 2 This equation can be approximated by using the Stirling equation for large faculties: √ n! ≈ 2πn · nn · e−n . (3.21) This gives the following equation: √ 1 2π16n · (16n)16n · e−16n √ =√ . P (position(SCW [n]) = 25n) ≈ 8n −8n 2 16n ( 2π8n · (8n) · e ) ·2 8πn (3.22) Simulation shows that this equation is a very good approximation. 38 CHAPTER 3. LINEAR ATTACKS When this line of reasoning is reversed, we see that the probability that the 25n-th √ word of the NLF-stream will be the n-th SCW, is also proportional to 1/ n. The probability for the following words to be at their most probable position in the key stream will be similar. It can be written as: P (position(vn ) = n n − b 25 c λ )= q , 2 8πn (3.23) 25 where λ is a constant. Simulation shows that this hypothesis is indeed correct. From the simulations it follows that λ ' 0.84. The probability that v17·232 −4 , v15·232 −4 , v4·232 −4 appear in the key stream at their most probable position can now be calculated with (3.23): • The probability that v4·232 −4 is at its most probable position (this position can be calculated with (3.19)) is q λ 8π·4·(232 −4) 25 = 2−17.3 . (3.24) • To calculate the probability that v15·232 −4 is at its most probable position, we can exploit the fact that we know that v4·232 −4 is at its most probable position. The probability for v15·232 −4 is then: q λ 8π·(15−4)·232 25 = 2−18.0 . (3.25) • In the same way we know for v17·232 −4 that v15·232 −4 is at its most probable position. The probability for v17·232 −4 is: q λ 8π·(17−15)·232 25 = 2−16.8 . (3.26) The total probability p0 can now be calculated: p0 = 0.217 · 0.198 · 2−17.3 · 2−18.0 · 2−16.8 = 2−56.6 (3.27) With probability p0 , v17·232 −4 , v15·232 −4 , v4·232 −4 , v13 and v11 are present in the key stream at their most probable position. They will appear in the key stream, XORed with 0, C or C 0 . This will however not affect the distribution: we are considering the XOR of bit 29 and bit 30, and for C both bits are 1. Thus, 3.3. LINEAR DISTINGUISHING ATTACK ON SOBER-T32 39 for C, C 0 and for 0, the XOR of bit 29 and 30 is always zero, so the constant values are always eliminated. This yields the following for the calculation of the Chernoff information between the distribution of the key stream PY and the uniform distribution PU : C(PY , PU ) ≈ p20 · C(PW , PU ) = 22·−56.6 · 22·−9.5 · 2−81.5 = 2−194.7 . (3.28) PW is the distribution of the NLF-stream. For an error probability of Pe = 2−32 this means we need 32 · 2194.7 = 2199.7 samples. The attack requires 2199.7 sequences of 17 · 232 words from the key stream, this is a total of L = 2200 ≥ 2199.7 + 17 · 232 words. 3.3.2 Extending the Attack on Sober-t16 to Sober-t32 In [74], Ekdahl and Johansson present a distinguishing attack on the full Sobert16 (with stuttering). As the paper mentions, the proposed methods are expected to be applicable against Sober-t32 as well, but no complexity expression is given due to computational limitations. In this section, we derive the expected complexity by making a number of (realistic) probabilistic assumptions. Distributions The distinguishing attack in [74] starts by approximating the equation of the NLF (A.11) of the cipher with a linear function and analyzing the distribution of the noise wt for different values of K. h i vt = st ⊕ st+1 ⊕ st+6 ⊕ st+13 ⊕ st+16 ⊕ wt = Ωt ⊕ wt (3.29) In the case of Sober-t16, the noise wt can take on 216 values and the probability of each of these values can easily be estimated by simulations (in [74], an accurate estimation is obtained by taking 238 samples). A similar simulation for Sober-t32, however, would require an impractical amount of memory and a huge processing time. This motivates us to derive a complexity expression based on the average non-uniformity of wt , instead of on its full distribution. First let us introduce some notation: 1 Uniform distribution: Pu (w) = p = = 2−32 (3.30) N ¡ ¢ Noise distribution: Pw (w) = p · 1 + ²w (w) (3.31) X 1 Non-uniformity: σ²2w = ²w (w)2 . (3.32) N w To estimate the non-uniformity of the noise distribution, we will now simulate ²w (w) for a limited number of values w and assume that these samples are representative for the full distribution. 40 CHAPTER 3. LINEAR ATTACKS Estimating parts of ²w . A first straightforward way to find an estimate of ²w (w) for a given value of w would consist in randomly choosing n sets (st , st+1 , st+6 , st+13 , st+16 ), computing wt for each of them and analyzing the frequency at which wt equals w. For a uniform (or almost uniform) distribution, the error of the estimation is expected to converge to zero with increasing n: r 1 1−p ≈√ . (3.33) error = p·n p·n As the magnitude of the ²w will appear to be in the order of 2−5 , we would not be able to detect them with this method until the number of samples n exceeds 242 by at least an order of magnitude. In order to speed up the convergence, we will follow a somewhat different approach. We first uniformly choose n sets (st , st+1 , st+6 , st+16 ) and compute ¡ ¢ at = fw (st ¢ st+16 ) ¢ st+1 ¢ st+6 ⊕ K (3.34) bt = st ⊕ st+1 ⊕ st+6 ⊕ st+16 . (3.35) Then for each set we count the number of st+13 for which wt = (at ¢ st+13 ) ⊕ (bt ⊕ st+13 ) = w . (3.36) This number can directly be derived from the bits of at and bt , i.e., it is not needed to run through all possible values of st+13 . The estimation for ²w (w) obtained this way is expected to converge more rapidly, as each step takes into account 232 possible values for (st , st+1 , st+6 , st+13 , st+16 ). Indeed, in case at and bt were uncorrelated and uniformly distributed, one would find that s¡ ¢ 3 32 1 4 error = ≈p . (3.37) p·n 213 · p · n The required number of steps n is thus reduced by a factor 213 . For both Sober-t16 and Sober-t32, we performed the simulations for different values of K and different sets of 256 consecutive w. From the estimated values of ²w (w), we calculated the non-uniformity σ²2w . Eventually, the minimal and average non-uniformity were found to be 2−10 and 2−8 for Sober-t16 and 2−9 and 2−7 for Sober-t32. Combining Distributions In the next step of the attack described in [74], different shifted versions of wt are combined in order to eliminate the unknown LFSR words st . Wt = wt+17 ⊕ wt+15 ⊕ wt+4 ⊕ α · wt . (3.38) 3.3. LINEAR DISTINGUISHING ATTACK ON SOBER-T32 41 Next, we need to calculate the non-uniformity of the full noise Wt . In the following two subsections, we derive a general expression for the non-uniformity of z = x ⊕ y given σ²2x and σ²2y . The sum of two independent distributions. Let x an y be drawn from two independent distributions with non-uniformity σ²2x and σ²2y . ¡ ¢ Px (x) = p · 1 + ²x (x) ¡ ¢ Py (y) = p · 1 + ²y (y) . (3.39) (3.40) The distribution of z = x ⊕ y can be written as: Pz (z) = X Px (x) · Py (y) (3.41) x⊕y=z ¡ ¢ = p · 1 + ²z (z) . (3.42) Exploiting the fact that the sum of all ² equals zero and that the distributions of x and y are independent, we obtain: ¡ ¢ ¢ 1 X ¡ E σ²2z = E ²z (z)2 N z " #2 1 X −1 X = E p · Px (x) · Py (y) − 1 N z x⊕y=z " #2 1 X 1 X 1 X 1 X = E ²x (x) + ²y (y) + ²x (x) · ²y (y) N z N x N y N x⊕y=z " #2 1 X 1 X = E ²x (x) · ²y (y) N z N x⊕y=z # " ¢ 1 X ¡ ¢ 1 X 1 X ¡ E ²x (x) · ²x (x ⊕ d) · E ²y (y) · ²y (y ⊕ d) . = N N x N y d 42 CHAPTER 3. LINEAR ATTACKS To calculate the expression between the square¡brackets we distinguish the cases ¢ d = 0 and d 6= 0. When d = 0, we have E ²x (x) · ²x (x ⊕ 0) = σ²2x . In all other cases we may assume that the expected value of ²x (x) · ²x (x ⊕ d), over all possible distributions with non-uniformity σ²2x , is independent of d and equal to −σ²2x /(N − 1) (because ²x sums to zero). The same applies for ²y and hence E ¡ σ²2z ¢ " # −σ²2y −σ²2x 1 2 2 σ²x · σ²y + (N − 1) · · = N N −1 N −1 = 1 · σ2 · σ2 . N − 1 ²x ²y (3.43) The sum of two identical distributions. A similar expression can be derived for the non-uniformity of z = x⊕x0 with x and x0 drawn from a single distribution. ¡ ¢ ¢ 1 X ¡ E σ²2z = E ²z (z)2 N z " #2 1 X −1 X = E p · Px (x) · Px (x0 ) − 1 N z 0 x⊕x =z " #2 1 X 2 X 1 X = E ²x (x) + ²x (x) · ²x (x0 ) N z N x N 0 x⊕x =z " #2 1 X 1 X E ²x (x) · ²x (x0 ) = N z N 0 x⊕x =z " # X ¡ ¢ 1 X 1 X 1 0 00 000 = E ²x (x) · ²x (x ) · ²x (x ) · ²x (x ) N z N N 00 000 x⊕x0 =z x ⊕x =z · ¸ N · σ²4x − τ²4x 3 · N · σ²4x − 6 · τ²4x 1 τ²4x = +3· + N N N N · (N − 3) £ ¤ 1 = 3 · (N − 2) · σ²4x − 2 · τ²4x , (3.44) N · (N − 3) with 1 X ²x (x)4 N x ¡ ¢ = O σ²4x . τ²4x = (3.45) (3.46) 3.3. LINEAR DISTINGUISHING ATTACK ON SOBER-T32 43 Distinguishing Distributions Chernoff information. The number of samples required to distinguish a distribution Px (x) from the uniform distribution is determined by the Chernoff information [74]. This quantity can easily be derived from the non-uniformity σ²2x : − log2 Xp 1 Xp 1 + ²x (x) N x ¸ · 1 1 1 X ≈ − log2 1 + · ²x (x) − · ²x (x)2 N x 2 8 µ ¶ 1 = − log2 1 − · σ²2x 8 1 ≈ · σ2 . 8 · ln 2 ²x Px (x) · p = − log2 x (3.47) (3.48) (3.49) (3.50) Complexity of the Attack on Sober-t32 Using the formulae in the previous sections we are now able to estimate the complexity of a distinguishing attack on Sober-t32. • The simulated approximation for the minimal non-uniformity of the noise wt is σ²2w ≈ 2−9 . (3.51) • Using (3.43) and (3.44) we find the expected non-uniformity of the full noise Wt : ¢4 3 ¡ σ²2W ≈ 3 · σ²2w (3.52) N ≈ 2−130 . (3.53) In order to derive this result, the last two sums of the expression Wt = (wt+17 ⊕ wt+15 ) ⊕ (wt+4 ⊕ α · wt ) are considered to be sums of independent distributions. • The stuttering adds an additional unknown constant Ct to the noise Wt . This constant can be written as Ct = ct+17 ⊕ ct+15 ⊕ ct+4 ⊕ α · ct with ct ∈ {0, C, C}. Assuming that all 12 possible values of Ct are equiprobable (which is the worst case), we find σ²2C ≈ N/12 and 1 · σ2 12 ²W ≈ 2−134 . σ²2W ⊕C ≈ (3.54) (3.55) 44 CHAPTER 3. LINEAR ATTACKS • As explained in [74], we will only get this non-uniform distribution if we made correct guesses for the positions of the wt in the key stream. Simulations show that this happens with a probability p0 = 2−5.5 . The final non-uniformity will therefore be reduced to: σ²2 ≈ p20 · σ²2W ⊕C ≈2 −145 . (3.56) (3.57) • In order to obtain a sufficiently small probability of error (say 2−16 ), even after applying the distinguisher for all 232 possible values of K, we need a stream of at least 48 times the inverse of the Chernoff information. This yields the final complexity: 48 · 3.4 8 · ln 2 ≈ 2153 . σ²2 (3.58) Conclusions In this chapter we have discussed linear attack methods on synchronous stream ciphers during key stream generation. Linear attacks are a very powerful method for both distinguishing and key recovery attacks, and an analysis of such attacks is of great importance when designing a stream cipher. First, we gave a general overview of linear attacks and we have stressed the importance of a simple design that can be easily analyzed with respect to these attacks. Then, we have shown how a simple but important class of stream ciphers, the nonlinear filters and combiners, can be analyzed with respect to these attacks. In a third part of this chapter, we have described linear distinguishing attacks on the NESSIE candidate Sober-t32 as an illustration. Chapter 4 Algebraic Attacks Algebraic attacks (AA) are a recent development in the cryptanalysis of symmetric encryption techniques and some other primitives. While their applicability to some other primitives is a topic of discussion for the moment, it is clear that algebraic attacks on stream ciphers with linear feedback are practical and require scrutiny by the cryptographic community. In this chapter we will study this class of attacks. In Sect. 4.1, a description of algebraic attacks on stream ciphers is presented, mainly based on the work of Courtois and Meier [55] and Courtois [51]. We will focus on the requirements that stream ciphers should satisfy in order to resist these attacks, and relate this to the mathematical properties of the building blocks of the stream cipher. This leads us to a description of algorithms that evaluate the resistance of Boolean functions against algebraic attacks in Sect. 4.2. In particular, we present a new algorithm that is faster than previously known algorithms for verifying the resistance against fast algebraic attacks (FAA). If practical verification is still computationally infeasible, theoretical deductions may help the cryptographer. An overview of these results is given in Sect. 4.3. Finally, we apply the methodology developed in this chapter on some actual stream cipher designs. This chapter is based on our papers [33, 36]. 4.1 Algebraic Attacks and Fast Algebraic Attacks In this section we describe the algebraic and fast algebraic attacks, as developed in [55, 51] by Courtois and Meier. We consider stream ciphers with a linear update function and a Boolean output function, such as the nonlinear filters and combiners. Using a simplified model as in the previous chapter, the equations 45 46 CHAPTER 4. ALGEBRAIC ATTACKS governing the stream cipher are ½ zt = f (St ) St+1 = UL (St ) , (4.1) where we use f instead of O to denote that the output function is a Boolean function, and the subscript to indicate that the update function UL is linear. Algebraic attacks also apply to some other classes of stream ciphers, for instance stream ciphers using vectorial output functions [52] and stream ciphers with a small nonlinear memory [8]. Algebraic attacks are very straightforward attacks. We write the system of nonlinear equations between the initial state S0 and the key stream (zt )t≥0 and try to solve it. The system of equations we are trying to solve is of the following format: f (S0 ) = z0 f (UL (S0 )) = z1 (4.2) f (UL2 (S0 )) = z2 ... Clearly, this original system of equations has s unknowns corresponding with the s bits of the initial state. Because UL is linear, also all iterations ULi are linear and the degree of the system of equations is therefore equal to df , the degree of f . For most decent stream cipher designs, solving this system is infeasible. However, the recent discovery of algebraic attacks presented ways to transform this system of equations into another system of equations that is much easier to solve. 4.1.1 Algebraic Attacks In a standard algebraic attack we try to reduce the degree of the above system of equations before we solve it. This lowers the attack complexity considerably. First, we need to perform a relation search step. Here we search for nonzero functions g and/or h of lower degree dg and/or dh both smaller than df , such that the following holds for all x ∈ Fϕ 2: ½ f (x) · g(x) = 0 (f (x) ⊕ 1) · h(x) = 0 . (4.3) It is easy to see that we can then replace each equation in (4.2) by a lower degree equation as follows: g(ULi (S0 )) = 0, ∀ i where zi = 1 h(ULi (S0 )) = 0, ∀ i where zi = 0 . (4.4) 4.1. ALGEBRAIC ATTACKS AND FAST ALGEBRAIC ATTACKS 47 This new system of equations can be solved much faster than (4.2) because of its lower degree. This is done in the second step, the solving step. As explained in Sect. 2.4.3, several techniques exist to solve this system of equations. To facilitate the complexity analysis, we use linearization (replacing each monomial by a new variable) followed by Gaussian elimination of this overdetermined system of equations. Consequently, the time complexity of solving the system of equations (4.4) is of order ω dg µ ¶ X s , (Msdg )ω = (4.5) i i=0 with ω = log2 (7) ≈ 2.807 (Strassen’s exponent) corresponding to the fastest known method of Gaussian elimination [158], and where we use the M notation as in (2.29). The data complexity is about Msdg but can be lowered if multiple functions g and/or h exist. From (4.5), it follows that the complexity of the algebraic attack is polynomial in the state size and exponential in the degree of g or h. Hence the efficiency of an algebraic attack depends heavily on the lowest degree of any multiple. This corresponds to the algebraic immunity of the Boolean function, as defined above. So in essence to determine the resistance of a design against the algebraic attack, all that needs to be done is to make sure that the AI of its output function is sufficiently high. We explain how this can be computed in Sect. 4.2. As an example, Table 4.1 shows the numerical values for the algebraic attack on a stream cipher with internal state of 256 bits and AI between 4 and 8. All data in the table are base 2 logarithms. From the table and taking the eSTREAM Profile 2 requirement of 80-bit security, it follows that an AI of 5 appears to be currently sufficient to withstand the standard algebraic attack in our framework. An interesting question is how such a design can be made resistant to the standard algebraic attack. The implications for Boolean functions are the following. It has been shown [55] that the AI of a Boolean function with ϕ inputs can be at most d ϕ2 e. A Boolean function hence needs to have at least 9 inputs. Of course, we need to check that the AI of the Boolean function is large enough, as explained in Sect. 4.2. 4.1.2 Fast Algebraic Attacks Fast algebraic attacks can be much more efficient than the standard algebraic attacks. In the fast algebraic attacks introduced by Courtois [51], the attacker tries to decrease the degree df of the system of equations (4.2) even further by searching for relations between the initial state and several bits of the output function simultaneously, i.e., equations of the form F (S0 , zt , zt+1 , . . . , zt+T −1 ) = 0, ∀t , (4.6) 48 CHAPTER 4. ALGEBRAIC ATTACKS Table 4.1: Logarithm of complexities of the algebraic attack AI 4 5 6 7 8 Data complexity 27.40 33.07 38.46 43.62 48.59 Time complexity 76.92 92.84 107.97 112.46 136.41 with F (S0 , z) = g(S0 , z) ⊕ h(S0 ) and dg < dh . See [96] for a general description. Let us now restrict to the case T = 1 in (4.6) and thus obtain the equations zi · g(ULi (S0 )) ⊕ h(ULi (S0 )) = 0 , i ≥ 0 , (4.7) where (g, h) represents the tuple of functions with low degrees (dg , dh ) and dg < dh for which f · g = h. So far, fast algebraic attacks mounted on actual stream ciphers with linear update mechanism and a Boolean output function all use equations of type (4.7). These are the fast algebraic attacks we are going to consider here. It is an open problem whether better attacks of type (4.6) can be found for specific stream ciphers. The considered attack can be described by the following simple framework, as briefly described by Courtois in [51] and more explicitly in [53]. 1. Relation search step: we need to find a (dg , dh )−R for our Boolean function. As defined above, f satisfies (dg , dh )−R if an equation exists of the following form, where g and h are nonzero and dg < dh : f (x) · g(x) = h(x) Multiplying (4.2) by g, our system h(S0 ) h(UL (S0 )) h(UL2 (S0 )) ... ∀x ∈ Fϕ 2 . (4.8) of equations now has degree dh : = z0 · g(S0 ) = z1 · g(L(S0 )) = z2 · g(L2 (S0 )) (4.9) 2. Precomputation step: we now reduce the degree of (4.9) further from dh to dg in a precomputation step. Because the linear combination of linear recurring sequences is again a linear recurring sequence, there always exists 4.1. ALGEBRAIC ATTACKS AND FAST ALGEBRAIC ATTACKS 49 a single linear combination of every Msdh + 1 consecutive bits such that for every t ≥ 0: Msd Mh αi · h(ULi+t (S0 )) = 0, (4.10) i=0 where the coefficients αi determine the linear recurrence. Therefore our system of equations (4.9) becomes of degree dg and has the following form: L s Md i h = 0 i=0 αi · zi · g(UL (S0 )) LMsdh i+1 i=0 αi · zi+1 · g(UL (S0 )) = 0 (4.11) LMsdh i+2 α · z · g(U (S )) = 0 i i+2 0 i=0 L ... The complexity of this precomputation step has been shown to be Msdh · (log2 Msdh )2 (see Hawkes and Rose, [96]). 3. Substitution step: when we obtain an actual key stream and use it in a fast algebraic attack, we need to substitute this key stream into (4.11). The best algorithm to do this so far has been described by Hawkes and Rose [96] and has complexity 2 · Msdg · Msdh · log2 Msdh . 4. Solving step: we solve the system (4.11) by linearization. The time complexity to do so is O((Msdg )ω ). The data needed is Msdg + Msdh ≈ Msdh . This data complexity can be reduced by a constant factor if we find several linear combinations, but then the substitution complexity will increase by the same factor. Note that, in order to evaluate the resistance of a stream cipher with linear feedback and Boolean output function against this fast algebraic attack, all we need to do is to perform step 1 (the relation search) for its Boolean function. Namely, we need to find out which (dg , dh ) − R the Boolean function satisfies. In Sect. 4.2, we will show a new algorithm which performs this step much faster than previous ones, and which hence makes it possible to evaluate the resistance of stream cipher with linear feedback using relatively large Boolean functions against such a fast algebraic attack (see Sect. 4.4 for some examples). Consider again the previous example, namely a stream cipher with 256-bit linear internal state and desired security level of 80 bits. Table 4.2 shows the numerical values for fast algebraic attacks if the Boolean function f satisfies (dg , dh ) − R, for some relevant values of dh and dg given in columns 1 and 2. Columns 3 to 7 of Table 4.2 present the precomputation, substitution, solving and data complexities of the fast algebraic attack and determine whether this 50 CHAPTER 4. ALGEBRAIC ATTACKS Table 4.2: Logarithm of complexities of the fast algebraic attack dh 6 7 8 9 10 11 12 dg 4 5 3 4 5 2 3 4 1 2 3 1 2 3 1 2 1 Prec. 49.0 49.0 54.5 54.5 54.5 59.8 59.8 59.8 64.9 64.9 64.9 69.7 69.7 69.7 74.4 74.4 79.0 Subs. 72.1 77.8 71.5 77.5 83.1 70.2 76.6 82.6 68.1 75.1 81.5 72.9 79.9 86.3 77.5 84.4 81.9 Solve 76.9 92.8 60.1 76.9 92.8 42.1 60.1 76.9 22.5 42.1 60.1 22.5 42.1 60.1 22.5 42.1 22.5 Data 38.5 38.5 43.6 43.6 43.6 48.6 48.6 48.6 53.4 53.4 53.4 58.0 58.0 58.0 62.5 62.5 66.9 Att.? yes no yes yes no yes yes no yes yes no yes yes no yes no no Res. Subs. 72.1 77.8 75.1 81.1 86.7 78.8 85.2 91.2 81.5 88.5 94.9 90.9 97.9 104.3 100.0 106.9 108.8 Eq. 0 0 3.6 3.6 3.6 8.6 8.6 8.6 13.4 13.4 13.4 18.0 18.0 18.0 22.5 22.5 26.9 Att? yes no yes no no yes no no no no no no no no no no no attack is valid (i.e., takes less than 280 operations). Columns 8 to 10 of Table 4.2 show the complexities if the number of key stream bits from a single key stream is limited to 240 . In this case, the substitution complexity may be higher (column 8) and an additional requirement is that enough low degree equations should be available (column 9). The column Att.? says whether an attack faster than exhaustive key search is possible. It follows from the table that there are two ways to ensure that such a stream cipher resists this fast algebraic attack: • Verify (practically or theoretically) that its Boolean function does not satisfy any of the (dg , dh ) − R for which the attack complexity is less than 280 in the table. • Verify (practically or theoretically) that its Boolean function has an algebraic immunity of at least 12. In that case fast algebraic attacks will also become infeasible because the substitution complexity is now the dominating step. If we accept to restrict the length of any frame to 240 bits, an algebraic immunity of 9 is sufficient to resist this fast algebraic attack. It is important to stress that these suggestions are not at all conservative 4.2. ALGORITHMS FOR COMPUTING AI AND (DG , DH ) − R 51 as they do not account for a security margin against fast algebraic attacks. In practice, it is advised to make sure a design has a considerable security margin against these attacks. Algebraic and fast algebraic attacks have been developed very recently, and it is likely that better methods can be found. See Sect. 4.5 for a discussion. 4.2 Algorithms for Computing AI and (dg , dh )−R As we explained in Sect. 4.1, in order to evaluate the resistance of a generator against the considered classes of (fast) algebraic attacks, all that needs to be done is to perform the relation search step for its Boolean function. We now describe the algorithms for this. In Sect. 4.2.1 we describe the algorithms from [123] to find relations for mounting algebraic attacks, but explain them in a more elegant way. This will help to understand the new algorithm we will develop in Sect. 4.2.2, and why this algorithm is much more efficient than previous ones. 4.2.1 Computing the AI of a Boolean Function The aim is to find or to disprove the existence of a function g(x) of degree dg such that f (x) · g(x) = 0 for all inputs x. The unknowns we need to find are the coefficients cg (a) with wt(a) ≤ dg in the ANF of g: g(x) = M a ϕ−1 cg (a) xa0 0 . . . xϕ−1 . (4.12) a∈Fϕ 2 wt(a)≤dg Hence the number of unknown coefficients is Mϕ dg . The equations we can use to find these unknowns are found as follows: the requirement f (x) · g(x) = 0 can be easily seen to correspond to the requirement that g(x) = 0 whenever f (x) = 1. When we consider the case of balanced functions, we have exactly 2ϕ−1 such equations. As we are looking for dg < d ϕ2 e, the number of equations 2ϕ−1 is always larger than the number of unknowns Mϕ dg . Solving the above system of equations corresponds to algorithm 1 of Meier 2 et al. [123]. The complexity of solving this system naively is O(2ϕ−1 · (Mϕ dg ) ). ϕ However, we can also just solve a subsystem of Mdg equations and check whether 3 its solution fits the other equations. Then we get a complexity O((Mϕ dg ) ). Algorithm 2 of [123] improves this complexity. In this algorithm, a specific choice is made for the subsystem of Mϕ dg equations that we will solve. Denote the system we are solving by M · cg = 0 , (4.13) 52 CHAPTER 4. ALGEBRAIC ATTACKS where cg is the Mϕ dg × 1 column matrix containing the ANF coefficients cg (a) ϕ ordered lexicographically on a, and M is an Mϕ dg × Mdg matrix of which we choose the rows as follows: the first row corresponds with the equation g(x) = 0 with wt(x) = 0 (if it exists, i.e., if f (0) = 1). The following rows correspond to the equations (if they exist) where the Hamming weight of the input vector is 1, 2, . . . , until Hamming weight dg , all ordered lexicographically. By now we have about 1/2 · Mϕ dg rows of our matrix M . We fill up the remaining rows of M by choosing other values x at random. Now we look at the upper part of this matrix, corresponding to the rows up to Hamming weight dg . It is easy to see that this part of the matrix is in echelon ϕ form and its size is approximately 1/2 · Mϕ dg × Mdg . From the structure of the problem presented, it is easy to see that this echelon form has O(1/2 · Mϕ dg ) pivot ¡ϕ¢ ϕ columns, 1/2 · dg zero columns (which we can ignore) and 1/2 · Mdg −1 nonzero nonpivot columns. The solution of this underdetermined system of equations ϕ ϕ can be found efficiently, with a complexity of O(1/4 · Mϕ dg · (Mdg + Mdg−1 )) operations. We now substitute these solutions into the remaining 1/2 · Mϕ dg equations. Only the nonzero pivot columns need to be substituted into the system, resulting ϕ 2 in a substitution complexity O(1/8 · (Mϕ dg ) · Mdg−1 ), and we obtain a system ϕ ϕ of 1/2 · Mdg equations in 1/2 · Mdg unknowns, which can be solved by Gaussian 3 elimination in O(1/8 · (Mϕ dg ) ) operations. This approach corresponds to algorithm 2 in [123]. The last step will be the dominating step in terms of time complexity. To make the algorithm memory efficient we will process the matrix row by row and only store the nonzero nonpivot ϕ elements for later use, resulting in a memory complexity O(1/4 · Mϕ dg Mdg −1 ). 4.2.2 Computing the (dg , dh ) − R for a Boolean Function The gain of algorithm 2 compared to algorithm 1 for computing the AI is only modest because the final system we need to solve still has half the number of unknowns as the original system. We will now show that for the search of (dg , dh ) − R, we can do much better. Our aim is to find out whether the Boolean function satisfies (dg , dh ) − R for certain values dg < dh at which the complexity of a fast algebraic attack would be less than exhaustive search. The unknowns we need to find are the coefficients cg (a) with wt(a) ≤ dg and ch (a) with wt(a) ≤ dh in the ANFs of g and h: g(x) = h(x) = L a∈Fϕ 2 wt(a)≤d g L a ϕ−1 cg (a) xa0 0 . . . xϕ−1 ch (a) xa0 0 a∈Fϕ 2 wt(a)≤dh a ϕ−1 . . . xϕ−1 . (4.14) 4.2. ALGORITHMS FOR COMPUTING AI AND (DG , DH ) − R 53 ϕ Hence the number of unknown coefficients is Mϕ dh + Mdg . Previously, it was thought [53] that the complexity of solving this system 3 of equations is O((Mϕ dh ) ), making it hard to evaluate the resistance against fast algebraic attacks even for functions with modest input dimension. We will now show a new algorithm that can solve the problem in time complexϕ 2 ϕ 2 ity O(Mϕ dh · (Mdg ) + (Mdh ) ). This makes the evaluation of the resistance against fast algebraic attacks achievable for larger Boolean functions, as shown by the examples in Sect. 4.4. The equations used to find the unknown coefficients of the ANFs of g and h are as follows: the requirement f (x) · g(x) = h(x) for all x ∈ Fϕ 2 gives us a condition on the ANF coefficients of g and h for every value of x. This results in an equation that can be of one of the following two types: g(x) = h(x) h(x) = 0 whenever whenever f (x) = 1 f (x) = 0 . (4.15) The number of equations obtained, 2ϕ , will always be larger than the number of ϕ unknowns Mϕ dh + Mdg in situations of practical interest. The naive solution is again to solve this system of equations, which has comϕ 2 plexity O(2ϕ · (Mϕ dh + Mdg ) ). However, we can also just solve a subsystem of ϕ ϕ Mdh + Mdg equations and see if its solution fits the other equations. Then we ϕ 3 get a complexity O((Mϕ dh + Mdg ) ). It is possible to significantly reduce the complexity further by a good choice of the subsystem to be solved and by making use of its structure. Denote again the system we are solving by M · cg,h = 0 , (4.16) ϕ where cg,h is the (Mϕ dh +Mdg )×1 column matrix containing the ANF coefficients ch (a) ordered lexicographically, where we intercalate the coefficients cg (a) right before the corresponding ch (a) as long as such coefficients exist – recall that ϕ ϕ ϕ dg < dh . The matrix M is a (Mϕ dh + Mdg ) × (Mdh + Mdg ) matrix of which we choose the rows as follows: the first row corresponds to the equation (4.15) with wt(x) = 0. The following ϕ rows correspond to¡the ¢ equations (4.15) with wt(x) = 1 ordered lexicographically, followed by the ϕ2 rows corresponding ¡ to¢ the equations (4.15) with wt(x) = 2. We continue this procedure until the dϕh rows corresponding to the equations (4.15) with wt(x) = dh . The remaining Mϕ dg rows of M are then filled up by choosing other values x at random for obtaining additional equations (4.15). Again, we can reduce the complexity significantly by looking at the structure of this M . Consider the Mϕ dh first rows of this matrix. It is easy to see that this ϕ ϕ part of the matrix is in echelon form and its size is exactly Mϕ dh × (Mdh + Mdg ). The solution of this underdetermined system of equations can be found efficiently, 54 CHAPTER 4. ALGEBRAIC ATTACKS ϕ ϕ ϕ 2 ϕ 2 requiring O((Mϕ dh ) + Mdh Mdg ) ≈ O((Mdh ) ) operations. The Mdh pivot ϕ columns correspond to the coefficients ch (a) with wt(a) ≤ dh , the Mdg nonpivot columns correspond to the coefficients cg (a) with wt(a) ≤ dg . We now substitute this into the remaining equations. We need to substitute ϕ ϕ Mϕ dh coefficients by Mdg coefficients and do this in Mdg rows, resulting in a ϕ ϕ 2 substitution complexity O(Mdh · (Mdg ) ). We can then solve this system to 3 obtain the coefficients cg (a) through Gaussian elimination, requiring O((Mϕ dg ) ) operations. Finally, we introduce the solution for the coefficients cg (a) with wt(a) ≤ dg into the upper part of the matrix and can efficiently find the solution for the coefficients ch (a) with wt(a) ≤ dh . The gain in complexity is much higher here than in the case of the previous section. Adding up the complexities of the steps and discarding terms that will not be relevant for the analysis, the time complexity will be dominated by either finding the solution of the echelon matrix or by the substitution complexity. ϕ 2 ϕ 2 Our algorithm hence has a time complexity O(Mϕ dh · (Mdg ) + (Mdh ) ). To make the algorithm memory efficient, we will process the matrix row by row and only store the nonpivot elements for later use, resulting in a memory complexity ϕ O(Mϕ dh Mdg ). 4.3 Theoretical Bounds In the previous section, we have presented algorithms to compute the actual resistance of a Boolean function against algebraic and fast algebraic attacks. For larger Boolean functions, it may not be possible to calculate these. Therefore it is interesting to perform a theoretical study of the AI and the existence of (dg , dh ) − R, and relate these to other cryptographic properties of the Boolean function. 4.3.1 Theoretical Bounds on Algebraic Immunity In this section, we will show relations between the AI of a Boolean function and its number of inputs ϕ, its degree df and its nonlinearity Nf . Relation between affine invariance and AI Dalai et al. [60] have shown that the AI of affine equivalent functions can differ with one. Only for functions that are affine equivalent in the input variables, the AI is invariant. 4.3. THEORETICAL BOUNDS 55 Relation between ϕ and AI It has been shown by Courtois [51] that AI(f ) ≤ d ϕ2 e. Relation between df and AI It is straightforward from the definition of AI that AI ≤ df . Relation between Nf and AI Dalai et al. [60] have shown a bound which was later improved by Lobanov [116]: AI(f )−2 µ Nf ≥ 2 · X i=0 ¶ n−1 . i (4.17) This formula shows that high nonlinearity and high algebraic immunity can go hand in hand. 4.3.2 Theoretical Bounds on the Existence of (dg , dh ) − R In this section we will relate the existence of (dg , dh )−R to the number of inputs, the degree and the AI of the Boolean function, and try to find out if there is any logic in the structure of existing (dg , dh ) − R for a particular function. Relation between ϕ and (dg , dh ) − R We will first look into the relation between the existence of (dg , dh ) − R and the number ϕ of inputs of the Boolean function. To do so, we first investigate the affine invariance of (dg , dh ) − R [32]. We now show that (dg , dh ) − R satisfies a stronger invariance than AI, which implicitly shows that fast algebraic attacks are more general than algebraic attacks: Theorem 4.3.1 The (dg , dh ) − R (with dg < dh ) of Boolean functions is an affine invariant property. Proof. Let f (x) satisfy a (dg , dh ) − R with dg < dh : f (x) · g(x) = h(x). (4.18) Then for any function f 0 that is affine equivalent to f expressed by Equation (2.11), the following relation holds: f 0 (x) · g(xA ⊕ b) = (f (xA ⊕ b) ⊕ x · c ⊕ d) · g(xA ⊕ b) = [f (xA ⊕ b) · g(xA ⊕ b)] ⊕ [(x · c ⊕ d)g(xA ⊕ b)] (4.19) 56 CHAPTER 4. ALGEBRAIC ATTACKS On the right hand side, the first term is equal to h(xA ⊕ b) because of (4.18), and the second term has degree ≤ dh because dg < dh . So (4.19) shows that f 0 satisfies (dg , dh ) − R. ¤ A bound on the relation between (dg , dh ) − R and ϕ has already been given in [51, Theorem 7.2.1] by Courtois: Theorem 4.3.2 A Boolean function f always satisfies (dg , dh ) − R for any dg and dh such that dg + dh ≥ ϕ Due to a condition on the Walsh spectrum of the function, we can strengthen this bound. Theorem 4.3.3 Any Boolean function on Fϕ 2 , for which there exists at least one vector v ∈ Fϕ 2 with d µ ¶ X ϕ ϕ |Wf (v)| > 2 − 2 i i=0 § ¨ satisfies an (d, d + 1) − R with d < ϕ2 − 1. Proof. For a Boolean function f on Fϕ 2 , it is well known [60] that if d µ ¶ d µ ¶ X X ϕ ϕ ≤ wt(f ) ≤ 2ϕ − i i i=0 i=0 (4.20) then the AI(f ) ≤ d. Note that Wf (x)⊕s·x⊕s0 (0) = (−1)s0 Wf (s) for all s0 ∈ F2 , s ∈ ϕ c0 Fϕ 2 and Wf (0) = 2 − 2wt(f ). Suppose (−1) Wf (c) satisfies the requirement ϕ given in the theorem for c0 ∈ F2 , c ∈ F2 , then f (x) ⊕ c · x ⊕ c0 has AI less than or equal to d due to (4.20). This means that there exists a function g on Fϕ 2 with degree less than or equal to d for which (f (x) ⊕ c · x ⊕ c0 )g(x) = 0, or also f (x)g(x) = (c · x ⊕ c0 )g(x), for all x ∈ Fϕ 2 . Consequently, f satisfies an (d, d + 1) − R. ¤ ϕ Theorem 4.3.3 §ϕ¨ § ϕ ¨ implies that any function f on F2 , with ϕ odd, satisfies at least ( 2 − 1, 2 ) − R, which corresponds with Theorem 4.3.2. Relation between df and (dg , dh ) − R Second, we investigate the bounds imposed by the degree of the Boolean function on the existence of (dg , dh ) − R. The following straightforward relation exists between (dg , dh ) − R and the degree (also mentioned in [51, Theorem 7.1.1]) . Theorem 4.3.4 If f has degree df , then f satisfies (i, df + i) − R for every i < df . 4.4. APPLICATION TO CONCRETE DESIGNS 57 Relation between AI and (dg , dh ) − R We now show that if f satisfies (dg , dh ) − R, the parameter dh is greater than or equal to AI(f ), due to the following relation with the annihilators of f and f ⊕ 1. Lemma 4.3.5 Let f, g, h be three Boolean functions on Fϕ 2 . The equation f · g = h is satisfied if and only if both f ·(g⊕h) = 0 and (f ⊕1)·h = 0 hold simultaneously. In other words, if and only if g ⊕ h is an annihilator of f and h is an annihilator of f ⊕ 1. Proof. Let f · g = h. Multiplying both sides of the equation with f leads to f 2 · g = f · h. Consequently, f · g = f · h and f · h = h. The other side of the implication is trivial. ¤ As a consequence, for every dh greater than or equal to AI(f ), the lowest possible dg such that f satisfies (dg , dh ) − R could be found by searching the lowest degree function which is a linear combination of functions belonging to the set of annihilators with degree dh of f and f ⊕ 1. We have implemented an algorithm that uses this methodology. This algorithm is less efficient than the one described in Sect. 4.2.2, but the implementation effort of this new algorithm is more complex and is work in progress. 4.4 Application to Concrete Designs We now show how to apply our methodology to concrete designs. We investigate the strength of two stream ciphers, SFINKS [35] and WG [91], proposed for eSTREAM, the ECRYPT stream cipher project [73]. Both stream ciphers are simple nonlinear filter generators using different internal size and filter function. We also give some general comments on the practicality of our approach. 4.4.1 SFINKS SFINKS is our submission to eSTREAM. A brief description can be found in Appendix A.3. For a more detailed description, we refer to [35]. Resistance to algebraic attacks Theoretical bounds. The Boolean function of SFINKS has 17 inputs and degree 15. From this it follows that an upper bound is AI(f ) ≤ 9. 58 CHAPTER 4. ALGEBRAIC ATTACKS Searching for relations. The algebraic immunity of the 17-bit Boolean function has been computed using an implementation of the algorithm of Sect. 4.2.1, and it was found that the AI is 6. There are four such degree-6 annihilators. As the internal state of SFINKS has 256 bits, Table 4.1 applies and an algebraic attack on SFINKS has complexity of O(2108 ) operations. Resistance to fast algebraic attacks To analyze the resistance against fast algebraic attacks, we need to check whether the function satisfies (dg , dh ) − R in all cases where the corresponding attack complexity is smaller than 280 . Table 4.3 presents the total complexity (i.e., the precomputation, substitution and solving complexities together) of the fast algebraic attack for an arbitrary filter generator with internal size 256 and filter function which satisfies (dg , dh ) − R, only when dh ≥ 6 because of Lemma 4.3.5. Table 4.3: log2 of complexities of FAA for a 256-bit state if f satisfies (dg , dh )−R dg \dh 1 2 3 4 5 6 52.73 59.73 66.14 76.94 92.83 7 58.10 65.10 71.48 77.47 92.83 8 63.20 70.20 76.61 82.60 92.83 9 68.12 75.12 81.53 87.52 93.18 10 72.88 79.88 86.28 92.27 97.94 11 77.47 84.47 90.88 96.87 102.53 12 81.93 88.93 95.34 101.33 106.99 Theoretical bounds. From Theorem 4.3.2, it follows that f satisfies (1, 16) − R, (2, 15)−R, (3, 14)−R, . . . but none of these are a threat. From Theorem 4.3.4, it follows that f satisfies (1, 16) − R, (2, 17) − R,. . . which is not a problem as well. Searching for relations. It is possible that we can improve on the theoretical bounds by actually searching for relations. With our algorithm of Sect. 4.2.2, this can be done easily. Table 4.4 shows the complexity of our relation search step. We only need to verify the existence of these relations for which the attack complexity is less than 280 in Table 4.3. In all other situations a zero for the complexity is put in Table 4.4. Courtois [53] already checked for the existence of such relations, using the old algorithm but which found the relations in a few days. We repeated his experiments with the old algorithm and confirm his simulations. 4.4. APPLICATION TO CONCRETE DESIGNS 59 Table 4.4: log2 of complexities for the Relation Search Step for SFINKS with the new algorithm dg \dh 1 2 3 4 6 28.84 29.88 33.86 37.71 7 30.67 31.31 34.82 38.63 8 32.01 32.44 35.54 0 9 32.92 33.25 0 0 10 33.48 33.75 0 0 11 33.79 0 0 0 12 0 0 0 0 In particular, Courtois found that f satisfies (4, 6)−R, (3, 7)−R and (2, 8)−R. If one restricts the length of a single key stream to 240 bits, as required in [35], some of these attacks can not be mounted. However, the (4, 6) − R and (3, 7) − R attacks can be mounted and have data, time and memory complexities below 280 . Remark. Due to the particular form of the filter function f in SFINKS, i.e., f (x, x16 ) : F17 2 → F2 : x 7→ f1 (x) ⊕ x16 , where f1 is a Boolean function on 16 F16 2 affine equivalent to the trace function of the inverse function on F2 , we can restrict us to the computation of the (dg , dh )−R with dg < dh of f1 . This follows from the observation that f1 · g = h f1 · g = h ⊕ x16 · g ⇒ (f1 ⊕ x16 ) · g = h ⊕ x16 · g ⇐ (f1 ⊕ x16 ) · g = h . In general, we have the following theorem. Theorem 4.4.1 Let fi be a Boolean function on Fi2 that satisfies (ei , di ) − R for i = 1, 2. Then f1 ⊕ f2 on Fn2 1 +n2 satisfies also (e1 , max{deg(f2 ) + e1 , d1 }) − R and (e2 , max{deg(f1 ) + e2 , d2 }) − R. 4.4.2 WG The stream cipher WG [91] is also a simple filter generator with 80-bit secret key, 319-bit internal state and a filter function with 29 inputs and degree 11. Resistance to algebraic attacks From the theory it immediately follows that the AI of the Boolean function of WG is at most 11. The authors did not compute the AI of WG, but they argue that its AI is most likely 11 in practice and that WG hence has a good resistance to standard algebraic attacks. 60 CHAPTER 4. ALGEBRAIC ATTACKS Resistance to fast algebraic attacks For the fast algebraic attack, we should again check whether f satisfies (dg , dh ) − R for all cases where the attack complexity is less than 280 . Table 4.5 presents these complexities. Table 4.5: log2 of complexities of FAA for a 319-bit state if f satisfies (dg , dh )−R dg \dh 1 2 3 4 3 36.2 43.9 62.8 0 4 42.9 50.2 62.8 80.5 5 49.1 56.4 63.1 80.5 6 55.0 62.4 69.0 80.5 7 60.7 68.0 74.8 81.1 8 66.2 73.5 80.2 86.5 9 71.4 78.7 85.4 91.8 10 76.5 83.8 90.5 96.8 11 81.4 88.7 95.4 101.8 Theoretical bounds. From Theorem 4.3.2, it follows that f satisfies (1, 28) − R, (2, 27)−R, (3, 26)−R, . . . but none of these are a threat. From Theorem 4.3.4, it follows that f satisfies (1, 12) − R, (2, 13) − R,. . . Let us investigate this (1, 12) − R into more detail. We know that there exist at most 230 equations of this type. These equations are simply found by multiplying the function with an arbitrary affine function which increases the degree with at most one. The required key stream length is equal to 270.73 and the attack complexities are as follows. • Precomputation complexity: 281.958 • Substitution complexity: 286.19 • Solving complexity: 223.36 In the proposal of the stream cipher WG, the key stream has been restricted to sequences of length 245 . A fast algebraic attack may still exist if at least 225.73 out of the 230 equations are linearly independent. In this situation, the substitution complexity will increase with a factor of 225.73 , leading to a complexity of 2111.4 . Searching for relations. It may also be that better fast algebraic attacks can be performed on WG. In order to check this, we need to run our algorithm and search for relations. Table 4.6 presents the complexity of our algorithm for determining if f is (dg , dh )−R. We expect that most of these values are attainable when implementing the algorithm on a modern PC. In order to show the efficiency of the algorithm, we compare Table 4.6 with Table 4.7 containing the complexity 4.4. APPLICATION TO CONCRETE DESIGNS 61 Table 4.6: log2 of complexities for the Relation Search Step for WG with new algorithm dg \dh 1 2 3 3 24.3 29.6 36.0 4 29.6 32.5 38.8 5 34.3 35.5 41.2 6 38.5 38.9 43.3 7 42.1 42.2 45.2 8 45.2 45.3 0 9 47.9 48.0 0 10 50.2 0 0 11 0 0 0 using the old approach. Consequently, the new algorithm still makes it possible to check the parameters (dg , dh ) which may allow fast algebraic attacks, while the old algorithm would never get results in reasonable time complexity. Table 4.7: Complexity of Relation Search Step for WG with the old algorithm dg \dh 1 2 3 4.4.3 3 36.0 36.0 36.0 4 44.3 44.3 44.3 5 51.5 51.5 51.5 6 57.7 57.7 57.7 7 63.2 63.2 63.2 8 67.9 67.9 0 9 71.9 71.9 0 10 75.4 0 0 11 0 0 0 Discussion of the Current Analysis of the Resistance Our relation search algorithm seems to make it possible to analyze the resistance of stream ciphers with linear feedback and with Boolean functions having a relatively high number of inputs (e.g. until 24 input bits for a filter function on an LFSR with length 256 with computation power of 245 ). If we are trying to assess whether a design resists these fast algebraic attacks, we should only run the algorithm for the right corners of the boldfaced area in Tables 4.3 and 4.5. This is because evidently if f does not satisfy (dg , dh ) − R, it also doesn’t satisfy (dg − i, dh − j) − R for positive i, j. For larger functions, our algorithm will not be able to check the resistance, and we will hence only have some theoretical results to study the resistance. These cannot for the moment guarantee resistance against fast algebraic attacks. Further study may reveal other interesting relations, either in general or for particular classes of Boolean functions. 62 CHAPTER 4. ALGEBRAIC ATTACKS 4.5 Conclusions and Open Problems In this chapter, we have presented algebraic and fast algebraic attacks against stream ciphers with a linear update function. We have restricted ourselves to the case where the output function is a Boolean function (and not an S-Box). After describing the state of the art in algebraic attacks, we have explained how the resistance of stream cipher designs against these attacks can be evaluated. For this, we have presented search algorithms, namely a more simple description of the algorithm to compute the AI and a new and more efficient algorithm to calculate the (dg , dh )−relations. We also presented some first bounds that relate the resistance against (fast) algebraic attacks to some properties of the Boolean function. Finally, we have illustrated the application of our approach on two eSTREAM candidates, SFINKS and WG. The results in this chapter are by no means the final words about assessing whether a stream cipher with linear update mechanism resists (fast) algebraic attacks. These attacks are very novel and many problems are still open in this respect: • In evaluating the complexity of the attacks, we have assumed that we use linearization to solve the system of equations. Further investigation is necessary as to whether special techniques, such as those described in Sect. 2.4.3, can be used to mount these attacks more efficiently. • Our analysis only looks at a certain class of fast algebraic attacks, namely those that follow from the existence of (dg , dh ) − R for the Boolean output function. It is unknown whether we can find better fast algebraic attacks of the general type described by (4.6). Finding such relations will require analysis of both the update and output function, and appears to be a complex task. • A thorough study of the resistance of some classes of Boolean functions against the fast algebraic attacks described here can be very interesting. There is a need for Boolean functions that combine good resistance to (fast) algebraic attacks with other desirable cryptographic properties and good implementation properties. The advice to cryptographers is that they should at least make sure that their design resists the analysis made in this chapter, and should include a sufficiently high security margin accounting for the many possible future improvements of the attack techniques. Chapter 5 Structural Attacks The cryptanalytic tools described in the previous chapters were based on rigorous mathematical analysis of the building blocks of the synchronous stream cipher. In this chapter, we will present some attacks which take another approach, and are mainly based on the size of the various building blocks of the stream cipher and on how they are interconnected. This chapter consists of two parts. In Sect. 5.1 trade-off attacks, inversion attacks and Guess-and-Determine attacks are discussed. Section 5.2 presents an illustration of an attack, namely a Guess-and-Determine attack on Sober-t32. Our treatment of trade-off attacks is an improved version of our comments in [62]; the attack on Sober-t32 has been previously presented by us in [11]. 5.1 5.1.1 Overview of Some Attacks Trade-off Attacks We start by treating trade-off attacks on stream ciphers. We limit our treatment to the impact of the trade-off attacks on stream ciphers. A detailed description of the algorithms to mount these attacks can be found in the references below. Our treatment in this section is an adapted version of our note [62] that addresses the relevance of trade-off attacks on the eSTREAM primitives. Standard Scenario In the standard scenario, an attacker is given a frame of known plaintext of length 2d . He wants to recover the k-bit secret key or the s-bit internal state of the stream cipher, using 2t time and 2m memory. The attacker will also need 63 64 CHAPTER 5. STRUCTURAL ATTACKS to perform a precomputation step of 2p operations to prepare for the trade-off attack. For the moment, it is assumed that the IV is just a counter, which is initialized to zero in the first frame with any key. Hellman. The standard time-memory trade-off is described by Hellman in [98] on block ciphers. It can also be applied to stream ciphers to recover the secret key. The bounds on the orders of the attack complexities are as follows: t + 2m = 2k p=k (5.1) d = 1. Babbage-Golic. The Babbage-Golic (BG) trade-off attack on stream ciphers has been independently discovered by Babbage [10] and Golic [86]. It trades time and data complexity on the one hand versus memory and precomputation complexity on the other, to recover the internal state of the stream cipher. The bounds on the orders of the attack complexities are as follows for the BG attack: t+m=s d=t (5.2) p = m. Biryukov-Shamir. The Biryukov-Shamir (BS) trade-off [28] combines the two approaches above into a more flexible time/memory/data trade-off, aimed at recovering the internal state of a synchronous stream cipher. The bounds on the orders of the attack complexities are as follows for the BS attack: t + 2m + 2d = 2s t ≥ 2d (5.3) p = s − d. Multiple-Key Scenario A different security model is the following: the attacker observes many key streams, each having a different secret key. The aim of the attacker is to recover one of these secret keys. Such attacks have been studied before, e.g. by Biham in [21]. Recently, new attention has been given to this scenario due to the work of Hong and Sarkar [101] and by Biryukov et al. [27]. We now describe these attacks and show their impact for stream ciphers using the eSTREAM key lengths, 80 bit and 128 bit. The attacks are similar to the BG and BS attacks described above, but now the secret key is attacked directly in the multiple key scenario. In order to emphasize this we denote the data requirement 5.1. OVERVIEW OF SOME ATTACKS 65 with 2dk , to stress that we need data corresponding to different keys. We need data the size of the key length for every single frame. It is significant to note that these attacks are also applicable to several modes of operation of block ciphers with the same key size. Babbage-Golic. The BG attack is now a simple birthday attack. The attacker generates key stream for 2m key pairs and stores them in a table. He then observes 2dk different key streams. Because of the birthday paradox, he will be able to recover the key for one of these key streams with high probability if m + dk = k. The bounds on the orders of the attack complexities are as follows: m + dk = k t = dk (5.4) p = m. Biryukov-Shamir. The BS attack can now also be applied to recover one of the secret keys, giving the following complexities: t + 2m + 2dk = 2k t ≥ 2dk (5.5) p = k − dk . We now show the implications for the two key sizes in the eSTREAM project, 80 bits and 128 bits. • For a 128-bit secret key, a graph of the trade-off is shown in Fig. 5.1. This figure represents, on a logarithmic scale, the trade-off that minimizes the maximum of t and m as a function of dk , following (5.5). A point on the curve is e.g. the following: after observing frames from 240 different keys, an attacker is expected to recover one of these, using 288 precomputation, and then an online attack requiring 280 time and 248 memory. • For a 80-bit secret key, a graph of the trade-off is shown in Fig. 5.2. Again, this figure represents, on a logarithmic scale, the trade-off that minimizes the maximum of t and m as a function of dk , following (5.5). A point on the curve is e.g. the following: after observing frames from 225 different keys, an attacker is expected to recover one of these, using 255 precomputation, and then an online attack requiring 250 time and 230 memory. Discussion of these trade-offs A first point about this attack is the practicality of the trade-offs proposed, especially when huge amounts of memory are needed. It may indeed be better to CHAPTER 5. STRUCTURAL ATTACKS Time t (solid), Memory m (dotted), Precomputation p (dashed) 66 120 100 80 60 40 20 0 0 10 20 30 40 Number of key streams observed 50 60 Figure 5.1: BS multiple-key trade-off for 128-bit secret key invest your resources in parallelization of the operations instead of in this huge memory for a trade-off attack, as already noted by Amirazizi and Hellman [3] and later by Wiener [163] and Bernstein [20]. Wiener defines the full cost of a cryptanalytic attack to compare the cost of key recovery for the various attacks. More research is needed to effectively compare the various attacks. This requires the combined knowledge of some refined techniques for trade-off attacks, such as Rivest’s distinguished points technique [66] or Oechslin’s rainbow tables [136], and of hardware design. A second point is the impact of these attacks. The standard Hellman tradeoff and a parallel attack both are ways of performing exhaustive search. In their yearly report on algorithms and key sizes [72], ECRYPT studies these various attack scenarios and concludes that an 80-bit level appears to be the smallest general-purpose level as with current knowledge it protects against the most reasonable and threatening attack scenarios. (. . . ) The choices 112/128 for the high and very high levels are a bit conservative based on extrapolation of current trends. The standard BG and BS trade-offs are specific to stream ciphers and attack the internal state directly. Babbage [10] proposes to make the state of the stream cipher at least twice as large as the key, a recommendation which has been followed in the literature. This indeed makes the BS and BG trade-offs more expensive than exhaustive key search. The impact of the multiple-key scenario, which holds both for stream ciphers and for block ciphers in several common modes, became only recently under increased scrutiny. It is advisable that system designers make an assessment Time t (solid), Memory m (dotted), Precomputation p (dashed) 5.1. OVERVIEW OF SOME ATTACKS 67 100 90 80 70 60 50 40 30 20 10 0 0 5 10 15 20 25 30 Number of key streams observed 35 40 Figure 5.2: BS multiple-key trade-off for 80-bit secret key to determine whether these attacks are a problem in the intended applications. A possible remedy is to add entropy to the stream cipher initialization. As we explained in [62], two scenarios could be considered: • An attacker should not be able to mount an attack where precomputation, online time, memory and data requirements are all less than 2k . In this case, entropy should be added equivalent to half the key size. • An attacker should not be able to mount an attack requiring online time, memory and data all less than 2k , and a precomputation which is even longer than exhaustive search. In this case, entropy should be added equivalent to the key size. Two methods for adding entropy can be considered: either we add entropy through full-entropy IV s [101], or by using a secret key which is longer than the actual security level [20]. We see the following advantages and disadvantages to both approaches: • The use of longer keys requires these longer keys to be securely distributed and stored, which may require some additional resources. • The use of full-entropy IV s requires that these IV s are generated and transmitted. • Longer secret keys may be confusing, as the security level no longer corresponds to the key size. 68 CHAPTER 5. STRUCTURAL ATTACKS • Longer secret keys make brute-force search more expensive, whereas the full-entropy IV s only provide protection against trade-off attacks. • Both approaches can be used in actual stream ciphers, as the state is already twice the security level and can hence receive more entropy. However, if the entropy gets close to the internal state size, resynchronization may become a bit more delicate due to ad hoc attacks such as slide attacks [29] and related techniques. Again, it will depend on the application which approach will be preferable. It is advisable that stream cipher designers make their specifications flexible enough to accommodate various key and IV sizes. 5.1.2 Inversion Attacks Inversion attacks, introduced by Golic [85], are a simple class of attacks using the structure of the state update and output function to determine the entire internal state of a stream cipher. It applies when the state is a shift register and the distance between the first and last tap (called the memory size M) is smaller than the shift register length. The attack first guesses the M bits corresponding to the memory size, and then determines the following bits at each iteration. Some techniques to perform this efficiently have been described in [87, 88]. These attacks can even be stronger if the cryptographer has made an unfortunately regular choice for the output taps. To prevent this structural attack, Golic proposed that shift register-based stream ciphers should have a large memory size M (i.e., that values from the beginning up to the end of the shift register should be combined to generate key stream) and that the output taps should have an irregular structure. Golic suggests the use of a full positive difference set, better known as a Golomb ruler [164]. 5.1.3 Guess-and-Determine Attacks Guess-and-Determine (GD) attacks are basically a very simple class of attacks, but which can be surprisingly powerful and can be combined with other ad hoc attack techniques. The basic idea is a divide and conquer approach. The attacker guesses a part of the internal state, and recovers the remaining part of the internal state by using the relations induced by the update and output function of the stream cipher. GD can be seen as a name for a whole set of attacks on various designs. For example, the attacks on RC4 by Knudsen et al. [108], on A5/1 by Golic [86], on SNOW 1.0 by Hawkes and Rose [95] and on Sober-t32 by us (see below) all use GD-type methods. Some more advanced techniques for GD-like attacks are 5.2. GUESS-AND-DETERMINE ATTACK ON SOBER-T32 69 Hawkes and Rose’s use of multiples of connection polynomials for word-oriented LFSRs [94] and Krause’s BDD-based cryptanalysis [110]. Usually GD-attacks are quite ad hoc and the best attacks are often found using some heuristic techniques, though some recent research tries to perform them in a more systematic way [42]. We now describe our attack on Sober-t32 as an example of a Guess-and-Determine Attack in the next section. 5.2 A Guess-and-Determine Atttack on Unstuttered Sober-t32 Sober-t32 is a software-oriented synchronous stream cipher designed by Rose and Hawkes [93] which has been submitted to the NESSIE competition. The cipher uses a Linear Feedback Shift Register (LFSR), a Non-Linear Function (NLF) and an irregular stuttering mechanism for the generation of the pseudorandom key stream. Sober-t32 is described in Appendix A.4. We now present a Guess-and-Determine attack against unstuttered Sobert32. This attack exploits a probabilistic factor in the design and appears to be faster than exhaustive search. This is not expected by the designers, as they state in [93]: Our analysis indicates that the combination of the LFSR and NLF appears to be sufficient to resist GD-attacks. First, a simplified attack will be presented, where the carry bits are not taken into account. Next the real attack will be presented. Finally, the attack will be improved in different ways. 5.2.1 Attack without Carry Bits First, we will rewrite the equation of the NLF (A.11) by separating the Most Significant Byte (MSB) and the three other bytes and by using (A.12). This yields the following equation: (((SBOX(st,M SB ¢ st+16,M SB ¢ o1 ) ⊕ (0kst,R ¢ st+16,R )) ¢ st+1 ¢ st+6 )⊕ K) ¢ st+13 . (5.6) In this equation, o1 represents the carry bit towards the MSB. Under the assumption that the MSB of K is zero, this equation can be split into two separate vt = 70 CHAPTER 5. STRUCTURAL ATTACKS parts: ((SBOX1 (st,M SB ¢ st+16,M SB ¢ o1 )) ¢ st+1,M SB ¢ st+6,M SB ) vt,M SB = ¢st+13,M SB ¢ o2 (((SBOX2 (st,M SB ¢ st+16,M SB ¢ o1 ) ⊕ (st,R ¢ st+16,R )) vt,R = ¢st+1,R ¢ st+6,R ) ⊕ KR ) ¢ st+13,R . (5.7) In this equation, SBOX1 represents the MSB of the output of the S-box, and SBOX2 the three remaining bytes of the S-box output. o2 represents the carry bits from the additions. Especially the first equation is interesting. Given the value of the MSB of st , st+1 , st+6 and st+13 , the value of the MSB of the key stream vt and the value of the carry bits o1 and o2 , it is possible to calculate the MSB of st+16 . For the moment, the carry bits are not taken into account. Now the attack starts. First, the MSB of the first 16 words of the LFSR (st , st+1 , st+2 . . . st+15 ) are guessed, a total of 128 bits. Knowing these, together with the key stream vt , it is possible to calculate the MSB of st+16 , st+17 , st+18 , st+19 , . . . The LFSR will now be clocked a few times. We get a system of linear equations in bits, where the unknowns are the 24 least significant bits of every word that appears in the LFSR. Meanwhile, at every iteration a number of linear equations is obtained. These equations are the following: • At every iteration, we get 32 new linear equations from the derivation of the new LFSR word. In fact, the linear recurrence over GF (232 ) is equivalent to 32 bit-wise LFSRs (see [99]). These LFSRs all have the same linear recurrence, which is given in [152]. • We also get an extra linear equation per iteration in the least significant bit. As the input to the S-box is known, one can easily see that the following equation holds for the least significant bit: vt0 = SBOX 0 (st,M SB ¢st+16,M SB ⊕o1 )⊕s0t ⊕s0t+16 ⊕s0t+1 ⊕s0t+6 ⊕K 0 ⊕s0t+13 . (5.8) In this equation, the superscript 0 stands for the least significant bit of the word. This equation is perfectly linear. We also get one such equation before iterating: before the LFSR is clocked, we already get such an equation from the initial state. After clocking the LFSR k times, 32 · k + k + 1 = 33 · k + 1 linear equations are obtained. The remaining unknowns in these equations are: • The value of the 24 least significant bits of st , st+1 , st+2 . . . st+16 . • The value of the least significant bit of K. 5.2. GUESS-AND-DETERMINE ATTACK ON SOBER-T32 71 • The value of the 24 least significant bits of the new word at each iteration. After k iterations, the total is 24 · 17 + 1 + 24 · k = 24 · (17 + k) + 1 unknowns. In order to obtain a solvable set of equations, the number of equations must be larger than the number of unknowns: 33 · k + 1 ≥ 24 · (17 + k) + 1 ⇐⇒ k ≥ 45.33 . (5.9) Remark that the attack recovers the whole state of the cipher, except for the 23 remaining bits of K. However, once the attack is finished, these can be recovered easily from the second equation of (5.7). In order to evaluate the complexity of the attack, one should consider the number of bits that has to be guessed. The following bits should be guessed: • The MSB of st , st+1 , st+2 . . . st+15 , a total of 128 bits. • The MSB of K. In fact, this MSB will always be assumed to be zero. This assumption can be seen to be equivalent to guessing the MSB of K: It is possible to mount the attack on a number of different key streams until we get a key stream in which the MSB of K is zero. This means guessing 136 bits in total, which implies a complexity of 2136 . It should be noted that solving the set of equations does not increase the complexity of the attack: We know in advance the linear equations relating the deduced bits to the initial state bits, so we can precompute the inverse matrix and all we have to do is a matrix multiplication. Of course, we have not taken into account the carry bits that are unknown to us. In the next section, we will take the carry bits into account and show the complexity of the full attack. 5.2.2 Taking Account of the Carry Bits In the previous section, we have assumed that the carry bits are known. In reality we will also have to guess these carry bits. Carry bits are not purely random, the distribution of their value is non-uniform. One can take advantage of this by first trying to guess the more probable values. The number of times we have to guess on average will be well approximated using the entropy, as this equals the amount of information that is present in the carry bits. The number of guesses will typically be 2entropy−1 . In this section, we will first calculate the entropy for the two carry bits o1 and o2 . In the second part, we will evaluate the complexity of the full attack, now considering the carry bits. 72 CHAPTER 5. STRUCTURAL ATTACKS The Entropy of the Carry Bits In the following, pji (and also qij and rij ) stands for: the probability that the value of the carry bit to the i-th bit is equal to j. The bits are numbered from least to most significant. Bit 1 is the least significant bit (lsb), bit 32 is the most significant bit (msb ). The notations pji , qij and rij are used to distinguish between the three different scenarios which are discussed below: • Two random 32-bit words are added. One can see that p02 equals 3/4, and that p12 equals 1/4. The values of the subsequent carry bits can be obtained with the following recursion: ( p0i+1 = 3 4 · p0i + 1 4 · p1i p1i+1 = 1 4 · p0i + 3 4 · p1i . (5.10) The values of p0i and p1i can now be calculated for all i. Both values converge rapidly to 1/2. For i = 25, the carry bit of interest here, this is a very good approximation. The entropy H for this carry value is thus: H=− X 1 1 1 1 pi · log(pi ) = − log( ) − log( ) = 1 . 2 2 2 2 (5.11) The carry bit o1 corresponds to this case. The entropy of the carry bit o1 is 1 bit. • Now three words are added up. The carry value can now be 0, 1 or 2. The probabilities for the first carry value are q20 = 12 , q21 = 12 and q22 = 0, and the recursion formula is: 0 qi+1 = 12 · qi0 + 18 · qi1 1 qi+1 = 21 · qi0 + 34 · qi1 + 21 · qi2 (5.12) q2 = 1 · q1 + 1 · q2 . i+1 i i 8 2 For increasing i, the values converge rapidly towards q 0 = q 2 = 16 . 1 1 6, q = 2 3 and The entropy H converges thus towards the value: H=− X 1 1 2 2 1 1 q i · log(q i ) = − log( ) − log( ) − log( ) = 1.25 . (5.13) 6 6 3 3 6 6 5.2. GUESS-AND-DETERMINE ATTACK ON SOBER-T32 73 • In a third scenario, we consider the carry value for the sum ((x + y) ⊕ C) + z + u. As an extra constraint, all bits of C that are more significant (i.e. more to the left) than the carry value considered must be zero. It can be seen that this case is a combination of the two previously considered cases: the total carry value is the sum of the carry value from the addition of two words (x and y) and of the carry value from the addition of three words (x + y ⊕ C, z and u). It is then easy to see that the probabilities converge towards the following values: 0 1 r = p0 · q 0 = 12 . 16 = 12 r1 = p0 · q 1 + p1 · q 0 = 12 · r2 = p0 · q 2 + p1 · q 1 = 12 · 3 1 r = p1 · q 2 = 12 · 16 = 12 . 2 3 + 1 2 · 1 6 = 5 12 1 6 + 1 2 · 2 3 = 5 12 (5.14) The entropy H converges thus towards the following value: X 1 1 5 5 5 5 1 1 log( )− log( )− log( )− log( ) = 1.65 . 12 12 12 12 12 12 12 12 (5.15) The carry value o2 corresponds to this case. The entropy of the carry value o2 is 1.65 bit. H=− ri ·log(ri ) = − Complexity of the Full Attack The full attack requires guessing the following bits. • The 136 bits that had to be guessed in the attack without carry bits. • The carry bits, a total of 47 · (1 + 1.65) = 124.55 bits. (That is, the number of iterations plus one. This extra guess comes from determining st+16 : there is no iteration here, but the carry bits have also to be guessed.) This means a total of 260.55 bits. The complexity of the attack is thus 2260.55 . This is only a little above the complexity of doing an exhaustive search for the 256-bit key. In the following sections, a number of improvements will be presented in order to get the complexity below that of exhaustive key search. 5.2.3 Further Improvements Instead of assuming that the 8 most significant bits of K are zeros, one could also assume that the nine most significant bits of K are zeros. This will lower 74 CHAPTER 5. STRUCTURAL ATTACKS the entropy of the carry bit o2 . It can be calculated that the new entropy for o2 is 1.48 bit. Now we can recalculate the total complexity for the attack with the procedure described above. We get a complexity of 2253.56 . We can also assume that the 10 most significant bits of K are zero. The entropy of o2 is then 1.43 bit, and the complexity of the attack is 2252.21 . Another improvement would be to guess the carry bits of several rounds together. The entropy of the carry bit in (s0 + s16 ) is 1 bit; and the entropy of the carry bit in (s16 + s32 ) is also 1 bit. The entropy of the two together is 1.92 bit however. If we take (s0 + s16 ), (s16 + s32 ) and (s32 + s48 ) together, the entropy is 2.83 bit. There is scope for more improvements of this type. Another possible improvement would be to use all relations in the NLF, so not only the linear ones. The complexity of this approach is not well understood and requires further research. 5.3 Conclusion In this section we have described some more structural approaches in the cryptanalysis of stream ciphers. We have described the trade-off attacks, analyzed their impact and evaluated the countermeasures to these attacks. We also described some attacks using structural properties of the stream ciphers to mount key recovery attacks, notably inversion and Guess-and-Determine attacks. As an example, we mounted a Guess-and-Determine attack on the stream cipher Sober-t32. Chapter 6 Resynchronization Attacks In the three previous chapters, attacks have been described on synchronous stream ciphers during key stream generation. It was assumed in these sections that the stream cipher has been initialized with a random initial state, and we only analyzed the update and output function. In this chapter, we will also analyze the initialization (or resynchronization) mechanism of the stream cipher in more detail. Daemen, Govaerts and Vandewalle [58] observed that this resynchronization mechanism can lead to a new type of attacks on synchronous stream ciphers. They also showed an efficient attack on nonlinearly filtered systems using a linear resynchronization mechanism and using an output Boolean function with few inputs. Golic and Morgari [89] extended this attack to the case where the output function is unknown. Borissov et al. [30] showed that a ciphertext-only attack is also possible in some cases. In this chapter, we extend the resynchronization attack to overcome some limitations of the original attack of Daemen et al. We achieve this by further refining the original resynchronization attack and by combining the attack with the other attack methodologies described in the previous chapters. In Sect. 6.1, we introduce some necessary theory: we explain to which classes of stream ciphers our analysis applies, give some necessary theorems on Boolean functions, and present a framework for resynchronization attacks. In Sect. 6.2 we present the Daemen et al. attack and its limitations. In Sect. 6.3 we show how to perform the Daemen et al. attack in real-time by precomputation. In Sect. 6.4 we describe several methods to mount a resync attack when the number of resyncs is small. Section 6.5 describes methods to mount attacks when the output function has many inputs. In Sect. 6.6, we describe attacks on stream ciphers with memory and in Sect. 6.7 we discuss attacks on stream ciphers with nonlinear resynchronization mechanism. In Sect. 6.8, we apply some of the attacks 75 76 CHAPTER 6. RESYNCHRONIZATION ATTACKS described on some stream ciphers: E0 from the Bluetooth standard, Toyocrypt and the summation generator. We conclude in Sect. 6.9. 6.1 6.1.1 Preliminaries Used Framework for Synchronous Stream Ciphers Again, we consider a subcase of the general framework for a synchronous stream cipher given by (2.26). We consider a synchronous stream cipher with s-bit linear state S updated by a linear function represented by a matrix UL (e.g., one or more LFSRs) over F2 , and with a nonlinear Boolean output function f that takes ϕ input bits coming from S to produce one output bit zt . Some designs (e.g., the combiners with memory) also include a m-bit memory M that has a nonlinear update function h. This results in the following framework: zt = f (St , Mt ) St+1 = UL · St (6.1) Mt+1 = h(St , Mt ) . The initial state (S0 , M0 ) is determined by the resynchronization mechanism, which combines a k-bit secret key K and a iv-bit known initialization vector IV i with an initialization function In: (S0 , M0 ) = In(K, IV i ) . 6.1.2 (6.2) Some Special Properties of Boolean Functions and Related Inputs In this section we provide some theorems about Boolean functions and related inputs. All calculations are done over the finite field F2 . Lemma 6.1.1 Let α, δ (1) , δ (2) ∈ {0, 1}ϕ be arbitrary. For i = 1, 2 it holds that M mα (x ⊕ δ (i) ) = mα0 (x)mα−α0 (δ (i) ) , (6.3) α0 ≤α where mα is a monomial as defined in Sect. 2.2.1. Also, deg mα (x ⊕ δ (1) ) ⊕ mα (x ⊕ δ (2) ) ≤ deg mα − 1. (6.4) Proof. The first equation is obvious. The second one is because of: mα (xP+ δ (1) )¡ + mα (x + δ (2) ) ¢ = α0 ≤α mα0 (x)mα−α0 (δ (1) ) + mα0 (x)mα−α0 (δ (2) ) ¡ ¢ P = mα (x) + mα (x) + α0 <α mα0 (x)mα−α0 (δ (1) ) + mα0 (x)mα−α0 (δ (2) ) . | {z } =0 6.1. PRELIMINARIES 77 ¤ Theorem 6.1.2 Let f be a Boolean function with deg f = d. Then deg f (x ⊕ δ (1) ) ⊕ f (x ⊕ δ (2) ) ≤ d − 1 . (6.5) P Proof. We can write out f (x) by its ANF α,|α|≤d cα mα (x). Then, lemma 6.1.1 implies that ³ ´ M f (x ⊕ δ (1) ) ⊕ f (x ⊕ δ (2) ) = cα mα (x ⊕ δ (1) ) ⊕ mα (x ⊕ δ (2) ) . | {z } α,|α|≤d deg≤d−1 ¤ The following corollary is obvious: Corollary 6.1.3 For any even integer m and any vectors δ (1) , . . . , δ (m) : deg m M f (x ⊕ δ (i) ) ≤ deg f − 1 . (6.6) i=1 Theorem 6.1.4 is a special case of theorem 6.1.2 and will be of use in this chapter: Theorem 6.1.4 Let f be a Boolean function with deg f = d. Let ei ∈ {0, 1}ϕ be the unit vector with a 1 in position i and 0 elsewhere. Then the function f1 (x0 ) = f (x) ⊕ f (x ⊕ ei ) has degree ≤ d − 1, where x0 = (x1 , . . . , xi−1 , xi+1 , . . . , xn ) ∈ {0, 1}ϕ−1 . Proof. We first split the function f (x) into two parts, the first part consisting of the monomials containing xi , and the second part consisting of the monomials not containing xi as a factor: f (x) = xi · f1 (x0 ) ⊕ f2 (x0 ) , (6.7) where it is straightforward to see that deg f1 ≤ d − 1 and deg f2 ≤ d. We do the same for the function f (x ⊕ ei ): f (x ⊕ ~ei ) = (xi ⊕ 1) · f1 (x0 ) ⊕ f2 (x0 ) . (6.8) Taking the XOR of the equations (6.7) and (6.8) and eliminating terms occurring twice yields: f (x) ⊕ f (x ⊕ ~ei ) = f1 (x0 ). ¤ 78 CHAPTER 6. RESYNCHRONIZATION ATTACKS Table 6.1: The situation of a frequently reinitialized key stream generator Frame 0 1 .. . R-1 6.1.3 Clock 0 State (S00 , M00 ) = In(K, IV 0 ) (S01 , M01 ) = In(K, IV 1 ) .. . (S0R−1 , M0R−1 ) = In(K, IV R−1 ) Output z00 ... ... Clock T − 1 State Output (ST0 −1 , MT0 −1 ) zT0 −1 z01 ... (ST1 −1 , MT1 −1 ) zT1 −1 .. . ... .. . .. . z0R−1 ... R−1 (STR−1 −1 , MT −1 0) zTR−1 −1 Resynchronization Attacks For a synchronous stream cipher, perfect synchronization between sender and receiver is required. The aim of the resynchronization mechanism is to achieve this in a secure fashion. A first solution is fixed resync. In this scenario, the message is divided into frames, and each frame i is encrypted with a unique IV i , a frame counter updated in a deterministic way. An attack under this scenario is called a known IV resynchronization attack. The frequency at which resynchronization should occur depends on the risk of synchronization loss. Examples of stream ciphers that use fixed resync are the E0 algorithm used in the Bluetooth [157] wireless communication standard, and A5 [4] used in GSM cellular voice and data communication. A second scenario is that the receiver sends a resynchronization request to the sender as soon as synchronization loss is detected. This is called requested resync. In this scenario, the receiver may be allowed to choose the nonce IV i used in the frame. This may enable a chosen IV resynchronization attack, as described e.g. by Joux and Muller [104]. Security under the chosen IV attack scenario implies security under the known IV attack scenario. Hence, a good resynchronization mechanism should be resistant against a chosen IV attack. Table 6.1 shows the general setting for a resync attack as used throughout this chapter. For R frames, the attacker knows the first T key stream bits. For simplicity we use successive key stream bits, although this is not a requirement for most attacks described in this chapter. Where this is really a requirement, we will specify this. The attacker also knows the initial value IV i of each frame. 6.2. THE DAEMEN ET AL. RESYNCHRONIZATION ATTACK 79 We now describe a first resynchronization attack by Daemen et al. on a simplified version of this framework and point out its limitations. 6.2 6.2.1 The Daemen et al. Resynchronization Attack Description The resynchronization attack of Daemen et al. [58] is a known plaintext attack for the special case of a simple memoryless combiner with a linear resynchronization mechanism. The framework of the attack can be described as follows: S0i = A · K ⊕ B · IV i (6.9) z i = f (Π · Sti ) ti St+1 = UL · Sti . In these equations, A, B, UL and Π are known binary matrices, A ∈ Fs×k , 2 B ∈ Fs×iv , UL ∈ Fs×s and Π ∈ Fϕ×s . The matrices A and B represent the 2 2 2 fact that the resync mechanism is linear, the projection matrix Π shows that the output function f uses only a subset of ϕ bits of Sti . The initialization vector of the ith frame is IV i , and zti is the key stream bit at time t of the ith frame. We introduce the key-dependent unknown values kt and the known values ivit , both ϕ-bit long, as follows: ½ kt = Π · ULt · A · K (6.10) ivit = Π · ULt · B · IV i . Using the setting from Table 6.1, the attacker can then rewrite this system into a system of equations built as follows: zti = f (kt ⊕ ivit ) for 0 ≤ i ≤ R − 1, 0 ≤ t ≤ T − 1. (6.11) Now we try to find a solution of this system of equations for each t. Assume without loss of generality that t = 0. If ϕ is not too large, we can perform an exhaustive search over k0 , and check whether the guess satisfies the R equations for t = 0. If R ≥ ϕ, it is expected that a unique solution for k0 exists. Hence the correct value of k0 has been found, and thus ϕ linear equations in the bits of the secret key. This is repeated for t = 1, 2, ..., p − 1, such that p = dk/ϕe, until the entire secret key is deduced. The complete attack requires on average dk/ϕe · 2ϕ evaluations of the function f , at least ϕ resyncs and about k bits of key stream in total (ϕ frames of length dk/ϕe). Note that the computational complexity of the attack increases exponentially with the number ϕ of inputs of the Boolean function f . 80 CHAPTER 6. RESYNCHRONIZATION ATTACKS 6.2.2 Limitations The Daemen et al. attack can be seen as a divide and conquer attack. Standard cryptanalytic attacks such as correlation and algebraic attacks work chronologically on a key stream, which corresponds to a row in Table 6.1. On the contrary, the Daemen et al. attack tries to solve the system column by column. One of the motivations of this chapter is to combine both approaches. Of special interest is the combination of the resynchronization attack with algebraic attacks. In many cases, it is not obvious whether the column by column approach by Daemen et al. works or not. We have identified the following limitations: 1. The attack does not work in real time. 2. The number of resyncs R has to be at least the number ϕ of input bits of the output function f . 3. The complexity is prohibitively large for large values of ϕ. 4. The divide-and-conquer approach does not work if the key stream generator uses additional memory. 5. The initialization function In has to be linear. In the following, we will address each of these limitations and show how to overcome them. 6.3 Real-time Attack The attack of Daemen et al. shows that ciphers that use a linear resynchronization mechanism and that have a Boolean function f with few inputs are insecure. This enables a passive attack on such designs. A real-time attack in which the attacker can discover the plaintext immediately (and even modify it in a controlled way) is not possible, because the time required to perform the p exhaustive searches will be too high. Here we show how to easily replace this iterated exhaustive search by a precomputation step. This enables real-time active attacks on such ciphers. We start from the realistic assumption that the IV i s are chosen in a deterministic way, for instance by a counter or a fixed update mechanism. In the precomputation step, we first calculate the values ivit for 0 ≤ i ≤ ϕ − 1 and 0 ≤ t ≤ T − 1. Then we calculate the following ϕ bits, and repeat this for all 6.4. ATTACK WITH A SMALL NUMBER OF RESYNCHRONIZATIONS 81 values of gt (a guess for kt ) going from 0 to 2ϕ − 1, and this for all t. f (gt ⊕ iv0 ) = b0 f (g ⊕ ivt1 ) = bg1 t ,t t t gt ,t ... f (gt ⊕ ivϕ−1 ) = bϕ−1 t gt ,t . (6.12) For each time t, we obtain 2ϕ sequences b0gt ,t b1gt ,t . . . bϕ−1 gt ,t . Because the length of these sequences is ϕ, every value of gt is expected to correspond with a unique sequence b0gt ,t b1gt ,t . . . bϕ−1 gt ,t . We then sort the gt values based on the numerical value of the corresponding sequence, and store this in memory. The attack now goes as follows. We group the outputs observed at say t = 0 in a sequence z00 z01 ...z0ϕ−1 . We jump to this position in our table built for t = 0, and the value found there is the correct value of k0 . We do the same for the times t = 1, 2, ...p − 1, and we have then found the necessary ki to directly determine the secret key K. The total complexity of the precomputation step is about k · 2ϕ evaluations of f (but this can of course be replaced by 2ϕ evaluations of f and k · 2ϕ table lookups). The memory requirement is about k · 2ϕ bits, which is feasible for many stream ciphers (e.g., for a secret key of k = 256 bits, and a Boolean function with ϕ = 20 inputs, 32 Mbyte is required). 6.4 Attack with a Small Number of Resynchronizations In [58], the authors made the assumption that the number of solutions converges to 1 if R & ϕ. Actually, the number of required resyncs depends on the cipher and the observed public parameters IV i . In [89], Golic and Morgari discussed the number of IV s that are needed for the Daemen et al. attack to work. They showed that with a non-negligible probability more than ϕ known IV s are necessary. This results in an increased attack complexity, both for the original attack and for the precomputation attack. We will here follow different approaches. We want the attacks to work in any case and with a minimal number of known IV s. Simulations on various Boolean functions have confirmed that the standard resynchronization attack does not always work in practice with ϕ IV s. This is due to two reasons, the first being imperfect behavior of the function f . However, this effect is not significant because in most stream ciphers the function f has good statistical properties, which typically include balancedness and high nonlinearity. A second reason is that sometimes collisions occur between two values ivat and ivbt where a 6= b. We will show several ways to overcome this problem. 82 6.4.1 CHAPTER 6. RESYNCHRONIZATION ATTACKS Computational Approach Two-phase attack We implement the algorithm in two steps. The first step, the resynchronization attack, retains a set of values for each of the k0 , k1 . . . kp−1 . In a second step, we then search through all possible combinations until we have found the correct secret key. Simulations have shown that for ϕ (or more) known IV s, the time complexity of the second step is negligible. In other words, the resynchronization attack (extended with the fast search step) is always successful under realistic assumptions with ϕ known IV s. Using this two-step algorithm, one can also mount a resynchronization attack with R < ϕ known IV s. We can show that the time complexity of the second k step is approximately 2(ϕ−R)· ϕ . Even if this complexity increases exponentially with decreasing R, this shows that a resynchronization attack is still feasible for R smaller than (but close to) ϕ: if R = α · ϕ with 0 < α ≤ 1, the complexity becomes 2(1−α)·k . Overlapping bits There is also another interesting way to perform a resynchronization attack when R < ϕ. Let us take the case R = ϕ − 1. For kt , we will get two possibilities after the exhaustive search. But looking at the bits of these two possibilities kt,1 and kt,2 , about half of these will be equal, and will therefore certainly be the correct values for these bits of kt . This implies that we have still found ϕ/2 linear equations in K, and we will just need frames that are twice as long as in the standard attack, i.e., have length T ≥ 2k/ϕ each. This is still very realistic in most cases. We can develop a similar reasoning for smaller values of R, but the length of the frames and the complexity required increases rapidly: they can ϕ−R ϕ−R −1 −1 be shown to be 22 · k/ϕ and 22 · k/ϕ · 2ϕ respectively as follows. Suppose we have done R resynchronizations, where R < ϕ. This implies that on average 2ϕ−R solutions remain for kt . The probability that one specific ϕ−R −1 bit is the same in all these solutions is (1/2)2 . We have found one bit of information if this is the case, else we didn’t learn anything. This holds for all ϕ bits of kt . The average frame length required is now equal to the amount of information required (the k bits of the key) divided by the average information learned at each time step: k frame length = ϕ 22 ϕ−R −1 = k 2ϕ−R −1 ·2 . ϕ (6.13) 6.4. ATTACK WITH A SMALL NUMBER OF RESYNCHRONIZATIONS 83 Table 6.2: The complexities of the practical resynchronization attacks Attack Daemen et al. Resyncs ϕ 30 frame length k/ϕ 10 bit Total complexity k/ϕ · 2ϕ 233 Two-phase attack R(< ϕ) 29 28 27 26 k/ϕ 10 bit 10 bit 10 bit 10 bit k/ϕ · 2ϕ + 2(ϕ−R)·k/ϕ 233 233 233 240 Overlapping bits R(< ϕ) 29 28 27 26 22 ϕ−R −1 · k/ϕ 20 bit 80 bit 1280 bit 328 kbit 22 ϕ−R −1 · k/ϕ · 2ϕ 34 2 236 240 248 The time complexity of the attack now consists of performing an exhaustive search over 2ϕ possibilities at each time t. Hence the time complexity is equal to ϕ−R −1 2ϕ times the frame length, which gives s/ϕ · 22 · 2ϕ . In Table 6.2, we summarize the complexities of the attacks. We also give a realistic example for a secret key of k = 300 bits and ϕ = 30. This clearly shows that limiting the number of allowed resyncs per secret key to less than ϕ will not prevent a resynchronization attack. We will also give an example of this later on when we discuss the attack on Bluetooth. 6.4.2 Using Algebraic Attacks As we have pointed out, Table 6.1 implies the following system of equations: zti = f (kt ⊕ ivit ), 0 ≤ i ≤ R − 1, 0 ≤ t ≤ T − 1 . (6.14) Hence, another possibility is to try to solve it as a whole instead of “columnby-column”, by using techniques from linear algebra. The linearization method requires that the number of linearly independent equations exceeds the number of occurring monomials Mkdf . This requires T · R ≥ Mkdf . Hence, the lower the degree of the equations the faster the attack. As already explained in Chapter 4, several conditions and methods are described in the literature [55, 51, 123, 7] for transforming (6.14) into a new system of equations g(kt ⊕ ivit , zti ) = 0, 0 ≤ i ≤ R − 1, 0 ≤ t ≤ T − 1 . (6.15) 84 CHAPTER 6. RESYNCHRONIZATION ATTACKS with dg < df . Next, we will show how to use the resync setting to decrease the degree of (6.15) further. The Degree-d -1 attack The first approach is to construct new equations of degree ≤ d − 1. We express g by M gj (kt ⊕ ivit ) · g̃j (zti ) . (6.16) g(kt ⊕ ivit , zti ) = j Observe that the functions gj and g̃j depend only on g and are all known to the attacker. The idea is to find appropriate linear combinations of (6.16) to reduce the degree. Let I := {j| deg gj = d} and rewrite (6.16) to M M g= gj · g̃j ⊕ gj · g̃j . (6.17) j∈I | {z } deg gj = d j6∈I | {z } deg gj <d Theorem 6.4.1 provides a method for decreasing the degree at least by 1: Theorem 6.4.1 Let g be expressed as described in (6.16). For any known iv0t ,. . . , |I| ivt and corresponding known outputs zti , coefficients c0 , . . . , c|I| ∈ {0, 1} with at least one ci 6= 0 can be computed such that deg |I| M (i) ci · g(kt ⊕ ivit , zt ) ≤ deg g − 1 . (6.18) i=0 Proof. We set gji := gj (kt ⊕ ivit ) and g̃ji := g̃j (zti ) ∈ {0, 1}. With (6.16), we can write |I| M i=0 ci · g(kt ⊕ ivit , zti ) = |I| MM j∈I i=0 ci · g̃ji · gji ⊕ |I| MM ci · g̃ji · gji . j6∈I i=0 The second part of the right hand side has a degree ≤ d − 1 by definition of I. The idea is to find coefficients c0 , . . . , c|I| ∈ {0, 1} such that the first part of the right hand side has degree ≤ d − 1 too. By Corollary 6.1.3, it is sufficient that P|I| i i=0 ci · g̃j (treated as an integer) is an even number for all j ∈ I. We ³show now that ´ it is always possible. For each i we define the vector − → (i) (i) Vi := g̃1 , . . . , g̃|I| ∈ {0, 1}|I| . Then the assumption above is equivalent to L − → → − i ci · Vi = 0 . By the theory of linear algebra, the |I| + 1 vectors of the |I|-dimensional vector space {0, 1}|I| are linearly dependent. Therefore, such coefficients ci exist. ¤ 6.4. ATTACK WITH A SMALL NUMBER OF RESYNCHRONIZATIONS 85 The attack complexity is as follows. First we have to calculate (for a fixed clock t) the coefficients ci . This requires O(|I|3 ) operations. Then, the computation of the function of degree ≤ d − 1 is equivalent to the summation of (several) vectors of size Mkd−1 . The two steps have to be repeated about Mkd−1 times to get enough linearly independent equations of degree ≤ d − 1. The final step is to use Gaussian elimination to solve the linearized system of equations ≈ (Mkd−1 )3 . Therefore the overall number of operations is about ¢ ¡ 3 |I| + Mkd−1 · Mkd−1 + (Mkd−1 )3 . (6.19) L Note that it may happen that i ci · g(kt ⊕ ivit , zti ) is equal to zero for some t. For example, this can not be avoided if g is linear. But in this case, the cipher is weak anyhow. As opposed to fast algebraic attacks [51, 6], this approach does not require the highest-degree monomials to be independent of the key stream bits. Moreover, the number of key stream bits required is ≤ Mkd−1 + |I| instead of ≤ Mkd . On the other hand, fast algebraic attacks benefit from the fact that the most time consuming part can be sourced out in a precomputation step. This is not possible here. Another advantage is that its applicability is independent of the values of ivit and zti and that it does not require ϕ to be low. The Degree-e attack So far, we concentrated only on decreasing the degree by 1. But clever combinations may reduce the degree even further. In the worst case these combinations are linear, even if the degree of g is high. This is for example the case for the E0 key stream generator. The best other result was a system of equations of degree 3 found by Courtois [51]. We develop now the theory explaining how to compute the lowest possible degree. In the following, we treat k as ϕ unknown bits. Definition 6.4.2 We set S(g) := {g(k ⊕ iv, z) | iv ∈ {0, 1}ϕ , z ∈ {0, 1}} and define by < g >:=< S(g) > the linear span of S(g) (i.e., all possible linear combinations). < g > is a vector space over the finite field F2 . By dim g we define the dimension of < g > and by B(g) an arbitrary basis of < g >. Further on we set Md (g) to be all monomials of degree ≤ d which occur in S(g). From the theory of linear algebra, the following theorem is obvious: Theorem 6.4.3 A function of degree ≤ e exists in < g > only if the vectors in B(g) ∪ Me (g) are linearly dependent. Let S be a set of Boolean functions. We now describe an algorithm to compute a linearly independent set of functions < S > with the lowest possible degree e: We treat the functions in S as rows of a matrix where each column reflects one 86 CHAPTER 6. RESYNCHRONIZATION ATTACKS occurring monomial. Then, we apply Gaussian elimination in such way that the monomials with the highest degree are eliminated first and so on. Finally, we just pick those functions in the result with the lowest degree. If S = B(g), the algorithm computes the lowest possible value for e. Let B̃ := {g(kt ⊕ ivit , zti ) | 0 ≤ i ≤ R − 1} be the set of functions available to the attacker. < B̃ > might be only a subset of < g >. In this case, the lowest possible degree can be higher. We try now to estimate the complexity of the Degree-e attack. The first step is to find an appropriate linear combination in B̃. The effort is about (dim g)2 · |Md (g)| to find the linear combination and about Mke to compute the corresponding vector of size Mke . This has to be repeated at least Mke times. Finally, a system of equations in Mke has to be solved (≈ (Mke )3 ). Hence, the overall number of operations is about ((dim g)2 · |Md (g)| + Mke ) · Mke + (Mke )3 . (6.20) In the individual case, the applicability of this attack depends on many parameters: the function g, the number R of accessible frames and the corresponding values of ivi and zti . Hence the attack does not work in every case. On the other hand, it puts on the designer the responsibility of making sure that these attacks are not feasible. Moreover, if the set B̃ is a basis of < g >, then an equation of the lowest possible degree can be constructed. What is the probability that this happens? Let sg := dim g and R ≥ sg . If we assume that each expression g(kt ⊕ ivit , zti ) is a random vector in {0, 1}sg then by [166], the probability that B̃ is a basis of < g > is µ ¶ m Y 1 Prob = 1− i . (6.21) 2 i=R−sg +1 6.5 Resynchronization Attacks with Large ϕ The Daemen et al. attack only works when the number ϕ of inputs to the Boolean function is not too large. However, we will show in this section that using a linear resynchronization mechanism will inevitably induce weaknesses into stream ciphers, even when ϕ is very large. We will show a chosen IV attack, a known IV attack and an algebraic attack. 6.5.1 A Chosen IV Attack The standard attack has a large time complexity of dk/ϕe · 2ϕ evaluations of the function f , but it only requires ϕ resyncs. We will now show that a trade-off is possible. 6.5. RESYNCHRONIZATION ATTACKS WITH LARGE ϕ 87 Let kt in (6.11) consist of the bits k0 , k1 , . . . kϕ−1 . We make the reasonable assumption that in the chosen IV attack, the attacker can control the values of ivit , consisting of the bits iv 0 , iv 1 , . . . iv ϕ−1 . We now start the chosen IV attack. We first take a constant C. We then perform resyncs with all the values ivit = C ⊕ i, where we let i take all values going from 0 (00 . . . 0) to 2u − 1 (00. . . 011. . . 1) for some u. The impact of this choice of i is that the last u input bits of f will take all possible values. Of course we can do the same with any combination of u bits by choosing i as needed. Let us consider the first two values of our resynchronization attack. We denote ki ⊕ iv i as xi . We know that: ½ f (x0 , x1 , . . . xϕ−1 ) = z00 (6.22) f (x0 , x1 , . . . xϕ−1 ⊕ 1) = z01 . By XORing both equations and using Theorem 6.1.4 we get: f1 (x0 , x1 , . . . xϕ−2 ) = z00 ⊕ z01 , (6.23) where the Boolean function f1 has many properties that are desirable for an attacker. The degree of f1 is lower, it has fewer monomials and it depends on less variables than f . This makes many attacks much easier. In our attack, we will apply this method with 2u chosen IV s in an iterative way. As an illustration, these are the equations for u = 2. ¾ f (. . . xϕ−2 , xϕ−1 ) = z00 f1 (. . . xϕ−2 ) = ⇒ 1 0 ⊕ z f (. . . xϕ−2 , xϕ−1 ⊕ 1) = z01 z 0 0 ¾ ⇒ ... 2 f (. . . xϕ−2 ⊕ 1, xϕ−1 ) = z0 f1 (. . . xϕ−2 ⊕ 1) = ⇒ f (. . . xϕ−2 ⊕ 1, xϕ−1 ⊕ 1) = z03 z12 ⊕ z03 (6.24) The basic attack requires at every time 2u · ϕ resyncs, in order to obtain ϕ − u equations in the Boolean function fu (x0 . . . xϕ−u−1 ) which can then be used in a normal resynchronization attack. In practice, however, we note that the number of monomials, variables and the degree of the equation decreases very rapidly, making the attack work with very small complexity. We will give an example using the Boolean function of Toyocrypt in Sect. 6.8.2. 6.5.2 A Known IV Attack We now describe another attack, which shows that the linear resynchronization mechanisms introduces weaknesses in the fixed resync setting for all Boolean functions. The principle of the attack is similar to the linear cryptanalysis method, developed by Matsui for attacking block ciphers [118]. First we search for a 88 CHAPTER 6. RESYNCHRONIZATION ATTACKS linear expression for the Boolean function that holds with probability p 6= 0.5. We then collect sufficiently many resyncs such that we can determine key bits using a maximum likelihood method. We will now describe this in more detail. Our starting point is the fact that for any ϕ-input nonlinear Boolean function f , we can always find a subset S ⊂ {0, 1, . . . ϕ − 1} for which the equation M xi = f (x0 , . . . xϕ−1 ) (6.25) i∈S holds with probability 0.5 + ², where ² 6= 0. Suppose that the best bias we have found is ² (0 < ² ≤ 0.5). For each time t, with R known IV s, we get the following equations: L L 0 1 i∈S ki = i∈S iv i ⊕zt .. (6.26) . L L R−1 R−1 ⊕zt , i∈S ki = i∈S iv i each of which holds with probability 0.5 + ². We now count for how many of these equations the right hand side is 1 respectively 0. We assume then that the correct right hand side is the value (0 or 1) that occurs most if ² > 0, and the value that occurs least if ² < 0. We now have found one linear equation in the state bits that is true with some probability. This probability increases with the value of R and is dependent on the magnitude of ². As in [118], the probability that the equation is correct, given R resyncs and a bias ², is equal to: Z ∞ √ 2 1 √ · e−x /2 · dx = 0.5 + 0.5 · erf ( 2 · R · |²|) , (6.27) √ 2·π −2· R·|²| where erf is the error function. If we want the probability of correctness to approach one, we need the number of resyncs R to be c · ²−2 for a small constant value c. As explained in Sect. 2.2.2, the output of a Boolean function is correlated to at least one linear function of the inputs. The bias is at least the bias of bent functions, which is given by the equality in (2.17). In the following, we will use the nonlinearity of bent functions to establish an upper bound on the number of resynchronizations needed in practice to mount the attack. From (2.17), it follows that any linear resynchronization mechanism with a ϕ-input Boolean function f can be broken by the resynchronization attack using at most about 2ϕ+2 known IV s. The biases found in actual Boolean functions used in stream ciphers will be much higher than the lower bound given by (2.17) . This is due to several reasons: the functions have to be balanced, they have to be easily implementable, 6.6. ATTACKS ON COMBINERS WITH MEMORY 89 and for combiners they will also have to take into account the trade-off that has to be made between nonlinearity and resilience such as given by (2.19). It can be expected that most Boolean functions used in practice are vulnerable to this known IV attack on a linear resynchronization mechanism. 6.5.3 An Algebraic Attack As said in Sect. 6.4.2, the goal is to find a solution to the equations: g(kt ⊕ ivit , zti ), 0 ≤ i ≤ R − 1, 0 ≤ t ≤ T − 1 (6.28) Again, the approaches to reduce the degree as described in Sect. 6.4.2 can be also applied here. If all bits of kt are uniquely specified by the equations, one can use Gröbner bases or the linearization method to solve (6.28) clock by clock. If the degree of the equations is low (e.g., Toyocrypt), it may be faster and require less IV s than the approach described in Sect. 6.5.2. 6.5.4 The Degree-1 Attack Another approach is to apply the methods described in Sect. 6.4.2 if e = 1 is possible. In this case, we get at least one linear equation in the bits of kt directly. If we repeat this for enough values of t the corresponding K can be reconstructed by solving a system of linear equations. The exact effort depends on many parameters: the function g, the number R of frames, the corresponding values of ivi and zti and so on (see also Sect. 6.4.2). Due to (6.20), the number of operations is about or more generally ((dim g)2 · |Md (g)| + k) · k + k 3 (6.29) 3 3 ((Mϕ dg ) + k) · k + k (6.30) Expression (6.30) shows that the approach may be feasible if the degree is small. 6.6 Attacks on Combiners with Memory Many stream ciphers use their linear state in conjunction with a (small) nonlinear memory in order to avoid the trade-off between correlation immunity and nonlinearity for the combining function, see Rueppel [148]. In this section we demonstrate that resynchronization attacks can also be performed on stream ciphers with memory. Note that the known IV attack of Sect. 6.5 can also be applied on combiners with memory. 90 6.6.1 CHAPTER 6. RESYNCHRONIZATION ATTACKS A Standard Resynchronization Attack We will use the following model based on the general case of the combiners with memory: S0i M0i i St+1 i Mt+1 zti = A · K ⊕ B · IV i (6.31) = const = C · Sti (6.32) (6.33) = h(D · Sti , Mti ) = h(kt ⊕ ivit , Mti ) = f (D · Sti , E · Mti ) = f (kt ⊕ ivit , E (6.34) · Mti ) . (6.35) In practice, some designs only start outputting key stream when t = µ; this results in an improved diffusion of the key and the initialization vector into the nonlinear state. We will discuss both the cases µ = 0 and µ > 0. Note that in some designs M0i is also dependent on the key and the IV . This can in our model be treated in the same way as the case µ > 0. µ=0 In the case µ = 0, the resynchronization attack can easily be adapted to work also with combiners with memory. We again describe the attack with ϕ resyncs. The first series of outputs can be written as z0i = f (k0 ⊕ ivi0 , M0i ). Because M0i is known, the attacker can recover k0 by exhaustive search. He can then determine M1i for all the resyncs using (6.34). Now he knows all inputs to (6.35) for t = 1 except k1 , which he can again recover through exhaustive search, and so on. All complexities of this attack are exactly the same as for the case without memory. The only difference is that each step now consists of one evaluation of f and of ϕ evaluations of h. µ>0 The attacker does not know the initial contents of the m-bit memory M . Moreover, this memory is different for each resync. The attack now works exactly as above, except that the attacker will first have to guess the contents of M at t = µ. The time complexity of the attack now becomes 2ϕ·m · dk/ϕe · 2ϕ evaluations of the f and h function. As ϕ and m are quite small in most actual designs, the attacks are feasible. For larger memories (e.g., 32 bits), the complexity gets prohibitively large. Let us take as an example a combiner consisting of 5 LFSRs with total length 320 bits, and with 5 memory bits (i.e., ϕ = 5, k = 320 and m = 5 respectively). The complexity of the resynchronization attack is then equal to 236 function evaluations. 6.6. ATTACKS ON COMBINERS WITH MEMORY 91 Practical considerations of the attack As for the case without memory, we would like that the attack always works with ϕ resyncs. At some times, we will have several possible values for ki . In a second phase, we cannot use the exhaustive search method of the memoryless case, because we would then have to try all possible values for updating the memory bits, which would increase the complexity enormously. This problem can be easily overcome by implementing the algorithm with a depth-first search. When at some time t we have several possibilities for kt , we pick the first one and go to t + 1. If we have no solution at time t, we go back to t − 1 and try the next possibility there. When we have arrived at time dk/ϕe, we have found a sufficient number of values and we check if we have found the correct key. Simulations indicate that when the number of resyncs R is equal to or larger than ϕ, the attack will find the correct values very quickly and has to search very few states. The same approach can also be used when the attacker disposes of less than ϕ resyncs, i.e., R < ϕ. In the case µ > 0 this may even be advantageous from a complexity viewpoint, because we have to perform an exhaustive search over less than ϕ initial memory states. But again the complexity of the search algorithm will increase exponentially with decreasing R, making this attack feasible only when R is close to ϕ. A particularity is the case when the Boolean output function f is linear. In that case we don’t get new information at each new resync, because all equations z0i = f (k0 ⊕ ivi0 , M0i ) are equivalent. This problem can be easily overcome by using the memory update function h to do the checks during the search. An example of such a linear output function is E0, on which we will describe some attacks in Sect. 6.8.1. 6.6.2 Using Ad-hoc Equations Another possibility is the use of ad-hoc equations which have been introduced in [8]. The authors showed that for a combiner with memory with m memory bits Mt , an equation i i F (Sti , . . . , St+m , zti , . . . , zt+m )=0 (6.36) of degree ≤ d ϕ·(m+1) e always exists which is completely independent of the mem2 ory bits. They also propose an algorithm to find ad-hoc equations i i G(Sti , . . . , St+r−1 , zti , . . . , zt+r−1 )=0 (6.37) with the lowest possible degree d if ϕ · r is not too large. For example, an ad-hoc equation of degree 4 using r = 4 successive clocks exists for the E0 key stream generator. 92 CHAPTER 6. RESYNCHRONIZATION ATTACKS As (6.37) is independent of the memory bits, these equations can be used for all attacks described in the previous sections. An additional requirement is now that the attacker knows enough successive key stream bits. If ϕ · r is small the Daemen et al. attack is applicable. The number of operations is about dk/(ϕ · r)e · 2ϕ·r + k 3 (6.38) The methods described in Sect. 6.4.2 to reduce the degree of the equations can be easily adapted to the case of ad-hoc equations. 6.7 Attacks on Stream Ciphers with Nonlinear Resynchronization In this section, we will show that stream ciphers with a nonlinear resynchronization mechanism can also be vulnerable to resynchronization attacks. A first attack is a chosen IV attack; its principle is similar to that of Daemen et al. The second attack is a known IV attack that uses the principle of linear approximations as used in the known IV attack of Sect. 6.5.2. We will demonstrate these attacks on the two-level memoryless combiner. The framework of the attack is shown in Fig. 6.1. In a first level, the key and an IV are linearly loaded into the LFSRs. The input to f at time t of the level 1 initialization is denoted xt . The following holds: xt = (x0t , x1t , . . . , xϕ−1 ) = kt ⊕ivt = (kt0 ⊕iv 0t , kt1 ⊕iv 1t , . . . , ktϕ−1 ⊕iv ϕ−1 ) . (6.39) t t The output yi of level 1 is collected and is used as the initial state for the level 2 generator. We take here the simplifying assumption that this is done as shown in the figure. Level 2 generates the key stream zi . LEVEL 2 LEVEL 1 LFSR0 ...x20 x10 x00 LFSR1 ...x12 x11 x10 ... f y0 y1 y 2 ... LFSR0 ... y2ϕ yϕ y0 LFSR1 ... y2ϕ +1 yϕ +1 y1 ... ϕ −1 ϕ −1 ϕ −1 LFSRij-1 ...x2 x1 x0 LFSRij-1 ... y3ϕ −1 y2ϕ −1 yϕ −1 Figure 6.1: Model for a two-level combiner f z0 z1 z 2 ... 6.7. NONLINEAR RESYNCHRONIZATION 6.7.1 93 A Chosen IV Attack We will show an attack scenario on this construction which holds under the assumption that the attacker can choose the value of the ivs which go into the Boolean function (this is for instance the case when the initial state equals the XOR of the key and the initialization vector). We start with t = 0. We let iv0 take R different values, while keeping the other ivs constant. We obtain the following equations: 1 1 f (f (k0 ⊕ iv02 ), y1 , . . . yϕ−1 ) = z02 f (f (k0 ⊕ iv0 ), y1 , . . . yϕ−1 ) = z0 (6.40) ... R f (f (k0 ⊕ ivR 0 ), y1 , . . . yϕ−1 ) = z0 . Denote the vector (y1 , . . . yϕ−1 ) by v. In half of the cases, we will see that all z0i are equal (either all 0 or all 1). This is due to the fact that f (0, v) = f (1, v). In the other half of the cases, which is of interest here, the z0i take both the values 0 and 1 in a random-looking way; which is due to the fact that f (0, v) 6= f (1, v). Assume that we are in the latter case. We now guess the ϕ − 1 bits of v, let us call this guess g. If f (0, g) = f (1, g), then our guess is certainly not correct and we proceed to the next guess. If f (0, g) 6= f (1, g), the guess is possible; we now get R equations in k0 of the form: f (k0 ⊕ ivi0 ) = z0i ⊕ f (0, g) . (6.41) The equations we have obtained are exactly the same as in the case of the linear resynchronization mechanism. If R is large enough, we expect to find a unique solution for k0 over all the guesses for g. It can be shown that we need about 2 · ϕ chosen IV s to achieve this. In the same way, we can also recover k1 , k2 , . . . which gives us the whole secret key of the system. The attack requires a total of about 2 · k chosen IV s and has a time complexity of dk/ϕe · 22·ϕ−1 . This attack has been implemented on various 8-bit Boolean functions, and we can easily recover the key. 6.7.2 A Known IV Attack In this attack, we extend the approach in which we search a bias in the ϕ-input Boolean function (see the known IV attack in Sect. 6.5) in a straightforward way to the case of the two-level combiners. Each bit now goes twice through the function f , but we can show that a bias still persists. The equation we want to hold is as follows: M M kj = iv ij ⊕ z0i , (6.42) j∈BS j∈BS 94 CHAPTER 6. RESYNCHRONIZATION ATTACKS where the set BS consists of the bits of the set S involved in the linear bias, for all times t ∈ S of the first level of the combiner. Similar equations can be written for the next iterations. Let us denote the cardinality of the set S by cS . Of course it holds that cs ≤ ϕ. The cardinality of the set BS is then evidently c2S . The piling-up lemma (2.28) learns that the probability that this equation holds is equal to 1/2+2cS ·²cS +1 , where ² is the bias of the Boolean function. The bias of this equation is thus 2cS ·²cS +1 , which means we need R = 2−2·cS ·²−2·cS −2 resyncs to break this system by a known IV attack. We will show what this implies for actual Boolean functions. We take Boolean functions with ϕ inputs and resilience ρ, and use the two bounds given by (2.17) and by (2.19). We conjecture that we will find the bias for ρ+1 in practice. We will certainly find a bias for ρ + 1, as f is ρ-th order correlation immune but not ρ + 1-th order correlation immune. As the cases we discuss are optimal from a designer’s point of view, we expect the Walsh spectrum to be flattened as much as possible over the values with Hamming weight > ρ and therefore to find a bias (very close) to ² for Hamming weight ρ + 1. This conjecture can be easily verified for a popular class of functions, the plateaued functions [167]. The cardinality of the set S is then cS = ρ + 1. We can now calculate an upper bound for the number of resyncs needed for a successful attack as a function of ϕ and of the resilience ρ. This is shown in Fig. 6.2 for some values of ϕ. These graphs show that memoryless combiners with few inputs cannot be made resistant against resynchronization attacks on a twolevel combiner. For larger functions it should be checked whether the Boolean function is strong enough to withstand the above attack. The bounds given here may be refined by a more careful examination of the properties of the various classes of Boolean functions. 6.7.3 Implications of the Attacks The known IV attack described above for the two-level memoryless combiner can be easily extended to other nonlinear resynchronization mechanisms. It is also possible to apply the attack on other designs, such as combiners with memory, nonlinearly filtered generators and irregularly clocked shift registers. We can use techniques as described by Golic [82, 83] to find suitable linear approximations. Our attack can be used to evaluate the strength of any resynchronization mechanism, and resistance against this attack is a minimum requirement for any design. 6.8. APPLICATION TO SOME KEY STREAM GENERATORS 95 250 phi=5 phi=10 phi=15 phi=20 log2(Maximum number of resyncs needed) 200 150 100 50 0 0 2 4 6 8 10 12 resiliency of the function 14 16 18 20 Figure 6.2: Upper bound for the number of resyncs as a function of the resilience ρ with as parameter the number ϕ of input bits of the Boolean function 6.8 Application to some Key Stream Generators In this section we will describe some actual implementations of the attacks described in the previous sections. We will describe attacks on E0, Toyocrypt and the summation generator. 6.8.1 Attacks on E0 A brief description of E0, the stream cipher used in the Bluetooth standard, is given in Appendix A.2. We will now show a resynchronization attack on one level of E0. We want to stress that the result below does not imply an immediate weakness. The reason is that the Bluetooth encryption consists of a two level encryption. Therefore any known differences at the beginning of level 1 do not produce known differences after level 2. On the other hand, the Bluetooth encryption scheme would be in danger if an ad-hoc equation would be known which depends only on the input of level 1 and the output of level 2 (= the key stream). The methods developed in [8] are not practical in this case, because the output of level 1 is permuted before it is used as the input of level 2. Still other ways to find such an equation may exist. Further on, our attacks show that if a feasible attack on level 2 of E0 can be found, it is very easy to recover the actual secret key used in level 1. 96 CHAPTER 6. RESYNCHRONIZATION ATTACKS Known IV attack on level 1 of E0 In this section, we describe a resynchronization attack on level 1 of E0 such as described in Sect. 6.6.1. This attack is similar to the attack of S. Fluhrer described in [78]. We show that this attack can be put into the general framework of resynchronization attacks and hence gets very efficient once R ≥ ϕ. In this attack we assume that we have recovered the state of E0 for some instances of level 2. This could have been done by implementing attacks on frames of level 2, e.g., an algebraic attack as described by [6]. We now want to recover the actual secret key which is used in level 1. In our attack, we will use two properties of E0. The first is the above mentioned fact that the final state of level 1 of the blender is also the initial state of level 2. This means that we will know the final state of the blender in level 1 in our resynchronization attack. The second is that the blender can be run backwards [150]. Namely, apart from i the normal equation Mt+1 = h(kt ⊕ ivit , Mti ), we can also easily determine the 0 i i ). function h such that Mt = h0 (kt ⊕ ivit , Mt+1 Our attack is now a regular resynchronization attack on combiners with memory and µ = 0, with as only particularity that the attack is executed on the level 1 generator running backwards in time. The attack has been implemented in the depth-first search manner described above. At each iteration we take the next state that fulfills all equations generated by the LFSRs, by f and by h0 . This attack has been implemented in C and turns out to be very efficient. Three resyncs already enable us to recover the secret key in several minutes. But it is only when the number of resyncs is equal to 4 (the number of LFSRs) that the attack gets very efficient. This is because the search that has to be done is very limited, as can be predicted by our framework for the resynchronization attack. The attack runs in a few milliseconds on a PC. The search algorithm only needs to go through a couple of thousands guesses for the states. If the number of resyncs is larger than 4, the algorithm is even faster because even more conditions are imposed and even less values need to be searched. Actual algebraic attacks The secret key K = (a0 , . . . , d38 ) is the initial state of four LFSRs A, B C and D. Let at , bt , ct and dt denote their output bits at clock t. We define σt1 := at ⊕ bt ⊕ ct ⊕ dt , σt2 := at bt ⊕ at ct ⊕ at dt ⊕ bt ct ⊕ bt dt ⊕ ct dt , σt3 := at bt ct ⊕ at bt dt ⊕ at ct dt ⊕ bt ct dt , σt4 := at bt ct dt . 6.8. APPLICATION TO SOME KEY STREAM GENERATORS 97 Table 6.3: Attacks on the E0 key stream generator Attack Equation Effort Degree-d -1 Attack (6.19) 254.25 Degree-1 attack Attack (6.29) 225.38 Daemen et al. with ad-hoc eqs. (6.38) 223.32 Table 6.4: Theoretical probability of having a basis R Prob 39 28.9 40 57.8 41 77 42 88 43 93.9 44 96.9 45 98.4 46 99.2 Then the ad-hoc equation found in [8] is G(at−1 , . . . , dt−1 , . . . , at+2 , . . . , dt+2 , zt−1 , . . . , zt+2 ) = (zt−1 ⊕ zt ⊕ zt+1 ⊕ zt+2 ) ⊕ σt1 · (zt−1 ⊕ zt+1 ⊕ zt+2 ) · (1 ⊕ zt )⊕ ⊕σt2 · (zt−1 ⊕ zt ⊕ zt+1 ⊕ zt+2 )⊕ 1 1 · σt1 · (1 ⊕ zt ) · (zt+1 )⊕ · (1 ⊕ σt2 ) · (zt+1 ) ⊕ σt+1 ⊕σt+1 2 1 1 2 · σt2 . ) · (1 ⊕ zt ) ⊕ σt4 ⊕ σt+1 ⊕ σt−1 ⊕ σt+2 ⊕σt1 · (σt+1 Obviously, we have r = d = 4 and |I| = 1 . We calculated that dim G = 39, |Md (G)| = 213, e = 1 (the minimum possible degree) and M1 = n = 128, M4 ≈ 223.07 and M3 ≈ 218.32 . Table 6.3 shows the estimated efforts for the attacks. i i i )| 0 ≤ i ≤ R − 1} be the set of , zt+3 , zt+2 Let B̃ := {G(kt ⊕ ivit , zti , zt+1 functions available to the attacker. Using (6.21), we calculated the probability of B̃ being a basis of V (G), depending on the number R of resyncs as shown in Table 6.4. Additionally, we’ve checked the estimation by repeating the following test 3000 times: for a fixed random value of k0 we generated randomly ivi0 and the corresponding expressions G(k0 ⊕ ivi0 , z0i , z1i , z2i , z3i ), until the number of linearly independent expressions was equal to dim G = 39. It turned out that the probabilities computed with (6.21) were too optimistic. Table 6.5 shows the results of our test. In more than 90% of all cases, the number R of frames was between 39 and 50, and in more than 99% between 39 and 59. Hence, we assume that the Degree-e attack and the Degree-1 attack work against E0 with a high probability for about 59 resynchronizations. Additionally, we expect similar effects for other key stream generators. 98 CHAPTER 6. RESYNCHRONIZATION ATTACKS Table 6.5: Experimental probability of having a basis R≤ % 39 1.2 40 6.5 41 16 42 28.2 43 41.7 ... ... 49 89.1 50 91.8 ... ... 58 98.8 59 99.1 Table 6.6: Experimental probability of having a basis for Daemen’s attack R≤ % 19 0.1 20 0.3 21 1 22 1.9 ... ... 33 55.3 ... ... 43 89.5 44 91 ... ... 58 98.5 59 99.2 We did similar tests concerning the Daemen et al. attack using ad-hoc equations. I.e., we did 6000 times the following test: we reinitialized the cipher for random iv’s until 9 bits of information of k0 were uniquely specified. In fact, due to the form of G, only 8 of the ϕ · r = 16 bits can be reconstructed directly. The other 8 bits occur only as a linear combination. It turned out that R was always ≥ 19. Table 6.6 shows the results of our test. In more than 90% of all cases, the R was between 19 and 44, and in more than 99% between 19 and 59. As expected, the Daemen et al. attack needed less frames than the Degree-1 attack. We simulated the Degree-1 attack and the Daemen et al. attack on the whole E0 key stream generator independently 500 times on a computer. I.e., for a fixed unknown key K we generated new frames with random known differences until the bits a1 , b1 , c1 , d1 ,. . . ,a39 , b39 , c39 , d39 could be reconstructed. The attack could be improved by using the fact that a1 , . . . , a25 uniquely determine a26 , . . . , a39 etc. or by reconstructing only a part of the key and guessing the missing bits. Table 6.7 displays the results. It shows how many times a given number of frames was needed. E.g., in 12 resp. 39 cases out of 500, the Degree-1 attack resp. the Daemen et al. attack needed 47 frames. In our simulations, the Degree-1 attack needed on average more frames. The same holds for the time effort: Degree-1 attack needed on average 239 s, the Daemen et al. attack only 91 s. 6.8.2 Attacks on Toyocrypt Description of Toyocrypt Toyocrypt was submitted to Cryptrec, the Japanese government call for cryptographic primitives, and was accepted to the second phase but has finally been rejected. A description can be found in [126]. At each clock t the key stream bit 6.8. APPLICATION TO SOME KEY STREAM GENERATORS 99 Table 6.7: Computer simulations of the Degree-1 attack and the Daemen et al. attack using ad-hoc equations on the E0 key stream generator # Frames Deg-1 att. Daemen # Frames Deg-1 att. Daemen # Frames Deg-1 att. Daemen 36 0 1 49 49 36 62 10 9 37 0 4 50 41 31 63 9 3 38 0 6 51 55 23 64 9 7 39 0 8 52 48 23 65 7 7 40 0 9 53 44 28 66 3 3 41 0 14 54 41 14 67 1 3 42 0 13 55 26 9 68 5 4 43 0 20 56 22 19 69 2 3 44 0 30 57 18 9 70 3 2 45 1 21 58 12 8 71 2 5 46 2 26 59 15 8 72 3 5 47 12 39 60 10 9 73 1 2 48 36 29 61 6 6 74 0 2 zt is produced by P62 t t t kt kt kt ⊕ ⊕ i=0 kit kαt i ⊕ k10 ) = k127 = f (ULt (K)) = f (k0t , . . . , k127 Q6223 t32 42 t t t t t t t t t t t t t t t t ⊕k1 k2 k9 k12 k18 k20 k23 k25 k26 k28 k33 k41 k42 k51 k53 k59 ⊕ i=0 ki , (6.43) with {α0 , . . . , α62 } being some permutation of the set {63, . . . , 125}. zt Algebraic attacks t t ) provides two ) resp. (1 ⊕ k42 As noticed in [55], multiplying f with (1 ⊕ k23 ad-hoc equations G1 and G2 of degree 3. Due to the large number ϕ · r of key bits used in G1 and G2 (ϕ = 126, r = 1), the computation of dim G1 resp. dim G2 in the naive way is infeasible. Therefore, ≈ 218.42 and we concentrate only on the Degree-d -1 attack. As |I| = 1, M128 3 13.01 128 , the number of operations is about M2 ≈ 2 ¡ 3 ¢ 3 18.42 |I| + M128 · M128 + (M128 ) · 213.01 + (213.01 )3 ≈ 239.04 . 3 2 2 ) ≤ (1 + 2 Chosen IV attack We can see, for instance, that only the monomial x24 · x63 in (6.43) contains the variable x63 . We now perform a chosen IV attack as in Sect. 6.5.1 with two IV s that only differ in bit x63 . Eliminating all terms as in (6.22) gives us the following simple equation for t = 0: x24 = z00 ⊕ z01 , (6.44) 100 CHAPTER 6. RESYNCHRONIZATION ATTACKS Table 6.8: The Degree-d -1 Attack, the Degree-e attack and the Daemen et al. attack on the summation generator with 4 LFSRs and a key size of 128 Attack Equation Effort Degree-d -1 Attack (6.19) 255.25 Degree-eAttack (6.29) 222.34 Daemen et al. using ad-hoc eqs. (6.38) 223.32 which is a simple linear equation in the linear state bits. We need just n such equations to determine the entire linear state of the algorithm. We can obtain these easily by performing more resyncs as above for other times, and by doing resyncs at the same time but flipping another bit that also only appears in monomials of degree 2 (there are many of them here). 6.8.3 The Summation Generator with 4 Inputs The summation generator is a simple combiner with memory proposed by Rueppel [148]. The number ϕ of inputs (i.e., the number of LFSRs) can be chosen arbitrarily. In [114], the authors proved that an ad-hoc equation of degree ≤ 2dlog2 ϕe always exists if r = dlog2 ϕe + 1 clocks are considered. We consider only the case of ϕ = 4 inputs and use the ad-hoc function G of degree 4 given in [114]: 0 = 1 2 1 σt4 ⊕ (1 ⊕ zt ) · σt3 ⊕ σt2 · σt+1 ⊕ (1 ⊕ zt+1 ) · σt2 ⊕ σt+1 ⊕ σt+2 ⊕ zt+2 1 1 1 1 ⊕(1 ⊕ zt ) · σt · σt+1 ⊕ (1 ⊕ zt ) · (1 ⊕ zt+1 ) · σt ⊕ (1 ⊕ zt+1 ) · σt+1 . We calculated that |I| = 1, dim G = 19, |Md (G)| = 69, e = 1 and appreciated by ≈ 223.32 resp. ≈ 218.43 . Table 6.8 displays our estimations resp. M128 M128 3 4 for some attacks. To be comparable with the other results in this chapter, we have chosen a key size of 128 bits. 6.9 Conclusions In [58], Daemen, Govaerts and Vandewalle presented the original resynchronization attack on synchronous stream ciphers. In this chapter, we have extended this resynchronization attack in several directions, by using new attack methods and by combining the attack with cryptanalytic techniques such as algebraic attacks and linear cryptanalysis. 6.9. CONCLUSIONS 101 Our attacks on linear resynchronization mechanisms show that a linear resynchronization mechanism should never be used in practice. Even if the system uses few resyncs, has an input function with many inputs and has a non-linear memory, it will still very likely contain weaknesses that can be exploited by one of our attack scenarios. Nowadays, resynchronization mechanisms are typically designed in an ad hoc manner, by making them nonlinear to an extent that seems to be sufficient. Our attacks on nonlinear resynchronization mechanisms lead to a better understanding of the strength of such mechanisms and can be used to provide a lower bound for the nonlinearity required from a secure resynchronization mechanism. This allows designers to consider the trade-offs between the speed and the security of a resynchronization mechanism. 102 CHAPTER 6. RESYNCHRONIZATION ATTACKS Chapter 7 Implementation of Synchronous Stream Ciphers In the previous chapters, we discussed several classes of cryptanalytic attacks on synchronous stream ciphers, and deduced the cryptographic properties that the stream cipher building blocks should satisfy in order to withstand this cryptanalysis. Our treatment so far has been mainly theoretical. In this chapter, we will take a more practical perspective. The mathematical building blocks need to be implemented in practical applications. This implementation has to be both efficient and secure. In Sect. 7.1, we discuss efficient implementation of stream ciphers, focusing on hardware. We discuss the area, performance and energy consumption of the building blocks in Sect. 7.1.1, and present a case study, namely the search for efficient Boolean functions that can be used in a nonlinear filter generator, in Sect. 7.1.2. Section 7.2 then discusses secure implementations of a cryptanalytic design in hardware. This chapter is based on results previously published by us in the articles [16, 33, 112]. 103 104 CHAPTER 7. IMPLEMENTATION 7.1 Efficient Hardware Implementation of Synchronous Stream Ciphers 7.1.1 Area, Performance and Energy Area An important criterion in the implementation of stream ciphers, is the area that is occupied by the primitive in CMOS standard cell based hardware. Especially in applications with limited resources, such as for instance RFID tags, restrictions on the primitives that can be implemented on the device can be very severe and AES may not be adequate. This is why eSTREAM launched a call for hardwareoriented stream ciphers that are suited for applications with restricted resources such as limited storage, gate count, or power consumption. The area needed to implement a stream cipher will clearly depend largely on the technology (e.g. the 0.15 micron and 65 nanometer technologies), and therefore a measure that is more independent of the technology is used: a NAND gate, the standard cell in CMOS design, is considered to have unit area, and the area of the stream cipher and its building blocks is expressed in Gate Equivalents (GE). Table 7.1 lists the number of GE for some common building blocks in stream cipher design. There can be some variations on these numbers depending on the library used. When the building blocks are combined into actual stream ciphers in practice, some overhead is introduced such as buffers and wiring area. Table 7.1 can be used as a guideline to find the gate count of the operations used in existing and newly proposed stream ciphers. This method is also useful to compare different designs. Table 7.1: Gate count of some common stream cipher building blocks Operation XOR gate AND gate OR gate Register (Flip-Flop) RAM (per byte) n-bit LFSR 4-bit Adder 8-bit Adder GE 2.5 1.5 1.5 8 to 12 12 8 · n to 12 · n 20.5 91.5 7.1. EFFICIENT HARDWARE IMPLEMENTATION 105 Performance Performance for bulk encryption. The most common measure for the performance of a stream cipher is its throughput, the number of bits it can produce per second. In general the throughput can be defined as follows: throughput = N · CF , (7.1) where N is the number of bits produced in every clock cycle, and CF represents the clock frequency. Traditional stream cipher designs, such as nonlinear filters and combiners typically only produce one bit at every clock cycle. So to have good throughput they typically need a very high clock frequency, which can be achieved by making sure that the critical path of the design is very short. A second way to achieve a high clock frequency is to use a pipeline structure. Stream ciphers with tight feedback loops (meaning that the current input depends on recent outputs) cannot be pipelined. Hence it is essential to avoid tight feedback loops in stream cipher algorithms. Because the bit serial design clearly poses limitations on the throughput, designers have started moving away from bit serial design and have adopted a word serial approach, where a whole word (8, 32 or 64 bits) of bits is output instead of a single bit. Some interesting proposals, notably Trivium [65] and Grain [97] have been presented at the eSTREAM project, but further cryptanalytic assessment is required. Performance in applications. In many applications (GSM, Bluetooth, Internet, . . . ), frequent resynchronization of a synchronous stream cipher is needed: the ciphertext is divided into small packets which are encrypted and sent separately. In GSM such a packet is 224 bits, in Bluetooth it is at most 2745 bits. Half of the packets circulating on the Internet are only 512 bits long. To prevent the reuse of key stream, at every resynchronization the secret key is combined with an IV unique to every packet. As explained in Chapter 6, this resynchronization needs to be performed in a sufficiently nonlinear way. Resynchronization thus induces some overhead: the algorithm has to operate for several clocks before it can start outputting information. The shorter the packets, the higher this overhead becomes. To quantify this overhead, we define the Resynchronization Factor (RF) as follows: RF = total number of clocks performed . number of clocks performed to generate key stream (7.2) The RF will increase the clock frequency required from the design to achieve 106 CHAPTER 7. IMPLEMENTATION a certain throughput in the following way compared to Eq. (7.1): CF = throughput · RF N (7.3) In order to be usable in applications, stream ciphers should achieve a good trade-off resulting in a secure and fast resynchronization mechanism. In Fig. 7.1, we present the RF as a function of the length of the packets for the three commonly used stream ciphers A5/1, E0 and RC4. For RC4, we have assumed that the first 256 bytes of key stream are discarded. This is done in most applications to overcome some weaknesses of the RC4 algorithm. 10 RC4 E0 A5/1 9 Resynchronization Factor (RF) 8 7 6 5 4 3 2 1 0 2 10 3 10 Length of the packet in bits 4 10 Figure 7.1: RF as a function of the packet length for A5/1, E0 and RC4 We make the following observations: • For equally secure resynchronization mechanisms, there is a strong correlation between the size of the internal state of the stream cipher and the 7.1. EFFICIENT HARDWARE IMPLEMENTATION 107 RF. Stream ciphers with a large internal state will typically require more time for initialization. • As follows from our cryptanalytic treatment of resynchronization attacks in Chapter 6, a good resynchronization mechanism should have properties that are very similar to a good block cipher. As an illustration, the eSTREAM candidate Sosemanuk [18] uses a block cipher for its resynchronization mechanism. Therefore, before it starts generating key stream, a stream cipher should perform about the same work as a block cipher encryption of data that has the size of its internal state. It follows that the performance of a stream cipher for short packets (less than a small multiple of its internal state size) will be slower than that of block ciphers. It is therefore paradoxical that Bluetooth and GSM use stream ciphers for the encryption of their typically small packets. A5/1 needs 414 clocks to encrypt a packet, of which 186 clocks (almost half) are needed for resynchronization. An explanation why these standards opted for a stream cipher primitive lies in the hardware complexity. Some stream ciphers can be implemented in less area than block ciphers. Due to the resynchronization factor they will pay a price in performance, but this was apparently a lesser problem to the designers. Energy consumption The power consumption in CMOS circuits is primarily attributed to the voltage changes on load capacitors and is given by, 2 PSW IT CHIN G = α0←1 CL Vdd fclk (7.4) where fclk is the clock frequency, Vdd is the operating supply voltage, CL is output load capacitance and α0←1 is the node transition activity factor, the fraction of the time of the node makes a power consumption transition (0 ← 1 and 1 ← 0 transitions) inside the clock period. The energy consumed by a circuit can be divided into two main categories: the fundamental energy, which is the energy necessary to calculate the intended result. Then there is the overhead energy, this is the energy associated with all overhead functions, e.g. moving data around, logic that switches but does not create outputs, etc. Two common measures Derived from the three main characteristics of the implementation – area, performance and energy consumption – we can further derive two notions that are 108 CHAPTER 7. IMPLEMENTATION importance when comparing various designs: Throughput Area • bits ( sec·m 2 ): This measure gives a good indication of the area versus performance trade-off of the stream cipher. Typically, several trade-offs are possible for a design, where more throughput can be achieved at the expense of a larger area. Comparing the graphs that give the throughput as a function of the area for various stream ciphers permits a fair comparison of these designs. Of course the relation between throughput and area is largely dependent on the technology, so the same benchmark should be used when comparing the data. • Data Energy bits ( Joule ): Many new applications will end up into battery operated devices. For these devices, the instantaneous power consumption is less relevant, but the amount of encryptions per energy unit (Joule) will determine the lifetime of the device. 7.1.2 Efficient and Secure Boolean Functions for the Filter Generator From the analysis in Chapters 3 to 6, it follows that two substantial properties a good Boolean filter function should satisfy are high algebraic immunity and high nonlinearity. Besides, to be used in practice, they should be efficiently implementable in hardware. In this section, two classes of functions that have a low-cost hardware implementation are presented. The first is a class of symmetric functions that have an optimal algebraic immunity, but very low nonlinearity. The second class, the power functions, have a high nonlinearity and reasonably high algebraic immunity. Symmetric Functions Symmetric functions [39] are functions with the very interesting property that their hardware complexity is linear in the number of variables. A symmetric function is a function of which the output is completely determined by the weight of the input vector. Therefore, the truth table vf = (v0 , . . . , vϕ ), also called value vector, of the symmetric function f on Fϕ 2 reduces to a vector of length ϕ + 1, corresponding with the function values vi of the vectors of weight i with 0 ≤ i ≤ ϕ. We have identified the following class of symmetric functions with maximum AI: Theorem 7.1.1 The symmetric function in Fϕ 2 with value vector § ¨ ½ 0 for i < ϕ2 vf (i) = 1 else (7.5) 7.1. EFFICIENT HARDWARE IMPLEMENTATION 109 has maximum § ¨ AI. Let us denote this function by Fk where k is equal to the threshold ϕ2 . Proof. First we show that the function Fd ϕ e ⊕ 1 only has annihilators of degree 2 § ¨ greater than or equal to ϕ2 . The annihilators of Fd ϕ e ⊕ 1 are 0 in all vectors of 2 § ¨ weight less than or equal to ϕ2 . Consequently, the terms which appear in the ANF § ϕ ¨ of the function correspond with vectors of weight greater than or equal to 2 by definition of the ANF. Thus, no linear combination can be found in order to decrease the degree of the resulting function. The transformation (x1 , . . . , xϕ ) 7→ (x1 ⊕ 1, . . . , xϕ ⊕ 1) for all ϕ will map a symmetric function f with value vector vf to a symmetric function with value vector equal to the reverse of this value vector, i.e., vfr . Consequently, the function Fd ϕ e and Fd ϕ e ⊕1 are affine equivalent under affine transformation (complemen2 2 tation) in the input variables for ϕ odd. For ϕ even, the function Fd ϕ e is affine 2 equivalent with Fd ϕ e+1 ⊕1. The proof explained above can also be applied on the 2 annihilators of the function Fd ϕ e+1 ⊕ 1 for ϕ even. Finally, the theorem follows 2 from the fact that functions which are affine equivalent in the input variables have the same number of annihilators of fixed degree. ¤ By Proposition 2 and Proposition 4 of [39], the degree of these functions are determined as follows. Theorem 7.1.2 The degree of the symmetric function Fd ϕ e on Fϕ 2 is equal to 2 blog2 ϕc 2 . If ϕ is odd, ¥ ¦these functions are trivially balanced because vf (i) = vf (ϕ−i)⊕1 for 0 ≤ i ≤ ϕ2 . As shown in Proposition 5 of [39], trivially balanced functions satisfy the following properties: • The derivative D1 f with respect to 1 is the constant function: f (x)⊕f (x⊕1) is constant. • Wf (x) = 0 for all x with wt(x) even. For ϕ even, the functions are not balanced. But by XORing with an extra input variable from the LFSR, this property is immediately obtained. Another problem is that the nonlinearity of these functions is not high. In particular, µ ¶ ϕ−1 maxϕ |Wf (w)| = 2 ϕ−1 (7.6) w∈F2 2 110 CHAPTER 7. IMPLEMENTATION ¡ ϕ ¢ for odd ϕ and equal to ϕ/2 for even ϕ. Therefore, ² ≈ 2−3.15 , 2−3.26 , 2−3.348 for ϕ = 13, 15, 17 respectively. Note that the nonlinearity increases very slowly with the number of inputs ϕ: even for 255 input bits, we only get a bias of 2−5.33 . Remark. The nonlinearity of this class of symmetric functions corresponds to the nonlinearity of the functions with maximum AI that are obtained by means of the construction described in [61]. This construction has the best nonlinearity with respect to other constructions that have a provable lower bound on the AI presented in literature so far. An extensive study on the AI and nonlinearity of symmetric Boolean functions is performed in [31]. It has been shown that there exists for some even dimensions other classes of symmetric functions with maximum AI which have slightly better nonlinearity, but still far too small to be resistant against the distinguishing attack. Also, no symmetric function with better bias in nonlinearity compared with the AI has been found in [31]. To summarize, we have identified a class of symmetric functions with very low hardware requirements and with maximal algebraic immunity. However, the nonlinearity of the design is not good and there may be other problems related to the trivial balancedness. One may consider to use a symmetric function as a building block to increase the algebraic immunity (by direct sum) of a design at a low implementation cost. Power Functions The idea of using a filter function f derived from a power function P on Fϕ 2 is as follows: we consider the ϕ input bits to the function P as a word x in F2ϕ . We then compute the p-th power, y = xp , of this word. The output of our Boolean function f is then one bit yi for i ∈ {0, . . . , ϕ − 1} of this output word y = (y0 , . . . , yϕ−1 ). Note that all these functions for i ∈ {0, . . . , ϕ − 1} are linearly equivalent to the trace of the power function P [81]. We now discuss the nonlinearity, algebraic immunity and implementation complexity of some interesting power functions. Nonlinearity. We will investigate Boolean functions derived from highly nonϕ linear bijective power functions. These functions have bias ² = 2− 2 for ϕ even ϕ 1 and 2− 2 − 2 for ϕ odd, which is very close to the ideal case, the bent functions. An overview of the different known classes of highly nonlinear power functions on F2ϕ of degree ≥ 9 is presented in Table 7.2, where the degree of the power function is the Hamming weight of the power p [135]. It is an open problem whether this table is complete or not. Algebraic immunity. In [40], the AI of the Boolean functions derived from the highly nonlinear power functions (see Table 7.2) is computed up to dimension 7.1. EFFICIENT HARDWARE IMPLEMENTATION 111 Table 7.2: Known highly nonlinear power functions on F2ϕ of degree ≥ 9 Exponent 2ϕ − 2 22k − 2k + 1 with k < and gcd(k, ϕ) = 2 22k − 2k + 1 with k < and gcd(k, ϕ) = 1 P ϕ2 ik with k < ϕ2 i=0 2 and gcd(k, ϕ) = 1 ϕ−1 ϕ−1 2 2 +2 4 −1 3ϕ−1 ϕ−1 2 2 +2 2 −1 ϕ 2 Condition on ϕ all ϕ ϕ mod 2 ≡ 0 Name + Reference Inverse [135] Kasami class [106] ϕ 2 ϕ mod 2 ≡ 1 Kasami class [106] ϕ mod 4 ≡ 0 Dobbertin class[69] ϕ mod 4 ≡ 1 ϕ mod 4 ≡ 3 Niho 1 class [71] Niho 2 class [70] less than or equal to 14. These results together with our simulations for higher dimensions indicate that most of the highly nonlinear bijective power functions we study do not achieve the optimal AI, but they do reasonably well for this criterion. For instance, the AI of the Boolean function derived from the inverse power function on F216 is equal to 6 (where 8 would be the maximum attainable). Implementation. We now study the implementation complexity of some concrete functions and give the nonlinearity and AI of these practical functions. Efficient implementations of the inverse function in the field F2i for i ≥ 3 has been studied by several authors, since this function is used in the Advanced Encryption Standard. An efficient approach can be obtained by working in composite fields as described in [138]. Based on recursion, the inverse function is decomposed into operations in the field F22 . A minimal hardware implementation for ϕ = 16 requires 398 XOR gates and 75 AND gates, and thus consists of about 1107.5 NAND gates as computed by us in [35]. It is also possible to increase the clock frequency if necessary by pipelining the design. Hardware implementation of exponentiation with general exponents in F2ϕ has been well studied [1]. However, it turns out that all classes of bijective highly nonlinear power functions with degree greater than or equal to 7 have a very regular pattern in their exponent, which can be exploited for a more efficient implementation. As we can see in Table 7.2, all exponents p satisfy the property that the vector p = (p0 , . . . , pϕ−1 ) defining its binary representation, Pϕ−1 i.e., p = i=0 pi 2i , contains a regular sequence consisting of ones together with at most one bit which is separated of this sequence. The distance between two consecutive elements in the sequence is equal to one except in the case of the 112 CHAPTER 7. IMPLEMENTATION Dobbertin functions for k > 1. If the weight of this sequence is equal to a power of 2, than this property can be exploited leading to a more efficient implementation. We will demonstrate this improved implementation on the power function X 511 in F16 2 . The exponent 511 has binary representation 1111111112 . Consequently, it contains a sequence of weight 9, or also a sequence of weight 8 with ϕ−1 one extra digit. First, consider the normal basis {α, α2 , α4 , · · · , α2 } of Fϕ 2 for ϕ = 16 (a normal basis exists for every ϕ ≥ 1, see [137]). Computing the power function in this basis will not change the properties of nonlinearity, degree, AI and Walsh spectrum of the output functions, since power functions in different bases are linearly equivalent. Squaring in this basis represents simply a cyclic shift of the vector representation of that element. Consequently, computing the power 511 of an element x ∈ F216 , can be computed as follows: x511 = = = (x · x2 ) · (x4 · x8 ) · (x16 · x32 ) · (x64 · x128 ) · x256 (y · y 4 ) · (y 16 · y 64 ) · x256 with y = x · x2 z · z 16 · x256 with z = y · y 4 . (7.7) Therefore, we only need to perform some shifts together with 4 multiplications in 2 4 the normal basis of F16 2 . These multiplications correspond with (x · x ), (y · y ), and (z · z 16 · x256 ). The hardware complexity of such multiplication depends on the basis used to represent the field elements, or more precisely, on the number of ones Cϕ in the multiplication matrix. It is known that Cϕ ≥ 2ϕ − 1 with equality if and only if the normal basis is optimal [129]. Note that optimal bases do not exist for any dimension ϕ. The overall gate count of the multiplication is lower bounded by ϕCϕ ≥ 2ϕ2 − ϕ AND gates and (ϕ − 1)Cϕ ≥ 2ϕ2 − 3ϕ + 1 XOR gates [117]. Other implementations may provide even lower complexity. Also several algorithms exist for performing normal basis multiplication in software efficiently [134, 143]. For ϕ = 16, 17, 19 (corresponding with dimensions in Table 7.3), there is no optimal basis. Therefore, the number of NAND gates for a multiplication in normal basis is lower bounded by 1906.5 for ϕ = 16, 2161.5 for ϕ = 17, and 2719.5 for ϕ = 19 respectively. If the vector containing the binary representation of the exponent consists of a regular sequence with weight 2i , then the number of multiplications is equal to i, or i + 1 if there is an additional digit defining the complete exponent. Table 7.3 represents the bijective highly nonlinear power function which behaves optimally with respect to the above described implementation for dimensions ϕ between 14 and 32. We also computed the AI of the Boolean functions associated to these power functions by means of the algorithm described in Sect. 4.2. These power functions seem to offer good nonlinearity combined with reasonably good algebraic immunity. Moreover, we believe that the implementation of a complete S-box for the filter function has several potential advantages. In the first place, we can increase the throughput of the generator by outputting more 7.2. SECURE IMPLEMENTATION 113 Table 7.3: AI of known highly nonlinear power functions on F2ϕ of degree ≥ 9 Class Inverse Kasami Dobbertin Dobbertin Dobbertin Dobbertin Niho Class(2) Exponent 217 − 2 65281 511 (k = 1) 37741(k = 3) 51001(k = 5) 21931(k = 7) 214 + 511 ϕ 17 17 16 16 16 16 19 # multiplications 4 4 4 4 4 4 5 AI 7 8 7 8 8 7 ≥7 Degree 16 9 9 9 9 9 10 bits m instead of outputting 1 bit. Therefore, a careful study on the best bias in the affine approximation and the AI of all linear and nonlinear combinations of m output bits need to be performed. Another possibility, which makes the analysis harder but may increase the security, is to destroy the linearity of the state. We could consider a filter generator with memory by simply feeding some remaining bits from the S-box into a nonlinear memory. Another possibility is to feed back bits of the S-box into the LFSR during key stream generation. In both cases, it seems that the added nonlinearity may allow us to increase the throughput. Finally, resynchronization can be performed faster by using all bits of the S-box to destroy as rapidly as possible the linear relations between the bits. 7.2 Secure Implementation of Stream Ciphers Research towards the security of cryptographic algorithms has been essentially concerned with classical cryptanalysis. The algorithm is regarded as an abstract mathematical model on which a variety of attacks are investigated. In reality an algorithm will always be implemented and will run on some specific hardware. Hence, an attacker can also mount an attack on the implementation of the algorithm. Even if the cipher is mathematically sound, he may in this way be able to recover the secret key. Such attacks are often called side-channel attacks (SCA). We can roughly divide these attacks into active attacks and passive attacks. In active attacks one tries to deviate from the proper functioning of the device, for instance by fault induction. Passive attacks only require the observation of the side-channels of the device during normal operation. Passive attacks include timing analysis, power analysis and electromagnetic analysis. Concerning symmetric algorithms, most research has been directed towards attacks on block ciphers, namely DES and the AES finalists. Very little research 114 CHAPTER 7. IMPLEMENTATION has been done on the resistance of stream ciphers against side-channel attacks. An example of such research is a theoretical discussion on fault analysis, see [100]. In this section we will develop a theoretical framework for the power analysis of synchronous stream ciphers with resynchronization mechanism. Power analysis attacks measure the power consumption of the device and try to extract the secret key of the algorithm during its operation. Two types of analysis exist: the simple power analysis (SPA), and the much more powerful differential power analysis (DPA), introduced by Kocher, Jaffe and Jun in [109]. Successful DPA attacks have been implemented against block ciphers such as DES and the AES candidates and against several public-key cryptosystems. First, a description of power analysis attacks will be given. We then discuss power analysis of stream ciphers and describe two attacks: one on the irregularly clocked A5/1 algorithm used in GSM, and one on the regularly clocked E0 stream cipher from the Bluetooth standard. 7.2.1 Description of Power Analysis Attacks Simple Power Analysis (SPA) In simple power analysis, one measures the power consumption of the algorithm and tries to extract the secret key from this measurement. This is possible if some operations are directly dependent on the value of the key. This is for instance the case if the algorithm has some key-dependent conditional jumps, etc. Simple power analysis can normally be prevented quite easily by some simple implementational tricks. Differential Power Analysis (DPA) DPA attacks were introduced in [109]. DPA exploits the fact that the power consumption of an operation is influenced by the value of the bits being manipulated. This is impossible to see on a single trace (i.e., the measurement of the power consumption of the algorithm during one encryption), but by combining the knowledge obtained from many traces one is able to extract the secret key. Different consumption models have been developed that describe this effect, and although they have their limitations (see e.g. [2]), the attacks appear to work in practice. DPA attacks require many (thousands) measurements, in which both known (usually plaintexts, ciphertexts) and unknown (usually a secret key) bits are used. The central idea of the DPA attack is to guess a small subset of key bits. The knowledge of these key bits allows us to determine the value of one bit b that appears during the computation. We then divide our measurements into two subsets depending on the value of this bit. The consumption of the system 7.2. SECURE IMPLEMENTATION 115 allows us, on average, to distinguish the consumption when computing with b = 0 from the case b = 1. So if we see a bias in the difference of our two sets, our guess of the key bits was correct. In case our guess of the key bits was wrong, we have divided the measurements into two random sets and do not see any bias. 7.2.2 Power Analysis of Stream Ciphers Power analysis attacks have not been applied to synchronous stream ciphers. The problem with DPA attacks against synchronous stream ciphers is that the key stream is computed independently from the plaintext to be encrypted. So the interference between known and unknown values required for the DPA attack is not present here. However, in many applications stream ciphers require frequent resynchronization, to prevent synchronization loss between sender and receiver. In this case, the state of the stream cipher is frequently reinitialized with the same secret key K and with a different initial value IV . This resynchronization should be highly nonlinear in order to prevent resynchronization attacks [9, 58]. An interesting observation is that the resynchronization inserts known values into the system. This enables an attacker to perform consecutive power measurements on many frames with the same secret key but with many different IV s. It seems that all conditions are present to perform a successful DPA attack on a stream cipher with reinitialization. We will now describe theoretical power analysis on two very common types of LFSR-based stream ciphers. The first is the GSM algorithm A5/1, which is an example of an irregularly clocked stream cipher. The second is the Bluetooth algorithm E0, which is an example of a combiner with memory. Power Analysis Attack on A5/1 A5/1 is the stream cipher used for encrypting GSM communications. A description of A5/1 can be found in Appendix A.1. The idea for the Power Analysis of A5/1 is very simple: the attacker does not know whether 2 or 3 LFSRs are clocked at a time t. However, the power consumption of A5/1 while clocking 3 LFSRs in parallel can be expected to be higher than when clocking only 2 LFSRs. We will now show how to deduce the secret key from power measurements by using power analysis techniques. Assume we have measured a large number of power traces, obtained from a single secret session key K. For example, 1000 traces would correspond to about 4.6 seconds of telephone conversation. We now focus first on t = 86, the time of the first irregular clock. For each trace i, we know that the three bits used for the majority are a combination 116 CHAPTER 7. IMPLEMENTATION of unknown but constant key material and known frame-dependent values as follows: i c1 = k1 ⊕ iv1i ci = k2 ⊕ iv2i (7.8) 2i c3 = k3 ⊕ iv3i Now we proceed as follows: we guess the triplet (k1 , k2 , k3 ). Based on this guess, we divide our power traces into two sets S2 and S3 . A trace goes into set S3 if the three values in (7.8) are equal and into S2 otherwise. If our guess was correct, all traces in S3 represent instances that had 3 LFSRs clocked at t = 86, and all traces in S2 had only 2 LFSRs clocked at t = 86. By computing mean(S3 ) − mean(S2 ), we can thus expect to see a significant peak in this differential trace. If the guess were not correct, our division into two sets will be wrong and the difference between the two means will be much lower. Note that our division into sets will also be correct for the complement of the triplet, so we will retain two possible values for the three bits. In other words, we have learnt 2 bits of information on the key. Now that we know which LFSRs have been updated at t = 86, we can calculate which bits are in the majority bits positions at t = 87. By continuing to work iteratively we can deduce the secret key of A5/1. Power Analysis Attack on E0 The stream cipher E0 is used in the Bluetooth wireless standard [157] to encrypt data. A description can be found in Appendix A.2. The case of the two-level key stream generator is interesting for power analysis: the linearly loaded first level key stream generator is shielded from an attacker who tries to do a resynchronization attack, but it remains exposed to an attacker who is doing a SCA attack. In the DPA attack, our aim is to recover the encryption key KC . In a first phase, power measurements have been performed for N (typically 1000) frames, each encrypted with the same KC but with a different (known) clock counter value. In order to simplify the description of the DPA attack, we will for now assume that the initial value of each bit of the LFSRs is influenced by the resynchronization. This is not the case in E0, but the attack can be adapted quite easily to circumvent this. In this first simple attack, we assume the Hamming weight model: the power consumption is correlated positively with the Hamming weight of the inputs. We focus on the XOR output function, which takes as inputs the five bits x1t , x2t , x3t , x4t and c0t . In the first iteration of level 1, we know that c01 is equal to zero. 7.3. CONCLUSION 117 This is due to the structure of the blender update function. For each frame i, we know the initial values involved in this first iteration. We thus get the following equations: 1 x = k 1 ⊕ ivi1 12 x1 = k 2 ⊕ ivi2 (7.9) x31 = k 3 ⊕ ivi3 4 x1 = k 4 ⊕ ivi4 We now guess the four bits k 1 , k 2 , k 3 and k 4 . Based on this guess, we divide our power traces into five sets S0 , S1 , S2 , S3 and S4 . For instance, set S0 contains all measurements for which the tuple (x1 x2 x3 x4 ) has Hamming weight 0 for the current guess of the key. We then compute the mean of these sets, and the differences of these means. It can be expected that the guess of the four key bits for which the bias detected is largest will be the correct guess of these four key bits. By knowing the four key bits, we can now calculate the next state of the blender. We thus know c02 for all our measurements, and we can again perform the DPA attack on the second iteration of the first level. We continue proceeding iteratively until we have recovered the entire key KC . Many other attack scenarios are imaginable. For instance, one could focus on the update functions of the individual LFSR during the first level and perform a similar attack. As the update function of each LFSR has weight 5, the complexity of this approach should not be higher. It would also not require to keep track of the blender state. A third alternative would be to attack the addition performed at the entrance of the blender. In all scenarios the attack would be similar as described above. 7.3 Conclusion In this chapter, we have taken a more practical perspective and studied actual implementation of stream ciphers. We reviewed the most important properties for efficient hardware implementation of stream ciphers, discussing area, performance and energy consumption, and related these to the properties of the various mathematical building blocks. As an illustration, we discussed some examples in the search for efficient and secure Boolean filter functions. We also discussed the security of a stream cipher implementation against side-channel attacks. In particular, we developed a model for mounting DPA attacks against stream ciphers. The study of such side-channel attacks on stream ciphers has received little attention until now. It can be expected that such analysis will receive increased attention during the second phase of the eSTREAM project. 118 CHAPTER 7. IMPLEMENTATION Chapter 8 Cryptanalysis of Alleged SecurID In this chapter, we will demonstrate the results of our cryptanalysis of the Alleged SecurID Hash Function (ASHF), an ad hoc design of a symmetric-key MAC algorithm, which is used for authenticating users to a corporate computer infrastructure. The outline of this chapter is as follows. In Sect. 8.1, some background is presented. Section 8.2 gives a detailed description of the ASHF. In Sect. 8.3, we show attacks on the block cipher-like structure, the heart of the ASHF. In Sect. 8.4, we show how to use these weaknesses for the cryptanalysis of the full ASHF construction. Section 8.5 presents the conclusions. The results presented in this chapter have previously been presented in our papers [25, 26]. 8.1 Background The SecurID authentication infrastructure was developed by SDTI (now RSA Security). A SecurID token is a hand-held hardware device issued to authorized users of a company. Every minute (or every 30 seconds), the device generates a new pseudo-random token code. By combining this code with his PIN, the user can gain access to the computer infrastructure of his company. Software-based tokens also exist, which can be used on a PC, a mobile phone, Blackberry or PDA. More than 12 million employees in more than 8000 companies worldwide now use SecurID tokens. Institutions that use SecurID include the White House, the U.S. Senate, NSA, CIA, The U.S. Department of Defense, and a majority of the 119 120 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID Fortune 100 companies. The core of the authenticator is the proprietary SecurID hash function, developed by John Brainard in 1985. This function takes as an input the 64-bit secret key, unique to one single authenticator, and the current time (expressed in seconds since 1986). It generates an output by computing a pseudo-random function on these two input values. SDTI/RSA has never made this function public. However, according to V. McLellan [121], the author of the SecurID FAQ, SDTI/RSA has always stated that the secrecy of the SecurID hash function is not critical to its security. The hash function has undergone cryptanalysis by academics and customers under a non-disclosure agreement, and has been added to the NIST “evaluated products”list, which implies that it is approved for U.S. government use and recommended as cryptographically sound to industry. The source code for an alleged SecurID token emulator was posted on the Internet in December 2000 [162]. This code was obtained by reverse engineering, and the correctness of this code was confirmed in some reactions on this post. From now on, this implementation will be called the “Alleged SecurID Hash Function” (ASHF). Although the source code of the ASHF has been available for more than two years, no article discussing this function has appeared in the cryptographic literature before this work, except for a brief note [128]. This may be explained by a huge number of subrounds (256) and by the fact that the function looses information and thus standard techniques for analysis of block ciphers may not be applicable. In this chapter, we will analyze the ASHF and show that the block cipher at the core of this hash function is very weak. In particular this function is non-bijective and allows for an easy construction of vanishing differentials (i.e. collisions) which leak information on the secret key. The most powerful attacks that we show are differential [22] adaptive chosen plaintext attacks which result in a complete secret key recovery in just a few milliseconds on a PC (See Table 8.1 for a summary of attacks). For the real token, vanishing differentials do occur for 10% of all keys given only two months of token output and for 35% of all keys given a year of token output. We describe an algorithm that recovers the secret key given one single vanishing differential. The complexity of this algorithm is 248 SecurID encryptions. These attacks motivate a replacement of ASHF by a more conventional hash function, for example one based on the AES. 8.1. BACKGROUND 121 Table 8.1: The attacks presented in this chapter Function attacked Block cipher (256 subr.) Full ASHF Attack Time Compl.?? 210 Result Sect. ACP Data Compl. (#texts∗∗ ) 70 full key 8.3.4 ACP 217 217 few keybits 8.4.1 Full ASHF CP & ACP 224 + 213 224 full key 8.4.1 Full ASHF+ time dupl. Full ASHF+ time dupl. KP? 216 − 218 248 8.4.2 218 218 full key, for 10 − 35% Possible intrusion KP∗ + monitoring 8.4.2 ? A full output of the 60-second token for 2-12 months. ∗ A full output of the 60-second token for one year. ∗∗ Data complexity is measured in full hash outputs which would correspond to two consecutive outputs of the token. ?? Time complexity is measured in equivalent SecurID encryptions. KP: Known Plaintext - CP: Chosen Plaintext - ACP: Adaptively Chosen Plaintext 122 8.2 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID Description of Alleged SecurID Hash Function The alleged SecurID hash function (ASHF) is, despite its name, not a hash function evidently, as it possesses a secret key. ASHF can be viewed as a symmetrickey encryption algorithm built from hash function design principles, like the eSTREAM submission Salsa20 [19]. Indeed, it can be said that ASHF fits into our general definition of a stream cipher proposed in Sect. 2.3.2. ASHF outputs a 6 to 8-digit word every minute (or 30 seconds). All complexity estimates further in this chapter, if not specified otherwise, will correspond to the 60-second token described in [162] though we expect that our results apply to 30-second tokens with twice shorter observation period. The design of ASHF is shown in Fig. 8.1. The secret information contained in ASHF is a 64-bit secret 32 bit time Expansion 64 bit secret key 64 bit time Permutation Round 1 Round 2 Round 3 Round 4 Permutation Conversion two 8-digit codes Figure 8.1: Overview of the ASHF hash function 8.2. DESCRIPTION OF ALLEGED SECURID HASH FUNCTION 123 key. This key is stored securely and remains the same during the lifetime of the authenticator. Every 30 or 60 seconds, the time is read from the clock. This time is then manipulated by the expansion function (which is not key-dependent), to yield a 64-bit time. This 64-bit time is the input to the key-dependent transformation, which consists of an initial permutation, four rounds of a block cipher-like function, a final permutation and a conversion to decimal. After every round the key is changed by XORing the key with the output of that round, hence providing a simple key scheduling algorithm. The different building blocks of the hash function will now be discussed in more detail. In the following, we will denote a 64-bit word by b, consisting of bytes B0 , B1 , . . . B7 , of hexadecimal nibbles b0 b1 . . .b15 and of bits b0 b1 . . . b63 . B0 is the Most Significant Byte (MSB) of the word, b0 is the most significant bit (msb). 8.2.1 The Expansion Function We consider the case where the token code changes every 60 seconds. The time is read from the clock into a 32-bit number, and converted first into the number of minutes since January 1, 1986, 0.00 GMT. Then this number is shifted to the left one position and the two least significant bits are set to zero. We now have a 32-bit number B0 B1 B2 B3 . This is converted into the 64-bit number by repeating the three least significant bytes several times. The final time t is then of the form B1 B2 B3 B3 B1 B2 B3 B3 . As the two least significant bits of B3 are always zero, this also applies to some bits of t. The time t will only repeat after 223 minutes (about 16 years), which is clearly longer than the lifetime of the token. One can notice that the time only changes every two minutes, due to the fact that the last two bits are set to zero during the conversion. In order to get a different one-time password every minute, the ASHF will at the end take the 8 (or 6) least significant digits at even minutes, and the 8 (or 6) next digits at odd minutes. This may have been done to economize on computing power, but on the other hand this also means that the hash function outputs more information. From an information-theoretic point of view, the ASHF gives 53 bits (40 bits) of information if the attacker can obtain two consecutive 8 digit (6 digit) outputs. This is quite high considering the 64-bit secret key. 8.2.2 The Key-Dependent Initial and Final Permutation The key-dependent bit permutation is applied before the core rounds of the hash function and after them. It takes a 64-bit value u and a key k as input and outputs a permutation of u, called y = P (u, k). The description of the permutation given here is more transparent than the actual implementation. This equivalent 124 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID representation and the observation on the number of permutations below has also been found independently by Contini [46]. This is how the permutation works: The key is split into 16 nibbles k0 k1 . . .k15 , and the 64 bits of u are put into an array. A pointer p jumps to bit uK0 . The four bits strictly before p will become the output bits y60 y61 y62 y63 . The bits that have already been picked are removed from the array, and the pointer is increased (modulo the size of the array) by the next key nibble k1 . Again the four bits strictly before p become the output bits y56 y57 y58 y59 . This process is repeated until the 16 nibbles of the key are used up and the whole output word y is determined. An interesting observation is that the 264 keys only lead to 260.58 possible permutations. This can be easily seen: when the array is reduced to 12 values, the key nibble can only lead to 12 different outputs, and likewise for an array of length 8 or 4. This permutation is not strong: given one input-output pair (u, y), one gets a lot of information on the key. An algorithm has been written that searches all remaining keys given one pair (u, y). On average, 212 keys remain, which is a significant reduction. Given two input-output pairs, the average number of remaining keys is 17. In Sect. 8.3 and Sect. 8.4, we will express the complexity of attacks in the number of SecurID encryptions. In terms of processing time, one permutation is equivalent to 5% of a SecurID encryption. 8.2.3 The Key-Dependent Round One round takes as input the 64-bit key k and a 64-bit value b0 . It outputs a 64bit value b64 , and the key k is transformed to k = k ⊕b64 . This ensures that every round (and also the final permutation) gets a different key. The round consists of 64 subrounds, in each of these one bit of the key is used. The superscript i denotes the word after the i-th round. Subround i (i = 1 . . . 64) transforms bi−1 into bi and works as follows (see Fig. 8.2): 1. Check if the key bit ki−1 equals bi−1 0 . • if the answer is yes: ½ B0i = ((((B0i−1 ≫ 1) − 1) ≫ 1) − 1) ⊕ B4i−1 Bji = Bji−1 for j = 1, 2, 3, 4, 5, 6, 7 (8.1) where ≫ i denotes a cyclic shift to the right by i positions and ⊕ denotes an exclusive or. This function will be called R. 8.2. DESCRIPTION OF ALLEGED SECURID HASH FUNCTION 125 i−1 i−1 if ki−1= b0 if ki−1= b0 B0 B1 B2 B3 B4 B5 B6 B7 B0 B1 B2 B3 B4 B5 B6 B7 1 FFx 1 FFx FFx 65 x 1 1 R S Figure 8.2: Sub-round i defined by the operations R and S. • if the answer is no: i i−1 B0 = B4 i B = 100 − B0i−1 mod 256 4i Bj = Bji−1 for j = 1, 2, 3, 5, 6, 7 (8.2) This function will be called S. 2. Cyclically shift all bits one position to the left: bi = bi ≪ 1 . (8.3) After these 64 iterations, the output b64 of the round is found, and the key is changed according to k = k ⊕ b64 . In terms of processing time, one round is equivalent to 0.21 SecurID encryptions. 8.2.4 The Key-Dependent Conversion to Decimal After the initial permutation, the 4 rounds and the final permutation, the 16 nibbles of the data are converted into 16 decimal symbols. These will be used to form the two next 8 (or 6)-digit codes. This is done in a very straightforward way: the nibbles 0x , 1x , . . . 9x are simply transformed into the digits 0, 1, . . . 9. Ax , Cx and Ex are transformed into 0, 2, 4, 6 or 8, depending on the key; and Bx , Dx and Fx are transformed into 1, 3, 5, 7 or 9, also depending on the key. 126 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID 8.3 Cryptanalysis of the Key-Dependent Rounds The four key-dependent rounds are the major factor for mixing the secret key and the current time. The ASHF could have used any block cipher here (for example DES), but a dedicated design has been used instead. The designers had the advantage that the rounds do not have to be reversible, as the cipher is only used to encrypt. In this section we will show some weaknesses of this dedicated design. We first describe a linear attack, and then discuss some differential attacks, with the attack using vanishing differentials being of most interest. 8.3.1 A Linear Property An interesting property of the S and R function is that the msb of B4 , b32 , will be biased after applying S or R. Moreover, this bias is dependent on the value of the key bit. This can be seen as follows: suppose that after i − 1 subrounds, B0 and B4 are random. Without taking the rotation to the left into account, these are the possible cases: • ki−1 = 0: – bi−1 = 0: The word will undergo the R operation. B4 remains un0 changed through the R operation, and the probabilities are p(bi32 = 0) = 1/2 and p(bi32 = 1) = 1/2. – bi−1 = 1: The word will undergo the S operation. B0i−1 is uniformly 0 distributed within the interval [128, 255], thus B4i is uniformly distributed within the interval [101, 228]. Hence p(bi32 = 0) = 101/128 and p(bi32 = 1) = 27/128. • ki−1 = 1: – bi−1 = 1: The word will undergo the R operation. B4 remains un0 changed through the R operation, and the probabilities are p(bi32 = 0) = 1/2 and p(bi32 = 1) = 1/2. – bi−1 = 0: The word will undergo the S operation. B0i−1 is uniformly 0 distributed within the interval [0, 128], thus B4i is uniformly distributed within the interval [0, 100] ∪ [228, 255]. Hence p(bi32 = 0) = 27/128 and p(bi32 = 1) = 101/128. This means that overall, after the R or S function, p(ki−1 = bi32 ) ≈ 35.55%. This property can be used to break one round of the block cipher using ciphertextonly: the bias will simply propagate to the left after the rotation step, and will then remain unchanged until it enters B0 after 25 more subrounds. The attack scenario is as follows: given 100 ciphertexts, we count the number of times b64 7 , 8.3. CRYPTANALYSIS OF THE KEY-DEPENDENT ROUNDS 127 64 b64 8 . . . b31 are zero. It is easy to see that these values correspond to the value of bit b32 after mixing in the key bit k39 , k40 . . . k63 respectively. So if b7 is equal to 0 more than 50 times, this means that k39 = 1, and so on for the other values. 25 bits of the key can be determined with a very high probability (when using 100 ciphertexts, the probability of error is only 0.05%). The other 39 key bits can now be determined by an exhaustive search. It would be easy to extend the attack to four rounds if the key schedule of ASHF was not data-dependent. The data-dependent key schedule does not allow to add up the occurrences for different ciphertexts, because the key in the final round will be different for every ciphertext. It is possible that the designers were aware of this property and therefore chose the data-dependent key schedule. However, we can mount the following key-recovery attack on the full fourround function: given a ciphertext, we know that the most probable value for k39 . . . k63 of the key in the final round, is the complement of b7 . . . b31 . The probability that this guess for the 25 bits is correct is (165/256)25 ≈ 2−16 . With 239 more guesses, we can determine the other bits of the key in the final round. It is then possible, knowing the plaintext-ciphertext pair, to work our way back to the secret key. This attack requires on average 215 plaintexts and corresponding ciphertexts, and 255 steps. Finding the secret key given the plaintext, the ciphertext and the key in the final round is not straightforward, as the round function is not a bijection. An ad hoc algorithm has been implemented to perform this. This means that each step is expensive in terms of CPU usage. 8.3.2 Avalanche of One-Bit Differences It is instructive to look at how a differential propagates through the rounds. As has been explained in Sect. 8.2.3, in every subround one bit of the key is mixed in by first performing one R or S-operation and then shifting the word one position to the left. A standard differential trail, for an input difference in b63 , is shown in Table 8.2. As long as the difference does not enter B0 or B4 , there is no avalanche at all: the difference is just shifted one position to the left after each subround. But once the difference enters into B0 and B4 , avalanche occurs quite rapidly. This is due to two factors: S and R are nonlinear (though very close to linear) and mainly due to a choice between the two operations S or R, conditional on the most significant bit b0 of the text. Another complication is that, unlike in regular block ciphers, the round output is XORed into the subkey of the following round. This leads to difference propagation into the subkeys which makes the mixing even more complex. Because of the many subrounds (256), it seems that avalanche will be more than 128 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID Table 8.2: Unrestricted differential trail for the first 50 subrounds (avalanche demonstration). Subround 0 1 2 3 4 5 6 7 ... 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 ... B0 00 00 00 00 00 00 00 00 B1 00 00 00 00 00 00 00 00 B2 00 00 00 00 00 00 00 00 B3 00 00 00 00 00 00 00 00 B4 00 00 00 00 00 00 00 00 B5 00 00 00 00 00 00 00 00 B6 00 00 00 00 00 00 00 00 B7 01 02 04 08 10 20 40 80 00 02 1a 08 38 f4 68 4c 22 00 98 2c 10 50 c0 c2 0e e4 c2 72 a0 9c 14 4a e0 b6 3e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 05 0a 15 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 05 0a 15 2b 56 ac 58 b1 63 c7 8f 00 00 00 00 00 00 01 02 05 0a 15 2b 56 ac 58 b1 63 c7 8f 1e 3c 78 f0 e1 c2 84 08 01 02 04 1c 70 e0 40 80 00 cc 98 88 28 e0 a0 10 ec d8 3e 50 64 9c b8 70 94 0a 0b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 04 08 11 22 45 8a 15 00 00 00 00 00 00 00 00 00 00 01 02 04 08 11 22 45 8a 15 2a 55 aa 55 ab 56 ac 59 00 00 01 02 04 08 11 22 45 8a 15 2a 55 aa 55 ab 56 ac 59 b2 64 c9 92 25 4a 94 29 8.3. CRYPTANALYSIS OF THE KEY-DEPENDENT ROUNDS 129 sufficient to resist differential cryptanalysis. 8.3.3 Differential Characteristics We now search for interesting differential characteristics. Two texts are encrypted and the differential trail is inspected. A difference in the lsb of B7 will be denoted by 1x , in the second bit of B7 by 2x , etc. . . Three scenarios can occur in a subround: • If the difference is not in B0 or B4 , it will undergo a linear operation: the difference will always propagate by shifting one position to the left: 1x will become 2x , 2x will become 4x , etc. • If the difference is in B0 and/or B4 , and the difference in the msb of B0 equals the difference in the key, both texts will undergo the same operation (R or S). These operations are not completely linear, but a difference can only go to a few specific differences. A table has been built that shows for every differential the differentials it can propagate to together with the probability. • If the difference is in B0 and/or B4 , and the difference in the msb of B0 does not equal the difference in the key, one text will undergo the R operation and the other text will undergo the S operation. This results in a uniform distribution for the output differences: each difference has probability 2−16 . Our approach has been to use an input differential in the least significant bits of B3 and B7 . This differential will then propagate “for free” for about 24 subrounds. Then it will enter B0 and B4 . We will pay a price in the second scenario (provided the msb of the differential in B0 is not 1) for a few subrounds, after which we want the differential to leave B0 and B4 and jump to B7 or B3 . Then it can again propagate for free for 24 rounds, and so on. That way we want to have a difference in only 1 bit after 64 subrounds. This is the only difference which will propagate into the key, and so we can hope to push the differentials through more rounds. The best differential for one round that we have found has an input difference of 2x in B3 and 5x in B7 , and gives an output difference of 1x in B6 with probability 2−15 . As an illustration, such a differential trail is shown in Table 8.3. It is possible that better differentials exist. One could try to push the characteristic through more rounds, but because of the difference in the keys other paths will have to be searched. For now, it is unclear whether a path through the four rounds can be found with probability larger than 2−64 , but it may well be feasible. In any case, this approach is far less efficient than the approach we will discuss next. 130 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID Table 8.3: A Differential characteristic with probability 2−15 for the first round Subround 0 1 2 3 . . . 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 B0 00 00 00 00 B1 00 00 00 00 B2 00 00 00 00 B3 02 04 08 10 B4 00 00 00 00 B5 00 00 00 00 B6 00 00 00 00 B7 05 0a 14 28 00 00 00 03 00 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 08 04 06 00 00 00 00 00 00 00 00 00 20 40 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 04 00 00 00 00 00 00 00 00 00 00 00 00 50 a0 40 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 04 08 10 20 40 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 04 08 10 20 40 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 01 02 04 08 10 20 40 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 04 08 10 20 40 80 00 8.3. CRYPTANALYSIS OF THE KEY-DEPENDENT ROUNDS 8.3.4 131 Vanishing Differentials Because of the specific application in which the ASHF is used, the rounds do not have to be bijections. A consequence of this non-bijective nature is that it is possible to choose an input differential to a round, such that the output differential is zero, i.e., an internal collision. An input differential that causes an internal collision will be called a vanishing differential. Notice that the ASHF scheme can also be viewed both as a block cipher with partially non-bijective round function or as a MAC where current time is authenticated with the secret key. The fact that internal collisions in MACs may lead to forgery or key-recovery attacks was noted and exploited against popular MACs in [141, 142, 102]. This can be derived as follows. Suppose that we have two words at the input of the i-th subround, bi−1 and b0i−1 . A first condition is that: bi−1 6= b0i−1 . 0 0 (8.4) i−1 Then one word, assume b , will undergo the R operation, and the other word b0i−1 will undergo the S operation. We impose a second condition, namely: Bji−1 = Bj0i−1 for j = 1, 2, 3, 5, 6, 7. (8.5) We now want to find an input differential in B0 and B4 that vanishes. This is easily achieved by simply setting bi equal to b0i after S or R: ½ i B0 = B00i ⇒ B40i−1 = ((((B0i−1 ≫ 1) − 1) ≫ 1) − 1) ⊕ B4i−1 (8.6) B4i = B40i ⇒ B00i−1 = 100 − B4i−1 , As both R and S are bijections, there will be 216 such tuples (B0i−1 , B4i−1 , B00i−1 , B40i−1 ). The extra condition that the msb of B0i−1 and B00i−1 have to be different will filter out half of these, still leaving 215 such pairs. This observation leads to a very easy attack on the full block cipher. We choose two plaintexts b0 and b00 that satisfy the above conditions. We encrypt these with the block cipher (the 4 rounds), and look at the difference between the outputs. Two scenarios are possible: • b0 undergoes the R operation and b00 undergoes the S operation in the first subround: this means that b1 will be equal to b01 . As the difference is equal to zero, this will of course propagate to the end and we will observe this at the output. This also implies that k0 = b00 , because that is the condition for b0 to undergo the R operation. • b0 undergoes the S operation and b00 undergoes the R operation in the first subround: this means that b1 will be different from b01 . This difference will propagate further (with very high probability), and we will observe a nonzero output difference. This also implies that k0 = b00 0 , because that is the condition for b00 to undergo the R operation. 132 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID To summarize, this gives us a very easy way to find the first bit of the secret key with two chosen plaintexts: choose b0 and b00 according to (8.4), (8.5) and (8.6) and encrypt these. If outputs will be equal, then k0 = b00 . If not, then k0 = b00 0 ! This approach can be easily extended to find the second, third, . . . bit of the secret key by working iteratively. Suppose we have determined the first i − 1 bits of the key. Then we can easily find a pair (b0 , b00 ), that will be transformed by the already known key bits of the first i − 1 subrounds into a pair (bi−1 , b0i−1 ) that satisfies (8.4), (8.5) and (8.6). The search for an adequate pair is performed offline, and once we have found a good pair offline it will always be suitable for testing on the real cipher. The procedure we do is to encrypt a random text with the key bits we have deduced so far, then we choose the corresponding second text there for a vanishing differential, and for this text we then try to find a preimage. Correct preimages can be found with high probabilities (75% for the 1st bit, 1.3% for 64th bit), hence few text pairs need to be tried before an adequate pair is found. Now we can deduce ki in the same way we deduced k0 above. This algorithm has been implemented in C on a Pentium. Finding the whole secret key requires at most 128 adaptively chosen plaintexts. This number will in practice mostly be reduced, because we can often reuse texts from previous subrounds in the new rounds. Simulation shows that on average only 70 adaptively chosen plaintexts are needed. The complexity of this attack can be calculated to be 210 SecurID encryptions. Finding the 64-bit secret key for the whole block cipher, with the search for the appropriate plaintexts included, requires only a few milliseconds. Of course, it is possible to perform a trade-off by only searching the first i key bits. The remaining 64 − i key bits can then be found by doing a reduced exhaustive search. 8.4 Extending the Attack with Vanishing Differentials to the Whole ASHF So far, we have shown how to break the four key-dependent rounds of the ASHF. In this paragraph, we will first try to extend the attack of Sect. 8.3.4 to the full ASHF, which consists of a weak initial key-dependent permutation, the four key-dependent rounds, a weak final key-dependent permutation and the keydependent conversion to decimal. Then we will mount the attack when taking the symmetrical time format into account. 8.4. EXTENDING THE ATTACK 8.4.1 133 The Attack on the Full ASHF The attack with vanishing differentials of Sect. 8.3.4 is not defeated by the final permutation. Even a perfect key-dependent permutation would not help. The only thing an attacker needs to know is whether the outputs are equal, and this is not changed by the permutation. The lossy conversion to decimal also doesn’t defeat the attack. There still are 53 bits (40 bits) of information, so the probability that two equal outputs are caused by chance (and thus are not caused by a vanishing differential) remains negligibly small. However, the initial key-dependent permutation causes a problem. It prevents the attacker from choosing the inputs to the block cipher rounds. A new approach is needed to extract information from the occurrence of vanishing differentials. The attack starts by searching a vanishing differential by randomly choosing 2 inputs that differ in 1 bit, say in bit i. Approximately 217 pairs need to be tested before such a pair can be found. Information on the initial permutation (and thus on the secret key) can then be learned by flipping a bit j in the pair and checking whether a vanishing differential is still observed. If this is so, this is an indication that bit j is further away from B0 and B4 than bit i. An algorithm has been implemented that searches for a vanishing pair with one bit difference in all bits i, and then flips a bit for all bits j. This requires 224 chosen and 213 adaptively chosen plaintexts. A 64 × 64 matrix is obtained that carries a lot of information on the initial permutation. Every element of the matrix establishes an order relationship, every row i shows which bits j can be flipped for an input differential in bit i, and every column shows how often flipping bit i will preserve the vanishing differential. This information can be used to extract the secret key. The first nibble of the key determines which bits go to positions b60 , . . . b63 . By inspecting the matrix, one can see that this should be the four successive bits between bit 60 and bit 11 that have the lowest row weight and the highest column weight. Once we have determined the first nibble, we can work recursively to determine the following nibbles of the secret key. The matrix will not always give a unique solution, so then it will be necessary to try all likely scenarios. Simulation indicates that the uncertainty seems to be lower than the work required to find the vanishing differentials. The data complexity of this attack may be further reduced by using fewer vanishing differentials, but thereby introducing more uncertainty. 8.4.2 The Attack on the Full ASHF with the Time Format The attack is further complicated by the symmetrical time format. A one-bit difference in the 32-bit time is expanded into a 2-bit difference (if the one-bit 134 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID difference is in the second or third byte of the 32-bit time) or into a 4-bit difference (if the one-bit difference is in least significant byte of the 32-bit time). A 4-bit difference caused by a one-bit difference in the 32-bit time will yield a vanishing differential with negligible probability, because the initial permutation will seldom put the differences in an ordered way that permits a vanishing differential to occur. A 2-bit difference, however, will lead to a vanishing differential with high probability: taking a random key, an attacker observes two equal outputs for one given 2-bit difference with probability 2−19 . This can be used to distinguish the SecurID stream from a random stream, as two equal outputs would normally only be detected with probability 2−53 (2−26.5 if only one 8-digit tokencode is observed). In the following, we assume that an attacker can observe the output of the 60-second token during some time period. This attack could simulate a “curious user” who wants to know the secret key of his own card, or a malicious internal user who has access to the SecurID cards before they are deployed. The attacker could put his token(s) in front of a PC camera equipped with OCR software, automating the process of reading the numbers from the card. The attacker then would try to detect vanishing differentials in the sampled data. Figure 8.3 shows the percentage of the tokens for which at least one vanishing differential is detected, as a function of the observation time. Figure 8.3: Percentage of keys that have at least one VD 8.4. EXTENDING THE ATTACK 135 Two interesting points are the following: after two months, about 10% of the keys have one or more vanishing differentials; after one year, 35% of keys have at least one vanishing differential. Below, we show two ways in which an attacker can use the information obtained by the observation of the token output. Full key recovery In this attack, we recover the key from the observation of a single vanishing differential. In the following, it is assumed that we are given one pair of 64bit time inputs (t, t0 ) that leads to a vanishing differential, and that there is a 2-bit difference between the two inputs (the attack can also be mounted with other differences). We know that the known pair (t, t0 ) will be permuted into an unknown pair (b, b0 ), and that the difference will vanish after an unknown number of subrounds N . The outputs of the hash, o and o0 , are also known and are of course equal. The idea of the attack is to first guess the subround N in which the vanishing differential occurs, for increasing N = 1, 2, . . . For each value of N , a filtering algorithm is used that searches the set of keys that make such a vanishing differential possible. We then search this set to find the correct key. We will now describe the filtering algorithm for the case N = 1. For other values of N the algorithm will be similar. When guessing N = 1, we assume the vanishing differential to happen in the first subround. This means that the only bits that play a role in the vanishing differential are the key bit k0 , and the bytes B0 , B4 , B00 and B40 of the two words after the permutation. Out of the 233 possible values, we are only interested in those combinations that have a 2-bit difference between the texts and lead to a vanishing differential. We construct a list of these values during a precomputation phase, which has to be performed only once for all attacks. In the precomputation phase, we let a guess of the word b, called bv , take on all values in bytes Bv0 and Bv4 (the other values are set to a fixed value), and let k0 be 0 or 1. We perform one subround for all these values and store the outputs. For each value of k0 , we then search for equal outputs for which the inputs bv , b0v have 2-bit difference. We then have found a valid combination, which we store as a triplet (k0 , bv , b0v ). Note that in bv and b0v only two bytes are of importance, we refer to such words as “partially filled”. Only 30 such valid triplets are found. The list of scenarios found in the precomputation phase will be used as the start for a number of guessing and filtering steps. In the first step, for each scenario (k0 , bv , b0v ), we try all possible values of k1 , . . . k27 . Having guessed the first 28 key bits, we know the permutation of the last 28 bits. We call these partially guessed values bp = P (t, k) and b0p = P (t0 , k). Under this scenario, we now have two partial guesses for both b and b0 , namely bv , bp and b0v , b0p . One can see that bv and bp overlap in the nibble b9 . If bv9 =bp9 and b’v9 =b’p9 , the 136 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID scenario (k0 . . . k27 , bv , b0v ) is possible and will go to the next iteration. If there’s a contradiction, the scenario is impossible and can be filtered out. In the second step, a scenario that survived the first step is taken as input and the key bits k28 . . . k31 are guessed. Filtering is done based on the overlap between the two guesses for nibble b8 . In the third step we guess key bits k32 . . . k59 and filter for the overlap in nibble b1 . Finally, in the fourth step the key bits k60 . . . k63 are guessed and the filtering is based on the overlap in nibble b0 . Every output of this fourth step is a key value that leads to a vanishing differential in the first subround for the given inputs t and t0 . Each key that survives the filtering algorithm will then be checked to see if it is the correct key. The time complexity of the attack itself is dependent on the strength of the filtering. This may vary from input to input, but simulations indicate the following averages for N = 1: the 30 scenarios go to 227 possibilities after the first step. These are reduced to 225 after the second step, go to 245 after the third step and reduce to 241 after the fourth step. The most expensive step in this algorithm is the third step: 28 bits of the key are guessed in 225 scenarios. At every iteration, two partial permutations (28 bits in each) have to be done. This means that the total complexity in terms of SecurID encryptions of this step is: 225 · 228 · 28 · 2 · 0.05 ≈ 248.5 , 64 (8.7) where the factor 0.05 is the relative complexity of the permutation. The time complexity of the third (and forth) step can be further improved by using the fact that there are only 260.58 possible permutations, as described in Sect. 8.2.2. By searching for the permutations instead of the key, we reduce the complexity of the attack by a factor 12/16 · 8/16 to 247 . The memory requirements, apart from the storage of the precomputed tables, are negligible. The total attack consists of performing the above algorithm for N = 1, 2, . . . until we have a success probability of 50%. Simulation on 10000 samples indicated that more than 50% of the vanishing differentials (observed in two-month periods) occur before the 12th subround. So it suffices to run the attack for N up to 12. One can see that the complexity of the attack will be lower for N ≥ 2 than for N = 1. Let us take the example of N = 2. Now the only bits that play a role in the vanishing differential are the key bits k0 and k1 , and the bits b0 . . . b8 and b32 . . . b40 . During the precomputation phase, we construct a list for these 20 bits. Only 350 possibilities remain. The complexity of the attack is now reduced for two reasons. First, the overlap in the filtering algorithms is now 18 bits in total instead of 16 bits for N = 1. Our filtering will thus be stronger. Second, we can now do six filtering steps instead of four filtering steps. For instance, in the first step we will only have to guess k1 . . . k23 to have a first overlap. Because of this, we will see that the complexity of the algorithm decreases with 8.4. EXTENDING THE ATTACK 137 increasing N . The total complexity of the attack is the sum of the complexities of the steps for N = 1, 2, . . .. It can be calculated that the time complexity is equivalent to 248 SecurID encryptions. We expect that further refinements of the filtering algorithm are possible which reduce to 245 the time complexity of the attack. Indeed, Contini and Yin published a nice follow-up [47] on this work which proves that this estimate is correct. They improved the time complexity of the attack to 244 , and show further refinements which reduce the complexity to 240 . Another factor in the attack is the complexity of the precomputation step. The time complexity of the precomputation step increases with N . One can see that for a vanishing differential in subround N , 2·N +14 bits of the text and N bits of the key are of importance. It can be calculated that the total time complexity to precompute the tables for N up to 12 will be 244 SecurID encryptions. The size of the precomputed tables is increasing with N . Our results show that the size increases with N by a factor of 6 to 8. Our conservative estimate is that about 500 GB of hard disk is required to store the tables for N up to 12. Network intrusion An attacker can also use the one year output without trying to recover the secret key of the token. In this attack scenario it is assumed that the attacker can monitor the traffic on the network in the following year. When an attacker monitors a token at the (even) time t, he will compare this with the value of the token 219 minutes (about 1 year) earlier. If both values are equal, this implies that a vanishing differential is occurring. This vanishing differential will also hold at the next (odd) time t + 1. The attacker can thus predict at time t the value of the token at time t + 1 by looking it up in his database, and can use this value to log into the system. Simulation shows that 4% of the keys will have such vanishing differentials within a year, and that these keys will have 6 occurrences of vanishing differentials on average. The decreased key fraction is due to the restriction of the input difference to a specific one year difference. Higher percentages of such weak keys are possible if we assume the attacker is monitoring several years. The total probability of success of an attacker depends on the number of cards from which he could collect the output during one year, and on the number of logins of the users in the next year. It is clear that the attack will not be very practical, since the success probability for a single user with 1000 login attempts per year will be 2−19 · 1000 · 6 · 0.04/2 ≈ 2−12 . This is however, much higher than what would be expected from the password length (i.e., 2−26.5 and 2−20 for 8and 6-digit tokens respectively). Note that this attack does not produce failed login attempts and requires only passive monitoring. 138 8.5 CHAPTER 8. CRYPTANALYSIS OF ALLEGED SECURID Conclusion In this chapter, we have performed a cryptanalysis of the Alleged SecurID Hash Function (ASHF). It has been shown that the core of ASHF, the four keydependent block cipher rounds, is very weak and can be broken in a few milliseconds with an adaptively chosen plaintext attack, using vanishing differentials. The full ASHF is also vulnerable: by observing the output during two months, we have a 10% chance to find at least one vanishing differential. In our main attack, we use this vanishing differential to recover the secret key of the token with expected time complexity 248 . In a follow-up on this work, Contini and Yin [47] confirm this result and reduce the complexity of the attack with a factor 16. We can also use the observed output to significantly increase our chances of network intrusion. The ASHF does not deliver the security that one would expect from a presentday cryptographic algorithm. See Table 8.1 for an overview of the attacks presented. We thus would recommend to replace the ASHF-based tokens by tokens implementing another algorithm that has undergone extensive open review and which has 128-bit internal secret. The AES seems to be a good candidate for such a replacement and indeed, from February 2003 RSA has started phasing in AES-based tokens [147]. Chapter 9 Conclusions and Open Problems 9.1 Contributions of the Thesis In this thesis, we have studied the cryptanalysis and design of synchronous stream ciphers. Stream ciphers are a widely studied and used class of encryption algorithms, of which synchronous stream ciphers appear to offer the best combination of security and efficiency. We have approached the cryptanalysis and design of synchronous stream ciphers by thoroughly analyzing the mathematical properties of their building blocks. This analysis should motivate designers to build stream ciphers from mathematical building blocks that are easy to analyze and for which it is easy to assess the resistance against known cryptanalytic attacks. The first four chapters of this thesis deal with these cryptanalytic attacks: • Linear cryptanalysis is a very important tool in the analysis of stream ciphers. We have showed how many recent attacks fit into the framework of linear attacks, and have quantified the resistance of nonlinear filters and combiners against linear attacks. • The recent algebraic and fast algebraic attacks introduced by Courtois et al. have a significant impact on the cryptanalysis of stream ciphers with a linear update function. We have presented a coherent framework for these attacks, and have specifically quantified the properties that the stream cipher building blocks should satisfy to resist these attacks. In order to evaluate the resistance of stream ciphers against fast algebraic attacks, we 139 140 CHAPTER 9. CONCLUSIONS AND OPEN PROBLEMS have presented a new and more efficient algorithm and also have derived theoretical bounds. • Many structural properties should also be inspected when designing a synchronous stream cipher. We have described and evaluated the impact of these attacks, notably trade-off attacks and Guess and Determine attacks. • Resynchronization attacks are a very important class of attacks, which until recently have received little attention. We have extended the framework for resynchronization attacks, by further refining the original resynchronization attack and by combining the attack with other attack methodologies. This helps us to establish the properties a good resynchronization mechanism should satisfy. After presenting the various cryptanalytic results, we have discussed how to implement synchronous stream ciphers in practical applications in an efficient and secure way. As an illustration, we have discussed efficient and secure Boolean functions for the filter generator. We have also covered the issue of secure implementation of stream ciphers, by introducing differential power analysis of synchronous stream ciphers. In order to illustrate our cryptographic approach, we have presented practical applications of our analysis to concrete primitives. This included the NESSIE candidate Sober-t32, the eSTREAM candidates SFINKS and WG, the Bluetooth encryption algorithm E0, the GSM stream cipher A5/1 and the Alleged SecurID Hash Function. The results in this thesis have been published in the papers [13, 64, 11, 25, 9, 112, 16, 62, 26, 33, 36, 34]. The remaining papers [24, 165, 63, 35, 59, 15] that were published during our doctoral research were not included in this thesis. 9.2 Open Problems There still remain many open problems in the search for efficient and secure synchronous stream ciphers. In the 90’s, the research on stream ciphers has been less intensive than the research on block ciphers. In the past years the interest of the research community in stream ciphers has increased considerably, as evidenced by the huge interest for ECRYPT’s eSTREAM project. It can therefore be hoped that many remaining open problems can be solved in the coming years, and that a unification starts to emerge in the field of stream cipher design. These are some of the interesting open problems: • There is a lack of good Boolean functions for the filter generator which are efficient and also resist the cryptanalytic attacks described in this thesis, 9.2. OPEN PROBLEMS 141 in particular algebraic and fast algebraic attacks. Now that efficient tools are available to evaluate the resistance of Boolean functions against these attacks (see our discussion in Chapter 4), the search for such functions for practical usage becomes possible. • Classical filter generators only output one bit of key stream at each clock, and this puts a limit on the throughput of these algorithms. A more efficient design would use an (n, m) S-box. The search for such efficient and secure S-Box constructions is still an open problem, even if the tools to analyze these building blocks are now available. • In order to fully analyze the security of synchronous stream ciphers, the combination of mathematical (linear and algebraic) attacks with structural attacks should be investigated. This amounts to the study of the so-called augmented functions, which relate the internal state to a sequence of key stream bits. This concept was already introduced by Anderson in 1994 [5] to study correlation attacks. It could also be used to study the general case of linear attacks and the search for the lowest-degree relations in general fast algebraic attacks. • The study of stream ciphers with a nonlinear update function was until recently little studied, and very diverse strategies exist. As we believe that introducing more nonlinearity in the update function can have considerable advantages for the security of the cipher, it is important that such mechanisms are studied in a more focused way. We believe that the eSTREAM project can focus the attention on some promising design strategies. In particular, we like the approach that has been taken by the designers of Trivium [65] and Grain [97], but further study is certainly required. • Side-Channel Analysis of stream ciphers has received little attention so far. We believe it will receive more attention once the cryptographic community has focused its attention on a few selected primitives and proceeds to actual usage of these. It is likely that then also methods to protect against SCA will be studied. We already presented a first possible approach for protecting a filter generator in [35]. 142 CHAPTER 9. CONCLUSIONS AND OPEN PROBLEMS Appendix A Description of Algorithms A.1 Description of the GSM Stream Cipher A5/1 A5/1 [4] is the stream cipher used for encrypting GSM communications in Europe and the United States. In other countries, the even weaker stream cipher A5/2 is used. A5/1 consists of three LFSRs, as depicted in Fig. A.1. Every 4.6 milliseconds, a frame is encrypted by A5/1 using the fixed secret key K and a frame counter Fn . The procedure for generating each frame is as follows: 1. t = 0 . . . 63: The LFSRs are set to zero and then clocked 64 times. At each clock one bit of the secret key is XORed in parallel into the least significant bits of the three LFSRs. 2. t = 64 . . . 85: The LFSRs are clocked 22 more times. At each clock one bit of the frame counter Fn is XORed in parallel into the least significant bits of the three LFSRs. 3. t = 86 . . . 413: From now on the LFSRs are clocked irregularly as follows: at each clock a majority bit m = majority(c1 , c2 , c3 ) is calculated. LFSR Ri (1 ≤ i ≤ 3) is now updated only if m = ci . This implies that at each clock either 2 or 3 LFSRs are updated. In this way 328 bits of output are generated. The first 100 bits are discarded, the next 228 bits are used to encrypt the GSM conversation. 143 144 APPENDIX A. DESCRIPTION OF ALGORITHMS Figure A.1: Layout of the stream cipher A5/1 A.2 Description of the Bluetooth Stream Cipher E0 The stream cipher E0 is used in the Bluetooth wireless standard [157] to encrypt data. A full description of E0 can be found in [157], Vol. 2, pp. 763-772. A working C implementation of E0 can be found at [151]. When two Bluetooth devices want to communicate, they undergo a key exchange protocol whereby they agree on a 128-bit encryption key KC (it may be smaller than 128 bits, but we will only describe the 128-bit case here). When encrypting a packet, KC is linearly combined with the 48-bit Bluetooth address of the master device and with the 26-bit clock to form the initial state of the LFSRs for the two-level key stream generator. The general design of E0 is inspired by the summation combiner of Rueppel. It is depicted in Fig. A.2. E0 consists of 4 LFSRs (ϕ = 4), with total length of 128 bits (25 bits for LFSR A, 31 for LFSR B, 33 for LFSR C, 39 for LFSR D). E0 also contains 4 nonlinear memory bits, the so-called blender. At each iteration the LFSRs are clocked, the new state of the blender is calculated as a nonlinear function of its current state and of the 4 output bits of the LFSRs, and A.3. DESCRIPTION OF THE SFINKS STREAM CIPHER 145 one bit of key stream is calculated as the XOR of the 4 output bits of the LFSRs and of the output bit of the blender. Figure A.2: Layout of the stream cipher E0 In level 1 the LFSRs are linearly loaded with the key and an initialization vector. The blender is set to zero. Then, E0 is clocked 200 times. The first 72 outputs are discarded, the next 128 outputs will be the initial state of the LFSRs in level 2. The initial state of the blender in the second level is set equal to the final state of the blender after the first level. A.3 Description of the SFINKS Stream Cipher SFINKS [35] is a synchronous stream cipher based on a nonlinear filtered key stream generator with an associated authentication mechanism. The layout of the design is presented in Fig. A.3. The internal state consists of an LFSR of 256 bits. The Boolean function is the 16-bit inversion S-box from which one bit is taken as output after linear masking. One output bit of the inversion function is also used for the generation of the 64-bit MAC. We now discuss the different building blocks of the algorithm. 146 APPENDIX A. DESCRIPTION OF ALGORITHMS Figure A.3: Layout of the stream cipher SFINKS A.3.1 Key, IV and MAC Length The algorithm is aimed at offering medium-term security, which is reflected in the length of 80 bits of the secret key K = k79 . . . k0 . Also, the length of the initialization vector IV = iv79 . . . iv0 is 80 bits. The MAC has a length of 64 bits. A.3.2 LFSR An LFSR of length L = 256 bits is used. It is a Fibonacci LFSR with the following weight 17 feedback polynomial: P (X) = 1 + X 44 + X 62 + X 64 + X 69 + X 93 + X 105 + X 131 + X 141 + X 149 + X 171 + X 190 + X 192 + X 204 + X 208 + X 242 + X 256 . (A.1) A.3. DESCRIPTION OF THE SFINKS STREAM CIPHER 147 In other words, the recursion formula for the LFSR stream is as follows: st+256 A.3.3 = st+212 ⊕ st+194 ⊕ st+192 ⊕ st+187 ⊕ st+163 ⊕ st+151 ⊕ st+125 ⊕ st+115 ⊕st+107 ⊕ st+85 ⊕ st+66 ⊕ st+64 ⊕ st+52 ⊕ st+48 ⊕ st+14 ⊕ st . (A.2) Filter Function The nonlinear filter function used for generating the key stream is a Boolean function f from F17 2 into F2 . At each time t, 17 bits are taken from the LFSR, put into the word xt ∈ F17 2 and input to the Boolean function to calculate the key stream zt as follows: 0 16 1 0 zt = f (xt ) = f (x16 t , . . . , xt ) = (IN V (xt , . . . , xt )&1) ⊕ xt . (A.3) In this formula, the input word xt is selected as follows from the LFSR stream: xt (st+255 , st+244 , st+227 , st+193 , st+161 , st+134 , st+105 , st+98 , st+74 , st+58 , st+44 , st+21 , st+19 , st+9 , st+6 , st+1 , st ) . (A.4) The main building block is the vectorial Boolean function IN V () from F16 2 into F16 2 . IN V () calculates the inverse of the input in the field F216 modulo the primitive polynomial X 16 + X 5 + X 3 + X 2 + 1. Let us denote the output of this vectorial Boolean function by y t ∈ F16 2 . Note that only one bit of this inverse S-box is used for the key stream generation. Some important cryptographic properties of the filter function on F17 2 are as follows: the function is balanced, the degree of the function is 15, the nonlinearity is 65024, its best affine approximation has bias ² = 2−8 and its algebraic immunity is equal to 6. A.3.4 = Resynchronization Mechanism First all internal state bits are set to zero and the algorithm starts at t = −128. 176 K and IV are loaded into the LF SR. We set s255 −128 = iv79 , . . . , s−128 = iv0 , 175 96 95 s−128 = k79 , . . . , s−128 = k0 . and s−128 = 1. We also set y −134 to y −128 equal to zero. We then start 128 resynchronization steps, for t = −127, . . . − 1, 0, as follows. The LFSR is clocked in a special way, where 16 bits of the LFSR are also XORed with bits from the inversion function as follows: j mod 16 sjt = sj+1 , t−1 ⊕ y t−7 (A.5) for the 16 values of j = {11, 17, 41, 52, 66, 80, 111, 118, 142, 154, 173, 179, 204, 213, 232, 247}. 148 APPENDIX A. DESCRIPTION OF ALGORITHMS At each time, we then need to calculate y t , as we will use these 16 bits to be fed back into the LFSR as in the equation above. We will also shift the bit yt1 into the shift register for MAC generation as during normal operation, and always XOR the register contents into the MAC state, see below. At t = 0, the resynchronization is finished and we can start the key stream generation. A.3.5 MAC Algorithm The MAC generation algorithm is based on concepts similar to [111]. If no authentication method is required, SFINKS can also be implemented without this authentication method. For MAC generation, we use the shift register Rt = rt63 . . . rt0 . During the frame encryption, the value of the MAC will be accumulated in the register 63 Mt = m63 t . . . m0 which is set to zero at the start of each frame. For all times t = −127, . . . f ramelength, the shift register is updated as follows: ½ i i+1 rt = rt−1 for i = 0, . . . , 62 (A.6) 63 rt = y 1t . For every plaintext bit pt , t = 1, . . . f ramelength, we update the MAC state by Mt = Mt−1 ⊕ pt · Rt−1 1 . At the end of the frame, the 64 bits of M are sent to the receiver, after they have been encrypted during 64 final clocks of the key stream generator. A.4 Description of the Sober-t32 Stream Cipher Sober-t32 is a word-oriented synchronous stream cipher. It uses 32-bit words and has a secret key of 256 bits. In a synchronous stream cipher such as Sober-t32, the key stream is generated independently from the plaintext. The sender encrypts the plaintext by performing an XOR (exclusive or, ⊕) operation between plaintext and key stream. The recipient can, if he knows the secret key, recreate the key stream and recover the plaintext by performing an XOR operation between the ciphertext and the key stream. Sober-t32 is based on 32-bit operations within the Finite Field F232 . Every word a = (a31 , a30 ...a1 , a0 ) is represented by a polynomial of degree less than 32: A = a31 x31 + ... + a0 x0 . (A.7) 1 note that, as mentioned in the resynchronization mechanism, the MAC is updated by Mt = Mt−1 ⊕ Rt−1 during the resynchronization steps A.4. DESCRIPTION OF THE SOBER-T32 STREAM CIPHER 149 Figure A.4: Layout of the stream cipher Sober-t32 When adding two words in GF (232 ), the polynomials are added and their coefficients are reduced modulo 2. This is the same as a bitwise XOR. For a multiplication, the polynomials are multiplied, the coefficients reduced modulo 2, and the resulting polynomial is reduced modulo a polynomial of degree 32. For Sober-t32 the polynomial is: x32 + (x24 + x16 + x8 + 1)(x6 + x5 + x2 + 1) . (A.8) Sober-t32 consists of three main building blocks. First there is a Linear Feedback Shift Register (LFSR), which uses a recursion formula to produce a state sequence sn . Next a Non-Linear Function (NLF) combines these words in a non-linear way to produce the NLF-stream vn . Finally, the so-called stuttering produces the key stream zj by decimating the NLF-stream in an irregular fashion. All three parts are explained in detail below. An overview of the general structure of Sober-t32 is shown in Fig. A.4. 150 A.4.1 APPENDIX A. DESCRIPTION OF ALGORITHMS The Linear Feedback Shift Register (LFSR) The LFSR is a shift register of length 17, where every register contains one word. The internal memory is thus 544 bits for Sober-t32. The state of the LFSR at a certain time t is respresented by the vector → − St = (st , st+1 , st+2 , . . . st+16 ) = (r0 , r1 , r2 , . . . r16 ) . (A.9) The next state of the LFSR is calculated by shifting the previous state one step, and calculating a new word st+17 as a linear combination of the words in the LFSR. In Sober-t32, st+17 is calculated as follows: st+17 = st+15 ⊕ st+4 ⊕ α · st , (A.10) with α = C2DB2AA3x . A.4.2 The Non-Linear Function (NLF) At any time t, the NLF takes five words from the LFSR state and calculates one output, called vt . This output can be written as: vt = ((f (st ¢ st+16 ) ¢ st+1 ¢ st+6 ) ⊕ K) ¢ st+13 . (A.11) In this equation, K is a word that is determined during the initialization of the LFSR, ¢ denotes addition modulo 232 and f is a non-linear function. The structure of the function f is shown in Fig. A.5. First, the word is partitioned into the Most Significant Byte (MSB) and the three remaining bytes. The MSB is used as an input to a substitution box (S-box), which outputs 32 bits. Of these, the MSB becomes the MSB of the output, and the three remaining bytes are XORed with the three remaining bytes of the input word to form the rest of the output word. It is important to notice that the S-box only uses the MSB as an input. This implies that most of the non-linearity is caused by the MSB. One can see that any differential in the less significant bits goes straight through the f -function, and that the MSB of the output is solely determined by the MSB of the input (and vice-versa). The f -function can be written as: f (a) = SBOX(aM SB ) ⊕ (0kaR ) . (A.12) In this equation, aR represents the three remaining bytes of a 32-bit word a, thus without the MSB. A.4. DESCRIPTION OF THE SOBER-T32 STREAM CIPHER 151 Figure A.5: Structure of the function f Table A.1: dibit 00 01 10 11 A.4.3 The possible actions of the stuttering unit as a function of the dibit action The next word is excluded from the key stream. The next word is XORed with C, and goes to the key stream, the word after that is excluded from the key stream. The next word is excluded from the key stream, the word after that goes to the key stream. The next word is XORed with C 0 , and goes to the key stream. The Stuttering The output of the NLF is used as the input for the stuttering phase. The stuttering decimates the stream in an irregular fashion. The first output of the NLF is taken as the first Stutter Control Word (SCW). This SCW is divided into pairs of bits, called dibits. The value of these dibits determines what happens with the following words, as explained in Table A.1. The constant C is 6996C53Ax for Sober-t32, and C 0 is the bitwise complement of C. When all dibits have been used, the next word from the NLF-stream is taken as the next SCW. In this way, the stuttering allows only about 48% of the NLF-stream words to go to the key stream. 152 APPENDIX A. DESCRIPTION OF ALGORITHMS Bibliography [1] Gordon Agnew, Thomas Beth, Ronald Mullin, and Scott Vanstone. Arithmetic operations in GF(2m ). Journal of Cryptology, 6(1):3–13, 1993. [2] Mehdi-Laurent Akkar, Regis Bevan, Paul Dischamp, and Didier Moyart. Power analysis, what is now possible... In T. Okamoto, editor, Advances in Cryptology - ASIACRYPT 2000, number 1976 in Lecture Notes in Computer Science, pages 489–516. Springer-Verlag, 2000. [3] Hamid Amirazizi and Martin Hellman. Time-memory-processor trade-offs. IEEE Transactions on Information Theory, IT-34(3):505–512, 1988. [4] Ross Anderson. A5 – the GSM encryption algorithm. sci.crypt post, 1994. http://groups.google.be/groups?hl=nl&lr=&selm=2ts9a0\%2495r\ %40lyra.csx.cam.ac.uk. [5] Ross Anderson. Searching for the optimum correlation attack. In B. Preneel, editor, Fast Software Encryption, FSE 1994, number 1008 in Lecture Notes in Computer Science, pages 137–143. Springer-Verlag, 1994. [6] Frederik Armknecht. Improving fast algebraic attacks. In B. Roy, editor, Fast Software Encryption, FSE 2004, number 3017 in Lecture Notes in Computer Science, pages 65–82. Springer-Verlag, 2004. [7] Frederik Armknecht. On the existence of low-degree equations for algebraic attacks. In State of the Art of Stream Ciphers, pages 175–189, 2004. [8] Frederik Armknecht and Matthias Krause. Algebraic attacks on combiners with memory. In D. Boneh, editor, Advances in Cryptology - CRYPTO 2003, number 2729 in Lecture Notes in Computer Science, pages 145–161. Springer-Verlag, 2003. [9] Frederik Armknecht, Joseph Lano, and Bart Preneel. Extending the resynchronization attack. In H. Handschuh and A. Hasan, editors, Selected Areas in Cryptography, SAC 2004, number 3357 in Lecture Notes in Computer Science, pages 19–38. Springer-Verlag, 2004. [10] Steve Babbage. Space/time trade-off in exhaustive search attacks on stream ciphers. Eurocrypt Rump session, 1996. [11] Steve Babbage, Christophe De Cannière, Joseph Lano, Bart Preneel, and Joos Vandewalle. Cryptanalysis of SOBER-t32. In T. Johansson, editor, Fast Software 153 154 Bibliography Encryption, FSE 2003, number 2887 in Lecture Notes in Computer Science, pages 111–128. Springer-Verlag, 2003. [12] Steve Babbage and Matthew Dodd. The stream cipher MICKEY (version 1). eSTREAM, ECRYPT Stream Cipher Project, Report 2005/015, 2005. http: //www.ecrypt.eu.org/stream. [13] Steve Babbage and Joseph Lano. A probabilistic factor in the SOBER-t stream ciphers. In Proceedings of the 3rd NESSIE Workshop, page 12, 2002. [14] Magali Bardet, Jean-Charles Faugère, and Bruno Salvy. Complexity of Gröbner basis computation for semi-regular overdetermined sequences over f2 with solutions in f2 . INRIA Research Report 5049, 2003. http://www.inria.fr/rrrt/ rr-5049.html. [15] Lejla Batina, Sandeep Kumar, Joseph Lano, Kerstin Lemke, Nele Mentens, Christof Paar, Bart Preneel, Kazuo Sakiyama, and Ingrid Verbauwhede. Testing framework for eSTREAM profile ii candidates. In SASC 2006 - Stream Ciphers Revisited, pages 104–112. ECRYPT, 2006. [16] Lejla Batina, Joseph Lano, Nele Mentens, Siddika Berna Ors, Bart Preneel, and Ingrid Verbauwhede. Energy, performance, area versus security trade-offs for stream ciphers. In ECRYPT, editor, State of the Art of Stream Ciphers, pages 302–310, 2004. [17] Come Berbain, Olivier Billet, Anne Canteaut, Nicolas Courtois, Blandine Debraize, Henri Gilbert, Louis Goubin, Aline Gouget, Louis Granboulan, Cedric Lauradoux, Marine Minier, Thomas Pornin, and Herve Sibert. Decim - a new stream cipher for hardware applications. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/004, 2005. http://www.ecrypt.eu.org/stream. [18] Come Berbain, Olivier Billet, Anne Canteaut, Nicolas Courtois, Henri Gilbert, Louis Goubin, Aline Gouget, Louis Granboulan, Cedric Lauradoux, Marine Minier, Thomas Pornin, and Herve Sibert. Sosemanuk, a fast software-oriented stream cipher. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/027, 2005. http://www.ecrypt.eu.org/stream. [19] Daniel Bernstein. Salsa20. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/025, 2005. http://www.ecrypt.eu.org/stream. [20] Daniel Bernstein. Understanding brute force. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/036, 2005. http://www.ecrypt.eu.org/stream. [21] Eli Biham. How to decrypt or even substitute DES-encrypted messages in 22 8 steps. Information Processing Letters, 84(3):117–124, 2002. [22] Eli Biham and Adi Shamir. Differential cryptanalysis of DES-like cryptosystems. Journal of Cryptology, 4(1):3–72, 1991. [23] Alex Biryukov and Christophe De Cannière. Block ciphers and systems of quadratic equations. In T. Johansson, editor, Fast Software Encryption, FSE 2003, number 2887 in Lecture Notes in Computer Science, pages 274–289. Springer-Verlag, 2003. Bibliography 155 [24] Alex Biryukov, Christophe De Cannière, Joseph Lano, Siddika Berna Ors, and Bart Preneel. Security and performance analysis of ARIA. COSIC Report, 2004. [25] Alex Biryukov, Joseph Lano, and Bart Preneel. Cryptanalysis of the alleged SecurID hash function. In M. Matsui and R. Zuccherato, editors, Selected Areas in Cryptography, SAC 2003, number 3006 in Lecture Notes in Computer Science, pages 130–144. Springer-Verlag, 2003. [26] Alex Biryukov, Joseph Lano, and Bart Preneel. Recent attacks on alleged SecurID and their practical implications. Computers and Security, 24(5):364–370, 2005. [27] Alex Biryukov, Sourav Mukhopadhyay, and Palash Sarkar. Improved timememory trade-offs with multiple data. In B. Preneel and S. Tavares, editors, Selected Areas in Cryptography, SAC 2005, number 3897 in Lecture Notes in Computer Science, pages 110–127. Springer-Verlag, 2005. [28] Alex Biryukov and Adi Shamir. Cryptanalytic time/memory/data tradeoffs for stream ciphers. In T. Okamoto, editor, Advances in Cryptology - ASIACRYPT 2000, number 1976 in Lecture Notes in Computer Science, pages 1–13. SpringerVerlag, 2000. [29] Alex Biryukov and David Wagner. Slide attacks. In L. Knudsen, editor, Fast Software Encryption, FSE 1999, number 1636 in Lecture Notes in Computer Science, pages 245–259. Springer-Verlag, 1999. [30] Yuri Borissov, Svetla Nikova, Bart Preneel, and Joos Vandewalle. On a resynchronization weakness in a class of combiners with memory. In T. Helleseth, editor, Security in Communication Networks, SCN 2002, number 2576 in Lecture Notes in Computer Science, pages 165–177. Springer-Verlag, 2002. [31] An Braeken. Cryptographic Properties of Boolean Functions and S-Boxes. PhD thesis, Katholieke Universiteit Leuven, March 2006. [32] An Braeken, Yuri Borissov, Svetla Nikova, and Bart Preneel. Classification of Boolean functions of 6 variables or less with respect to cryptographic properties. In M. Yung, G.F. Italiano, and C. Palamidessi, editors, International Colloquium on Automata, Languages and Programming ICALP 2005, volume 3580 of Lecture Notes in Computer Science, pages 324–334. Springer-Verlag, 2005. [33] An Braeken and Joseph Lano. On the (im)possibility of practical and secure nonlinear filters and combiners. In B. Preneel and S. Tavares, editors, Selected Areas in Cryptography, SAC 2005, number 3897 in Lecture Notes in Computer Science, pages 159–174. Springer-Verlag, 2005. [34] An Braeken and Joseph Lano. On the security of nonlinear LFSR-based filters and combiners. Academy Contact Forum on Coding Theory and Cryptography, 2005. [35] An Braeken, Joseph Lano, Nele Mentens, Bart Preneel, and Ingrid Verbauwhede. SFINKS: A synchronous stream cipher for restricted hardware environments. In Symmetric Key Encryption Workshop, page 19, 2005. 156 Bibliography [36] An Braeken, Joseph Lano, and Bart Preneel. Evaluating the resistance of stream ciphers with linear feedback against fast algebraic attacks. In L. Batten and R. Safavi-Naini, editors, Australasian Conference on Information Security and Privacy, ACISP 2006, number 4058 in Lecture Notes in Computer Science, pages 40–51. Springer-Verlag, 2006. [37] Bruno Buchberger. Ein Algorithmus zum Auffinden der Basiselemente des Restklassenringes nach einem Nulldimensionalen Polynomideal. PhD thesis, Universitat Innsbruck, 1965. [38] Anne Canteaut and Michael Trabbia. Improved fast correlation attacks using parity-check equations of weight 4 and 5. In B. Preneel, editor, Advances in Cryptology - EUROCRYPT 2000, number 1807 in Lecture Notes in Computer Science, pages 573–588. Springer-Verlag, 2000. [39] Anne Canteaut and Marion Videau. Symmetric Boolean functions. IEEE Transactions on Information Theory, IT-51(8):2791–2811, 2005. [40] Claude Carlet and Philippe Gaborit. On the construction of balanced Boolean functions with a good algebraic immunity. Proceedings of First Workshop on Boolean Functions : Cryptography and Applications, Mars 2005, Rouen, 2005. [41] Claude Carlet and Palash Sarkar. Spectral domain analysis of correlation immune and resilient Boolean functions. Finite fields and Applications, 8:120–130, 2002. [42] Mohammadi Chambolbol. Cryptanalysis of word-oriented stream ciphers (in persian). Master’s thesis, Sharif University of Technology, 2004. [43] Joo Yeon Cho and Josef Pieprzyk. Linear distinguishing attack on NLS. eSTREAM, ECRYPT Stream Cipher Project, Report 2006/018, 2006. http: //www.ecrypt.eu.org/stream. [44] Philippe Chose, Antoine Joux, and Michel Mitton:. Fast correlation attacks: An algorithmic point of view. In L. Knudsen, editor, Advances in Cryptology EUROCRYPT 2002, number 2332 in Lecture Notes in Computer Science, pages 209–221. Springer-Verlag, 2002. [45] Carlos Cid and Gaetan Leurent. An analysis of the XSL algorithm. In B. Roy, editor, Advances in Cryptology - ASIACRYPT 2004, number 3788 in Lecture Notes in Computer Science, pages 333–352. Springer-Verlag, 2005. [46] Scott Contini. The effect of a single vanishing differential on ASHF. sci.crypt post, 2003. [47] Scott Contini and Yiqun Lisa Yin. Fast software-based attacks on SecurID. In B. Roy, editor, Fast Software Encryption, FSE 2004, number 3017 in Lecture Notes in Computer Science, pages 454–471. Springer-Verlag, 2004. [48] Don Coppersmith, Shai Halevi, and Charanjit Jutla. Cryptanalysis of stream ciphers with linear masking. In M. Yung, editor, Advances in Cryptology CRYPTO 2002, number 2442 in Lecture Notes in Computer Science, pages 515– 532. Springer-Verlag, 2002. Bibliography 157 [49] Don Coppersmith, Hugo Krawczyk, and Yishay Mansour. The shrinking generator. In D. Stinson, editor, Advances in Cryptology - CRYPTO 1993, number 773 in Lecture Notes in Computer Science, pages 1–11. Springer-Verlag, 1994. [50] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic progressions. In Annual ACM Symposium on Theory of Computing, pages 1–6. ACM, 1987. [51] Nicolas Courtois. Fast algebraic attacks on stream ciphers with linear feedback. In D. Boneh, editor, Advances in Cryptology - CRYPTO 2003, number 2729 in Lecture Notes in Computer Science, pages 176–194. Springer-Verlag, 2003. [52] Nicolas Courtois. Algebraic attacks on combiners with memory and several outputs. In C. Park and S. Chee, editors, Information Security and Cryptology ICISC 2004, number 3506 in Lecture Notes in Computer Science, pages 3–20. Springer-Verlag, 2004. [53] Nicolas Courtois. Cryptanalysis of SFINKS. In M. Rhee and B. Lee, editors, Information Security and Cryptology - ICISC 2005, number 3935 in Lecture Notes in Computer Science. Springer-Verlag, to appear. [54] Nicolas Courtois, Alexander Klimov, Jacques Patarin, and Adi Shamir. Efficient algorithms for solving overdefined systems of multivariate polynomial equations. In B. Preneel, editor, Advances in Cryptology - EUROCRYPT 2000, number 1807 in Lecture Notes in Computer Science, pages 392–407. Springer-Verlag, 2000. [55] Nicolas Courtois and Willi Meier. Algebraic attacks on stream ciphers with linear feedback. In E. Biham, editor, Advances in Cryptology - EUROCRYPT 2003, number 2656 in Lecture Notes in Computer Science, pages 345–359. SpringerVerlag, 2003. [56] Nicolas Courtois and Josef Pieprzyk. Cryptanalysis of block ciphers with overdefined systems of equations. In Y. Zheng, editor, Advances in Cryptology - ASIACRYPT 2002, volume 2501 of Lecture Notes in Computer Science, pages 267– 287. Springer-Verlag, 2002. [57] Joan Daemen. Cipher and Hash Function Design. Strategies Based on Linear and Differential Cryptanalysis. PhD thesis, Katholieke Universiteit Leuven, March 1995. [58] Joan Daemen, Rene Govaerts, and Joos Vandewalle. Resynchronization weaknesses in synchronous stream ciphers. In T. Helleseth, editor, Advances in Cryptology - EUROCRYPT 1993, number 765 in Lecture Notes in Computer Science, pages 159–167. Springer-Verlag, 1993. [59] Joan Daemen, Joseph Lano, and Bart Preneel. Chosen ciphertext attack on SSS. In SASC 2006 - Stream Ciphers Revisited, pages 45–51. ECRYPT, 2006. [60] Deepak Dalai, Kishan Gupta, and Subhamoy Maitra. Results on algebraic immunity for cryptographically significant Boolean functions. In A. Canteaut and K. Viswanathan, editors, Progress in Cryptology - INDOCRYPT 2004, volume 3348 of Lecture Notes in Computer Science, pages 92–106. Springer-Verlag, 2004. 158 Bibliography [61] Deepak Dalai, Kishan Gupta, and Subhamoy Maitra. Cryptographically significant Boolean functions: Construction and analysis in terms of algebraic immunity. In H. Gilbert and H. Handschuh, editors, Fast Software Encryption, FSE 2005, number 2887 in Lecture Notes in Computer Science. Springer-Verlag, 2005. [62] Christophe De Cannière, Joseph Lano, and Bart Preneel. Comments on the rediscovery of time memory data tradeoffs. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/040, 2005. http://ecrypt.eu.org/stream. [63] Christophe De Cannière, Joseph Lano, and Bart Preneel. Cryptanalysis of the two-dimensional circulation encryption algorithm (TDCEA). EURASIP Journal on Applied Signal Processing, 2005(12):1923–1927, 2005. [64] Christophe De Cannière, Joseph Lano, Bart Preneel, and Joos Vandewalle. Distinguishing attacks on SOBER-t32. In Proceedings of the 3rd NESSIE Workshop, page 14, 2002. [65] Christophe De Cannière and Bart Preneel. Trivium - a stream cipher construction inspired by block cipher design principles. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/030, 2005. http://www.ecrypt.eu.org/stream. [66] Dorothy Denning. Cryptography and Data Security. Addison-Wesley Longman, 1982. ISBN 3-540-54973-0, 0-387-54973-0. [67] Claus Diem. The XL-algorithm and a conjecture from commutative algebra. In P. Lee, editor, Advances in Cryptology - ASIACRYPT 2004, number 3329 in Lecture Notes in Computer Science, pages 323–337. Springer-Verlag, 2004. [68] Cunsheng Ding, Guozhen Xiao, and Weijuan Shan. Stability Theory of Stream Ciphers. Number 561 in Lecture Notes in Computer Science. Springer-Verlag, 1991. ISBN 3-540-54973-0, 0-387-54973-0. [69] Hans Dobbertin. One-to-one highly nonlinear power functions on GF(2n ). Applicable Algebra in Engineering, Communication, and Computation, 9:139–152, 1998. [70] Hans Dobbertin. Almost perfect nonlinear power functions on GF(2n ): The Niho case. Information and Computation, 151(1-2):57–72, 1999. [71] Hans Dobbertin, Tor Helleseth, Vijay Kumar, and Halvard Martinsen. Ternary m-sequences with three-valued crosscorrelation: New decimations of Welch and Niho type. IEEE Transactions on Information Theory, IT-47:1473–1481, 2001. [72] ECRYPT. ECRYPT yearly report on algorithms and keysizes, revision 1.1, 2005. http://www.ecrypt.eu.org/. [73] ECRYPT. eSTREAM, the ECRYPT Stream Cipher Project, 2005-2008. http: //www.ecrypt.eu.org/stream. [74] Patrik Ekdahl and Thomas Johansson. Distinguishing attacks on SOBER-t16 and t32. In J. Daemen and V. Rijmen, editors, Fast Software Encryption, FSE 2002, number 2365 in Lecture Notes in Computer Science, pages 210–224. SpringerVerlag, 2002. Bibliography 159 [75] Hakan Englund and Thomas Johansson. A new simple technique to attack filter generators and related ciphers. In H. Handschuh and A. Hasan, editors, Selected Areas in Cryptography, SAC 2004, number 3357 in LNCS, pages 39–53. Springer, 2004. [76] Jean-Charles Faugère. A new efficient algorithm for computing Gröbner bases (F4 ). Journal of Pure and Applied Algebra, 139:61–88, 1999. [77] Jean-Charles Faugère. A new efficient algorithm for computing Gröbner bases without reduction to zero (F5 ). In International Symposium on Symbolic and Algebraic Computation — ISSAC 2002, pages 75–83. ACM Press, 2002. [78] Scott Fluhrer. Improved key recovery of level 1 of the Bluetooth encryption system. Cryptology ePrint Archive, Report 2002/068, 2002. http://eprint. iacr.org/. [79] Caroline Fontaine. Contribution à la Recherche de Fonctions Booléennes Hautement Non Linéaires, et au Marquage d’Images en Vue de la Protection des Droits d’Auteur. PhD thesis, Paris University, 1998. [80] David Freedman, Robert Pisani, and Roger Purves. Statistics, Third Edition. Norton, 1997. ISBN 0-393-97083-3. [81] Joanne Fuller and William Millan. Linear redundancy in S-boxes. In T. Johansson, editor, Fast Software Encryption, FSE 2003, number 2887 in Lecture Notes in Computer Science, pages 74–86. Springer-Verlag, 2003. [82] Jovan Golic. Correlation via linear sequential circuit approximation of combiners with memory. In R. Rueppel, editor, Advances in Cryptology - EUROCRYPT 1992, number 658 in Lecture Notes in Computer Science, pages 113–123. SpringerVerlag, 1992. [83] Jovan Golic. Linear cryptanalysis of stream ciphers. In B. Preneel, editor, Fast Software Encryption, FSE 1994, number 1008 in Lecture Notes in Computer Science, pages 154–169. Springer-Verlag, 1994. [84] Jovan Golic. Computation of low-weight parity-check polynomials. Electronics Letters, 32(21):1981–1982, 1996. [85] Jovan Golic. On the security of nonlinear filter generators. In D. Gollman, editor, Fast Software Encryption, FSE 1996, number 1039 in Lecture Notes in Computer Science, pages 173–188. Springer-Verlag, 1996. [86] Jovan Golic. Cryptanalysis of alleged A5 stream cipher. In W. Fumy, editor, Advances in Cryptology - EUROCRYPT 1997, number 1233 in Lecture Notes in Computer Science, pages 239–255. Springer-Verlag, 1997. [87] Jovan Golic, Andrew Clark, and Ed Dawson. Inversion attack and branching. In J. Pieprzyk, R. Safavi-Naini, and J. Seberry, editors, Australasian Conference on Information Security and Privacy, ACISP 1999, number 1587 in Lecture Notes in Computer Science, pages 88–102. Springer-Verlag, 1999. [88] Jovan Golic, Andrew Clark, and Ed Dawson. Generalized inversion attack on nonlinear filter generators. IEEE Transactions on Computers, 49(10):1100–1109, 2000. 160 Bibliography [89] Jovan Golic and Guglielmo Morgari. On the resynchronization attack. In T. Johansson, editor, Fast Software Encryption, FSE 2003, number 2887 in Lecture Notes in Computer Science, pages 100–110. Springer-Verlag, 2004. [90] Solomon Golomb. Shift Register Sequences. Aegean Park Press, 1981. ISBN 0-894-12048-4. [91] Guang Gong and Yassir Nawaz. The WG stream cipher. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/033, 2005. http://www.ecrypt.eu.org/ stream. [92] Kishan Gupta and Subhamoy Maitra. Multiples of primitive polynomials over GF (2). In C. Rangan and C. Ding, editors, Progress in Cryptology - INDOCRYPT 2001, number 2247 in Lecture Notes in Computer Science, pages 62–72. SpringerVerlag, 2001. [93] Philip Hawkes and Greg Rose. Primitive specification and supporting documentation for SOBER-t32 submission to nessie. In Proceedings of the 1st NESSIE Workshop, page 39, 2000. [94] Philip Hawkes and Gregory Rose. Exploiting multiples of the connection polynomial in word-oriented stream ciphers. In T. Okamoto, editor, Advances in Cryptology - ASIACRYPT 2000, number 1976 in Lecture Notes in Computer Science, pages 303–316. Springer-Verlag, 2000. [95] Philip Hawkes and Gregory Rose. Guess-and-determine attacks on SNOW. In K. Nyberg and H. Heys, editors, Selected Areas in Cryptography, SAC 2002, number 2595 in Lecture Notes in Computer Science, pages 37–46. Springer-Verlag, 2002. [96] Philip Hawkes and Gregory Rose. Rewriting variables: The complexity of fast algebraic attacks on stream ciphers. In M. Franklin, editor, Advances in Cryptology - CRYPTO 2004, number 3152 in Lecture Notes in Computer Science, pages 390–406. Springer-Verlag, 2004. [97] Martin Hell, Thomas Johansson, and Willi Meier. Grain - a stream cipher for constrained environments. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/010, 2005. http://www.ecrypt.eu.org/stream. [98] Martin Hellman. A cryptanalytic time-memory trade-off. IEEE Transactions on Information Theory, IT-26(4):401–406, 1980. [99] Tore Herlestam. On functions of linear shift register sequences. In F. Pichler, editor, Advances in Cryptology - EUROCRYPT 1985, number 219 in Lecture Notes in Computer Science, pages 119–129. Springer-Verlag, 1985. [100] Jonathan Hoch and Adi Shamir. Fault analysis of stream ciphers. In M. Joye and J. Quisquater, editors, Cryptographic Hardware and Embedded Systems CHES 2004, number 3156 in Lecture Notes in Computer Science, pages 240–253. Springer-Verlag, 2004. [101] Jin Hong and Palash Sarkar. New applications of time memory data tradeoffs. In B. Roy, editor, Advances in Cryptology - ASIACRYPT 2005, number 3788 in Lecture Notes in Computer Science, pages 353–372. Springer-Verlag, 2005. Bibliography 161 [102] Authentication Internet Security, Applications and Berkeley Cryptography, University of California. GSM cloning, 1998. http://www.isaac.cs.berkeley.edu/ isaac/gsm-faq.html. [103] Cees Jansen and Alexander Kolosha. Cascade jump controlled sequence generator (CJCSG)(Pomaranch). eSTREAM, ECRYPT Stream Cipher Project, Report 2005/022, 2005. http://www.ecrypt.eu.org/stream. [104] Antoine Joux and Frederic Muller. A chosen IV attack against Turing. In M. Matsui and R. Zuccherato, editors, Selected Areas in Cryptography, SAC 2003, number 3006 in LNCS, pages 194–207. Springer, 2003. [105] Antoine Joux and Frederic Muller. Chosen-ciphertext attacks against MOSQUITO. In M. Robshaw, editor, Fast Software Encryption, FSE 2006, Lecture Notes in Computer Science. Springer-Verlag, 2006. [106] Tadao Kasami. The weight enumerators for several classes of subcodes of the second order binary Reed-Muller codes. Information and Control, 18:369–394, 1971. [107] Alexander Klimov and Adi Shamir. New cryptographic primitives based on multiword T-functions. In B. Roy and W. Meier, editors, Fast Software Encryption, FSE 2004, number 3017 in Lecture Notes in Computer Science, pages 1–14. Springer-Verlag, 2004. [108] L. Knudsen, W. Meier, B. Preneel, V. Rijmen, and S. Verdoolaege. Analysis methods for (alleged) RC4. In K. Ohta and D. Pei, editors, Advances in Cryptology - ASIACRYPT 1998, number 1514 in Lecture Notes in Computer Science, pages 327–341. Springer-Verlag, 1998. [109] Paul Kocher, Joshua Jaffe, and Benjamin Jun. Differential power analysis. In M. Wiener, editor, Advances in Cryptology - CRYPTO 1999, number 1666 in Lecture Notes in Computer Science, pages 388–397. Springer-Verlag, 1999. [110] Matthias Krause. BDD-based cryptanalysis of keystream generators. In L. Knudsen, editor, Advances in Cryptology - EUROCRYPT 2002, number 2332 in Lecture Notes in Computer Science, pages 222–237. Springer-Verlag, 2002. [111] Hugo Krawczyk. LFSR-based hashing and authentication. In Y. Desmedt, editor, Advances in Cryptology - CRYPTO 1994, number 839 in Lecture Notes in Computer Science, pages 129–139. Springer-Verlag, 1994. [112] Joseph Lano, Nele Mentens, Bart Preneel, and Ingrid Verbauwhede. Power analysis of synchronous stream ciphers with resynchronization mechanism. In ECRYPT, editor, State of the Art of Stream Ciphers, pages 327–333, 2004. [113] David Lay. Linear Algebra and Its Applications. Addison-Wesley, 2003. ISBN 0-201-70970-8. [114] Dong Hoon Lee, Jaeheon Kim, Jin Hong, Jae Woo Han, and Dukjae Moon. Algebraic attacks on summation generators. In B. Roy, editor, Fast Software Encryption, FSE 2004, number 3017 in Lecture Notes in Computer Science, pages 34–48. Springer-Verlag, 2004. 162 Bibliography [115] Rudolf Lidl and Harald Niederreiter. Introduction to Finite Fields and their Applications. Cambridge, 2000. ISBN 0-521-46094-8. [116] Mikhail Lobanov. Tight bound between nonlinearity and algebraic immunity. Cryptology ePrint Archive, Report 2005/441, 2005. http://eprint.iacr.org/. [117] James Massey and Jimmy Omura. Computational method and apparatus for finite field arithmetic. US Patent No. 4, 587,627, 1986. [118] Mitsuru Matsui. Linear cryptanalysis method for DES cipher. In T. Helleseth, editor, Advances in Cryptology - EUROCRYPT 1993, number 765 in Lecture Notes in Computer Science, pages 386–397. Springer-Verlag, 1993. [119] Mitsuru Matsui. The first experimental cryptanalysis of the data encryption standard. In Y. Desmedt, editor, Advances in Cryptology - CRYPTO 1994, number 839 in Lecture Notes in Computer Science, pages 1–11. Springer-Verlag, 1994. [120] Mitsuru Matsui. On correlation between the order of S-boxes and the strength of DES. In A. De Santis, editor, Advances in Cryptology - EUROCRYPT 1994, number 950 in Lecture Notes in Computer Science, pages 366–375. SpringerVerlag, 1994. [121] Vince McLellan. Re: Securid token emulator, 2001. uni-stuttgart.de/archive/bugtraq/2001/01/msg00090.html. http://cert. [122] W. Meier and O. Staffelbach. Fast correlation attacks on stream ciphers. In C. Günther, editor, Advances in Cryptology - EUROCRYPT 1988, number 330 in Lecture Notes in Computer Science, pages 301–314. Springer-Verlag, 1988. [123] Willi Meier, Enes Pasalic, and Claude Carlet. Algebraic attacks and decomposition of Boolean functions. In C. Cachin and J. Camenisch, editors, Advances in Cryptology - EUROCRYPT 2004, volume 3027 of Lecture Notes in Computer Science, pages 474–491. Springer-Verlag, 2004. [124] Willi Meier and Othmar Staffelbach. The self-shrinking generator. In A. De Santis, editor, Advances in Cryptology - EUROCRYPT 1994, volume 950 of Lecture Notes in Computer Science, pages 205–214. Springer-Verlag, 1994. [125] Alfred Menezes, Paul van Oorschot, and Scott Vanstone. Handbook of Applied Cryptography. CRC Press, 1997. ISBN 0-8493-8523-7. [126] Miodrag Mihaljevic and Hideki Imai. Cryptanalysis of Toyocrypt-HS1 stream cipher. IEICE Transactions on Fundamentals, E85-A:66–73, 2002. [127] Havard Molland and Tor Helleseth. An improved correlation attack against irregular clocked and filtered keystream generators. In M. Franklin, editor, Advances in Cryptology - CRYPTO 2004, number 3152 in Lecture Notes in Computer Science, pages 373–389. Springer-Verlag, 2004. [128] Mudge and Kingpin. Initial cryptanalysis of the RSA SecurID algorithm, 2001. www.atstake.com/research/reports/acrobat/initial securid analysis.pdf. [129] Ronald Mullin, I. Onyszchuk, and Scott Vanstone. Optimal normal bases in GF(pn ). Discrete Applied Mathematics, 22:149–161, 1989. Bibliography 163 [130] National Institute of Standards and Technology. FIPS-46: Data Encryption Standard, January 1977. http://csrc.nist.gov/publications/fips. [131] National Institute of Standards and Technology. Advanced encryption standard. Federal Information Processing Standard (FIPS) 197, November 2001. http: //csrc.nist.gov/publications/fips. [132] National Institute of Standards and Technology. Recommendation for block cipher modes of operation - methods and techniques, 2001. NIST Special Publication 800-38A, http://csrc.nist.gov/CryptoToolkit/modes/. [133] NESSIE. New European Schemes for Signatures, Integrity, and Encryption, 20002003. http://www.cryptonessie.org. [134] Peng Ning and Yiqun Lisa Yin. Efficient software implementation for finite field multiplication in normal basis. In S. Qing, T. Okamoto, and J. Zhou, editors, Information and Communications Security, ICICS 2001, number 2229 in Lecture Notes in Computer Science, pages 177–188. Springer-Verlag, 2001. [135] Kaisa Nyberg. Differentially uniform mappings for cryptography. In T. Helleseth, editor, Eurocrypt 1993, volume 950 of Lecture Notes in Computer Science, pages 55–64. Springer-Verlag, 1993. [136] P. Oechslin. Making a faster cryptanalytic time-memory trade-off. In D. Boneh, editor, Advances in Cryptology - CRYPTO 2003, number 2729 in Lecture Notes in Computer Science, pages 617–630. Springer-Verlag, 2003. [137] Oystein Ore. On a special class of polynomials. Trans. Amer. Math.Soc., 35:559– 584, 1933. [138] Christopher Paar. Efficient VLSI Architectures for Bit-Parallel Computation in Galois Fields. Doctoral dissertation, Institute for Experimental Mathematics, University of Essen, Germany, 1994. [139] Marc-Antoine Parseval. Memoire sur les series et sur l’integration complete d’une equation aux differences partielle lineaires du second ordre, a coefficients constants, 1799. [140] Bart Preneel, Vincent Rijmen, and Antoon Bosselaers. Recent developments in the design of conventional cryptographic algorithms. In B. Preneel and V. Rijmen, editors, State of the Art in Applied Cryptography, number 1528 in Lecture Notes in Computer Science, pages 106–131. Springer-Verlag, 1998. [141] Bart Preneel and Paul van Oorschot. MDx-MAC and building fast MACs from hash functions. In D. Coppersmith, editor, Advances in Cryptology - CRYPTO 1995, number 963 in Lecture Notes in Computer Science, pages 1–14. SpringerVerlag, 1995. [142] Bart Preneel and Paul van Oorschot. On the security of two MAC algorithms. In U. Maurer, editor, Advances in Cryptology - EUROCRYPT 1996, number 1070 in Lecture Notes in Computer Science, pages 19–32. Springer-Verlag, 1996. [143] Arash Reyhani-Masoleh and Anwar Hasan. Fast normal basis multiplication using general purpose processors. IEEE Transaction on Computers, 52(3):1379–1390, 2003. 164 Bibliography [144] Terry Ritter. Measuring boolean function nonlinearity by Walsh transform, 1998. http://www.ciphersbyritter.com/ARTS/MEASNONL.HTM. [145] Greg Rose and Philip Hawkes. On the applicability of distinguishing attacks against stream ciphers. In Proceedings of the 3rd NESSIE Workshop, page 6, 2002. [146] Oscar Rothaus. On bent functions. Journal of Combinatorial Theory (A), 20:300– 305, 1976. [147] RSA. RSA security web site, 2003. release.asp?doc id=1543&id=1034. http://www.rsasecurity.com/press [148] Rainer Rueppel. Correlation immunity and the summation generator. In H. Williams, editor, Advances in Cryptology - CRYPTO 1985, number 218 in Lecture Notes in Computer Science, pages 260–272. Springer-Verlag, 1985. [149] Rainer Rueppel. Stream ciphers. In G. Simmons, editor, Contemporary Cryptology. The Science of Information Integrity, pages 65–134. IEEE Press, 1991. [150] Markku-Juhani Saarinen. Re: Bluetooth und E0. sci.crypt post, 2000. http://groups.google.be/groups?hl=nl&lr=&frame=right&th= f1962bbd829be9d8&seekm=pgpmoose.200002041140.28114\%40servo#link2. [151] Markku-Juhani Saarinen. A software implementation of the Bluetooth encryption algorithm E0, 2000. http://web.archive.org/web/20041016145100/http: //www.cc.jyu.fi/∼mjos/e0.c. [152] Marcus Schafheutle. First report on the stream ciphers SOBER-t16 and SOBERt32. NESSIE document NES/DOC/SAG/WP3/025/02, 2001. http://www. cryptonessie.org. [153] Bruce Schneier. Applied Cryptography - Protocols, Algorithms, and Source code in C. John Wiley & Sons, Inc., 2nd edition, 1996. ISBN 0-471-12845-7 or 0-47111709-9. [154] Claude Shannon. Communication theory of secrecy systems. Bell System Technical Journal, 28-4:656–715, 1949. [155] Thomas Siegenthaler. Correlation-immunity of nonlinear combining functions for cryptographic applications. IEEE Transactions on Information Theory, IT30(5):776–780, 1984. [156] Thomas Siegenthaler. Decrypting a class of stream ciphers using ciphertext only. IEEE Transactions on Computers, C-34(1):81–85, January 1985. [157] Bluetooth S.I.G. Specification of the Bluetooth system, v. 1.2, 2003. http: //www.bluetooth.org/spec. [158] Volker Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 13:354–356, 1969. [159] Gilbert Vernam. Cipher printing telegraph systems for secret wire and radio telegraphic communications. Journal American Institute of Electrical Engineers, 55:109–115, 1926. Bibliography 165 [160] Dai Watanabe, Alex Biryukov, and Christophe De Cannière. A distinguishing attack of SNOW 2.0 with linear masking method. In M. Matsui and R. Zuccherato, editors, Selected Areas in Cryptography, SAC 2003, number 3006 in Lecture Notes in Computer Science, pages 222–233. Springer-Verlag, 2003. [161] Doug Whiting, Bruce Schneier, Stefan Lucks, and Frederic Muller. Phelix - fast encryption and authentication in a single cryptographic primitive. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/020, 2005. http://www.ecrypt. eu.org/stream. [162] I.C. Wiener. Sample SecurID token emulator with token secret import. post to BugTraq, 2000. http://archives.neohapsis.com/archives/bugtraq/2000-12/ 0428.html. [163] Michael Wiener. The full cost of cryptanalytic attacks. Journal of Cryptology, 17(2):105–124, 2004. [164] Wikipedia. Golomb ruler, 2005. http://en.wikipedia.org/wiki/Golomb ruler. [165] Hirotaka Yoshida, Alex Biryukov, Christophe De Cannière, Joseph Lano, and Bart Preneel. Non-randomness of the full 4 and 5-pass HAVAL. In C. Blundo and S. Simato, editors, Security in Communication Networks, SCN 2004, number 3352 in Lecture Notes in Computer Science, pages 324–336. Springer-Verlag, 2004. [166] Kencheng Zeng, Chung-Huang Yang, and T. R. N. Rao. On the linear consistency test (LCT) in cryptanalysis with applications. In G. Brassard, editor, Advances in Cryptology - CRYPTO 1989, number 435 in Lecture Notes in Computer Science, pages 164–174. Springer-Verlag, 1989. [167] Yulian Zheng and Xian-Mo Zhang. On plateaued functions. IEEE Transactions on Information Theory, 47(3):1215–1223, 2001. Samenvatting : Cryptanalyse en Ontwerp van Synchrone Stroomcijfers Hoofdstuk 1: Inleiding De voorbije jaren waren we getuige van een enorme groei in de digitale opslag en overdracht van informatie. Dit werd versterkt door belangrijke doorbraken zoals het internet en de toename van draadloze communicatie. Deze nieuwe informatieen communicatietechnologieën hebben nood aan beveiliging. Cryptologie is de wetenschap die poogt om de digitale wereld te voorzien van informatiebeveiliging. Dit omvat heel verscheidene vereisten, onder meer de geheimhouding en de authenticiteit van de informatie. Cryptologie wordt meestal onderverdeeld in twee deelgebieden, cryptografie en cryptanalyse. Cryptografie bestudeert het ontwerp van algoritmen en protocollen voor informatiebeveiliging. Gezien het meestal onmogelijk is om bewijsbaar veilige systemen te ontwerpen, dient men de veiligheid te testen door aanvallen uit te voeren. De studie van deze aanvalsmodellen is het domein van de cryptanalyse. In deze thesis bestuderen we stroomcijfers, die de efficiënte geheimhouding van informatie mogelijk maken. De aandacht gaat voornamelijk uit naar synchrone stroomcijfers aangezien zij de meest interessante eigenschappen blijken te hebben. Deze thesis is als volgt opgebouwd: • Hoofdstuk 2 bespreekt enkele wiskundige concepten die in deze thesis zullen worden gebruikt. Dit omvast lineaire doorschuifregisters, Booleaanse functies, S-boxen, blokcijfers, stroomcijfers en cryptanalytische methodieken. • Hoofdstuk 3 stelt resultaten voor met betrekking tot de lineaire cryptanalyse van synchrone stroomcijfers. Dit omvat een algemene beschrijving van 167 de techniek, toepassing op de niet-lineaire filter- en combinatiegeneratoren, en een aanval op het stroomcijfer Sober-t32. • Hoofstuk 4 bestudeert algebraı̈sche aanvallen. Er wordt een overzicht van de aanvallen gegeven, en nieuwe algoritmen en een theoretisch raamwerk om snelle algebraı̈sche aanvallen te weerstaan worden voorgesteld. • Hoofdstuk 5 bespreekt structurele aanvallen. De impact van trade-off aanvallen wordt bestudeerd, en enkele andere aanvalsmodellen worden behandeld. Er wordt ook een Gok-en-Determineer aanval op Sober-t32 beschreven. • Hoofdstuk 6 stelt de resynchronizatieaanval voor. Het oorspronkelijke model van de aanval wordt uitgebreid en verfijnd, en de aanval wordt gecombineerd met verscheidene aanvalsmodellen uit de voorgaande hoofdstukken. • Hoofdstuk 7 geeft een overzicht van implementatieaspecten. Zowel de efficiëntie als de veiligheid van de implementatie komen aan bod. • Hoofdstuk 8 bespreekt de toepassing van de cryptanalytische methodieken op een bestaand algoritme, met name de Alleged SecurID Hash Function. • Hoofstuk 9 vat de resultaten van deze thesis samen, en omvat conclusies en open problemen die interessant zijn voor verder onderzoek. Hoofdstuk 2: Basisbegrippen Dit hoofdstuk beschrijft de basisbegrippen die nodig zijn om de thesis te begrijpen. Ten eerste beschrijven we de bouwblokken voor stroomcijfers. Dit omvat een beschrijving van lineaire doorschuifregisters (LFSRs), een bouwblok dat efficiënt getalreeksen met goede statistische eigenschappen en een lange periode kan produceren. De voornaamste representaties en eigenschappen van Booleaanse functies (afbeeldingen van ϕ naar 1 bits) en van S-Boxen (afbeeldingen van n naar m bits) worden behandeld. Dit omvat essentiële cryptografische eigenschappen zoals graad, niet-lineariteit, resiliëntie en algebraı̈sche immuniteit. Vervolgens worden symmetrische algoritmen besproken. Dit omvat blokcijfers, stroomcijfers en MAC algoritmen. Uiteraard gaat de meeste aandacht uit naar de synchrone stroomcijfers, die het hoofdonderwerp van deze thesis vormen. De definities en voordelen van synchrone stroomcijfers worden voorgesteld. Tenslotte worden de drie meest gebruikte cryptanalytische technieke geı̈ntroduceerd. Dit zijn differentiële, lineaire en algebraı̈sche cryptanalyse. Deze drie methodes zullen de basis vormen voor het cryptanalytisch raamwerk voor stroomcijfers dat zal worden voorgesteld in de volgende vier hoofdstukken. Hoofdstuk 3: Lineaire Aanvallen Dit hoofstuk behandelt lineaire cryptanalyse. Deze techniek bestaan erin om alle niet-lineaire bouwblokken van het algoritme lineair te benaderen, en dit te gebruiken om het algoritme aan te vallen. Deze techniek werd vooral bekend door de toepassing op blokcijfers door Matsui, maar is ook een heel krachtige methode van cryptanalyse voor andere algoritmen. Voor stroomcijfers gebruiken we het lineaire model om een onderscheidingsaanval uit te voeren of de sleutel van het stroomcijfer te vinden. Het hoofdstuk begint met de voorstelling van een algemeen model voor lineaire cryptanalyse van stroomcijfers. Vervolgens wordt dit model toegepast op twee veel gebruikte synchrone stroomcijfers, met name de niet-lineaire filter- en combinatiegeneratoren. De impact van lineaire cryptanalyse op deze twee types synchrone stroomcijfers wordt geanalyseerd, en we argumenteren waarom de filtergenerator beter geschikt is om de aanval te weerstaan dan de combinatiegenerator. Een tweede toepassing van lineaire cryptanalyse in dit hoofdstuk is een onderscheidingsaanval op de NESSIE kandidaat Sober-t32. Er wordt getoond hoe de aanvallen van Ekdahl en Johansson kunnen worden uitgebreid tot een aanval op Sober-t32 inclusief de stotteraar die deel uitmaakt van het algoritme. Hoofdstuk 4: Algebraı̈sche Aanvallen Algebraı̈sche cryptanalyse is een methode die recentelijk werd (her)ontdekt, en die voornamelijk op de veiligheid van synchrone stroomcijfers met lineair variërende toestand een belangrijke impact heeft. Het basisidee van de aanval is heel eenvoudig. Elke bit van de sleutelstroom kan worden geschreven als een niet-lineaire functie van de begintoestand. De begintoestand kan vervolgens worden berekend door dit stelsel op te lossen. Het oplossen van dit stelsel is vaak heel moeilijk, maar algebraı̈sche aanvallen slagen erin dit stelsel om te vormen tot een eenvoudiger probleem. Bij algebraı̈sche aanvallen wordt dit gedaan door een annihilator van de Booleaanse uitgangsfunctie te zoeken, bij snelle algebraı̈sche aanvallen wordt dit bereikt door te zoeken naar zogeheten (dg , dh )−relaties. In beide gevallen zijn het dus de eigenschappen van de Booleaanse uitgangsfunctie die de weerstand van het stroomcijfer tegen algebraı̈sche aanvallen bepalen. Het is dus van groot belang om deze eigenschappen te kunnen analyseren. Hiertoe stellen we een nieuw algoritme voor dat op efficiënte wijze het bestaan van (dg , dh )−relaties verifieert. Dit wordt gedaan door het bestaande algoritme voor het zoeken van annihilatoren vereenvoudigd te beschrijven en aan te passen voor het geval van (dg , dh )−relaties. 169 Naast het verifiëren van de eigenschappen van een specifieke functie in de praktijk, is het ook van belang om de theoretische verbanden tussen de eigenschappen gerelateerd met algebraı̈sche aanvallen en andere cryptografische eigenschappen te begrijpen. Daartoe bespreken we het verband tussen annihilatoren en (dg , dh )−relaties enerzijds en het aantal ingangen, de niet-lineariteit en de graad van de Booleaanse uitgangsfunctie anderzijds. Tenslotte illustreren we onze aanpak met de algebraı̈sche cryptanalyse van twee eSTREAM kandidaten, SFINKS en WG. We tonen de impact van zowel de praktische berekening als van de theoretische grenzen op de veiligheid van de algoritmen. Hoofdstuk 5: Structurele Aanvallen In dit hoofdstuk bespreken we een derde soort aanvallen op stroomcijfers. Deze maken gebruik van de grootte van de bouwblokken van het stroomcijfer en de manier waarop die bouwblokken geconnecteerd zijn. In een eerste deel van dit hoofstuk bespreken we drie soorten structurele aanvallen. Een eerste zijn de trade-off aanvallen. We bespreken het klassieke scenario voor trade-off aanvallen op een enkele sleutelstroom, en ook het scenario waarbij meerdere sleutels tegelijk worden aangevallen. De impact van de tradeoff aanvallen wordt geı̈llustreerd voor de eSTREAM primitieven, en er wordt besproken hoe de impact van de trade-off aanvallen kan worden gereduceerd. Een tweede klasse aanvallen die wordt besproken zijn de inversie-aanvallen, en een derde klasse zijn de zogeheten Gok-en-Determineer aanvallen. In het tweede deel van dit hoofdstuk wordt een voorbeeld van een Gok-enDetermineer aanval besproken. We stellen een Gok-en-Determineer aanval op de NESSIE kandidaat Sober-t32 voor. Hoofdstuk 6: Resynchronizatie-Aanvallen In dit hoofdstuk worden resynchronizatie-aanvallen op stroomcijfers besproken. Bij dit soort aanvallen wordt niet langer een enkele sleutelstroom gebruikt om de aanval uit te voeren, maar worden meerdere sleutelstromen, elk met dezelfde sleutel maar met een verschillende IV , gebruikt. Eerst wordt de oorspronkelijke aanval van Daemen et al. voorgesteld en wordt het raamwerk voor resynchronizatieaanvallen verduidelijkt. Vervolgens worden enkele tekortkomingen van de oorspronkelijke aanval weggenomen. Een eerste verbetering maakt het mogelijk om de aanval in reële tijd uit te voeren, wat actieve aanvallen mogelijk maakt. Een tweede verbetering stelt de aanvaller in staat om de aanval uit te voeren met een kleiner aantal resynchronizaties. Dit wordt bereikt door een computationele aanpak of door het gebruik van algebraı̈sche technieken. Een derde verbetering maakt het ook mogelijk om stroomcijfers die heel grote Booleaanse functies gebruiken aan te vallen. Hier maken we eveneens gebruik van methodes uit de algebraı̈sche en lineaire cryptanalyse. De uitbreiding van de aanval tot stroomcijfers met geheugen is een vierde verbetering, en ten vijfde tonen we ook hoe een resynchronizatie-aanval kan worden uitgevoerd op stroomcijfers met een niet-lineair resynchronizatiemechanisme. Ter illustratie passen we onze nieuwe methodiek toe op een aantal stroomcijfers, met name E0, Toyocrypt en de sommatie-generator. Hoofdstuk 7: Implementatie van Stroomcijfers In de vorige hoofdstukken werd de veiligheid van stroomcijfers louter theoretisch behandeld. In dit hoofdstuk wordt de praktische kant van stroomcijfers behandeld, met name de implementatie van het algoritme in een werkend geheel. In een eerste deel van het hoofdstuk behandelen we efficiënte implementatie van stroomcijfers in hardware. Eerst introduceren we de belangrijkste criteria waaraan we de implementatie dienen te toetsen, namelijk de oppervlakte, de performantie en het energie-verbruik. Vervolgens wordt het vinden van een efficiënte en veilige Booleaanse functie voor een filtergenerator behandeld. De symmetrische functies en de machtsfuncties worden behandeld, en hun voor- en nadelen worden besproken. In een tweede deel van het hoofdstuk wordt de veilige implementatie van stroomcijfers in hardware besproken. We introduceren differentiële vermogensanalyse en tonen aan dat de techniek ook kan worden toegepast op synchrone stroomcijfers met resynchronizatiemechanisme. Dit wordt geı̈llustreerd door toepassing op twee stroomcijfers, met name het GSM stroomcijfer A5/1 en het Bluetooth stroomcijfer E0. Hoofdstuk 8: Cryptanalyse van ASHF In dit hoofdstuk bespreken we de cryptanalyse van ASHF, Alleged SecurID Hash Function. SecurID is een algoritme dat in vele duizenden bedrijven wordt gebruikt om de gebruikers te authentiseren. De aanval die wordt voorgesteld is de eerste praktische aanval op dit systeem. Vooreerst wordt de beschrijving van ASHF gegeven, en worden de klassieke cryptanalytische technieken op ASHF uitgeprobeerd. Een eerste aanval maakt gebruik van botsingen in het algoritme en geeft ons een krachtige aanval op het onderliggende blokcijfer. Die aanval wordt vervolgens uitgebreid tot de volledi171 ge ASHF door de aanval te combineren met andere cryptanalytische technieken. Vervolgens wordt uiteengezet hoe de cryptanalyse kan gebruikt worden om aanvallen uit te voeren. We behandelen aanvallen die de geheime sleutel willen vinden en ook aanvallen die enkel netwerk-intrusie op het oog hebben. De conclusie luidt dat ASHF aan vervanging toe is. Hoofdstuk 9: Besluiten en Open Problemen In deze thesis werden de cryptanalyse en het ontwerp van synchrone stroomcijfers bestudeerd. Stroomcijfers zijn een veel bestudeerde en gebruikte klasse van encryptie-algoritmen; synchrone stroomcijfers bieden de beste combinatie van veiligheid en efficiëntie. De cryptanalyse van synchrone stroomcijfers werd benaderd door een grondige analyse van de wiskundige eigenschappen van de bouwblokken van het algoritme. Deze analyse dient ontwerpers ertoe aan te zetten om stroomcijfers te bouwen die gebruik maken van eenvoudige wiskundige bouwblokken, waarvan men eenvoudig de weerstand tegen cryptanalytische aanvallen kan nagaan. De eerste vier hoofdstukken van de thesis behandelen cryptanalytische aanvallen. • Lineaire cryptanalyse is een heel belangrijke methode bij de analyse van stroomcijfers. We hebben aangetoond op welke wijze veel aanvallen in het raamwerk van lineaire cryptanalyse passen, en we hebben de weerstand van de niet-lineaire filter en combinator geëvalueerd tegen lineaire cryptanalyse. • De recente algebraı̈sche en snelle algebraı̈sche aanvallen hebben een belangrijke impact gehad op de cryptanalyse van stroomcijfers met een lineair variërende toestand. Een coherent raamwerk voor deze aanvallen werd voorgesteld, en de eigenschappen waaraan de bouwblokken van het stroomcijfer dienen te voldoen om de aanval te weerstaan werden gekwantificeerd. Om de weerstand van stroomcijfers tegen snelle algebraı̈sche aanvallen te evalueren, werd een nieuw en efficiënt algoritme voorgesteld. • Tevens dienen veel structurele eigenschappen van het stroomcijfers in rekening te worden gebracht. Deze aanvalsmethodes werden beschreven en hun impact werd geëvalueerd, en dit met name voor trade-off aanvallen, inversie-aanvallen en Gok-en-Determineer aanvallen. • Resynchronizatie-aanvallen zijn een vierde belangrijke klasse aanvallen, die tot voor kort weinig bestudeerd werden. We hebben het raamwerk voor deze aanvallen grondig uitgebreid. Dit werd bereikt door het oorspronkelijke raamwerk te verfijnen en het te combineren met andere aanvalsmethodes. Dit maakt het mogelijk om te evalueren welke eigenschappen nodig zijn om de aanvallen te weerstaan. Aansluitend op de behandeling van deze cryptanalytische resultaten, werd vervolgens uiteengezet hoe synchrone stroomcijfers op efficiënte en veilige wijze kunnen worden geı̈mplementeerd. Ter illustratie werd de zoektocht naar efficiënte en veilige Booleaanse functies voor de filtergenerator behandeld. Tevens werd de problematiek van de veilige implementatie van stroomcijfers behandeld, waarbij differentiële vermogensanalyse op stroomcijfers geı̈ntroduceerd werd. Ter illustratie van onze cryptografische aanpak werden enkele praktische toepassingen van onze analyse behandeld. Dit omvat cryptanalyse van de NESSIE kandidaat Sober-t32, de eSTREAM kandidaten SFINKS en WG, het Bluetooth stroomcijfer E0, het GSM stroomcijfer A5/1 en de Alleged SecurID Hash Function. In het ontwerp van veilige en efficin̈te synchrone stroomcijfers zijn er nog steeds vele open problemen. In de jaren 90 was het onderzoek naar stroomcijfers veel minder intens dan het onderzoek naar blokcijfers. Gedurende de voorbije jaren is de interesse van de onderzoekswereld in stroomcijfers echter sterk toegenomen, onder impuls van de NESSIE en eSTREAM projecten. We mogen dus hopen dat veel open problemen in de komende jaren zullen worden opgelost, en dat er een eenmaking komt in het ontwerp van synchrone stroomcijfers. Enkele interessante open problemen zijn de volgende: • Er is een tekort aan goede Booleaanse functies voor de filtergenerator, die zowel efficiënt zijn als weerstand bieden tegen de cryptanalytische aanvallen die in deze thesis werden beschreven, in het bijzonder de algebraı̈sche en snelle algebraı̈sche aanvallen. Gezien er nu efficiënte algoritmen voorhanden zijn om de weerstand tegen deze aanvallen te evalueren, wordt het mogelijk om zulke functies te vinden. • De klassieke stroomcijferontwerpen genereren slecht één bit sleutelstroom per klok. Dit beperkt de efficiëntie van deze algoritmen. Een efficiënter ontwerp zou een (n, m) S-box gebruiken. Ook het vinden van een veilige en efficiënte S-box is een open probleem, hoewel de algoritmen om deze bouwblokken te analyzeren nu beschikbaar zijn. • Om de veiligheid van synchrone stroomcijfers volledig te analyseren, dient de combinatie van de wiskundige en de structurele aanvallen verder onderzocht te worden. Dit komt overeen met de studie van de zogeheten geaugmenteerde functie, die de interne toestand relateert met de sleutelstroom. Dit concept werd reeds geı̈ntroduceerd in 1994 door Anderson, maar kan ook gebruikt worden voor het verder bestuderen van lineaire en snelle algebraı̈sche aanvallen. • De studie van synchrone stroomcijfers met niet-lineair variërende toestand werd relatief weinig bestudeerd, en heel diverse strategieën bestaan hier173 voor. We denken dat de introductie van meer niet-lineariteit in de toestand belangrijke voordelen kan hebben voor de veiligheid van het stroomcijfer, en het is bijgevolg belangrijk om dergelijke stroomcijfers te bestuderen. Het eSTREAM project kan de aandacht vestigen op enkele interessante ontwerpen. Vooral de aanpak van Trivium en Grain lijkt ons interessant, maar verdere studie is absoluut noodzakelijk. • Nevenkanaalaanvallen op stroomcijfers hebben slechts weinig aandacht gekregen. We denken dat dit meer aandacht zal krijgen in de nabije toekomst, eens de cryptografische gemeenschap zich focust op enkele veelbelovende ontwerpen. Vermoedelijk zullen dan ook methoden om nevenkanaalaanvallen te weerstaan bestudeerd worden. List of Publications Lecture Notes in Computer Science 1. Steve Babbage, Christophe De Cannière, Joseph Lano, Bart Preneel, and Joos Vandewalle. Cryptanalysis of SOBER-t32. In T. Johansson, editor, Fast Software Encryption, FSE 2003, number 2887 in Lecture Notes in Computer Science, pages 111–128. Springer-Verlag, 2003. 2. Alex Biryukov, Joseph Lano, and Bart Preneel. Cryptanalysis of the alleged SecurID hash function. In M. Matsui and R. Zuccherato, editors, Selected Areas in Cryptography, SAC 2003, number 3006 in Lecture Notes in Computer Science, pages 130–144. Springer-Verlag, 2003. 3. Hirotaka Yoshida, Alex Biryukov, Christophe De Cannière, Joseph Lano, and Bart Preneel. Non-randomness of the full 4 and 5-pass HAVAL. In C. Blundo and S. Simato, editors, Security in Communication Networks, SCN 2004, number 3352 in Lecture Notes in Computer Science, pages 324– 336. Springer-Verlag, 2004. 4. Frederik Armknecht, Joseph Lano, and Bart Preneel. Extending the resynchronization attack. In H. Handschuh and A. Hasan, editors, Selected Areas in Cryptography, SAC 2004, number 3357 in Lecture Notes in Computer Science, pages 19–38. Springer-Verlag, 2004. 5. An Braeken and Joseph Lano. On the (im)possibility of practical and secure nonlinear filters and combiners. In B. Preneel and S. Tavares, editors, Selected Areas in Cryptography, SAC 2005, number 3897 in Lecture Notes in Computer Science, pages 159–174. Springer-Verlag, 2005. 6. An Braeken, Joseph Lano and Bart Preneel. Evaluating the resistance of stream ciphers with linear feedback against fast algebraic attacks. In L. Batten and R. Safavi-Naini, editors, Australasian Conference on Information Security and Privacy, ACISP 2006, number 4058 in Lecture Notes in Computer Science, pages 40–51. Springer-Verlag, 2006. 175 Other International Journals 1. Alex Biryukov, Joseph Lano, and Bart Preneel. Recent attacks on alleged SecurID and their practical implications. Computers and Security, 24(5):364–370, 2005. 2. Christophe De Cannière, Joseph Lano, and Bart Preneel. Cryptanalysis of the two-dimensional circulation encryption algorithm (TDCEA). EURASIP Journal on Applied Signal Processing, 2005(12):1923–1927, 2005. International Conferences/Workshops 1. Steve Babbage and Joseph Lano. A probabilistic factor in the SOBER-t stream ciphers. In Proceedings of the 3rd NESSIE Workshop, 12 pages, 2002. 2. Christophe De Cannière, Joseph Lano, Bart Preneel, and Joos Vandewalle. Distinguishing attacks on SOBER-t32. In Proceedings of the 3rd NESSIE Workshop, 14 pages, 2002. 3. Joseph Lano, Nele Mentens, Bart Preneel, and Ingrid Verbauwhede. Power analysis of synchronous stream ciphers with resynchronization mechanism. In State of the Art of Stream Ciphers, pages 327–333, 2004. 4. Lejla Batina, Joseph Lano, Nele Mentens, Siddika Berna Ors, Bart Preneel, and Ingrid Verbauwhede. Energy, performance, area versus security tradeoffs for stream ciphers. In State of the Art of Stream Ciphers, pages 302– 310, 2004. 5. An Braeken, Joseph Lano, Nele Mentens, Bart Preneel, and Ingrid Verbauwhede. SFINKS: A synchronous stream cipher for restricted hardware environments. In Symmetric Key Encryption Workshop, 19 pages, 2004. 6. Joan Daemen, Joseph Lano and Bart Preneel Chosen Ciphertext Attack on SSS. In SASC 2006 - Stream Ciphers Revisited, pages 45–51, 2006. 7. Lejla Batina, Sandeep Kumar, Joseph Lano, Kerstin Lemke, Nele Mentens, Christof Paar, Bart Preneel, Kazuo Sakiyama and Ingrid Verbauwhede. Testing Framework for eSTREAM Profile II Candidates. In SASC 2006 Stream Ciphers Revisited, pages 104–112, 2006. Miscellaneous 1. Alex Biryukov, Christophe De Cannière, Joseph Lano, Siddika Berna Ors, and Bart Preneel. Security and performance analysis of ARIA. COSIC Report, 2004. 2. Christophe De Cannière, Joseph Lano, and Bart Preneel. Comments on the rediscovery of time memory data tradeoffs. eSTREAM, ECRYPT Stream Cipher Project, Report 2005/040, 2005. http://ecrypt.eu.org/stream. 3. An Braeken and Joseph Lano. On the Security of Nonlinear LFSR-Based Filters and Combiners. Academy Contact Forum on Coding Theory and Cryptography, 2005. 177 Joseph Lano was born in Kortrijk (Belgium), on December 24th, 1979. He received the Master’s Degree of Electromechanical Engineering, option Electronics, from the Katholieke Universiteit Leuven in July 2002. He started working as a PhD researcher in the COSIC (COmputer Security and Industrial Cryptography) group of the Department of Electrical Engineering - ESAT of the Katholieke Universiteit Leuven in September 2002. His research is financed by a Ph.D. grant of the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT-Vlaanderen).