Database Security - smis
Transcription
Database Security - smis
Outline Database Security • Are we paranoids ? (no!) • Cryptographic tools and protocols Luc Bouganim Basic primitives – Encryption – Cryptographic hash Combining cryptographic primitives (correctly) SMIS project Secured and Mobile Information Systems INRIA, Paris-Rocquencourt - France • Overview of database specific tools [email protected] Slides from: mainly from L. Bouganim, P. Pucheral, A. Canteaut 1 Cryptography: the characters Alice: (A) Bob: (B) She wants to communicate with Bob! He wants to communicate with Alice! 2 Passive attacks Active attacks Charlie has access to unauthorized data Charlie modifies the transmitted data ALICE BOB MARVIN Marvin: (M) Trent: (T) Marvin is Malicious Trent is Trusted 3 ALICE BOB MARVIN 4 Passive and active attacks Symmetric encryption • Passive attacks: threaten confidentiality • Active attacks: threaten integrity • Sender and receiver share a secret allowing symmetric encryption and decryption Identity theft – Marvin sends a message to Bob with Alice IDs Data alteration – Marvin modifies the content of Alice message Shared secret – Marvin captures a message and sent it to Bob several times m Repudiation ALICE – Alice sends a message to Bob and denies having sent this message c c Decryption – Marvin forges a fake message for Bob or Alice Replay attack Encryption Message forging m BOB Destruction – Marvin destroys selectively some messages sent to Bob Delays / reordering – Marvin introduces communication delays or reorders messages MARVIN 5 6 Kerckhoffs' principle In 1883, the most famous work by Auguste Kerckhoffs was published: La Cryptographie Militaire (military cryptography). This book set forth desiderata for encryption systems… The encryption system must not be required to be secret, and it must be able to fall into the hands of the enemy without inconvenience; The system must be practically, if not mathematically, indecipherable; 7 8 Symmetric encryption: Secret key encryption • All the details of the system, including the encryption and decryption functions are known, except the key • Security is only based on the secrecy of the key K • The attacker knows the ciphertext c ÎFind the plaintext m, or better the key K c c c = EK(m) Decryption m Encryption Shared secret: key K ALICE Attacking a symmetric cipher • The attacker knows couples plaintext/ciphertext (m, c) ÎFind the key K, or at least be able to decrypt other messages m BOB m = DK(c) = DK(EK(m)) MARVIN 9 10 Attacking a symmetric cipher (2) Attacking an simple alphabetic substitution • • Assume a random alphabetic substitution such as • Can we easily retrieve the key (i.e., the table) knowing a ciphertext ? Example : Caesar cipher (used for military purposes by Julius Caesar) Plaintext: attackatonce Ciphertext: exxegoexsrgi Î Very easy to attack ! • Given the small number of possible shifts (26), the key can be found in 13 operations, in average… (NB: French message) The problem here is the size of the key! (less than 5 bits !) 11 12 (1) Frequency analysis: a single letter (2) Frequency analysis: bigrams In the cipher text In the cipher text In French In French In the cipher text Letter frequency in French In French Conclusion 13 (3) Using some common words 14 (4) and more common words… soulent LÆV dommes DÆH ehuipage HÆQ aleatros EÆB mompagnons MÆC etc. 15 16 Finally ! Simple alphabetic substitution The simple alphabetic substitution cipher is: • Class of Cipher: Block cipher The message is encrypted by blocks of fixed size (n bits) • Mode of operation: ECB Electronic Code Book The message is split in n bits blocks, each encrypted separately • Why the attack was so easy? • The algorithm is too weak? The encryption key is too short? The encryption block is too small? Other reasons? Let’s check stronger algorithms 17 18 Some symmetric block ciphers Attacks on DES, 3DES and AES • • DES • Data Encryption Standard (1976 - 1997) Encrypts 64 bits blocks Encryption and Decryption are the same algorithms The encryption key is 56 bits 1997: 39 days on 10 000 Pentium 1998: Deep Crack breaks a DES key in 56 hours (250 000 US$) 2007: 6.4 days on a $10,000 parallel machine • 3DES • AES Only Side Channel Attacks were successful on AES See http://www.cryptosystem.net/aes/ for more information AES is computationally secure (now) RIJNDAEL (AES) 3DES The best attack known on 3-key 3DES requires around 232 known plaintexts, 2113 steps, 290 single DES encryptions, and 288 memory !! 3DES is computationally secure (now) Used as a replacement for DES between 1997 and 2001 Uses three 56 bits keys tripleDES(k1k2k3, M) = DES(k3,DES(k2,DES(k1,M))) • DES Used since 2001 and Encryption standard since 2002 Authors: Joan Daemen and Vincent Rijmen Î Rijndael Winners of the AES competition 128 bits blocks with keys of 128, 192 or 256 bits Fast and requires little memory Î Thus, can I use 3DES or AES without problems ? 19 20 Unconditional vs Computational security • Triple DES encryption with ECB mode … Unconditional security • Can I use 3DES or AES without problems ? “the uncertainty in the plaintext, after observing the ciphertext, must be equal to the a priori uncertainty about the plaintext – observation of the ciphertext provides no information whatsoever to an adversary” [Menezes] Î The unique possible attack is exhaustive key search To reach unconditional security, the secret key must be as long as the plaintext ! [Shannon 49] • Computational security “A proposed technique is said to be computationally secure if the perceived level of computation required to defeat it (using the best attack known) exceeds, by a comfortable margin, the computational resources of the hypothesized adversary.” [Menezes] NO! 21 Encryption mode: Electronic Code Book-ECB 22 Triple DES encryption with CBC mode … • Can I use 3DES or AES without problems ? by tes 0 8 16 Plain-text 1 (P1) Plain-text 2 (P2) EK (P1) EK (P2) Cipher-text 1 (C1) Cipher-text 2 (C2) … • Yes, if I take care! … 23 24 Cipher Block Chaining (CBC) Other modes of operations • Error propagation Partial decryption Etc. by tes 0 8 Plain-text 1 (P1) 16 Plain-text 2 (P2) … • • Init. Vector (IV) EK (IV ⊕ P1) EK (C1 ⊕ P2) Several modes of operations exists, the adequate one depends on the desired properties of the ciphertext EK (…) See [Menezes] pp 288 Ex. The counter mode by tes 0 Cipher-text 1 (C1) Cipher-text 2 (C2) 8 … Plain-text 1 (P1) +1 EK (IV+1) ⊕ P1 16 Plain-text 2 (P2) +2 EK (IV+2) ⊕ P2 … +3 Init. Vector (IV) Cipher-text 1 (C1) Cipher-text 2 (C2) … 25 Asymmetric (Public key) encryption • 26 Secret Key vs Public/Private keys Alice & Bob have the key • Alice use the key to deposit a message into the safe Alice uses the public key (Bob’s address) to send a message to Bob • Bobs uses his private key (the mailbox key) to retrieve the message Each user has a public key, publicly available on a directory, a private key kept secret No need for a shared secret m ALICE • • Private key KPriv (bob) c c Decryption Public key KPub (bob) Encryption • • m BOB Î Only Alice & Bob can exchange messages m = DKPriv(c) = DKPriv(EKPub(m)) c = EKPub(m) MARVIN Bob use the key to retrieve the message from the safe ÎAnyone can send a message to Bob (his address being public) ÎOnly Bob can retrieve the messages 27 28 The RSA asymmetric cipher [Rivest - Shamir - Adleman 78] Hybrid encryption • • • Security: difficulty of factoring large numbers Pick two large random primes p and q Let N = p × q Pick a large integer d relatively prime to (p-1)×(q-1) Find the integer e such that e*d = 1 (mod (p-1)×(q-1)) (Public) Encryption key is (e, N): C = M e (mod N) Decryption key is (d, N): M = C d (mod N) – d must be kept secret! Private key KPriv (bob) K’ K’ RSA Ks c c AES Ks Performance RSA Public key KPub (bob) AES • Performance (symmetric) No shared secret (asymmetric) – anyone can send a message to Bob ! Algorithm Combines the advantages of both encryption methods m s s Much slower than symmetric algorithm – (2-3 orders of magnitude) ALICE Generally combined with symmetric algorithms • m BOB Recommended key size: 1024 bits 29 MARVIN Outline Cryptographic hash functions • Are we paranoids ? (no!) • Cryptographic tools and protocols • 30 Goal: ensure data integrity potentially combined with encryption (see after) • Basic primitives – Encryption – Cryptographic hash Combining cryptographic primitives (correctly) A hash function h must satisfy the following properties maps an input x of arbitrary bit length, to an output h(x) of fixed bit length n. h(x) must be easy to compute • A hash function h is a cryptographic hash functions if it satisfies (some of) the following properties preimage resistance: given a hash value h, it is computationally infeasible to find any x, such that h(x) = h 2nd preimage resistance: given x and h, such that h(x) = h, it is computationally infeasible to find any second input y, such that h(y)=h(x)=h collision resistance: it is computationally infeasible to find any two distinct inputs x and y such that h(x) = h(y) • Overview of database specific tools • 31 Thus, a cryptographic hash function h ensures that any bit change in the input will impact the output. 32 Cryptographic hash functions (2) Outline • • Are we paranoids ? (no!) • Cryptographic tools and protocols Examples: Message Digest 5 (MD5) [Riv92] Secure Hash Algorithm 1 (SHA-1) [NIS95] • • Process the input message by blocks of 512 bits Basic primitives – Encryption – Cryptographic hash Combining cryptographic primitives (correctly) The result of the hash function is 160 bits for SHA-1 128 bits for MD5 • Security: SHA-1 Find a 2nd preimage requires 2160 operations Find a collision requires 280 operations • • • Overview of database specific tools Hash functions are publicly known Keyed hash functions also exist (hash function taking as a parameter a cryptographic key) 33 34 Combining cryptographic primitives (1) Confidentiality: Encryption alone • Goal: Resist to passive and active attacks • • Passive attacks : threaten Confidentiality • Active attacks: threaten Integrity • 5 Identity theft 4 Data alteration Message forging 3 2 Replay attack 6 Repudiation 2 Destruction Delays / reordering Ensures confidentiality – remind that Symmetric encryption is efficient but needs a shared secret Asymmetric encryption is slow but does not need a shared secret Îneed for a PKI however (Public Key Infrastructure) Combining symmetric and asymmetric is a good option Both must be carefully used – Mode of operation: ECB should not be used for messages with repetitive patterns (frequency analysis) – Only use known algorithms 1 • Encryption does not ensure integrity This is a common mistake (even encrypted data can be (randomly) altered) Unless used in a very specific way – e.g., encrypting blocks with high redundancy to ensure integrity (e.g., a 128 bits block storing 64 bits of data, replicated) – This may lower the algorithm resistance to attacks 2 Approach followed Describe basic constructs and their rationale Do not describe complete protocols (too boring) Æ book ch. 26, wikipedia… 35 36 (2) Replay Attack (3) Message forging: Nonce • Problem: • Alice and Bob share a session key, K Alice sends a message M1 to Bob and gets M2 back. How can Alice be sure that M2 came from Bob and not from Marvin ? Marvin copies the message and resends it to Bob • Solution: Includes a unique timestamp (or sequence number) in the message. The receiver keeps timestamps of recently received messages He does not accept a duplicate • Marvin might: send a random string that Alice decrypts (using K) to another random string that looks like a correct response replay an earlier message sent by Bob encrypted with K, that is a possible response (Alice is not a server that maintains a list of timestamps) • Remark: Obviously, the timestamp integrity must be preserved (see point 4) • Problem: • Remark (2): Solution: Include a nonce, N, in M1 A random string generated by Alice Long enough so that Marvin cannot guess it If M2 contains N+1 then it can only have been generated by Bob (since only Bob knows K) and it cannot be a replay Using sequence number may protect from reordering/destruction attacks. 37 (4) Data alteration + Confidentiality m Hash m || h Encrypt • • • (5) Identity theft: Authentication-1 m m m Encrypt Hash Encrypt Encrypt c c Hash Hash c || h ≈ Symmetric encryption assumed c || h 38 • Is asymmetric encryption sufficient to ensure authentication ? Alice and Bob have no shared secret but want to establish a secured communication. They exchange their public keys and a session key encrypted with Alice public key h ALICE Encrypt c || h’ 1 Hi, Bob, I am Alice, here is my Public key, could you send me yours such that we can discuss securely ? 2 Hi Alice, I am Bob, here is my Public key and a session key encrypted with your Public Key 3 Fine, Bob, here is my secret message encrypted with the session key…. 4 • One construction is wrong BOB Etc… Is this protocol secure ? Others have different properties 39 40 (5) Authentication-2: attacking the simple protocol • (5) Authentication-3: trusted party & public keys A “Man in the middle” attack can be conducted !!! MARVIN • Alice has been previously registered by Trent, showing identity proofs and has sent her public key. • • She received in return, a certificate containing her public key, certified by Trent. Trent certificate is “well known” (certification hierarchy) 1 – Bobs asks Alice her certificate BOB ALICE 2 – Alice sends her certificate to Bob, Bob checks its validity Marvin intercepts Alice’s message and exchange her public key with a generated public key (Marvin knows the corresponding private key) Then, he intercepts Bob’s message, decrypt the session key, re-encrypt it with Alice’s public key (kept from 1st message) and exchange Bob’s public key with another generated public key Alice and Bob does not notice anything and Marvin intercepts all messages, can even create fake message ! • 3 – Bob sends Alice a session key encrypted with Alice public key t gis Re SSL is based on this scheme ion rat (remark: Bob need not be registered) Need for a trusted party delivering 1 public keys (PKI) or secret keys (e.g., Kerberos) 2 3 ALICE 41 During this registration, they exchanged a secret key only known by Trent/Alice (TA) or Trent/Bob (TB) 42 • Secret (Kerberos) 1 - Alice contacts Trent, sending a nonce for a communication with Bob Distributes symmetric keys Operates on-line, when interaction takes place since it creates a new symmetric key for each session 2 - Trent sends the nonce and the session key, both encrypted with TA. A ticket, including the session key encrypted TB key is also sent TRENT tio tra gis Re n • Public (SSL, Certification authority) 1 Registration 4 - Bob decrypts the Ticket, retrieve the session key, and starts communicating with Alice BOB • Both need trusted third parties. Alice & Bob have been previously registered by Trent 3 - Alice checks the nonce, decrypts the session key, sends the Ticket to Bob 4 (5) Authentication-5: Secret vs Public (5) Authentication-4: trusted party & secret keys • • TRENT 4 – Alice decrypts the session key and starts communicating with Bob KERBEROS is based on this scheme 2 Distributes public keys (certificates) Operates off-line, prior to interaction since public key is fixed Once certificate created, intervention by CA no longer required 3 4 ALICE BOB 43 44 (6) repudiation: Digital Signature-1 • Digital Signatures can be used for • • Proof of authorship Non-repudiation by author Guarantee of message integrity Does not guarantee confidentiality (orthogonal) – If confidentiality is needed, encryption of the message must be used Important for many Internet applications Based on public key cryptography • (6) Digital signature-2: Signing Current systems use RSA algorithm Basic idea: (1) Hash the message (2) Encrypt the hashing with the sender private key (3) Add to the message the sender public key and a certificate (for checking) 45 46 Outline (6) Digital Signature-3: Verifying • Are we paranoids ? (no!) • Cryptographic tools and protocols Basic primitives – Encryption – Cryptographic hash Combining cryptographic primitives (correctly) • Overview of database specific tools 47 48 Overview of database specific tools (1) Authentication, (2) Communications • 3-Authorizations 2-Protection of communications 2-Protection of communications 8-Data anonymization MS r DBerve S Statistical use User Delivered data User 1-Authentication 6-Usage control Authentication • Basic: Login + Password OS authentication can be used by the DBMS DBMSs support most authentication protocols (SW & HW) Attacks: default passwords (600 available on the web) Problems: Indirect authentication: – The user is authenticated by the Application – The application is authenticated by the DBMS Communications Use of classical protocols (e.g. SSL) A lot of known attacks on the Oracle Listener Service…. 5-Audit 4-Database encryption 7-Limited data retention 1-Authentication 49 (3) Authorizations-1: Access control 3-Authorizations • (3) Authorizations-2: Access control Basically: Discretionary Access Control (DAC) Subject: Authenticated users/processes Objects: Database objects to be protected Actions: Actions that are authorized (e.g., read, update) MS r DBerve S • Basic syntax • • View : Virtual table defined by an SQL query The DBMS transforms queries on views in queries on base tables Query on views GRANT <Actions> ON <object> TO <Subject> REVOKE <Actions> ON <object> FROM <Subject> • 50 Role Base Access Control (RBAC) Access control Roles are recipient of authorizations Roles are assigned to users • View manager Objects Tables Views Stored procedures etc. View definition 51 Rewrite query on base tables 52 Create View Select From Group By (3) Authorizations-3: Access control with views Employees Stats as service, count(*) Nbpatients, sum(expense) Total_Exp Patients service Select Total_Exp From Stats Where service = "immuno" FINAL QUERY QUERY VIEW Example Id-E LName FName Fone 1 Ricks Jim 5485 2 Trock Jack 1254 3 Lerich Zoe 5489 Number of employees Average Salary 4 Doe Joe 4049 4 225 Human ressources Select sum(expense) Total_Exp From Patients Where service = "immuno" Statistician Id-E LName FName Fone Address City Salary 1 Ricks Jim 5485 ………. Paris 230 2 Trock Jack 1254 ………. Versailles 120 3 Lerich Zoe 5489 ………. Chartres 380 4 Doe Joe 4049 ………. Paris 170 53 54 (3) Authorizations-4: Virtual Private Database (4) Database encryption Query Running the procedure Contextual information Procedure for Adding conditions • • • Query with added conditions 55 Oracle Obfuscation toolkit Same kind of tools for other DBMSs Protegrity Secure-Data 56 (5) Database auditing (6) Usage control, (7) Limited Data retention • • • • Access control / Usage control: Use specific auditing features of your database system Use classical triggers for personalized audit Triggers: E-C-A rules When the EVENT happens On insertion, update or deletion of a tuple in a given table • Limited data retention principle If the CONDITION is fulfilled Any kind of SQL predicate Do the ACTION Specific code to execute (SQL, PLSQL, other languages) • Access control defines the rules for accessing the data Usage control regulates the usage of delivered data Limited data retention is one example of usage control Digital Right Managements is usage control – You can access a video, watch it, but not redistribute it Triggers allow recording who has modified what and when Attach a lifetime to the data Compliant with its acquisition purpose After which it must be withdrawn from the system Examples – Google: Cookies kept 2 years instead of 30 as before (!) – Ask and IxQuick keep user information for only two days. 57 (8) Data Anonymization: on-going works 58 Bibliography [MPO] [Sch] [KBL] [SL98] [AKS02] [FBI] [Ora] [Mat] [Orab] 59 Alfred Menezes, Paul van Oorschot, Scott Vanstone: Handbook of Applied Cryptography. Available online: http://www.cacr.math.uwaterloo.ca/hac/ Bruce Schneier: Applied Cryptography: Protocols, Algorithms, and Source Code in C Kifer, Bernstein, Lewis : Database Systems, an application oriented approach (chapter 26) Stefan Lucks: Attacking Triple Encryption, Fast Software Encryption 1998, pp 239–253. Agrawal R., Kiernan J., Srikant R., Xu Y., “Hippocratic Databases”, VLDB , 2002. Computer Security Institute, "CSI/FBI Computer Crime and Security Survey" http://www.gocsi.com/forms /fbi/pdf.html). Oracle Unbreakable http://www.techtv.com/news/securityalert/story/0,24195,3364291,00.html U. Mattsson, Secure.Data Functional Overview, Protegity Technical Paper TWP-0011,. (http://www.protegrity.com/White_Papers.html) Oracle Corp., “Advanced Security Administrator Guide” 60 C-SDA Tradeoff : confidentialité vs performance • Données hébergées sur un serveur non sécurisé – Confidentialité Î données chiffrées – Performance Î déléguer un maximum de traitement au serveur Utilisateur C-SDA Data SGBD traditionnel Data • Limites Rights Mgr Data User Data Data Data Data Data Query Mgr Data Rights Mgr Query Mgr User Data – Tradeoff : confidentialité vs performance Confidentialité et performance dépendent du grain et de la méthode de chiffrement Serveur Alternative : approche serveur Alternative : Approche client Data Data Data Data Data Rights Mgr Query Mgr Data Data Data Data Data Pirate Query Mgr Confidentialité maximale, performances minimales Data BD BD chiffrée P.M.E. Data Data • Traitements sensibles + gestion des droits Î Carte à puce • La carte à puce agit comme un médiateur incorruptible entre le client et le serveur Data Query Mgr Rights Mgr Query Mgr User Data Data Data User Data Serveur BD Serveur de sécurité • L’empreinte de la BD est chiffrée – Oracle Obfuscation toolkit, ... • Restriction maximale des droits de l’administrateur – Protegrity’s Secure.Data, [He et al 01] … Faiblesse = Le déchiffrement se fait sur le serveur, Or le serveur n’est pas un site de confiance • Déchiffrement sur le client Î Qui gère les clés ? • Pour des données privées (accès exclusif) – Le client gère les clés – Tradeoff confidentialité / performance [Hacigumus et al 02] • Pour les données partagées – Le gestionnaire de droit (et clés) est nécessaire côté client Faiblesse : Le client peut attaquer le gestionnaire de droits Traitement d’une requête « simple » Chiffrement de la BD • Données et méta-données chiffrées • C-SDA intercepte les requêtes de Select et les traduit – Hypothèse initiale : chiffrement de granule attribut • Seuls les clients sont habilités à Créer/Détruire/Modifier des données • Chiffrement partiel possible produits nom type prix sdz azds sdeefa zze d300 Dell Pentium3 9800 zszd dedef zarevgzd Fffe I260 IBM Pentium2 6400 df’g Sde iukèefsa dgss ... ... • ... ... ... ... Serveur Trouver les produits de type = "Pentium3" lqskdqs ref ... Terminal n q p ... ref nom type prix d300 Dell Pentium3 9800 I330 IBM Pentium3 9000 Attention, chiffrement et gestion des droits ne doivent pas interférer • Serveur n q C-SDA 9400 q Calcul de la moyenne zze zszd dedef zarevgzd Fffe tger Sde zarevgzd zrzer – cryptographie • processeur RISC très puissant 25mm2) capacité mémoire limitée Mémoire persistante = EEPROM – lectures rapides (≈ 60ns), écritures très lentes (> 1ms) p r Moyenne Trouver le zze des lqskdqs de sdeefa= "zarevgzd" sdz azds sdeefa Sécurité – puce minuscule (< Trouver la moyenne des prix des produits de type = "Pentium3" o SGBD Contraintes liées à la carte à puce Traitement d’une requête plus complexe Terminal Trouver les lqskdqs de sdeefa= "zarevgzd" C-SDA o SGBD zze Fffe – capacité en augmentation (1Mo bientôt ?) 32 bits Proc I/O Security Blocks zrzer La partie des requêtes non évaluable sur les données chiffrées est évaluée sur la carte (prédicats <, >, fonctions de calcul, etc..) ROM 96KB RAM 4KB EEPROM 128KB Décomposition d’une requête • Q = Qterm ° Qcard ° Qserver Equiprédicats Inequi-predicats, aggregations ... Exemple d’exécution et optimisation Nom, n° de carte des clients et sommes à facturer pour les commandes passées après le 1/1/05 Calcul de la somme Déchiffrement Groupement Calcul de la somme Tri final Jointure Sélection (date > 1/1/05) SELECT C.Id, C.name, sum(O.amount) FROM WHERE GROUP BY HAVING ORDER BY Customers C, Orders O C.Id = O.CustId and O.date > 1996 C.Id, C.name count(*) >= 10 C.name Gestion des droits • Droits BD / droits systèmes – Droits BD : consulter, modifier, supprimer des données – Droits système : créer une partition, etc… Client Jointure Déchiffrement Chiffrement Groupement Sélection (date > 1/1/05) Déchiffrement Jointure Client Commandes Commandes Dates, sans doubles Commandes Robustesse du chiffrement • Idée : Utiliser plusieurs clés (ou algos) de chiffrement • Contrainte : a = b ⇔ chiffre (a) =chiffre (b) • Fragmentation verticale • Les droits BD font partie du noyau de sécurité … Î Les droits BD doivent donc être vérifiés par C-SDA ! • Les droits BD sont basés sur les vues … ÎLes vues doivent donc être gérées par C-SDA ! • Mais les droits et les vues sont des données partagées … Îdroits et vues (définitions) doivent être stockés sur le serveur! – utiliser des clés différentes pour des attributs non comparables A B C D E • Fragmentation horizontale – Plusieurs clés pour différentes valeurs du même attribut • Appliquer une fonction de hachage h sur la valeur a. • Utiliser la clé Key(h(a)) comme clé de chiffrement • Le couple (h(a), Chiffrekey(h(a))(a)) est stocké A 'Isoler' les données les plus sensibles • Isoler des données sans compliquer l’évaluation – Remplacer les données sensibles de la base par des indices dans un "domaine sensible" stocké sur la carte à puce • Peut être vu comme un mode de chiffrement incassable • la propriété a = b ⇔ chiffre (a) = chiffre (b) est préservée => même stratégie d’évaluation Robustesse du chiffrement • Désactiver une carte corrompue Visites User1 User2 User3 Id-V Id-P Date Prix 1 2 15 juin 250 2 1 12 août 180 3 2 13 juin 350 K2 4 3 1 mars 250 K3 K1 K2 K3 K4 Public Key …. … … K1 K4 K5 N° 4 Nom Dupond Code CB 0454255782 N° 4 Nom Dupond Code CB 1 Code CB 0454255782 K7 Etc.. 8 Durand 0450500609 8 Durand 2 0450500609 9 Dutronc 0456589413 9 Dutronc 3 0456589413 13 Duval 0454547898 13 Duval 4 0454547898 15 Dussol 0455121236 15 Dussol 5 0455121236 Architecture finale K6 Patients Clé secrète K5 chiffrée avec la clé publique de User2 Id-P Nom Prénom Ville 1 Leau Jacques Paris 2 Troger Zoe Evry 3 Doe John Paris 4 Perry Paule Valenton …. ……. ……. ……. K2 K5 K6 K7 Exercice • Supposons que Joe a le droit d’accéder uniquement à la vue « MyCustomers » contenant les clients niçois – MyCustomers : Select * from Customer where City = ‘Nice’ • Joe est malveillant et veut accéder à l’ensemble des clients • Comment Joe peut il attaquer le système (hors attaque statistique) pour récupérer des informations sur l’ensemble des clients sachant que – Joe a une carte puisqu’il a certains droits sur la BD – Le serveur n’est pas sécurisé [Hacigumus et al, SIGMOD’02], Univ. Irvine, CA Attributs numériques • Partitionner le domaine de variation d’un attribut Data Data Encryption Decryption Query Mgr Data Data Data Data Query Mgr User Connaissance du client Data h(1)=17 h(2)=4 h(3)=12 h(4)=3 h(5)=6 h(6)=1 h(7)=9 20 25 30 35 40 45 50 55 • Alternatives à l’hypothèse a=b ⇔ E(a)=E(b) • Granule de chiffrement = tuple dans sa globalité • Ajout d’index d’attributs – Indique l’appartenance d’un attribut d’un tuple à une plage de valeurs – Permet des traitements approximatifs sur le serveur Row: Encrypted row: id name age salary Encrypted row Connaissance du serveur (Age=37) Iid Iname Iage Isalary E(R1) E(R2) E(R3) (Age=53) (Age=26) index Attributs « String » 32<Age<40 Age=53 IAge= 12 or IAge= 3 IAge= 9 IAge 3 9 4 Service Provider Architecture • Signatures de string (n-grams) Server Site Client Site Connaissance du client N={"g", "re", "ma"} Query Executer Temporary Results exemple : string signature 'Greencar' 110 'Bigrecordman' 111 'Bigman' 101 name LIKE '%green%' Client Side Query ? Server Side Query Service Provider Query Translator Connaissance du serveur I Name E(R1) E(R2) E(R3) Encrypted Results 110 111 101 IName in (110, 111) Original Query Metadata ? Encrypted User Database ? Actual Results User Query Decomposition Query Decomposition (2) Client Query πname,pname Q: SELECT name, pname FROM emp, proj Client Query WHERE emp.pid=proj.pid AND salary > 100k πname,pname Client Query πname,pname πname,pname e.pid = p.pid σsalary >100k e.pid = p.pid D e.pid = p.pid e.pid = p.pid σsalary >100k σsalary >100k D D Encrypted (PROJ) PROJ σsalary >100k D D D E_PROJ σs_id = 1 v s_id = 2 E_PROJ E_EMP EMP Server Query Encrypted (EMP) Query Decomposition (3) Client Query πname,pname e.pid = p.pid E_EMP Server Query Server Query Query Decomposition (4) Client Query πname,pname πname,pname σsalary >100k e.pid = σsalary >100k e.pid = p.pid Client Query Q: SELECT name, pname FROM emp, proj WHERE emp.pid=proj.pid AND salary > 100k p.pid σsalary >100k D D D e.p_id = p.p_id e.p_id = p.p_id E_EMP Server Query e_emp.etuple, FROM e_emp, e_proj e_proj.etuple D σs_id = 1 v s_id = 2 QS: SELECT E_PROJ σs_id = 1 v s_id = 2 E_EMP Server Query E_PROJ σs_id = 1 v s_id = 2 E_EMP Server Query E_PROJ WHERE e.p_id=p.p_id AND s_id = 1 OR s_id = 2 QC: SELECT name, pname FROM temp WHERE emp.pid=proj.pid AND salary > 100k Effect of Number of Buckets in Non-Join Query Effect of Number of Buckets in Join Query Cost Factors for Query Response Time Query Response Time 40 30 Client Side Network Server Side 20 10 Client Server Total 1 0 • Client and communications costs decreases with increasing number of buckets due to better filtering at the server • Server cost doesn’t decrease as much, table scan remains best choice in the optimizer 75 100 150 250 300 Number of Buckets 2 8 Number of Buckets 500 750 1500 Effect of Decryption Time Query Response Time Query Response Time Client, Server, and Total Response Times 1 75 100 150 250 300 500 Client /w decryption Client w/o decryption 750 1500 Number of Buckets • Sharp decrease in query response time with increase in the number of buckets due to better filtering at the server • Client side query response time is greater than server side query response time due to dominant decryption cost on the query (second graph)