Database Security - smis

Transcription

Database Security - smis
Outline
Database Security
• Are we paranoids ? (no!)
• Cryptographic tools and protocols
Luc Bouganim
ƒ Basic primitives
– Encryption
– Cryptographic hash
ƒ Combining cryptographic primitives (correctly)
SMIS project
Secured and Mobile Information Systems
INRIA, Paris-Rocquencourt - France
• Overview of database specific tools
[email protected]
Slides from: mainly from L. Bouganim, P. Pucheral, A. Canteaut
1
Cryptography: the characters
Alice: (A)
Bob: (B)
She wants to communicate with Bob!
He wants to communicate with Alice!
2
Passive attacks
Active attacks
Charlie has access to
unauthorized data
Charlie modifies the
transmitted data
ALICE
BOB
MARVIN
Marvin: (M)
Trent: (T)
Marvin is Malicious
Trent is Trusted
3
ALICE
BOB
MARVIN
4
Passive and active attacks
Symmetric encryption
• Passive attacks: threaten confidentiality
• Active attacks: threaten integrity
• Sender and receiver share a secret allowing
symmetric encryption and decryption
ƒ Identity theft
– Marvin sends a message to Bob with Alice IDs
ƒ Data alteration
– Marvin modifies the content of Alice message
Shared secret
– Marvin captures a message and sent it to Bob several times
m
ƒ Repudiation
ALICE
– Alice sends a message to Bob and denies having sent this
message
c
c
Decryption
– Marvin forges a fake message for Bob or Alice
ƒ Replay attack
Encryption
ƒ Message forging
m
BOB
ƒ Destruction
– Marvin destroys selectively some messages sent to Bob
ƒ Delays / reordering
– Marvin introduces communication delays or reorders messages
MARVIN
5
6
Kerckhoffs' principle
In 1883, the most famous work by Auguste Kerckhoffs was
published: La Cryptographie Militaire (military cryptography).
This book set forth desiderata for encryption systems…
ƒ The encryption system must not be required to be
secret, and it must be able to fall into the hands
of the enemy without inconvenience;
ƒ The system must be practically, if not
mathematically, indecipherable;
7
8
Symmetric encryption: Secret key encryption
•
All the details of the system, including the encryption and
decryption functions are known, except the key
•
Security is only based on the secrecy of the key K
• The attacker knows the ciphertext c
ÎFind the plaintext m, or better the key K
c
c
c = EK(m)
Decryption
m
Encryption
Shared secret:
key K
ALICE
Attacking a symmetric cipher
• The attacker knows couples plaintext/ciphertext (m, c)
ÎFind the key K, or at least be able to decrypt other messages
m
BOB
m = DK(c)
= DK(EK(m))
MARVIN
9
10
Attacking a symmetric cipher (2)
Attacking an simple alphabetic substitution
•
•
Assume a random alphabetic substitution such as
•
Can we easily retrieve the
key (i.e., the table) knowing
a ciphertext ?
Example :
ƒ Caesar cipher (used for military purposes by
Julius Caesar)
ƒ Plaintext: attackatonce
ƒ Ciphertext: exxegoexsrgi
Î Very easy to attack !
•
Given the small number of possible
shifts (26), the key can be found in 13
operations, in average…
(NB: French message)
The problem here is the size of the key!
(less than 5 bits !)
11
12
(1) Frequency analysis: a single letter
(2) Frequency analysis: bigrams
In the cipher text
In the cipher text
In French
In French
In the cipher text
Letter frequency in French
In French
Conclusion
13
(3) Using some common words
14
(4) and more common words…
soulent
LÆV
dommes
DÆH
ehuipage
HÆQ
aleatros
EÆB
mompagnons
MÆC
etc.
15
16
Finally !
Simple alphabetic substitution
The simple alphabetic substitution cipher is:
•
Class of Cipher: Block cipher
ƒ The message is encrypted by blocks of fixed size (n bits)
•
Mode of operation: ECB
ƒ Electronic Code Book
ƒ The message is split in n bits blocks, each encrypted separately
•
Why the attack was so easy?
ƒ
ƒ
ƒ
ƒ
•
The algorithm is too weak?
The encryption key is too short?
The encryption block is too small?
Other reasons?
Let’s check stronger algorithms
17
18
Some symmetric block ciphers
Attacks on DES, 3DES and AES
•
•
DES
ƒ
ƒ
ƒ
ƒ
•
Data Encryption Standard (1976 - 1997)
Encrypts 64 bits blocks
Encryption and Decryption are the same algorithms
The encryption key is 56 bits
ƒ 1997: 39 days on 10 000 Pentium
ƒ 1998: Deep Crack breaks a DES key in 56 hours (250 000 US$)
ƒ 2007: 6.4 days on a $10,000 parallel machine
•
3DES
•
AES
ƒ Only Side Channel Attacks were successful on AES
ƒ See http://www.cryptosystem.net/aes/ for more information
ƒ AES is computationally secure (now)
RIJNDAEL (AES)
ƒ
ƒ
ƒ
ƒ
ƒ
3DES
ƒ The best attack known on 3-key 3DES requires around 232 known plaintexts,
2113 steps, 290 single DES encryptions, and 288 memory !!
ƒ 3DES is computationally secure (now)
ƒ Used as a replacement for DES between 1997 and 2001
ƒ Uses three 56 bits keys
ƒ tripleDES(k1k2k3, M) = DES(k3,DES(k2,DES(k1,M)))
•
DES
Used since 2001 and Encryption standard since 2002
Authors: Joan Daemen and Vincent Rijmen Î Rijndael
Winners of the AES competition
128 bits blocks with keys of 128, 192 or 256 bits
Fast and requires little memory
Î Thus, can I use 3DES or AES without problems ?
19
20
Unconditional vs Computational security
•
Triple DES encryption with ECB mode …
Unconditional security
• Can I use 3DES or AES
without problems ?
ƒ “the uncertainty in the plaintext, after observing the ciphertext, must be equal
to the a priori uncertainty about the plaintext – observation of the ciphertext
provides no information whatsoever to an adversary” [Menezes]
Î The unique possible attack is exhaustive key search
ƒ To reach unconditional security, the secret key must be as long as the
plaintext ! [Shannon 49]
•
Computational security
ƒ “A proposed technique is said to be computationally secure if the perceived
level of computation required to defeat it (using the best attack known)
exceeds, by a comfortable margin, the computational resources of the
hypothesized adversary.” [Menezes]
NO!
21
Encryption mode: Electronic Code Book-ECB
22
Triple DES encryption with CBC mode …
• Can I use 3DES or AES
without problems ?
by tes
0
8
16
Plain-text 1 (P1)
Plain-text 2 (P2)
EK (P1)
EK (P2)
Cipher-text 1 (C1)
Cipher-text 2 (C2)
…
• Yes, if I take care!
…
23
24
Cipher Block Chaining (CBC)
Other modes of operations
•
ƒ Error propagation
ƒ Partial decryption
ƒ Etc.
by tes
0
8
Plain-text 1 (P1)
16
Plain-text 2 (P2)
…
•
•
Init. Vector (IV)
EK (IV ⊕ P1)
EK (C1 ⊕ P2)
Several modes of operations exists, the adequate one
depends on the desired properties of the ciphertext
EK (…)
See [Menezes] pp 288
Ex. The counter mode
by tes
0
Cipher-text 1 (C1)
Cipher-text 2 (C2)
8
…
Plain-text 1 (P1)
+1
EK (IV+1) ⊕ P1
16
Plain-text 2 (P2)
+2
EK (IV+2) ⊕ P2
…
+3
Init. Vector (IV)
Cipher-text 1 (C1)
Cipher-text 2 (C2)
…
25
Asymmetric (Public key) encryption
•
26
Secret Key
vs
Public/Private keys
Alice & Bob have the key
•
Alice use the key to deposit a
message into the safe
Alice uses the public key (Bob’s
address) to send a message to Bob
•
Bobs uses his private key (the
mailbox key) to retrieve the message
Each user has
ƒ a public key, publicly available on a directory,
ƒ a private key kept secret
No need for a shared secret
m
ALICE
•
•
Private key
KPriv (bob)
c
c
Decryption
Public key
KPub (bob)
Encryption
•
•
m
BOB
Î Only Alice & Bob can exchange
messages
m = DKPriv(c)
= DKPriv(EKPub(m))
c = EKPub(m)
MARVIN
Bob use the key to retrieve the
message from the safe
ÎAnyone can send a message to Bob
(his address being public)
ÎOnly Bob can retrieve the messages
27
28
The RSA asymmetric cipher [Rivest - Shamir - Adleman 78]
Hybrid encryption
•
•
•
Security: difficulty of factoring large numbers
Pick two large random primes p and q
Let N = p × q
Pick a large integer d relatively prime to (p-1)×(q-1)
Find the integer e such that e*d = 1 (mod (p-1)×(q-1))
(Public) Encryption key is (e, N): C = M e (mod N)
Decryption key is (d, N): M = C d (mod N) – d must be kept secret!
Private key
KPriv (bob)
K’
K’
RSA
Ks
c
c
AES
Ks
Performance
RSA
Public key
KPub (bob)
AES
•
ƒ Performance (symmetric)
ƒ No shared secret (asymmetric) – anyone can send a message to Bob !
Algorithm
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
Combines the advantages of both encryption methods
m
s
s
ƒ Much slower than symmetric algorithm
– (2-3 orders of magnitude)
ALICE
ƒ Generally combined with symmetric algorithms
•
m
BOB
Recommended key size: 1024 bits
29
MARVIN
Outline
Cryptographic hash functions
• Are we paranoids ? (no!)
• Cryptographic tools and protocols
•
30
Goal: ensure data integrity
ƒ potentially combined with encryption (see after)
•
ƒ Basic primitives
– Encryption
– Cryptographic hash
ƒ Combining cryptographic primitives (correctly)
A hash function h must satisfy the following properties
ƒ maps an input x of arbitrary bit length, to an output h(x) of fixed bit length n.
ƒ h(x) must be easy to compute
•
A hash function h is a cryptographic hash functions if it
satisfies (some of) the following properties
ƒ preimage resistance: given a hash value h, it is computationally infeasible
to find any x, such that h(x) = h
ƒ 2nd preimage resistance: given x and h, such that h(x) = h, it is
computationally infeasible to find any second input y, such that h(y)=h(x)=h
ƒ collision resistance: it is computationally infeasible to find any two distinct
inputs x and y such that h(x) = h(y)
• Overview of database specific tools
•
31
Thus, a cryptographic hash function h ensures that any bit
change in the input will impact the output.
32
Cryptographic hash functions (2)
Outline
•
• Are we paranoids ? (no!)
• Cryptographic tools and protocols
Examples:
ƒ Message Digest 5 (MD5) [Riv92]
ƒ Secure Hash Algorithm 1 (SHA-1) [NIS95]
•
•
Process the input message by blocks of 512 bits
ƒ Basic primitives
– Encryption
– Cryptographic hash
ƒ Combining cryptographic primitives (correctly)
The result of the hash function is
ƒ 160 bits for SHA-1
ƒ 128 bits for MD5
•
Security: SHA-1
ƒ Find a 2nd preimage requires 2160 operations
ƒ Find a collision requires 280 operations
•
•
• Overview of database specific tools
Hash functions are publicly known
Keyed hash functions also exist
ƒ (hash function taking as a parameter a cryptographic key)
33
34
Combining cryptographic primitives
(1) Confidentiality: Encryption alone
•
Goal: Resist to passive and active attacks
•
•
Passive attacks : threaten Confidentiality
•
Active attacks: threaten Integrity
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
•
5
Identity theft
4
Data alteration
Message forging 3
2
Replay attack
6
Repudiation
2
Destruction
Delays
/ reordering
Ensures confidentiality – remind that
ƒ Symmetric encryption is efficient but needs a shared secret
ƒ Asymmetric encryption is slow but does not need a shared secret
Îneed for a PKI however (Public Key Infrastructure)
ƒ Combining symmetric and asymmetric is a good option
ƒ Both must be carefully used
– Mode of operation: ECB should not be used for messages with repetitive
patterns (frequency analysis)
– Only use known algorithms
1
•
Encryption does not ensure integrity
ƒ This is a common mistake (even encrypted data can be (randomly) altered)
ƒ Unless used in a very specific way
– e.g., encrypting blocks with high redundancy to ensure integrity
(e.g., a 128 bits block storing 64 bits of data, replicated)
– This may lower the algorithm resistance to attacks
2
Approach followed
ƒ Describe basic constructs and their rationale
ƒ Do not describe complete protocols (too boring) Æ book ch. 26, wikipedia…
35
36
(2) Replay Attack
(3) Message forging: Nonce
• Problem:
•
ƒ Alice and Bob share a session key, K
ƒ Alice sends a message M1 to Bob and gets M2 back.
ƒ How can Alice be sure that M2 came from Bob and not from Marvin ?
ƒ Marvin copies the message and resends it to Bob
• Solution:
ƒ Includes a unique timestamp (or sequence number) in the message.
ƒ The receiver keeps timestamps of recently received messages
ƒ He does not accept a duplicate
•
Marvin might:
ƒ send a random string that Alice decrypts (using K) to another random string
that looks like a correct response
ƒ replay an earlier message sent by Bob encrypted with K, that is a possible
response (Alice is not a server that maintains a list of timestamps)
• Remark:
ƒ Obviously, the timestamp integrity must be preserved (see point 4)
•
Problem:
•
Remark (2):
Solution: Include a nonce, N, in M1
ƒ A random string generated by Alice
ƒ Long enough so that Marvin cannot guess it
ƒ If M2 contains N+1 then it can only have been generated by Bob (since only
Bob knows K) and it cannot be a replay
ƒ Using sequence number may protect from reordering/destruction attacks.
37
(4) Data alteration + Confidentiality
m
Hash
m || h
Encrypt
•
•
•
(5) Identity theft: Authentication-1
m
m
m
Encrypt Hash
Encrypt
Encrypt
c
c
Hash
Hash
c || h
≈
Symmetric encryption assumed
c || h
38
•
Is asymmetric encryption sufficient to ensure authentication ?
ƒ Alice and Bob have no shared secret but want to establish a secured
communication.
ƒ They exchange their public keys and a session key encrypted with Alice
public key
h
ALICE
Encrypt
c || h’
1
Hi, Bob, I am Alice, here is my Public key, could you
send me yours such that we can discuss securely ?
2
Hi Alice, I am Bob, here is my Public key and a session
key encrypted with your Public Key
3
Fine, Bob, here is my secret message encrypted with
the session key….
4
•
One construction is wrong
BOB
Etc…
Is this protocol secure ?
Others have different properties
39
40
(5) Authentication-2: attacking the simple protocol
•
(5) Authentication-3: trusted party & public keys
A “Man in the middle” attack can be conducted !!!
MARVIN
•
Alice has been previously registered by Trent, showing identity proofs and has
sent her public key.
•
•
She received in return, a certificate containing her public key, certified by Trent.
Trent certificate is “well known” (certification hierarchy)
1 – Bobs asks Alice her certificate
BOB
ALICE
2 – Alice sends her certificate to Bob, Bob checks its validity
ƒ Marvin intercepts Alice’s message and exchange her public key with a
generated public key (Marvin knows the corresponding private key)
ƒ Then, he intercepts Bob’s message, decrypt the session key, re-encrypt it
with Alice’s public key (kept from 1st message) and exchange Bob’s public
key with another generated public key
ƒ Alice and Bob does not notice anything and Marvin intercepts all messages,
can even create fake message !
•
3 – Bob sends Alice a session key encrypted with Alice public key
t
gis
Re
SSL is based on this scheme
ion
rat
(remark: Bob need not be registered)
Need for a trusted party delivering
1
ƒ public keys (PKI)
ƒ or secret keys (e.g., Kerberos)
2
3
ALICE
41
During this registration, they exchanged a secret key only known by Trent/Alice
(TA) or Trent/Bob (TB)
42
• Secret (Kerberos)
1 - Alice contacts Trent, sending a nonce for a communication with Bob
ƒ Distributes symmetric keys
ƒ Operates on-line, when interaction takes place since it creates a new
symmetric key for each session
2 - Trent sends the nonce and the session key, both encrypted with TA.
A ticket, including the session key encrypted TB key is also sent
TRENT
tio
tra
gis
Re
n
• Public (SSL, Certification authority)
1
Registration
4 - Bob decrypts the Ticket, retrieve the
session key, and starts
communicating with Alice
BOB
• Both need trusted third parties.
Alice & Bob have been previously registered by Trent
3 - Alice checks the nonce, decrypts the session key,
sends the Ticket to Bob
4
(5) Authentication-5: Secret vs Public
(5) Authentication-4: trusted party & secret keys
•
•
TRENT
4 – Alice decrypts the session key and starts
communicating with Bob
KERBEROS is based on this scheme
2
ƒ Distributes public keys (certificates)
ƒ Operates off-line, prior to interaction since public key is fixed
ƒ Once certificate created, intervention by CA no longer required
3
4
ALICE
BOB
43
44
(6) repudiation: Digital Signature-1
•
Digital Signatures can be used for
ƒ
ƒ
ƒ
ƒ
•
•
Proof of authorship
Non-repudiation by author
Guarantee of message integrity
Does not guarantee confidentiality (orthogonal)
– If confidentiality is needed, encryption of the message must be used
Important for many Internet applications
Based on public key cryptography
ƒ
•
(6) Digital signature-2: Signing
Current systems use RSA algorithm
Basic idea:
(1) Hash the message
(2) Encrypt the hashing with the sender private key
(3) Add to the message the sender public key and a certificate (for checking)
45
46
Outline
(6) Digital
Signature-3:
Verifying
• Are we paranoids ? (no!)
• Cryptographic tools and protocols
ƒ Basic primitives
– Encryption
– Cryptographic hash
ƒ Combining cryptographic primitives (correctly)
• Overview of database specific tools
47
48
Overview of database specific tools
(1) Authentication, (2) Communications
•
3-Authorizations
2-Protection of
communications
2-Protection of
communications
8-Data
anonymization
MS r
DBerve
S
Statistical use
ƒ
ƒ
ƒ
ƒ
ƒ
User
Delivered data
User
1-Authentication
6-Usage
control
Authentication
•
Basic: Login + Password
OS authentication can be used by the DBMS
DBMSs support most authentication protocols (SW & HW)
Attacks: default passwords (600 available on the web)
Problems: Indirect authentication:
– The user is authenticated by the Application
– The application is authenticated by the DBMS
Communications
ƒ Use of classical protocols (e.g. SSL)
ƒ A lot of known attacks on the Oracle Listener Service….
5-Audit
4-Database
encryption
7-Limited data
retention
1-Authentication
49
(3) Authorizations-1: Access control
3-Authorizations
•
(3) Authorizations-2: Access control
Basically: Discretionary Access Control (DAC)
ƒ Subject: Authenticated users/processes
ƒ Objects: Database objects to be protected
ƒ Actions: Actions that are authorized (e.g., read, update)
MS r
DBerve
S
•
Basic syntax
•
•
View : Virtual table defined by an SQL query
The DBMS transforms queries on views in queries on base
tables
Query on
views
ƒ GRANT <Actions> ON <object> TO <Subject>
ƒ REVOKE <Actions> ON <object> FROM <Subject>
•
50
Role Base Access Control (RBAC)
Access control
ƒ Roles are recipient of authorizations
ƒ Roles are assigned to users
•
View manager
Objects
ƒ
ƒ
ƒ
ƒ
Tables
Views
Stored procedures
etc.
View
definition
51
Rewrite query
on base tables
52
Create View
Select
From
Group By
(3) Authorizations-3: Access control with views
Employees
Stats as
service, count(*) Nbpatients, sum(expense) Total_Exp
Patients
service
Select Total_Exp
From Stats
Where service = "immuno"
FINAL
QUERY
QUERY
VIEW
Example
Id-E
LName
FName
Fone
1
Ricks
Jim
5485
2
Trock
Jack
1254
3
Lerich
Zoe
5489
Number of
employees
Average
Salary
4
Doe
Joe
4049
4
225
Human
ressources
Select sum(expense) Total_Exp
From Patients
Where service = "immuno"
Statistician
Id-E
LName
FName
Fone
Address
City
Salary
1
Ricks
Jim
5485
……….
Paris
230
2
Trock
Jack
1254
……….
Versailles
120
3
Lerich
Zoe
5489
……….
Chartres
380
4
Doe
Joe
4049
……….
Paris
170
53
54
(3) Authorizations-4: Virtual Private Database
(4) Database encryption
Query
Running the
procedure
Contextual
information
Procedure for
Adding
conditions
•
•
•
Query with
added
conditions
55
Oracle Obfuscation toolkit
Same kind of tools for other DBMSs
Protegrity Secure-Data
56
(5) Database auditing
(6) Usage control, (7) Limited Data retention
•
•
•
• Access control / Usage control:
Use specific auditing features of your database system
ƒ
ƒ
ƒ
ƒ
Use classical triggers for personalized audit
Triggers: E-C-A rules
When the EVENT happens
ƒ On insertion, update or deletion of a tuple in a given table
• Limited data retention principle
If the CONDITION is fulfilled
ƒ Any kind of SQL predicate
ƒ
ƒ
ƒ
ƒ
Do the ACTION
ƒ Specific code to execute (SQL, PLSQL, other languages)
•
Access control defines the rules for accessing the data
Usage control regulates the usage of delivered data
Limited data retention is one example of usage control
Digital Right Managements is usage control
– You can access a video, watch it, but not redistribute it
Triggers allow recording who has modified what and when
Attach a lifetime to the data
Compliant with its acquisition purpose
After which it must be withdrawn from the system
Examples
– Google: Cookies kept 2 years instead of 30 as before (!)
– Ask and IxQuick keep user information for only two days.
57
(8) Data Anonymization: on-going works
58
Bibliography
[MPO]
[Sch]
[KBL]
[SL98]
[AKS02]
[FBI]
[Ora]
[Mat]
[Orab]
59
Alfred Menezes, Paul van Oorschot, Scott Vanstone: Handbook of
Applied Cryptography.
Available online: http://www.cacr.math.uwaterloo.ca/hac/
Bruce Schneier: Applied Cryptography: Protocols, Algorithms, and
Source Code in C
Kifer, Bernstein, Lewis : Database Systems, an application oriented
approach (chapter 26)
Stefan Lucks: Attacking Triple Encryption, Fast Software Encryption
1998, pp 239–253.
Agrawal R., Kiernan J., Srikant R., Xu Y., “Hippocratic Databases”,
VLDB , 2002.
Computer Security Institute, "CSI/FBI Computer Crime and Security
Survey" http://www.gocsi.com/forms /fbi/pdf.html).
Oracle Unbreakable
http://www.techtv.com/news/securityalert/story/0,24195,3364291,00.html
U. Mattsson, Secure.Data Functional Overview, Protegity Technical
Paper TWP-0011,. (http://www.protegrity.com/White_Papers.html)
Oracle Corp., “Advanced Security Administrator Guide”
60
C-SDA
Tradeoff : confidentialité vs performance
• Données hébergées sur un serveur non
sécurisé
– Confidentialité Î données chiffrées
– Performance Î déléguer un maximum de
traitement au serveur
Utilisateur
C-SDA
Data
SGBD
traditionnel
Data
• Limites
Rights
Mgr
Data
User
Data
Data
Data
Data
Data
Query Mgr
Data
Rights
Mgr
Query Mgr
User
Data
– Tradeoff : confidentialité vs performance
Confidentialité et performance dépendent du grain et de la méthode de chiffrement
Serveur
Alternative : approche serveur
Alternative : Approche client
Data
Data
Data
Data
Data
Rights
Mgr
Query Mgr
Data
Data
Data
Data
Data
Pirate
Query Mgr
Confidentialité maximale, performances minimales
Data
BD
BD
chiffrée
P.M.E.
Data
Data
• Traitements sensibles + gestion des
droits Î Carte à puce
• La carte à puce agit comme un
médiateur incorruptible entre le client et
le serveur
Data
Query Mgr
Rights
Mgr
Query Mgr
User
Data
Data
Data
User
Data
Serveur BD
Serveur de
sécurité
• L’empreinte de la BD est chiffrée
– Oracle Obfuscation toolkit, ...
• Restriction maximale des droits de l’administrateur
– Protegrity’s Secure.Data, [He et al 01] …
Faiblesse = Le déchiffrement se fait sur le serveur,
Or le serveur n’est pas un site de confiance
• Déchiffrement sur le client Î Qui gère les clés ?
• Pour des données privées (accès exclusif)
– Le client gère les clés
– Tradeoff confidentialité / performance [Hacigumus et al 02]
• Pour les données partagées
– Le gestionnaire de droit (et clés) est nécessaire côté client
Faiblesse : Le client peut attaquer le gestionnaire de droits
Traitement d’une requête « simple »
Chiffrement de la BD
•
Données et méta-données chiffrées
•
C-SDA intercepte les requêtes de Select et les traduit
– Hypothèse initiale : chiffrement de granule attribut
•
Seuls les clients sont habilités à Créer/Détruire/Modifier des données
•
Chiffrement partiel possible
produits
nom
type
prix
sdz azds sdeefa
zze
d300
Dell
Pentium3
9800
zszd
dedef
zarevgzd
Fffe
I260
IBM
Pentium2
6400
df’g
Sde
iukèefsa
dgss
...
...
•
...
...
...
...
Serveur
Trouver les produits de
type = "Pentium3"
lqskdqs
ref
...
Terminal
n
q
p
...
ref
nom
type
prix
d300
Dell
Pentium3
9800
I330
IBM
Pentium3
9000
Attention, chiffrement et gestion des droits ne doivent pas interférer
•
Serveur
n
q
C-SDA
9400
q Calcul de
la moyenne
zze
zszd
dedef
zarevgzd
Fffe
tger
Sde
zarevgzd
zrzer
– cryptographie
•
processeur RISC très puissant
25mm2)
capacité mémoire limitée
Mémoire persistante = EEPROM
– lectures rapides (≈ 60ns), écritures très lentes (> 1ms)
p
r
Moyenne
Trouver le zze des lqskdqs
de sdeefa= "zarevgzd"
sdz azds sdeefa
Sécurité
– puce minuscule (<
Trouver la moyenne des
prix des produits de
type = "Pentium3"
o SGBD
Contraintes liées à la carte à puce
Traitement d’une requête plus complexe
Terminal
Trouver les lqskdqs de
sdeefa= "zarevgzd"
C-SDA
o SGBD
zze
Fffe
– capacité en augmentation (1Mo bientôt ?)
32 bits
Proc
I/O
Security Blocks
zrzer
La partie des requêtes non évaluable sur les données chiffrées est
évaluée sur la carte (prédicats <, >, fonctions de calcul, etc..)
ROM 96KB
RAM 4KB EEPROM 128KB
Décomposition d’une requête
• Q = Qterm ° Qcard ° Qserver
Equiprédicats
Inequi-predicats, aggregations ...
Exemple d’exécution et optimisation
Nom, n° de carte des clients et sommes
à facturer pour les commandes passées
après le 1/1/05
Calcul de la somme
Déchiffrement
Groupement
Calcul de la somme
Tri final
Jointure
Sélection (date > 1/1/05)
SELECT
C.Id, C.name, sum(O.amount)
FROM
WHERE
GROUP BY
HAVING
ORDER BY
Customers C, Orders O
C.Id = O.CustId and O.date > 1996
C.Id, C.name
count(*) >= 10
C.name
Gestion des droits
• Droits BD / droits systèmes
– Droits BD : consulter, modifier, supprimer des données
– Droits système : créer une partition, etc…
Client
Jointure
Déchiffrement
Chiffrement
Groupement
Sélection (date > 1/1/05)
Déchiffrement
Jointure
Client
Commandes
Commandes
Dates, sans doubles
Commandes
Robustesse du chiffrement
• Idée : Utiliser plusieurs clés (ou algos) de chiffrement
• Contrainte : a = b ⇔ chiffre (a) =chiffre (b)
• Fragmentation verticale
• Les droits BD font partie du noyau de sécurité …
Î Les droits BD doivent donc être vérifiés par C-SDA !
• Les droits BD sont basés sur les vues …
ÎLes vues doivent donc être gérées par C-SDA !
• Mais les droits et les vues sont des données partagées …
Îdroits et vues (définitions) doivent être stockés sur le serveur!
– utiliser des clés différentes pour des attributs
non comparables
A
B
C
D
E
• Fragmentation horizontale
– Plusieurs clés pour différentes valeurs
du même attribut
• Appliquer une fonction de hachage h sur la valeur a.
• Utiliser la clé Key(h(a)) comme clé de chiffrement
• Le couple (h(a), Chiffrekey(h(a))(a)) est stocké
A
'Isoler' les données les plus sensibles
• Isoler des données sans compliquer l’évaluation
– Remplacer les données sensibles de la base par des indices dans un
"domaine sensible" stocké sur la carte à puce
• Peut être vu comme un mode de chiffrement incassable
• la propriété a = b ⇔ chiffre (a) = chiffre (b) est préservée
=> même stratégie d’évaluation
Robustesse du chiffrement
• Désactiver une carte corrompue
Visites
User1 User2 User3
Id-V
Id-P
Date
Prix
1
2
15 juin
250
2
1
12 août
180
3
2
13 juin
350
K2
4
3
1 mars
250
K3
K1
K2
K3
K4
Public Key
….
…
…
K1
K4
K5
N°
4
Nom
Dupond
Code CB
0454255782
N°
4
Nom
Dupond
Code CB
1
Code CB
0454255782
K7
Etc..
8
Durand
0450500609
8
Durand
2
0450500609
9
Dutronc
0456589413
9
Dutronc
3
0456589413
13
Duval
0454547898
13
Duval
4
0454547898
15
Dussol
0455121236
15
Dussol
5
0455121236
Architecture finale
K6
Patients
Clé secrète K5 chiffrée avec
la clé publique de User2
Id-P
Nom
Prénom
Ville
1
Leau
Jacques
Paris
2
Troger
Zoe
Evry
3
Doe
John
Paris
4
Perry
Paule
Valenton
….
…….
…….
…….
K2
K5
K6
K7
Exercice
• Supposons que Joe a le droit d’accéder uniquement à la
vue « MyCustomers » contenant les clients niçois
– MyCustomers : Select * from Customer where City = ‘Nice’
• Joe est malveillant et veut accéder à l’ensemble des
clients
• Comment Joe peut il attaquer le système (hors attaque
statistique) pour récupérer des informations sur
l’ensemble des clients sachant que
– Joe a une carte puisqu’il a certains droits sur la BD
– Le serveur n’est pas sécurisé
[Hacigumus et al, SIGMOD’02], Univ. Irvine, CA
Attributs numériques
• Partitionner le domaine de variation d’un attribut
Data
Data
Encryption
Decryption
Query Mgr
Data
Data
Data
Data
Query Mgr
User
Connaissance du client
Data
h(1)=17 h(2)=4 h(3)=12 h(4)=3 h(5)=6 h(6)=1 h(7)=9
20
25
30
35
40
45
50
55
•
Alternatives à l’hypothèse a=b ⇔ E(a)=E(b)
•
Granule de chiffrement = tuple dans sa globalité
•
Ajout d’index d’attributs
– Indique l’appartenance d’un attribut d’un tuple à une plage de valeurs
– Permet des traitements approximatifs sur le serveur
Row:
Encrypted row:
id name age salary
Encrypted row
Connaissance du serveur
(Age=37)
Iid Iname Iage Isalary
E(R1)
E(R2)
E(R3)
(Age=53)
(Age=26)
index
Attributs « String »
32<Age<40
Age=53
IAge= 12
or
IAge= 3
IAge= 9
IAge
3
9
4
Service Provider Architecture
• Signatures de string (n-grams)
Server Site
Client Site
Connaissance du client
N={"g", "re", "ma"}
Query
Executer
Temporary
Results
exemple :
string
signature
'Greencar'
110
'Bigrecordman'
111
'Bigman'
101
name LIKE '%green%'
Client Side
Query
?
Server Side
Query
Service Provider
Query
Translator
Connaissance du serveur I
Name
E(R1)
E(R2)
E(R3)
Encrypted
Results
110
111
101
IName in (110, 111)
Original Query
Metadata
?
Encrypted User
Database
?
Actual Results
User
Query Decomposition
Query Decomposition (2)
Client Query
πname,pname
Q: SELECT name, pname FROM emp, proj
Client Query
WHERE emp.pid=proj.pid AND salary > 100k
πname,pname
Client Query
πname,pname
πname,pname
e.pid = p.pid
σsalary >100k
e.pid = p.pid
D
e.pid = p.pid
e.pid = p.pid
σsalary >100k
σsalary >100k
D
D
Encrypted
(PROJ)
PROJ
σsalary >100k
D
D
D
E_PROJ
σs_id = 1 v s_id = 2
E_PROJ
E_EMP
EMP
Server Query
Encrypted
(EMP)
Query Decomposition (3)
Client Query
πname,pname
e.pid = p.pid
E_EMP
Server Query
Server Query
Query Decomposition (4)
Client Query
πname,pname
πname,pname
σsalary >100k ™ e.pid =
σsalary >100k ™ e.pid = p.pid
Client Query
Q: SELECT
name, pname
FROM
emp, proj
WHERE
emp.pid=proj.pid AND
salary > 100k
p.pid
σsalary >100k
D
D
D
e.p_id = p.p_id
e.p_id = p.p_id
E_EMP
Server Query
e_emp.etuple,
FROM
e_emp, e_proj
e_proj.etuple
D
σs_id = 1 v s_id = 2
QS: SELECT
E_PROJ
σs_id = 1 v s_id = 2
E_EMP
Server Query
E_PROJ
σs_id = 1 v s_id = 2
E_EMP
Server Query
E_PROJ
WHERE
e.p_id=p.p_id AND
s_id = 1 OR s_id = 2
QC: SELECT
name, pname
FROM
temp
WHERE
emp.pid=proj.pid AND
salary > 100k
Effect of Number of Buckets in
Non-Join Query
Effect of Number of Buckets in Join Query
Cost Factors for Query Response Time
Query Response
Time
40
30
Client Side
Network
Server Side
20
10
Client
Server
Total
1
0
•
Client and communications costs decreases with increasing number of
buckets due to better filtering at the server
•
Server cost doesn’t decrease as much, table scan remains best choice in
the optimizer
75
100
150
250
300
Number of Buckets
2
8
Number of Buckets
500
750
1500
Effect of Decryption Time
Query Response Time
Query Response Time
Client, Server, and Total Response Times
1
75
100
150
250
300
500
Client /w
decryption
Client w/o
decryption
750
1500
Number of Buckets
•
Sharp decrease in query response time with increase in the number of
buckets due to better filtering at the server
•
Client side query response time is greater than server side query response
time due to dominant decryption cost on the query (second graph)