6 Obfuscation project

Comments

Transcription

6 Obfuscation project
The Project of this year
Mariano Ceccato
FBK - Fondazione Bruno
Kessler
[email protected]
1
Traditional computer security
 Most computer security research:

Protect the integrity of a benign host (and its data) from attacks by malicious
client programs
 Basis of the Java security model

Downloaded applet or virus infested application
Restrict the actions that the client is allowed to perform

A program is not able to write outside of a designated area (sandbox)

 Software isolation
2
More recent computer security

Interest in mobile agents changed the view of computer security


Defend a client is much more difficult than defend a host.





Benign client code being threatened by host on which it has downloaded/installed
To defend the host all is needed is to restrict the client
Once the client code is in the host, the host can use any technique to violate its integrity.
Software piracy
Reverse Engineering
Software tampering
3
Problem:
Malicious Reverse Engineering
 Valuable piece of code is extracted from an
application and incorporated into competitor’s code.
4
Obfuscation
 Obfuscation transforms a program into a new
program which:


Has the same semantics
Is harder to reverse engineer
5
Example
public class Fibonacci {
}
public int fib ( int n ) {
if ( n <= 2 )
return 1;
else
return fib( n - 1 ) + fib( n - 2 );
}
6
Example: Obfuscation
public class x {public int x ( int x ) {
return x <=2 ? 1 : x(x-1)+x(x-2);
}}
7
What is obfuscation?
 It is a software protection technique.
 Transforms the application into one that is functionally
identical to the original but is more difficult to reverse
engineer.
 Can never completely protect an application from
malicious reverse engineering.
 Given sufficient time and resources, an adversary can
reverse engineer any obfuscated code.
8
Potential application domains
 Good ones …


Obscure program logic.
Hide ownership information (e.g. watermarks --discussed by Mariano)
 Bad ones …


Development of polymorphic virus or code that
contains obfuscated malicious payload.
Code Plagiarism!
9
Defining Obfuscation
 Let P  P’ be a transformation from source program P to target program
P’.
 P  P’ is an obfuscating transformation if P and P’ have the same
observable behaviour; i.e. the following two conditions hold (Collberg and
Thomborson):


If P fails to terminate or terminates with an error, then P’ may or may not
terminate.
Otherwise, P’ must terminate and produce the same output as P.
 Two important conditions that need to be preserved:


functionality – the obfuscated program should have the same input/output
behaviour as the input program (semantics preserving transformation), and
unintelligibility – the obfuscated program should be unintelligible to the
adversary in some sense.
10
Goals of obfuscation …
 Ideal obfuscator (Ehud Barak, PhD, 2004): Should simulate the “black box” property.
 Fails if there exists at least one program that cannot be obfuscated
by this method; i.e. an adversary can learn something from an
examination of the obfuscated version of this program that cannot be
learned by merely executing the program repeatedly.
 Practical obfuscator (What we have now): Use transforms such that the resources required for undoing them
are too expensive for attackers.
11
Taxonomy of Obfuscations
 Layout obfuscation: Changes or removes useful
information from the IL without affecting real
instructions. E.g. comment stripping, identifier
renaming.
 Data Obfuscation: Targets data and data structures in
the program. E.g. changing data encoding,
splitting/merging arrays.
 Control-flow obfuscation: Affects the control-flow
within the code. E.g. Reordering statements,
introducing dummy control-flow.
12
Layout Obfuscation
 Changes or removes useful information from
the IL without affecting real instructions.
E.g. comment stripping, identifier renaming.
 Used in commercial obfuscators like DashO
for Java and Dotfuscator for MSIL … both
from PreEmptive Corp.
13
14
Data Obfuscations
 Variable Encoding
15
Data Obfuscations
 Variable splitting and merging

Arrays can be split into several sub-arrays, two
or more arrays can be merged into one bigger
array, folded so as to increase the number of
dimensions, or flattened to decrease the number
of dimensions.
16
17
Control-flow Obfuscations
 Aggregation/De-Aggregation: The original control-flow logic is
disturbed by coalescing unrelated methods or splitting related methods.
E.g. DOJ (Design Obfuscator for Java) Method inlining, outlining,
cloning, and loop transformations are also fall in this class.
 Ordering: This category performs reordering operations on statements,
loops, and expressions to disturb the locality of related information.
 Spurious Computations: This type of obfuscation is done by modifying
the real control-flow by adding spurious computation blocks. E.g.
Opaque predicates
18
Opaque Predicates
 An opaque predicate ():
 conditional expression  thus called predicate
 value is known to the obfuscator,
 value difficult for the adversary to deduce (by statically analysing
the code)  thus called opaque
 The opacity property of predicates determines the resilience of control-flow
transformations, i.e.
 opaque a predicate   difficulty in determining its outcome
by static analysis.
19
Opaque Predicates
 T/ F –  always evaluates to T/F (Opaquely
T/F Predicate)
 ? – may sometimes evaluate to T and
sometimes to F. (Opaquely Unknown Predicate)
T
if (
)T
F
T
if (
20
)F
F
T
if (
)?
F
Embedding of opaque predicates
(Dummy Code insertion)
A
T
if (
A
)T
F
T
B’
B
if ( )?
B’’
B
f ( B)  f ( B' ' )
f ( B)  f ( B' )
21
F
Embedding of opaque predicates
(Loop condition extension)
i = 1;
while (i < 100){
…
i++;
}
Can be transformed into:
i = 1; j = 100;
while ((i < 100) && (j*j*(j+1)*(j+1)%4 == 0)T){
…
i++;
j = j*i+3;
}
22
Opaque Predicates based on aliasing
 Aliasing occurs when two variables refer to the same memory location.
 In the presence of aliasing, inter-procedural static analysis is intractable.
 This intractability property of pointer aliasing can be used to construct
opaque predicates.
 Construction based on the fact that it is impossible for approximate static
analysers to detect all aliases all of the time.
 The basic idea:



Construct a dynamic data structure and maintain a set of pointers on it.
Make opaque predicates from these pointers.
Insert code for manipulating these pointer locations, yet maintain the
invariant condition.
23
Opaque Predicates based on aliasing
g
f
Node g, h;
Method P(…,Node f)
{
g = g.Move();
h = h.Move();
h = h.Insert(new Node)
…
if (f==g)? …
if (g==h)F …
…
f.Token = False;
g.Token = True;
if (f.Token)? …
…
G
g.Move()
h
H
24
Alias based
opaque predicates
Aliases :
f==g
g!=h
Update :
g = g.left( )
f = g.left().move()
class A {
int f1 ;
int f2 ;
void m ( ) {
f1 = 1 ;
f2 = f1 ++;
int tmp = f1 ;
tmp = tmp - f1 ;
f1 = f1 + f2 ;
}
}
class
class A
A {{
int
int f1
f1 ;;
int
int f2
f2 ;;
void
void m
m (( )) {{
int
int tmp
tmp ;;
ifif (( ff ==g
==g )) {{
f1
f1 == 11 ;;
g = g.left( ) ;
f2
f2 == f1
f1 ++;
++;
}}
else
else {{
g = g.left ( ) ;
tmp = f1 +f2 / 5 ;
f1 = f2 - tmp ;
}}
if ( g != h ) {
f = g.left().move() ;
tmp = f1 ;
tmp = tmp - f1 ;
g = g.left( ) ;
f1 = f1 +f2 ;
}
else {
f1 = tmp / f2 ;
tmp = f2%59+f2 ;
f = g.left().move() ;
}
}
}
JSnapScreen 0.1
 http://sourceforge.net/projects/jsnapscreen
 Open source java project (2k LoC)
 It takes snapshoot of the current screen
26
Resources
 Java grammar for Txl
 JSnapScreen code


Separated sources
All the sources in a single file (merged)
 JSnapScreen class diagram
 Pointer intensive data-structure
 List of update expression
 List of opaque predicates
27
Mandatory requirements







Work on the merged file
Break basic blocks into many sub-parts
Add opaque predicates
Add random code
Add update statements
Txl rules must be briefly commented
Deliver a “readme” describing how to run the
obfuscator
28
Optional requirements
 Work on separated source files
 Transformation is non-deterministic

If applied twice, it gives different results
 The changed code compiles
 The changed code runs
29
Delivery
 The project must be delivered one week (7
days) before the date of the exam
30

Similar documents