Abstract

Transcription

Abstract
Anomaly detection for
online risk assessment
When data is cheap and streaming,
labels are expensive and customers want
control, performance and transparency
Boris Gorelik, Ph.D.
Marcelo Blatt, Ph.D.
Alon Kaufman, Ph.D.
Yael Vila, Ph.D.
Liron Liptz
RSA CTO Israel
© Copyright 2014 EMC Corporation. All rights reserved.
1
≈20,000,000
active users worldwide
© Copyright 2014 EMC Corporation. All rights reserved.
2
≈300,000,000
protected devices
http://www.flickr.com/photos/banksimple/6149390684/sizes/o/
≈300,000,000
end-users protected
April 23rd 2014 20:00 IST
© Copyright 2014 EMC Corporation. All rights reserved.
5
2
© Copyright 2014 EMC Corporation. All rights reserved.
https://www.flickr.com/photos/enigmabadger/12609229435
key factors behind the success of
RSA adaptive authentication are:
6
… wealth of input data
© Copyright 2014 EMC Corporation. All rights reserved.
7
… and feedback by trained analysts,
inherent to risk assessment workflow
which is not possible in some use cases
© Copyright 2014 EMC Corporation. All rights reserved.
8
What do we need?
We need a
risk assessment algorithm that does not rely on manual feedback
with enhanced control
© Copyright 2014 EMC Corporation. All rights reserved.
9
What do we need?
We need a
risk assessment algorithm that does not rely on manual feedback
with enhanced control
that works from day one
© Copyright 2014 EMC Corporation. All rights reserved.
10
What do we need?
We need a
risk assessment algorithm that does not rely on manual feedback
with enhanced control
that works from day one
that is also
modular, adaptive and extensible
© Copyright 2014 EMC Corporation. All rights reserved.
11
What do we need?
We need a
risk assessment algorithm that does not rely on manual feedback
with enhanced control
that works from day one
that is also
modular, adaptive and extensible
© Copyright 2014 EMC Corporation. All rights reserved.
12
What do we need?
We need a
risk assessment algorithm that does not rely on manual feedback
with enhanced control
that works from day one
that is also
modular, adaptive and extensible
and accurate
© Copyright 2014 EMC Corporation. All rights reserved.
13
TAnDeM
Time-based Anomaly Detection model
• No feedback (there is no CM)
• Online
• Enhanced control
• Enhanced visibility & simplicity of the policy manager
© Copyright 2014 EMC Corporation. All rights reserved.
14
TAnDeM provides modular
and configurable risk score
Risk
Risky pattern
© Copyright 2014 EMC Corporation. All rights reserved.
Organization
Anomaly
User
Anomaly
15
Example: risk associated with
geographical location of a given user
times Boris came from Israel
R(Boris|Israel)=1 
0
times Boris came from any country
© Copyright 2014 EMC Corporation. All rights reserved.
16
Example: risk associated with
geographical location of a given user
times Boris came from USA
R(Boris | USA )=1 
1
times Boris came from any country
© Copyright 2014 EMC Corporation. All rights reserved.
17
Counting events – the
streaming way
• Allows learning and forgetting
• Allows online (streaming) and offline (batch) modes
• Smooth behavior
© Copyright 2014 EMC Corporation. All rights reserved.
18
Scoring Scheme: Feed Forward Network
Risk
Group
Category
Raw fact
or
calculated predictor
For example: user id, IP address, device age, geo-location
© Copyright 2014 EMC Corporation. All rights reserved.
19
Scoring Scheme: Modular and configurable
Model structure enables low-level
balance between its components
© Copyright 2014 EMC Corporation. All rights reserved.
20
© Copyright 2014 EMC Corporation. All rights reserved.
http://www.flickr.com/photos/ntr23/4650249185
Results time
21
Plot structure of a
typical TAnDeM simulation
© Copyright 2014 EMC Corporation. All rights reserved.
22
We use fraud markings as
a proxy for an anomaly
© Copyright 2014 EMC Corporation. All rights reserved.
http://www.flickr.com/photos/wiredforsound23/6862675420
More use cases are possible and some
are being examined right now
23
Typical TAnDeM simulation
© Copyright 2014 EMC Corporation. All rights reserved.
24
TAnDeM performs as expected on training
set data
Fraud transactions
have significantly
higher score
Case separation –
useful information from
day one
© Copyright 2014 EMC Corporation. All rights reserved.
Learning pace is fast
and controllable
>99% of the
transactions have lowor medium risk
25
© Copyright 2014 EMC Corporation. All rights reserved.
http://www.flickr.com/photos/minifig/3174009125
Based on real data of
one of our customers
26
Good performance in versatile data sets
© Copyright 2014 EMC Corporation. All rights reserved.
27
Nice separation between fraud and
non-fraud transaction scores
© Copyright 2014 EMC Corporation. All rights reserved.
28
– How is your wife?
– Compared to what?
Henny Youngman
© Copyright 2014 EMC Corporation. All rights reserved.
29
vs.
Supervised
algorithm
with feedback
© Copyright 2014 EMC Corporation. All rights reserved.
vs.
Supervised
algorithm
without feedback
30
TAnDeM performs better than
feedback-deprived model
name
Trading company
Bank 1
Bank 2
Investment firm
© Copyright 2014 EMC Corporation. All rights reserved.
Corrected partial AUC(5%)
Supervised
Supervised
learning
learning
with
w/o
feedback
feedback
1.00
0.83
0.88
0.81
0.97
0.78
0.72
0.68
31
TAnDeM performs better than
feedback-deprived model
name
Trading company
Bank 1
Bank 2
Investment firm
© Copyright 2014 EMC Corporation. All rights reserved.
Corrected partial AUC(5%)
Supervised
Supervised
learning
learning
TAnDeM
with
w/o
feedback
feedback
1.00
0.89
0.83
0.88
0.81
0.81
0.97
0.93
0.78
0.72
0.77
0.68
32
http://www.flickr.com/photos/paolomazzoleni/436307747
© Copyright 2014 EMC Corporation. All rights reserved.
33
© Copyright 2014 EMC Corporation. All rights reserved.
http://www.flickr.com/photos/paolomazzoleni/436307747
TAnDeM can provide
better service when
feedback is not
feasible
34
• Web portals
• VPN access
• Authentication
as a service
© Copyright 2014 EMC Corporation. All rights reserved.
http://www.flickr.com/photos/paolomazzoleni/436307747
Risk assessment in
“unsupervised” scenarios:
35
© Copyright 2014 EMC Corporation. All rights reserved.
http://www.flickr.com/photos/pedrito_shot/1699098510
Questions?
36
Questions?
Data scientist? Join us now!
mail: [email protected]
© Copyright 2014 EMC Corporation. All rights reserved.
37
TAnDeM performs better than
feedback deprived supervised model
Identity line
© Copyright 2014 EMC Corporation. All rights reserved.
38

Similar documents

TRUE Planetary™ Gearheads

TRUE Planetary™ Gearheads A wealth of product and application information as well as 3D models, software tools, our distributor locator and global contact information is available at www.thomsonlinear.com. For assistance in...

More information