Stochastic Gradient Boosting Approach to Daily Attrition Scoring

Transcription

Stochastic Gradient Boosting Approach to Daily Attrition Scoring
Based on High-dimensional RFM Features
Dr. Gerald Fahner
Senior Director Analytic Science, FICO
© 2015 Fair Isaac Corporation. Confidential.
This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.
Agenda
•
Ultra-dynamic Attrition Scoring
•
Case Study—Credit Card Attrition
•
Category Attrition
2
Ultra-Dynamic (Daily) Attrition Scoring Approach
Customer uses card
Daily attrition risk score
•
Prolonged inactivity signals higher risk—drives up attrition risk score
 Re-engage customer when attrition risk exceeds some threshold
… Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We
…
3
Transaction Dynamics Hold Key Information
•
Given information at time of scoring, who is more likely to attrite?
─
•
Which measures are most informative?
How to combine Recency and Frequency into predicting attrition risk?
Recency
Spence
Observation Period
Attrite?
Frequency
Time of Scoring
Attila
Observation Period
4
Attrite?
Recency:
Days since last card use
Frequency:
Fraction of days card
used during obs. period
How Machine Learning Complements Domain Expertise
Domain Expertise
Machine Learning
Good at intuiting key predictors 1
Doesn’t scale to many variables
Lacks intuition
2
Poor at combining multiple predictors
Poor at quantifying uncertainty
Need story behind the numbers 4
3
Excels at combining many
features into accurate
probabilistic predictions
Diagnose and visualize models to
gain insight into effects
# Recommended path
5
Key Elements of Approach
Based on Recencies, Frequencies, Monetary
values
•
High-dimensional feature space of complex events
Featurization of
transaction events
•
Machine learning /
classification tools
•
•
Stochastic Gradient Boosting
Partial dependence visualization
Performance evaluation
•
•
Lift related to portfolio profit gain
Out-of-sample / Out-of-time evaluation
6
Stochastic Gradient Boosting[1]
Combines predictions from 100’s or 1000’s of shallow CARTs
Training Data
Prediction
Function
CART 2
Weighted
average
Score
Outcomes
CART 1
Scored
? New case
CART M
Predictors
Inexplicable model by direct inspection
7
Predictors
Agenda
•
•
─
•
Machine Learning for Higher Profit
Category Attrition
8
Credit Card Case Study
Data and Project Design
~5 million accounts. More than 1 billion transactions over 3 years
• Transaction information: Date, Merchant Code, Amount, Authorized Flag
•
2 years
Model
development
6 months
Performanc
e period
Observation period
Time of Scoring
Out-of-Time
validation
Observation period
Performance
period
Attrition Performance Definition
Scoring Exclusions
Binary indicator of card activity during Performance period
Inactive
9
Statistical Measures of Model Performance
Lift and Precision
Target top α %
High Scores
with retention offer
Would-be attriters
Non-attriters
λ=
=
Low Scores
Lift at α % operating point:
10
Fraction of Attriters Among Targeted
Base Attrition Rate
Precision
Base Attrition Rate

 # Attriters Among Targeted
# Targeted 
=
# Attriters Total
# Total
(
)
Profit from a Retention Campaign
Actual Behavior of
Targeted Customer
Profit Contribution
per Customer
Would-be attriter
we persuade to stay
(CLV Gain
– Contact Cost
– Incentive Cost)
Precision * Persuasion Rate
Unpersuadable
attriter
(No CLV Gain
– Contact Cost)
Precision * (1–Persuasion Rate)
Non-attriter,
erroneously targeted
(No CLV Gain
– Contact Cost
– Incentive Cost)
11
Fraction of Targeted Customers
with this Behavior
1–Precision
Profit Gain From Attrition Model Improvement[2]
Gain = (λB − λ A ) Nαβ 0 (γ CLV + δ (1 − γ )) is Portfolio Profit Gain
from improving model B over model A, where :
λA
Lift from model A
λB
Lift from model B
α
β0
Targeting Fraction
5%
Base Attrition Rate
8%
N
CLV
δ
γ
Portfolio Size
5 million
Customer Lifetime Value
$1,000
Incentive Cost
$100
Persuasion Rate
20%
12
Will benchmark
alternative models
Portfolio-specific
assumptions
Benchmarking Predictive Models of Increasing Complexity
•
How much can we gain by making models more complex?
•
Are complex models robust over time?
Model 3: Interaction
model in R and F of
complex events
Model 2: Interaction
model in R and F of
card use
Model 1: Additive
model in R and F of
card use
Complex Event Examples
•
•
•
R: Recency F: Frequency
Recent restaurant visit and frequent hotels
More than $1,000 spent on travel last week
Recent car deal and frequently at the
pump
Dimensionality of Feature Space
13
Interaction Detection Experiment
 Should Capture (Recency X Frequency) Interactions
•
Predictors: Recency and Frequency of card use
─
Model 1: Additive, nonlinear in R and F
─ Model 2: Captures interaction between R and F
Out-of-sample / λ = 6.03
1
Out-of-time
⇒
λ2 = 6.54
validation
•
Gain = $2.86 MM s.t. portfolio assumptions
Interaction effect in agreement with research by Fader and Hardie[3]
14
Interaction Visualization Tells Story
Two-dimensional Partial Dependence Function[4]
Probability to use
card during next
6 months
= 1–Pr(Attrition)
Attila
Spence
Attila is at higher risk of
attrition because his card
use has lapsed for an
unusually long time interval
?
Spence: R=20, F=0.05
?
Frequency
Recency
Fraction of days card used
Days since last card use
15
Attila:
R=20, F=0.55
Featurization Experiment
 Should Capture Complex Events in Your Models
•
Define R and F features for complex events
•
Model 3: Candidate predictors include:
Card use events
+ Hundreds of merchant category events
+ Monetary events defined by spending bands
+ No-authorization events
Out-of-sample /
Out-of-time
validation
λ3 = 7.52
Recall :
λ1 = 6.03
λ2 = 6.54
⇒ Gain over Model 1 (simple, additive) = $8.34 MM s.t. portfolio assumptions
16
Learning Curves Experiment
 Should Exploit Larger Samples to Develop More Complex Models
Lift (O-o-S / O-o-T)
Model 3 (high-dim complex
events)
7
6
Model 2 (card R and F
only)
5
1,000
17
10,000
100,000
#Training Samples
Agenda
•
•
•
Category Attrition
─
Detecting Subtle Forms of Attrition
18
Merchant Category (MC) Attrition
•
Hundreds of credit card MC’s
•
Performance definition for a specific MC:
─
•
Stop buying from this MC–while continuing card use for other MC’s
May signal competitive influence or early belt-tightening—before total attrition
occurs. Quick detection informs rapid intervention
Card-level model
Overall
customer status
Grocery model
Grocery status
Travel status
Travel model
Gas station status
Gas station model
19
Possible interventions:
Offer incentives at
service stations, or start
customer dialogue
Summary
•
Daily attrition scoring quickly detects emergent attrition—signaled by unusually
long time lapse since last transaction
•
With large transaction volumes, more complex models are more profitable
•
Machine learning helps with insight, automation, scale
20
References
[1] Greedy Function Approximation: A Gradient Boosting Machine, by Jerome
Friedman, The Annals of Statistics, 29(5), 2001, 1189-1232.
[2] Defection Detection: Measuring and Understanding the Predictive Accuracy of
Customer Churn Models, by Scott Neslin et al., Journal of Marketing Research,
43(2), 2006, 204-211.
[3] RFM and CLV: Using Iso-Value Curves for Customer Base Analysis, by Peter
Fader, Bruce Hardie, and Ka Lok Lee, Journal of Marketing Research, 42(4), 2005,
415-430.
[4] Predictive learning via rule ensembles, by Jerome Friedman et al., The Annals
of Applied Statistics, 2(3), 2008, 916-954.
21
Thank You
Dr. Gerald Fahner
++1 512 5323621
[email protected]
This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.

Stochastic Gradient Boosting Approach to Daily Attrition Scoring

Transcription

Similar documents

Eloxite Corporation

Isaac Drought - Scholastic New Zealand

Silego Technology - Globalpress Electronics Summit

Challenger 2/3 Level 0 Control

Compared to cold-tip thermometers

Manufacturing Energy Management Solution (MEMS)

"The SalesBible" by Jeffrey Gitomer. The Ultimate Sales

Project Description: In 2008 Hoyne Savings Bank realized that their

Company Profile - Javedan Resources Corporation

RMASS2: Repeated Measures with Attrition: Sample Sizes for 2 Groups

How to Combat Attrition? Case Study on a Malaysian Educational... International Journal of Business and Behavioral Sciences

Sample Attrition in Panel Data: The Role of Selection on Observables

AN ANALYSIS OF SAMPLE ATTRITION IN PANEL DATA: John Fitzgerald