CON6545_Jagannath
Transcription
CON6545_Jagannath
Market Basket & Advanced Analytics at Dunkin Brands Mahesh Jagannath, Prasanna Palanisamy Oct 1, 2014 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Agenda • • • • • About Dunkin Brands Inc. BI Program at Dunkin Brands BI Architecture at Dunkin Brands Advanced Analytics Architecture & Methodology Advanced Analytics Use Cases at Dunkin • • • Market Basket Customer Analytics Q&A Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Disclaimer All data used is sample data for presentation purposes only and is not actual corporate sales or consumer data 3 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. About Dunkin Brands Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. BI Program At Dunkin Brands • • • • • • First launched at DBI in 2007 1350 BI users today with role based access to 504 dashboard pages Mature governance process Domestic POS sales analysis to increase comparable store sales and profitability of DD and BR in U.S. Store development dashboards to identify opportunities to continue DD U.S. contiguous store expansion International reported sales analysis to drive accelerated international growth across both brands. 5 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. BI/DW Architecture at Dunkin Brands Other DBI Data Hyperion Users Exadata Exalytics Enterprise Data Warehouse Oracle EBS Hyperion R Radiant Sales Data Oracle BI DBI Corporate Users Intl. POS Franchisees (above store) Social Media Loyalty / CRM Steton SMG PAR RPS Bluecube PAR Terminals RPS Archive Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. 6 Agenda • • • • • About Dunkin Brands Inc. BI Program at Dunkin Brands BI Architecture at Dunkin Brands Advanced Analytics Architecture & Methodology Advanced Analytics Use Cases at Dunkin • • • Market Basket Customer Analytics Q&A Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Advanced Analytics platform • • Products Considered • • • Oracle Advanced Analytics / Oracle R Enterprise (ORE) Open Source R IBM SPSS Chose Oracle Advanced Analytics • • • Excellent fit with existing analytics infrastructure All the benefits of Open source R Scalability of Oracle 11G on engineered systems Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. R—Widely Popular R is a statistics language similar to Base SAS or SPSS statistics R environment Strengths • • • Powerful & Extensible Graphical & Extensive statistics Free—open source Challenges • • • • Memory constrained Single threaded Outer loop—slows down process Not industrial strength Oracle Advanced Analytics Oracle R Enterprise Component Architecture Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Oracle Advanced Analytics Oracle R Enterprise Compute Engines 11 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Advanced Analytics Methodology Identify Business Objective Monitor Performance & re-calibrate Deploy Model Understand Data Prepare data Test Model Develop model 12 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. ORE Advanced Analytics Framework Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Agenda • • • • • About Dunkin Brands Inc. BI Program at Dunkin Brands BI Architecture at Dunkin Brands Advanced Analytics Architecture & Methodology Advanced Analytics Use Cases at Dunkin • • • Market Basket Customer Analytics Q&A Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Market Basket Analysis • Understand role of category and Identify Business Objective purchase behavior • Identify category marketing opportunities Monitor Performance & re-calibrate Understand Data Deploy Model • Get richer insight into behavioral changes from promotions Prepare data • Apply data validation rules • Transform POS data into MB input format • Output to Star schema suitable for OBIEE consumption Test Model Develop model • Pairwise association model similar to Apriori, custom SQL implementation 15 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Market Basket Business Questions Choose a Category: (Sub Category Level) Answer the following questions for that Item in a particular region last week. • • • • • • What % of all transactions include [Product]? What related items are sold most frequently with [Product]? What is the average ticket $ amount when [Product] is present? On Average how many [Product] are sold in each transaction? What beverages are consumers buying most with [Product]? In what % of [Product] transactions is [Product] the only product purchased? 16 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Data Analysis & Design Considerations • • • • • • • 8 M daily transactions, ~25M transaction detail lines 20 TB data warehouse size, sales data about 10 TB Hierarchies: 5 level Product, 2x4 level Org, 4 level regional ~1000 SKUs @Item Group/Size level Exponential growth in combinations with each hierarchy 2 years of pre-computed Market Baskets and associated sales measures for reporting Nightly compute within ETL window data with 1 day latency Measures are non-additive along the Product Hierarchy 17 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Design : Approaches considered 1. Use Oracle Data Mining / Oracle R Enterprise Association Rules 2. Use Frequent Itemset table function in Oracle 11g to compute Item-sets 3. Custom SQL Development • Approach Chosen • Oracle Advanced Analytics for exploration / Ad-Hoc • Custom SQL for repeatable basket computation • OBIEE for reporting 18 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. High-level Design Transaction Data Data Model/ Preprocessing Rule Development Measure Calculation UI / Reports 19 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. 4 Key Reports % of Transactions containing related items Transaction Detail: Product of Interest Related Product Pairings Single Item Transactions: % of transactions when products are purchased alone. 20 Related Item What beverages are sold most often with PM Flats? 21 POI Transaction Detail Transaction Detail: Product of Interest 22 Related Purchases Related Product Pairings 23 Related Transactions Non-additive measures 5+3+3 Don’t Equal 11 in this case because some medium and small coffees might be sold in the same transaction! Single Item Transactions Click on to drill down for more detail Agenda • • • • • About Dunkin Brands Inc. BI Program at Dunkin Brands BI Architecture at Dunkin Brands Advanced Analytics Architecture & Methodology Advanced Analytics Use Cases at Dunkin • • • Market Basket Customer Analytics Q&A Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Current Areas Of Interest • • • • Customer Profiling Clustering / Segmentation Customer Churn Prediction Targeted Promotions 27 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Customer Profiling Identify Business Objective • Compute behavioral variables • Create Customer Monitor Performance & re-calibrate Deploy Model Understand Data record • Data Exploration in R Prepare data Test Model Develop model 28 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Customer Profiling: Attributes List of customer attributes used as-is or derived from their transactional history Descriptive Spend/ Check Transaction/Frequen cy Store Features Historical Purchase 1. 2. 3. 4. 5. 6. 1. 2. 3. 4. 1. Start Date 2. Last transaction date 3. Days since last transaction 4. Total transactions/Visits 5. Average weekly visits 6. % discounted visits 7. Top Day part 8. Daypart - % Visits 9. Preferred Store 10. Multi Store flag 11. Average DD Card Recharge Amount 12. Average DD Card Recharge Frequency 13. Days since last recharge 14. Current card balance 15. Transaction Activity in weeks 1. POS: drive thru or not 2. Combo or not 3. Wifi 1. Total Spend /Category 2. % spend on each Category 3. % spend Sub category 4. Average number of items per transaction 5. Preferred item combo Customer ID City State DMA Age Profession 5. 6. 7. 8. 9. Min Check Max Check Total Spend Average Weekly Spend Total points earned % Points redeemed Total No. of coupons redeemed Total discount amount (Coupons) Avg weekly coupon redeemed Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Customer Segmentation / Clustering Identify Business Objective • Re-run the model periodically to update the new clusters • Indicates any shift in the customer behavior Monitor Performance & re-calibrate • To understand your customers • Targeted Marketing • Design Promotions Understand Data • Compute behavioral variables • Create Customer record • Data Exploration in R • Model displays cluster means – Cluster properties • Number of Customers in a cluster • Deployed for targeted Deploy Model Prepare data Marketing and Monitoring Customer behavior • Identify variables for clustering, • Normalize data for Clustering Test Model Develop model • K-Means Clustering used to cluster Customers and find individual cluster characteristics 30 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Customer Segmentation / Clustering Analyze Cluster means to Derive Cluster Properties - - Clustering Algorithm Regulars – avg weekly visits are 5 - 78.2% visits in morning Mostly coffee drinker, but 25% times food buyers - Coffee Regulars - Avg weekly visits are 5.45 - Avg coffee transactions 80.29% Customer Data Profiles - High Spenders, Frequent visitors - Avg weekly spend ($35.12) - Avg. weekly visits (7.44) - Coffee and Food in basket (Avg items per transaction 2.4 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Customer Churn Analysis • Monitor the response and re-calibrate by updating training data or model parameters • Calculate the metrics for model evaluation • Define Churn & Active Identify Business Objective Customer • Identify Churn Customer patterns • Is the churn pattern Monitor Performance & re-calibrate localized or National? Understand Data Class Active Churn Active 71.93% 28.07% Churn 15.37% 84.63% Deploy Model Prepare data • Model will calculate the churn score for existing customers • Flag customers with high risk, low risk based on churn score Test Model • Test the model on test data set, for which outcome is known • Select threshold for model selection • Confusion Matrix for the best Model Develop model • Compute behavioral variables • Create Customer record • Data Exploration in R • Create Training data set • Should have equal distribution of churn and usual customers • Model to derive churn risk score. • SVM • Logistic regression • Naïve Bayes Classifier 32 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Possible Future initiatives • Periodic Churn Rate Modeling – measure churn over time • Customer Segments based on buying pattern – what they • • • • • buy, when they buy? Identify customers who are more likely to respond to offers Personalized promotions for retention Customer Lifetime value Customer Sentiment Analysis Enrich customer profiles with modeling scores 33 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited. Q&A 34 Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.