Making Fast Databases Faster - H
Making Fast Databases Faster - H
Making Fast Databases FASTER @andy_pavlo Yale University Columbia University April 2012 Fast + Cheap Legacy Systems 100% 80% CPU Cycles 60% TPC-C NewOrder 12.3% 29.6% 10.2% 40% 18.7% 20% 21.1% 0% 8.1% Real Work Buffer Pool Latching Locking Logging B-Tree Keys OLTP Through the Looking Glass, and What We Found There SIGMOD 2008 OLTP Transactions Fast Repetitive Small Main Memory • Parallel • Shared-Nothing Transaction Processing H-Store: A High-Performance, Distributed Main Memory Transaction Processing System VLDB vol. 1, issue 2, 2008 Procedure NameStored Procedure Input Parameters Execution Client Application Database Database Cluster Cluster TPC-C NewOrder txn/s 150,000 No Distributed Txns 125,000 20% Distributed Txns 100,000 75,000 50,000 25,000 0 4 8 12 16 20 24 28 32 36 Partitions 40 44 48 52 56 60 64 Optimization #1: Partition database to reduce the number of distributed txns. Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems SIGMOD 2012 CUSTOMER ORDERS c_id c_w_id c_last … o_id o_c_id o_w_id … 1001 5 RZA - 78703 1004 5 - 1002 3 GZA - 78704 1002 3 - 1003 12 Raekwon - 78705 1006 7 - 1004 5 Deck - 78706 1005 6 - 1005 6 Killah - 78707 1005 6 - 1006 7 ODB - 78708 1003 12 - CUSTOMER ORDERS CUSTOMER ORDERS CUSTOMER ORDERS CUSTOMER ORDERS ITEM ITEM i_id i_name i_price … 603514 XXX 23.99 - 267923 XXX 19.99 - 475386 XXX 14.99 - 578945 XXX 9.98 - 476348 XXX 103.49 - 784285 XXX 69.99 - CUSTOMER ORDERS ITEM CUSTOMER ORDERS ITEM CUSTOMER CUSTOMER ORDERS ITEM c_id c_w_id c_last … 1001 5 RZA - 1002 3 GZA - 1003 12 Raekwon - 1004 5 Deck - 1005 6 Killah - 1006 7 ODB - CUSTOMER ORDERS ITEM CUSTOMER ORDERS ITEM NewOrder(5, “Method Man”, 1234) Client Application CUSTOMER ORDERS ITEM CUSTOMER ORDERS ITEM CUSTOMER ORDERS ITEM DTxn Estimator DDL Schema Skew Estimator ------------------- Workload Large-Neighorhood Search Algorithm CUSTOMER ORDERS ITEM CUSTOMER ORDERS ITEM CUSTOMER ORDERS ITEM … CUSTOMER ORDERS ITEM Large-Neighborhood Search DDL ------------------- Schema Workload Initial Design Restart Relaxation Local Search Large-Neighborhood Search DDL ------------------- Schema Workload Initial Design Restart Relaxation Local Search Throughput Horticulture (txn/s) 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10,000 0 State-of-the-Art 60,000 14,000 50,000 12,000 10,000 40,000 8,000 30,000 6,000 20,000 4 8 16 TATP 32 +88% 64 4,000 10,000 2,000 0 0 4 8 16 TPC-C 32 +16% 64 4 8 16 32 TPC-C Skewed +183% 64 % Single-Partitioned Transactions Search Times TATP SEATS TPC-C TPC-C Skewed AuctionMark TPC-E Undo Log Client Application Database Database Cluster Cluster Optimization #2: Predict what txns will do before they execute. On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems VLDB, vol 5. issue 2, October 2011 » Partitions Touched? » Undo Log? » Done with Partitions? Client Application Database Database Cluster Cluster Current State: Input Parameters: w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] GetWarehouse: SELECT * FROM WAREHOUSE WHERE W_ID = ? Estimated Execution Path Input Parameters: w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] Transaction Estimate: Confidence Coefficient: Best Partition: Partitions Accessed: Use Undo Logging: 0.96 0 {0} Yes Throughput Houdini (txn/s) Assume Single-Partitioned 14,000 16,000 16,000 12,000 14,000 14,000 10,000 12,000 12,000 10,000 10,000 8,000 8,000 6,000 6,000 4,000 4,000 2,000 2,000 2,000 0 0 0 8,000 6,000 4,000 4 8 16 TATP 32 +57% 64 4 8 16 TPC-C 32 +126% 64 4 8 16 32 AuctionMark +117% 64 Prediction Overhead TATP TPC-C AuctionMark Conclusion: Achieving fast performance is more than just using only RAM. Future Work: Reduce distributed txn overhead through creative scheduling. h-store Help is Available +1-212-939-7064 Graduate Student Abuse Hotline Available 24/7 Collect Calls Accepted