ITPG-CTG CYQ1 QBR
Transcription
ITPG-CTG CYQ1 QBR
Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Gideon Senderov Director, Advanced Storage Products NEC Corporation of America Long-Term Data in the Data Center Consumption of Enterprise Disk Capacity by Type (EB) CAGR 140 120 100 46.0% 80 60 83% 47.5% 40 20 28.8% 22.5% 0 2011 2012 2013 Content Depots & Cloud Unstructured data Replicated data Structured data Page 2 © NEC Corporation of America 2014 2014 2015 2016 Source: IDC, November 2012 PBBA Market Forecast (CY2012 - CY2017) Market is growing fast. (Source: IDC) Page 3 © NEC Corporation of America 2014 Archive Market Trend ▌ Expected rapid growth in disk based archive market due to wide usage of active archive, growing with CAGR 35%, and reach $5.7B WW in 2016 ▌ File- and Object-Based Storage (FOBS) market exceeding $23B in 2013, projected to grow with CAGR 25% and reach $38B WW in 2017 (IDC, August 2013) WW Disk based Archive sales forecast 6,000 Sales revenue ($M) 5,000 4,000 3,000 2,000 1,000 CAGR 35% 0 2010 2011 2012 2013 2014 2015 2016 Source: IDC, June 2013 Page 4 © NEC Corporation of America 2014 Storage Infrastructure Spending Page 5 © NEC Corporation of America 2014 Scale-out Object Storage ▐ Object Storage A storage architecture that manages data as objects, as opposed to other storage architectures like file systems which manages data as a file hierarchy and block storage which manages data as blocks within sectors and tracks Object storage systems allow relatively inexpensive, scalable and selfhealing retention of massive amounts of unstructured data. Source: Wikipedia, April 2014 Page 6 © NEC Corporation of America 2014 Legacy Solutions – Scalability Limitations ▐ Inadequate scalability of capacity & performance Cannot scale performance to keep up with data growth Multiple products with different architectures More siloed capacity to manage ▐ Limited deduplication scope Limited scalability proliferates duplicate data across appliances Lower deduplication ratio for large environments Page 7 © NEC Corporation of America 2014 Local vs. Global Deduplication ▐ Single deduplication repository across entire solution ▐ Data deduplication across ALL data from ALL node Cross-node deduplication for greater efficiency Cross application deduplication leveraging application-aware deduplication Page 8 © NEC Corporation of America 2014 Legacy Solutions – Resiliency Limitations Day 1: Full Day 2: Day 3: Incremental Incremental …Day 7: Full 1 7 1 4 1 6 5 1 1 6 1 4 4 6 6 3 4 7 1 6 7 8 2 1 Q: What data can be restored if block #1 lost? A: NONE! 1 2 3 4 5 6 7 8 Traditional RAID is not sufficient for deduplicated data Page 9 © NEC Corporation of America 2014 Scale-out Scalable Grid Storage Architecture Community of Smart Nodes Nodes • Industry-standard servers Accelerator Nodes Hybrid Nodes • Multiple types allowed • Heterogeneous & open Storage Nodes Unrestricted Scalability Intelligent Management SW • Fully distributed system • PERFORMANCE SCALABILITY • CAPACITY SCALABILITY Page 10 © NEC Corporation of America 2014 • Self-aware & self-organizing • Data management services • Virtualizes hardware platform Adaptive Grid Storage For Long-Term Data Online Upgrade/Expansion with Multi-generation Nodes Non-disruptive Remove Non-disruptive Add/Upgrade V2 Grid + V3 + V4 + Vx = 1 System ▐ Enable in-place technology refresh with no data migration ▐ Ever greener storage with faster, denser components ▐ Enable continuous data availability ▐ Reduce CapEx and OpEx with deduplication ▐ Non-disruptive scaling from dynamic auto provisioning storage Page 11 © NEC Corporation of America 2014 Hands-Free Management ▐ Simple, fast deployment < 45 minutes to backup & archive ▐ Self-discovering capacity No storage provisioning No tape emulation tasks ▐ Self-tuning and Resource Management Optimized performance & capacity ▐ Self-healing Automatic recovery across resources ▐ Web-browser GUI Monitoring and Planning Page 12 © NEC Corporation of America 2014 Advanced Erasure-Coded Data Resiliency ▐ “User dialable” disk/node protection Default protection against 3 concurrent failures Dynamically allocated Intermix of multiple resiliency levels (1-6) for different applications ▐ Greater protection with lower overhead Default setting: 25% capacity overhead 1.5x greater protection than traditional RAID 6 with lower overhead and faster recovery No idle spare drives ▐ Faster self-healing with less performance degradation Only data is reconstructed rather than entire drive Data is reconstructed across multiple spindles Page 13 © NEC Corporation of America 2014 Lab Benchmark Test Page 14 © NEC Corporation of America 2014 Scale-out vs. Scale-up Data Deduplication Controller Shelf Page 15 Shelf Shelf AN AN Shelf SN SN SN SN ▐ Single or Multi-Controller ▐ Modular scale-out front-end ▐ Fixed maximum throughput ▐ Scalable maximum throughput ▐ Scalability limited by controller physical capabilities ▐ Scalability independent of physical capabilities with linear performance © NEC Corporation of America 2014 Scale-out Inline Global Data Deduplication ▐ Distributed Two-Tier Architecture Independent linear scalability of performance & capacity AN AN ▐ Global Deduplication Data deduplication across ALL data from ALL nodes SN SN SN ▐ Distributed Hash Table Data routed to responsible Storage Node Deduplication & hash table processing scales linearly with Storage Nodes Prevents silos of deduped data Page 16 © NEC Corporation of America 2014 SN Linear Performance Scalability Page 17 © NEC Corporation of America 2014 Scale-out Deduplication Performance Page 18 © NEC Corporation of America 2014 Application Metadata Impact Clients’ Files Agent Operation File Aggregation (tar) Blocking Backup Server Storage Media Metadata Filtering ▐ Application metadata makes user data appear different ▐ Inserted metadata reduces deduplication efficiency Page 19 © NEC Corporation of America 2014 Application-Aware Deduplication Capacity Optimization through Enhanced Deduplication Data (Bytes) Generic Dedupe Application-Aware Dedupe Time (Weeks) Original Data Generic Dedupe Application-Aware Dedupe ▐ Application-aware deduplication leverages format awareness to filter metadata inserted by the application and deduplicate the data payload separately ▐ Application-Awareness can increase Reduction Ratio by 130% or more Page 20 © NEC Corporation of America 2014 Disaster Recovery and Cloud WAN-optimized Replication Site A Geo-Distributed Grid Site B Replication Site A Site C Site B Page 21 21 © NEC Corporation of America 2014 WAN-Optimized Replication Accelerator Nodes Storage Nodes Many to one replication Many to many replication ▐ Asynchronous grid-to-grid WAN-optimized replication for DR ▐ Deduplication across all replicated HYDRAstor grids Minimizes network bandwidth requirements Minimizes DR site capacity requirements ▐ Policy-based data selection – File System Granularity ▐ In-flight encryption Page 22 © NEC Corporation of America 2014 Customer Configuration Example ▐ Replication across primary and secondary datacenter grid systems ▐ Replication from remote sites within the U.S., and across the Atlantic and the Pacific Page 23 © NEC Corporation of America 2014 HYDRAstor Gen-4 Family Performance Grid Model HS3-410 802TB/hr, 1,040TB/hr (OST) 7,920TB Raw 103PB Effective 9.7TB/hr, 12.6TB/hr (OST) 192TB Raw 9.7TB/hr, 12.6TB/hr (OST) HS8-4006-720 2.5PB Effective 96TB Raw 29.2TB/hr, 37.8TB/hr (OST) 1.2PB Effective 720TB Raw 9.4PB Effective HS8-4002-96 Single Node Model HS8-4002-192 HS8-4165-7920 HS8-4001 Node Building Blocks 4.9TB/hr, 6.3TB/hr (OST) 12-48TB Raw 3.2TB/hr, 3.6TB/hr (OST) 156-624TB Effective 8-24TB Raw 104-312TB Effective Hybrid Node Storage Node HS8-4001-96 4.9TB/hr, 5.6TB/hr (OST) 96TB Raw 1.2PB Effective Page 24 © NEC Corporation of America 2014 Capacity Common Code Base and Functionality ▐ Common code/features and modular scalability across models ▐ Intermix of multi-generation nodes in the same grids ▐ All Software Features supported for entire product line DRD™ protection – Advanced erasure-coded data resiliency DataRedux™ – Inline application-aware global data deduplication and compression Cloning/Snapshot – Instant fully deduplicated file system or file R/W copy Dynamic Data Shredding – Data shredding for deleted classified data AN Failover – Front-end failover for High Availability RepliGrid™ (Option) – WAN-optimized replication with in-flight encryption HYDRAlock™ (Option) – WORM file system functionality Encryption at Rest (Option) – Encryption to protect data at rest HYDRAstor OpenStorage Suite (Options) • • • • Page 25 Express I/O – Lightweight data transport for high throughput Dynamic I/O – Adaptive I/O load balancing across nodes Optimized Copy – WAN-optimized copy services Optimized Synthetics – Storage-synthesized full backups © NEC Corporation of America 2014 Scale-out Storage for Long-Term Data 1 PB/hr Page 26 100 PB © NEC Corporation of America 2014 Performance Potential Enhancement Directions Functionality Page Page27 27 © NEC Corporation of America 2014