Data Domain OpenStorage Primer
Transcription
Data Domain OpenStorage Primer
White Paper Data Domain OpenStorage Primer Abstract Data Domain’s support for Symantec NetBackup OpenStorage enables the use of disk as disk, eliminating the need to emulate tape drives, tape cartridges, and robots. The Data Domain OpenStorage solution advances the ability to use disk as disk, store more data on disk with inline deduplication, and simplifies the creation of backup copies with optimized duplication. This technical primer introduces OpenStorage and presents example use cases that address a variety of data protection challenges. DEDUPLICATION STORAGE Data Domain OpenStorage Primer Table of Contents 1 OpenStorage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 NetBackup Optimized Duplication. . . . . . . . . . . . . 3 1.2 Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Data Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Shared Storage Units . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5 Media Server Load Balancing. . . . . . . . . . . . . . . . . 6 1.6 Tape Consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Sample Deployments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Remote Office / Branch Office Solution . . . . . . . 7 2.2 Disaster Recovery Site Solution . . . . . . . . . . . . . . 8 2.3 HA – Clustered NetBackup Solution. . . . . . . . . . . 8 3 OpenStorage Technology . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1 OpenStorage Component View. . . . . . . . . . . . . . . . 8 4 Integration with NetBackup 6.5. . . . . . . . . . . . . . . . . . 10 4.1 Storage Management. . . . . . . . . . . . . . . . . . . . . . . 10 5 Licensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 6 Additional Documentation . . . . . . . . . . . . . . . . . . . . . 13 NetBackup hardware compatibility list . . . . . . . . . 13 General NetBackup administrative information . . . . . . . . . . . . . . . . . . . . 13 Information about the OpenStorage Disk option . . . . . . . . . . . . . . . . . . . . . . . 13 Information about using the NetBackup Vault option to create duplicate backup copies:. . . . . . . 13 NetBackup high availability information. . . . . . . . 13 Data Domain OpenStorage information. . . . . . . . . 13 2 DATA DOMAIN OPENSTORAGE PRIMER 1 OpenStorage Symantec NetBackup OpenStorage is an initiative designed to leverage intelligent disk storage solutions without the need for virtual tape emulation software. The initiative includes an API that enables NetBackup to take advantage of disk storage solutions with advanced capabilities such as data deduplication and replication. Data Domain OpenStorage software provides API based integration between Data Domain deduplication storage and NetBackup. The API gives NetBackup visibility into the properties and capabilities of the Data Domain storage system, control of the backup images stored in the system and network-efficient replication to remote Data Domain systems. Supported with NetBackup 6.5 and higher, OpenStorage enabled Data Domain systems and the Symantec NetBackup OpenStorage Option provide key enhancements for disk based data protection strategies: 4NetBackup optimized duplication - Backup image duplication based on Data Domain deduplication and network-efficient replication that is controlled, monitored, and cataloged by NetBackup. 4Integrated NetBackup reporting of Data Domain replication job status. 4Recovery of replicated backup images in their entirety or at a granular level via the NetBackup user interface. 4Sharing of Data Domain OpenStorage disk storage units among heterogeneous NetBackup media servers. 4NetBackup media server load balancing, eliminating the need to manually divide client backups across NetBackup media servers utilizing Data Domain OpenStorage disk storage units. 4Tape consolidation – Backup images from remote locations and branch offices can be replicated to a centralized location where they can be duplicated to tape under the control of NetBackup. 1.1 NetBackup Optimized Duplication Optimized duplication is a NetBackup term that describes the ability of an OpenStorage disk appliance to copy the data on one appliance to another appliance of the same type. Optimized duplication leverages Data Domain’s deduplication and network efficient WAN vaulting technologies for making duplicate copies of backup images. Under the direct control of NetBackup, duplication of backup images occurs such that the data path does not include NetBackup media servers. Instead, data transfers from one Data Domain system to another directly. Optimized Duplication Data Path WAN Data Domain System Data Domain System Figure 1: Optimized duplication data path NetBackup Storage Lifecycle Policies or the NetBackup Vault option can be used to control the sequenced creation of optimized duplicate backup images. Like all duplicate backups created by NetBackup, unique retention periods are easily selected for all backup copies. Optimized duplication is superior to legacy duplication methods for a variety of reasons. Because the data path bypasses NetBackup media servers, the media server doesn’t need to be sized for the increased CPU, I/O, and backplane bandwidth utilization typically associated with backup image duplication. Data Domain network efficient WAN vaulting transfers deduplicated data, which reduces bandwidth utilization by up to 99% resulting in faster backup image duplication with reduced network bandwidth needs. DATA DOMAIN OPENSTORAGE PRIMER 3 1.2 Reporting Optimized duplication jobs appear in the NetBackup activity monitor and are tracked like any other NetBackup job. Job details can be viewed to confirm that optimized duplication is performing as expected. Figure 2: NetBackup activity monitor – duplication job type Figure 3: Job details – optimized duplication 4 DATA DOMAIN OPENSTORAGE PRIMER 1.3 Data Recovery OpenStorage optimized duplication jobs get cataloged like any other NetBackup duplication job. Regardless of whether optimized duplication was invoked by a Storage Lifecycle Policy, a Vault option job, or ad-hoc duplication via the GUI or command line interface, NetBackup updates its catalog to reflect the creation of the duplicate copy. Since the NetBackup catalog is aware of the copy, recovering from an optimized duplicate copy is no different than recovering from any other duplicate copy. NetBackup, by default, always recovers from the primary backup copy. Fulfilling a restore request from a specific copy number requires that the desired copy be the primary copy. Setting a particular backup copy number to primary is accomplished simply via the NetBackup GUI or CLI. Figure 4: Setting an optimized duplicate to primary in the NetBackup catalog Figure 5: Recovery of replicated backups via the NetBackup GUI Recovering data using the NetBackup “Backup, Archive, and Restore” GUI makes full or granular restores simple and easy. Data recovery with OpenStorage optimized duplication is faster and more intuitive when compared to replication solutions that are not integrated with NetBackup. Backup image duplication performed externally to NetBackup results in a copy that is not known to the NetBackup catalog. These backup copies cannot be used to fulfill restore requests until NetBackup catalog entries have been created for them, potentially increasing the time to recovery. DATA DOMAIN OPENSTORAGE PRIMER 5 1.4 Shared Storage Units The Data Domain OpenStorage disk storage unit type can be shared between supported heterogeneous NetBackup media servers. The ability to allow any available media server to use the storage unit is selected by default. Usage of the storage unit can be restricted to specific NetBackup media servers if desired. Figure 6: NetBackup OpenStorage disk storage unit dialog window Sharing disk backup media among a collection of NetBackup media servers eliminates the need to configure and manage multiple smaller disk storage units and enables the practical use of large disk pools. Large shared disk pools facilitate a favorable increase in storage utilization. Small disk pools that cannot be shared may increase the upfront provisioning of storage without the ability to achieve high utilization rates. 1.5 Media Server Load Balancing OpenStorage disk storage units are one of the storage unit types that enable NetBackup media server load balancing. The algorithm used by NetBackup to select the best media server for a job seeks to avoid sending jobs to busy media servers. When an OpenStorage disk storage unit is selected for use with a backup policy or Storage Lifecycle Policy, and the storage unit is shared by multiple NetBackup media servers, NetBackup will automatically select the best candidate media server for the job based on criteria including media server rank and active job count. This configuration provides the added advantage of bypassing an offline NetBackup media server when processing a backup or restore request. “Basic Disk” storage units do not enable NetBackup media server load balancing. Within a storage unit group, the load balance feature is disabled when a “Basic Disk” storage unit is selected for inclusion. The alternative “round robin” storage unit selection setting falls short of true media server load balancing in that it fails to consider media server rank or active job count. 6 DATA DOMAIN OPENSTORAGE PRIMER 1.6 Tape Consolidation 2.1 Remote Office / Branch Office Solution Centralized tape cutting, where tape creation occurs at single location within the data protection environment, is more easily facilitated with the Data Domain OpenStorage solution. Backups occurring throughout the enterprise are easily replicated to a central location where they can then be duplicated to tape under the control of NetBackup. Remote and branch offices present well known challenges for the data protection staff: Centralized Tape Operations Site A Primary Data Center NetBackup Media Server NetBackup Media Server 4Limited IT staff at remote offices prohibits the deployment of extensive local data protection solutions. 4Purchasing and maintaining tape libraries for a collection of remote locations is expensive. 4Removing tape cartridges and shipping them off-site as well as recycling expired cartridges back into tape libraries requires an IT staff presence at remote locations. 4Tape media being sent off-site may need to be secured with encryption software or hardware, adding to the cost of the solution. 4Delays in the creation or transportation of backup images on tape media may impact data recovery with an increased time to recover. WAN The recommended solution eliminates the use of tape at remote locations: Hub and Spoke Remote Office Solution NetBackup Media Server Site B Tape Library Figure 7: Centralized tape operations A centralized or consolidated approach to managing the creation of backup tapes is beneficial for a number of reasons. Remote Site 4Instead of relying on tape media to be handled properly at multiple remote locations, tape operations can be appropriately managed using a reliable and predictable process. 4Consolidated tape operations promote high utilization of tape hardware resources, potentially reducing costs when compared to maintaining under-utilized tape hardware in a distributed multi-site environment. 4Tape media being sent off-site may need to be secured with encryption software or hardware, adding to the cost of the solution. 2 Sample Deployments Symantec NetBackup OpenStorage option and OpenStorage enabled Data Domain deduplication storage systems eliminate many of the challenges associated with the creation and management of duplicate backup images, transporting backup copies to an alternate site, and the centralized creation of tape based copies for longer term retention. Remote Site Primary Data Center Remote Site Remote Site Figure 8: Hub and spoke remote office solution Data protection operations are improved with the Data Domain remote office / branch office solution for a number of tangible reasons. The cost and challenges of managing tape operations at remote locations are eliminated. Optimized duplication facilitates faster creation of duplicate backup copies stored at the primary data center when compared to the outdated methodology employed with a tape duplication and physical transportation solution. Backups are duplicated from remote locations to the primary data center under the control of NetBackup. Data recovery can be fulfilled using a remote copy of the backup image, or in the event of a disaster, recovered from a duplicate copy residing at the primary data center. DATA DOMAIN OPENSTORAGE PRIMER 7 2.2 Disaster Recovery Site Solution A number of NetBackup disaster recovery strategies have been architected to accommodate a wide range of scenarios. The underlying requirement of each solution is that backup images are available for use at the disaster recovery facility. The optimized duplication of backups written to Data Domain OpenStorage disk storage units under the control of NetBackup represents the ideal method of making backup image copies available for restoration at a disaster recovery facility. Optimized Duplication for Disaster Recovery Primary Site Disaster Recovery Site NetBackup Server NetBackup Server Disk Storage Disk Storage Optimized Duplication Figure 9: Optimized duplication for disaster recovery The Symantec paper titled, “Implementing Highly Available Data Protection with Veritas NetBackup” (http://eval.symantec.com/ mktginfo/enterprise/white_papers/b-whitepaper_implementing_ highly_available_dr_with_veritas_netbackup_01_08_13599373. pdf) explores many strategies for disaster recovery. Having the most recent backup copies available in the event of a disaster improves the recovery point objective. Having a copy of the most recent backup at the disaster recovery site without the need to physically transport it improves the recovery time objective. The Symantec OpenStorage solution combined with appropriately deployed Data Domain storage systems represents the optimal foundation for a large variety of NetBackup disaster recovery solutions. 2.3 HA – Clustered NetBackup Solution Highly available NetBackup deployments may cluster the NetBackup master server and use catalog replication technologies such that ongoing data protection tasks can be fulfilled in the event of an interruption to service. One challenge that highly available solutions present is having multiple backup copies available to recover from in the event of a NetBackup master server failover. Should a failover occur from the primary site to a secondary site, having duplicate backup copies available increases the ability to fulfill a restore request. Outdated tape duplication and transportation methods often fail to meet the aggressive recovery point and recovery time objectives that dictated the need for a highly available environment in the first place. 8 DATA DOMAIN OPENSTORAGE PRIMER Optimized Duplication for Disaster Recovery Primary Site Catalog Replication Secondary Site Global Cluster Option Optimized Duplication Figure 10: NetBackup Global Cluster Option solution Backups written to OpenStorage Data Domain disk storage units are duplicated under the control of NetBackup Storage Lifecycle Policies or NetBackup Vault option policies to destination Data Domain OpenStorage disk storage units. Optimized duplication technology facilitates faster creation of duplicate backup copies stored at the secondary data center when compared to outdated legacy solutions. 3 OpenStorage Technology OpenStorage is a Symantec initiative that targets the integration of NetBackup and intelligent disk appliances such as Data Domain deduplication storage systems. Much terminology exists that describe various components of the technology: 4The term OpenStorage also refers to the API that enables communication between NetBackup and intelligent disk appliances. 4The acronym OST may be referenced, it equates to OpenStorage. 4OpenStorage is also a disk storage unit type within the context of NetBackup. 4The term plug-in refers to vendor specific code that acts as an interface between the NetBackup OpenStorage API and that vendor’s intelligent disk storage system. 3.1 OpenStorage Component View The OpenStorage API is one of multiple components that when combined, constitutes the integrated NetBackup - Data Domain OpenStorage solution. In addition to NetBackup and the OpenStorage API, there is also the Data Domain software plug-in that resides on NetBackup media server platforms. The plug-in communicates with NetBackup using the OpenStorage API, and also interfaces with Data Domain storage systems. A block diagram can be used to illustrate the relationship of the components: Integrated Solution Components Symantec NetBackup version 6.5 Data Domain OpenStorage API Data Domain OST Plug-in Media Server 3.1.2 Data Domain OST Plug-in The Data Domain OST plug-in gets installed on supported NetBackup media server platforms. NetBackup communicates with the plug-in using the OpenStorage API. The plug-in communicates with one or more Data Domain storage systems. Data Domain System Similar to the way that NetBackup provides out of the box plug-in modules for “Basic Disk”, “AdvancedDisk”, and “SharedDisk”, the plug-in for OpenStorage is vendor specific and serves as an interface linking NetBackup to supported intelligent disk storage systems. Storage System Software Plug-ins NetBackup 6.5 Processes Figure 11: Integrated solution components Notes: a)The OpenStorage API is installed by default when NetBackup is installed. OpenStorage API Basic Disk Plug-in AdvancedDisk Plug-in SharedDisk Plug-in OpenStorage Plug-in b)The Data Domain system is enabled as an OpenStorage server. c)The Data Domain OST Plug-in needs to be installed on all NetBackup media servers that use Data Domain systems as OpenStorage disk storage units. d)The Data Domain system is configured within NetBackup as a storage server. e)See section 5 for license requirements 3.1.1 OpenStorage API At a granular level, the OpenStorage API includes a comprehensive suite of commands that give NetBackup the ability to communicate with Data Domain storage systems. This enables NetBackup to take advantage of intelligent functionality, such as network efficient WAN vaulting, offered by Data Domain storage. Properties of the Data Domain system can be queried so that NetBackup has knowledge of device requirements and capabilities including credentials required for access, the ability to deduplicate data, and the ability to replicate deduplicated data between storage systems. The OpenStorage API provides NetBackup with the ability to control the creation, duplication, and deletion of backup images, while delegating the tasks to the storage system. Figure 12: Software Plug-ins 3.1.3 Data Domain OpenStorage Server The Data Domain system gets configured within NetBackup as a storage server. Different disk backup devices are effectively controlled by a storage server entity from the perspective of NetBackup. The role of the storage server includes mounting storage, writing data to, and reading data from disk storage. The storage server also mediates access to storage and backup images. In the case of OpenStorage disk, the storage server entity is the Data Domain system. Other disk media types supported by NetBackup such as “AdvancedDisk” and “SharedDisk” use a NetBackup media server as the storage server. DATA DOMAIN OPENSTORAGE PRIMER 9 4 Integration with NetBackup 6.5 In addition to the OpenStorage API, logical NetBackup entities, and Data Domain OpenStorage plug-in, NetBackup 6.5 has been enhanced to include comprehensive CLI and GUI facilities that accommodate configuring and utilizing intelligent OpenStorage systems. 4.1 Storage Management OpenStorage intelligent disk storage systems are configured and presented to NetBackup for simple yet powerful storage management. Physical server and storage resources are configured as logical components for use by NetBackup. 4.1.1 LSU – Logical Storage Unit Data Domain storage systems are configured with an LSU as a top level subdirectory of the “/backup/ost” path. An LSU on the Data Domain storage system will correspond to a disk pool on one or more NetBackup media servers. The LSU entity is not configured via the OpenStorage API but is instead configured on the Data Domain command line interface with the “ost lsu create” command. Figure 15: NetBackup CLI - List of OpenStorage credentials 4.1.4 Disk Pools Introduced with NetBackup version 6.5, disk pools are used to present disk to NetBackup for use as backup media. In the case of Data Domain OpenStorage, a disk pool corresponds to an LSU. Figure 13: Data Domain CLI - List of configured LSU’s 4.1.2 Storage Server Registration Registering the Data Domain storage system with NetBackup as a storage server makes the device known to NetBackup. Storage server registration is a prerequisite to creating a disk pool that can be used by NetBackup. Figure 14: NetBackup CLI - List of configured storage servers 4.1.3 Storage Server Credentials The NetBackup “tpconfig” utility is used to add and save credentials that make it possible for media servers to log on to the Data Domain storage server. 10 DATA DOMAIN OPENSTORAGE PRIMER Figure 16: Change disk pool 4.1.5 Storage Units 4.1.6 Storage Unit Groups Storage units are a logical abstraction of physical storage devices used as backup media. Data Domain OpenStorage disk storage units are configured with a variety of available criteria: NetBackup provides the ability to logically group storage units, as well as the ability to specify storage unit selection criteria within a group. 4“On demand only” – Enabled by default but not required. The available selection algorithms are: 4“Storage unit type” – Disk is the required type. 4Prioritized; where NetBackup chooses the first storage unit in the list that is not busy, down, or out of available media. 4“Disk type” – OpenStorage (DataDomain) is the required disk type. 4Failover; where NetBackup chooses the first storage unit in the list that is not down, out of media, or full. 4“Disk Pool” – The selected disk pool name corresponds to an LSU on the Data Domain Restorer that was previously configured. 4Round Robin; where NetBackup chooses the least recently selected storage unit. 4“Media Server” – Allows the selection of specific media servers that are permitted to use the storage unit, or the selection of any available media server to use the storage unit. 4Load Balanced; where NetBackup seeks to avoid sending jobs to busy media servers. 4The maximum number of concurrent jobs and fragment size can also be specified. Figure 18: Storage Unit Group Figure 17: OpenStorage storage unit Data Domain recommends the use of the “failover” algorithm when grouping Data Domain OpenStorage disk storage units. This method seeks to utilize the same Data Domain storage system for each backup job that uses the storage unit group. This technique takes full advantage of the deduplication capabilities of Data Domain storage by sending the same backups to the same Data Domain storage system. DATA DOMAIN OPENSTORAGE PRIMER 11 4.1.7 Storage Lifecycle Policies Multiple backup copies created on different storage devices being retained for different periods of time are easily managed with one or more Storage Lifecycle Policies. Optimized duplication of backups written to Data Domain OpenStorage disk storage units is easily performed with the appropriately configured Storage Lifecycle Policy. There is no additional configuration required on Data Domain storage systems for setting up optimized duplication. NetBackup automatically initiates optimized duplication after the backup completes. Figure 19: Storage Lifecycle Policy example The example Storage Lifecycle Policy shown prescribes an initial backup to the OpenStorage Data Domain disk storage unit named “dd120a”. The retention period has been set to a fixed value of 2 weeks. The second storage destination is “dd120b”, which will be retained for a fixed value of 6 months. 5 Licensing 4NetBackup requires the OpenStorage Disk Option license 4Data Domain storage systems require an OpenStorage license 4Optimized duplication requires a Data Domain replication license 12 DATA DOMAIN OPENSTORAGE PRIMER 6 Additional Documentation NetBackup Hardware Compatibility List Veritas NetBackup™ Enterprise Server and Server 6.5 Hardware Compatibility List ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/284599.pdf General NetBackup Administrative Information Veritas NetBackup™ Administrator’s Guide, Volume I for Windows Release 6.5 ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/290203.pdf Veritas NetBackup Administrator’s Guide, Volume I for UNIX and Linux Release 6.5 ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/290201.pdf Information About the OpenStorage Disk Option Veritas NetBackup Shared Storage Guide UNIX, Windows, Linux Release 6.5 ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/290238.pdf Information About Using the NetBackup Vault Option to Create Duplicate Backup Copies: Veritas NetBackup Vault™ Administrator’s Guide UNIX, Windows, and Linux Release 6.5 ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/290233.pdf NetBackup High Availability Information Implementing Highly Available Data Protection with Veritas NetBackup (copy and paste this link into a browser) http://eval.symantec.com/mktginfo/enterprise/white_papers/b-whitepaper_implementing_highly_available_dr_with_veritas_netbackup_01_08_13599373.pdf Data Domain OpenStorage Information Data Domain OpenStorage Software Datasheet http://www.datadomain.com/pdf/DataDomain-OST-Datasheet.pdf Data Domain OpenStorage (OST) User Guide (requires support site login credentials) https://support.datadomain.com DATA DOMAIN OPENSTORAGE PRIMER 13 Data Domain 2421 Mission College Blvd. Santa Clara, CA 95054 866-WE-DDUPE; 408-980-4800 [email protected] 24 international offices: datadomain.com/company/contacts.html Copyright © 2008 Data Domain, Inc. All rights reserved. Data Domain, Inc. believes information in this publication is accurate as of its publication date. This publication could include technical inaccuracies or typographical errors. The information is subject to change without notice. Changes are periodically added to the information herein; these changes will be incorporated in new additions of the publication. Data Domain, Inc. may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time. Reproduction of this publication without prior written permission is forbidden. The information in this publication is provided “as is”. Data Domain, Inc. makes no representations or warranties of any kind, with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Data Domain and Global Compression are trademarks of Data Domain, Inc. All other brands, products, service names, trademarks, or registered service marks are used to identify the products or services of their respective owners. WP-OST-0708 DEDUPLICATION STORAGE www.datadomain.com