Data Domain OpenStorage Primer

Transcription

Data Domain OpenStorage Primer
White Paper
Data Domain OpenStorage Primer
Abstract
Data Domain’s support for Symantec NetBackup OpenStorage enables the use of disk as
disk, eliminating the need to emulate tape drives, tape cartridges, and robots. The Data
Domain OpenStorage solution advances the ability to use disk as disk, store more data on
disk with inline deduplication, and simplifies the creation of backup copies with optimized
duplication. This technical primer introduces OpenStorage and presents example use cases
that address a variety of data protection challenges.
DEDUPLICATION STORAGE
Data Domain OpenStorage Primer
Table of Contents
1 OpenStorage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 NetBackup Optimized Duplication. . . . . . . . . . . . . 3
1.2 Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Data Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Shared Storage Units . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Media Server Load Balancing. . . . . . . . . . . . . . . . . 6
1.6 Tape Consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Sample Deployments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Remote Office / Branch Office Solution . . . . . . . 7
2.2 Disaster Recovery Site Solution . . . . . . . . . . . . . . 8
2.3 HA – Clustered NetBackup Solution. . . . . . . . . . . 8
3 OpenStorage Technology . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1 OpenStorage Component View. . . . . . . . . . . . . . . . 8
4 Integration with NetBackup 6.5. . . . . . . . . . . . . . . . . . 10
4.1 Storage Management. . . . . . . . . . . . . . . . . . . . . . . 10
5 Licensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6 Additional Documentation . . . . . . . . . . . . . . . . . . . . . 13
NetBackup hardware compatibility list . . . . . . . . . 13
General NetBackup
administrative information . . . . . . . . . . . . . . . . . . . . 13
Information about the
OpenStorage Disk option . . . . . . . . . . . . . . . . . . . . . . . 13
Information about using the NetBackup Vault
option to create duplicate backup copies:. . . . . . . 13
NetBackup high availability information. . . . . . . . 13
Data Domain OpenStorage information. . . . . . . . . 13
2
DATA DOMAIN OPENSTORAGE PRIMER
1 OpenStorage
Symantec NetBackup OpenStorage is an initiative designed to leverage intelligent disk storage solutions without the need for virtual tape
emulation software. The initiative includes an API that enables NetBackup to take advantage of disk storage solutions with advanced
capabilities such as data deduplication and replication.
Data Domain OpenStorage software provides API based integration between Data Domain deduplication storage and NetBackup. The API
gives NetBackup visibility into the properties and capabilities of the Data Domain storage system, control of the backup images stored in
the system and network-efficient replication to remote Data Domain systems.
Supported with NetBackup 6.5 and higher, OpenStorage enabled Data Domain systems and the Symantec NetBackup OpenStorage
Option provide key enhancements for disk based data protection strategies:
4NetBackup optimized duplication - Backup image duplication based on Data Domain deduplication and network-efficient replication
that is controlled, monitored, and cataloged by NetBackup.
4Integrated NetBackup reporting of Data Domain replication job status.
4Recovery of replicated backup images in their entirety or at a granular level via the NetBackup user interface.
4Sharing of Data Domain OpenStorage disk storage units among heterogeneous NetBackup media servers.
4NetBackup media server load balancing, eliminating the need to manually divide client backups across NetBackup media servers
utilizing Data Domain OpenStorage disk storage units.
4Tape consolidation – Backup images from remote locations and branch offices can be replicated to a centralized location where they
can be duplicated to tape under the control of NetBackup.
1.1 NetBackup Optimized Duplication
Optimized duplication is a NetBackup term that describes the
ability of an OpenStorage disk appliance to copy the data on one
appliance to another appliance of the same type.
Optimized duplication leverages Data Domain’s deduplication and
network efficient WAN vaulting technologies for making duplicate
copies of backup images. Under the direct control of NetBackup,
duplication of backup images occurs such that the data path does
not include NetBackup media servers. Instead, data transfers from
one Data Domain system to another directly.
Optimized Duplication Data Path
WAN
Data Domain System
Data Domain System
Figure 1: Optimized duplication data path
NetBackup Storage Lifecycle Policies or the NetBackup Vault
option can be used to control the sequenced creation of optimized
duplicate backup images. Like all duplicate backups created by
NetBackup, unique retention periods are easily selected for all
backup copies.
Optimized duplication is superior to legacy duplication methods for
a variety of reasons. Because the data path bypasses NetBackup
media servers, the media server doesn’t need to be sized for the
increased CPU, I/O, and backplane bandwidth utilization typically
associated with backup image duplication. Data Domain network
efficient WAN vaulting transfers deduplicated data, which reduces
bandwidth utilization by up to 99% resulting in faster backup
image duplication with reduced network bandwidth needs.
DATA DOMAIN OPENSTORAGE PRIMER
3
1.2 Reporting
Optimized duplication jobs appear in the NetBackup activity monitor and are tracked like any other NetBackup job. Job details can be viewed
to confirm that optimized duplication is performing as expected.
Figure 2: NetBackup activity monitor – duplication job type
Figure 3: Job details – optimized duplication
4
DATA DOMAIN OPENSTORAGE PRIMER
1.3 Data Recovery
OpenStorage optimized duplication jobs get cataloged like any other
NetBackup duplication job. Regardless of whether optimized duplication
was invoked by a Storage Lifecycle Policy, a Vault option job, or ad-hoc
duplication via the GUI or command line interface, NetBackup updates its
catalog to reflect the creation of the duplicate copy. Since the NetBackup
catalog is aware of the copy, recovering from an optimized duplicate copy is
no different than recovering from any other duplicate copy.
NetBackup, by default, always recovers from the primary backup copy.
Fulfilling a restore request from a specific copy number requires that the
desired copy be the primary copy. Setting a particular backup copy number
to primary is accomplished simply via the NetBackup GUI or CLI.
Figure 4: Setting an optimized duplicate to primary in the NetBackup catalog
Figure 5: Recovery of replicated backups via the NetBackup GUI
Recovering data using the NetBackup “Backup, Archive, and Restore”
GUI makes full or granular restores simple and easy. Data recovery with
OpenStorage optimized duplication is faster and more intuitive when
compared to replication solutions that are not integrated with NetBackup.
Backup image duplication performed externally to NetBackup results in a
copy that is not known to the NetBackup catalog. These backup copies
cannot be used to fulfill restore requests until NetBackup catalog entries
have been created for them, potentially increasing the time to recovery.
DATA DOMAIN OPENSTORAGE PRIMER
5
1.4 Shared Storage Units
The Data Domain OpenStorage disk storage unit type can be shared between supported heterogeneous NetBackup
media servers. The ability to allow any available media server to use the storage unit is selected by default. Usage of
the storage unit can be restricted to specific NetBackup media servers if desired.
Figure 6: NetBackup OpenStorage disk storage unit dialog window
Sharing disk backup media among a collection of NetBackup media servers eliminates the need to configure and
manage multiple smaller disk storage units and enables the practical use of large disk pools.
Large shared disk pools facilitate a favorable increase in storage utilization. Small disk pools that cannot be shared
may increase the upfront provisioning of storage without the ability to achieve high utilization rates.
1.5 Media Server Load Balancing
OpenStorage disk storage units are one of the storage unit types that enable NetBackup media server load balancing. The algorithm used by NetBackup to select the best media server for a job seeks to avoid sending jobs to busy
media servers.
When an OpenStorage disk storage unit is selected for use with a backup policy or Storage Lifecycle Policy, and the
storage unit is shared by multiple NetBackup media servers, NetBackup will automatically select the best candidate
media server for the job based on criteria including media server rank and active job count. This configuration
provides the added advantage of bypassing an offline NetBackup media server when processing a backup or restore
request.
“Basic Disk” storage units do not enable NetBackup media server load balancing. Within a storage unit group, the
load balance feature is disabled when a “Basic Disk” storage unit is selected for inclusion. The alternative “round
robin” storage unit selection setting falls short of true media server load balancing in that it fails to consider media
server rank or active job count.
6
DATA DOMAIN OPENSTORAGE PRIMER
1.6 Tape Consolidation
2.1 Remote Office / Branch Office Solution
Centralized tape cutting, where tape creation occurs at single
location within the data protection environment, is more easily
facilitated with the Data Domain OpenStorage solution. Backups
occurring throughout the enterprise are easily replicated to a
central location where they can then be duplicated to tape under
the control of NetBackup.
Remote and branch offices present well known challenges for the
data protection staff:
Centralized Tape Operations
Site A
Primary Data Center
NetBackup
Media Server
NetBackup
Media Server
4Limited IT staff at remote offices prohibits the deployment of
extensive local data protection solutions.
4Purchasing and maintaining tape libraries for a collection of
remote locations is expensive.
4Removing tape cartridges and shipping them off-site as well as
recycling expired cartridges back into tape libraries requires an IT
staff presence at remote locations.
4Tape media being sent off-site may need to be secured with
encryption software or hardware, adding to the cost of the
solution.
4Delays in the creation or transportation of backup images on tape
media may impact data recovery with an increased time to recover.
WAN
The recommended solution eliminates the use of tape at remote
locations:
Hub and Spoke Remote Office Solution
NetBackup
Media Server
Site B
Tape Library
Figure 7: Centralized tape operations
A centralized or consolidated approach to managing the creation
of backup tapes is beneficial for a number of reasons.
Remote Site
4Instead of relying on tape media to be handled properly at
multiple remote locations, tape operations can be appropriately
managed using a reliable and predictable process.
4Consolidated tape operations promote high utilization of tape
hardware resources, potentially reducing costs when compared
to maintaining under-utilized tape hardware in a distributed
multi-site environment.
4Tape media being sent off-site may need to be secured with
encryption software or hardware, adding to the cost of the
solution.
2 Sample Deployments
Symantec NetBackup OpenStorage option and OpenStorage
enabled Data Domain deduplication storage systems eliminate
many of the challenges associated with the creation and management of duplicate backup images, transporting backup copies
to an alternate site, and the centralized creation of tape based
copies for longer term retention.
Remote Site
Primary
Data Center
Remote Site
Remote Site
Figure 8: Hub and spoke remote office solution
Data protection operations are improved with the Data Domain
remote office / branch office solution for a number of tangible
reasons. The cost and challenges of managing tape operations at
remote locations are eliminated. Optimized duplication facilitates
faster creation of duplicate backup copies stored at the primary data
center when compared to the outdated methodology employed
with a tape duplication and physical transportation solution.
Backups are duplicated from remote locations to the primary
data center under the control of NetBackup. Data recovery can
be fulfilled using a remote copy of the backup image, or in the
event of a disaster, recovered from a duplicate copy residing at the
primary data center.
DATA DOMAIN OPENSTORAGE PRIMER
7
2.2 Disaster Recovery Site Solution
A number of NetBackup disaster recovery strategies have been
architected to accommodate a wide range of scenarios. The
underlying requirement of each solution is that backup images are
available for use at the disaster recovery facility.
The optimized duplication of backups written to Data Domain
OpenStorage disk storage units under the control of NetBackup
represents the ideal method of making backup image copies
available for restoration at a disaster recovery facility.
Optimized Duplication for Disaster Recovery
Primary Site
Disaster Recovery Site
NetBackup
Server
NetBackup
Server
Disk
Storage
Disk
Storage
Optimized
Duplication
Figure 9: Optimized duplication for disaster recovery
The Symantec paper titled, “Implementing Highly Available Data
Protection with Veritas NetBackup” (http://eval.symantec.com/
mktginfo/enterprise/white_papers/b-whitepaper_implementing_
highly_available_dr_with_veritas_netbackup_01_08_13599373.
pdf) explores many strategies for disaster recovery. Having the most
recent backup copies available in the event of a disaster improves
the recovery point objective. Having a copy of the most recent
backup at the disaster recovery site without the need to physically
transport it improves the recovery time objective. The Symantec
OpenStorage solution combined with appropriately deployed Data
Domain storage systems represents the optimal foundation for a
large variety of NetBackup disaster recovery solutions.
2.3 HA – Clustered NetBackup Solution
Highly available NetBackup deployments may cluster the NetBackup
master server and use catalog replication technologies such that
ongoing data protection tasks can be fulfilled in the event of an
interruption to service.
One challenge that highly available solutions present is having
multiple backup copies available to recover from in the event of a
NetBackup master server failover. Should a failover occur from the
primary site to a secondary site, having duplicate backup copies
available increases the ability to fulfill a restore request. Outdated
tape duplication and transportation methods often fail to meet the
aggressive recovery point and recovery time objectives that dictated
the need for a highly available environment in the first place.
8
DATA DOMAIN OPENSTORAGE PRIMER
Optimized Duplication for Disaster Recovery
Primary Site
Catalog
Replication
Secondary Site
Global
Cluster Option
Optimized
Duplication
Figure 10: NetBackup Global Cluster Option solution
Backups written to OpenStorage Data Domain disk storage units
are duplicated under the control of NetBackup Storage Lifecycle
Policies or NetBackup Vault option policies to destination Data
Domain OpenStorage disk storage units. Optimized duplication
technology facilitates faster creation of duplicate backup copies
stored at the secondary data center when compared to outdated
legacy solutions.
3 OpenStorage Technology
OpenStorage is a Symantec initiative that targets the integration
of NetBackup and intelligent disk appliances such as Data Domain
deduplication storage systems. Much terminology exists that
describe various components of the technology:
4The term OpenStorage also refers to the API that enables communication between NetBackup and intelligent disk appliances.
4The acronym OST may be referenced, it equates to
OpenStorage.
4OpenStorage is also a disk storage unit type within the context
of NetBackup.
4The term plug-in refers to vendor specific code that acts as an
interface between the NetBackup OpenStorage API and that
vendor’s intelligent disk storage system.
3.1 OpenStorage Component View
The OpenStorage API is one of multiple components that
when combined, constitutes the integrated NetBackup - Data
Domain OpenStorage solution. In addition to NetBackup and the
OpenStorage API, there is also the Data Domain software plug-in
that resides on NetBackup media server platforms. The plug-in
communicates with NetBackup using the OpenStorage API, and
also interfaces with Data Domain storage systems.
A block diagram can be used to illustrate the relationship of the
components:
Integrated Solution Components
Symantec
NetBackup
version 6.5
Data Domain
OpenStorage
API
Data Domain
OST Plug-in
Media Server
3.1.2 Data Domain OST Plug-in
The Data Domain OST plug-in gets installed on supported
NetBackup media server platforms. NetBackup communicates with
the plug-in using the OpenStorage API. The plug-in communicates
with one or more Data Domain storage systems.
Data Domain
System
Similar to the way that NetBackup provides out of the box plug-in
modules for “Basic Disk”, “AdvancedDisk”, and “SharedDisk”, the
plug-in for OpenStorage is vendor specific and serves as an interface
linking NetBackup to supported intelligent disk storage systems.
Storage System
Software Plug-ins
NetBackup 6.5 Processes
Figure 11: Integrated solution components
Notes:
a)The OpenStorage API is installed by default when NetBackup
is installed.
OpenStorage API
Basic Disk
Plug-in
AdvancedDisk
Plug-in
SharedDisk
Plug-in
OpenStorage
Plug-in
b)The Data Domain system is enabled as an OpenStorage server.
c)The Data Domain OST Plug-in needs to be installed on all
NetBackup media servers that use Data Domain systems as
OpenStorage disk storage units.
d)The Data Domain system is configured within NetBackup as
a storage server.
e)See section 5 for license requirements
3.1.1 OpenStorage API
At a granular level, the OpenStorage API includes a comprehensive
suite of commands that give NetBackup the ability to communicate
with Data Domain storage systems. This enables NetBackup to take
advantage of intelligent functionality, such as network efficient
WAN vaulting, offered by Data Domain storage. Properties of
the Data Domain system can be queried so that NetBackup has
knowledge of device requirements and capabilities including
credentials required for access, the ability to deduplicate data, and
the ability to replicate deduplicated data between storage systems.
The OpenStorage API provides NetBackup with the ability to control
the creation, duplication, and deletion of backup images, while
delegating the tasks to the storage system.
Figure 12: Software Plug-ins
3.1.3 Data Domain OpenStorage Server
The Data Domain system gets configured within NetBackup as
a storage server. Different disk backup devices are effectively
controlled by a storage server entity from the perspective of
NetBackup. The role of the storage server includes mounting
storage, writing data to, and reading data from disk storage. The
storage server also mediates access to storage and backup images.
In the case of OpenStorage disk, the storage server entity is
the Data Domain system. Other disk media types supported by
NetBackup such as “AdvancedDisk” and “SharedDisk” use a
NetBackup media server as the storage server.
DATA DOMAIN OPENSTORAGE PRIMER
9
4 Integration with NetBackup 6.5
In addition to the OpenStorage API, logical NetBackup entities,
and Data Domain OpenStorage plug-in, NetBackup 6.5 has been
enhanced to include comprehensive CLI and GUI facilities that
accommodate configuring and utilizing intelligent OpenStorage
systems.
4.1 Storage Management
OpenStorage intelligent disk storage systems are configured and
presented to NetBackup for simple yet powerful storage management. Physical server and storage resources are configured as
logical components for use by NetBackup.
4.1.1 LSU – Logical Storage Unit
Data Domain storage systems are configured with an LSU as a top
level subdirectory of the “/backup/ost” path. An LSU on the Data
Domain storage system will correspond to a disk pool on one or
more NetBackup media servers. The LSU entity is not configured via
the OpenStorage API but is instead configured on the Data Domain
command line interface with the “ost lsu create” command.
Figure 15: NetBackup CLI - List of OpenStorage credentials
4.1.4 Disk Pools
Introduced with NetBackup version 6.5, disk pools are used to
present disk to NetBackup for use as backup media. In the case of
Data Domain OpenStorage, a disk pool corresponds to an LSU.
Figure 13: Data Domain CLI - List of configured LSU’s
4.1.2 Storage Server Registration
Registering the Data Domain storage system with NetBackup as
a storage server makes the device known to NetBackup. Storage
server registration is a prerequisite to creating a disk pool that can
be used by NetBackup.
Figure 14: NetBackup CLI - List of configured storage servers
4.1.3 Storage Server Credentials
The NetBackup “tpconfig” utility is used to add and save credentials that make it possible for media servers to log on to the Data
Domain storage server.
10
DATA DOMAIN OPENSTORAGE PRIMER
Figure 16: Change disk pool
4.1.5 Storage Units
4.1.6 Storage Unit Groups
Storage units are a logical abstraction of physical storage devices
used as backup media. Data Domain OpenStorage disk storage
units are configured with a variety of available criteria:
NetBackup provides the ability to logically group storage units, as well
as the ability to specify storage unit selection criteria within a group.
4“On demand only” – Enabled by default but not required.
The available selection algorithms are:
4“Storage unit type” – Disk is the required type.
4Prioritized; where NetBackup chooses the first storage unit in
the list that is not busy, down, or out of available media.
4“Disk type” – OpenStorage (DataDomain) is the required disk
type.
4Failover; where NetBackup chooses the first storage unit in the
list that is not down, out of media, or full.
4“Disk Pool” – The selected disk pool name corresponds to
an LSU on the Data Domain Restorer that was previously
configured.
4Round Robin; where NetBackup chooses the least recently
selected storage unit.
4“Media Server” – Allows the selection of specific media servers
that are permitted to use the storage unit, or the selection of
any available media server to use the storage unit.
4Load Balanced; where NetBackup seeks to avoid sending jobs to
busy media servers.
4The maximum number of concurrent jobs and fragment size can
also be specified.
Figure 18: Storage Unit Group
Figure 17: OpenStorage storage unit
Data Domain recommends the use of the “failover” algorithm
when grouping Data Domain OpenStorage disk storage units. This
method seeks to utilize the same Data Domain storage system for
each backup job that uses the storage unit group. This technique
takes full advantage of the deduplication capabilities of Data
Domain storage by sending the same backups to the same Data
Domain storage system.
DATA DOMAIN OPENSTORAGE PRIMER
11
4.1.7 Storage Lifecycle Policies
Multiple backup copies created on different storage devices being
retained for different periods of time are easily managed with one
or more Storage Lifecycle Policies.
Optimized duplication of backups written to Data Domain
OpenStorage disk storage units is easily performed with the
appropriately configured Storage Lifecycle Policy. There is no additional configuration required on Data Domain storage systems for
setting up optimized duplication. NetBackup automatically initiates
optimized duplication after the backup completes.
Figure 19: Storage Lifecycle Policy example
The example Storage Lifecycle Policy shown prescribes an initial
backup to the OpenStorage Data Domain disk storage unit named
“dd120a”. The retention period has been set to a fixed value of 2
weeks. The second storage destination is “dd120b”, which will be
retained for a fixed value of 6 months.
5 Licensing
4NetBackup requires the OpenStorage Disk Option license
4Data Domain storage systems require an OpenStorage license
4Optimized duplication requires a Data Domain replication license
12
DATA DOMAIN OPENSTORAGE PRIMER
6 Additional Documentation
NetBackup Hardware Compatibility List
Veritas NetBackup™ Enterprise Server and Server 6.5 Hardware Compatibility List
ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/284599.pdf
General NetBackup Administrative Information
Veritas NetBackup™ Administrator’s Guide, Volume I for Windows Release 6.5
ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/290203.pdf
Veritas NetBackup Administrator’s Guide, Volume I for UNIX and Linux Release 6.5
ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/290201.pdf
Information About the OpenStorage Disk Option
Veritas NetBackup Shared Storage Guide UNIX, Windows, Linux Release 6.5
ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/290238.pdf
Information About Using the NetBackup Vault Option to Create Duplicate Backup Copies:
Veritas NetBackup Vault™ Administrator’s Guide UNIX, Windows, and Linux Release 6.5
ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/290233.pdf
NetBackup High Availability Information
Implementing Highly Available Data Protection with Veritas NetBackup
(copy and paste this link into a browser)
http://eval.symantec.com/mktginfo/enterprise/white_papers/b-whitepaper_implementing_highly_available_dr_with_veritas_netbackup_01_08_13599373.pdf
Data Domain OpenStorage Information
Data Domain OpenStorage Software Datasheet
http://www.datadomain.com/pdf/DataDomain-OST-Datasheet.pdf
Data Domain OpenStorage (OST) User Guide (requires support site login credentials)
https://support.datadomain.com
DATA DOMAIN OPENSTORAGE PRIMER
13
Data Domain
2421 Mission College Blvd.
Santa Clara, CA 95054
866-WE-DDUPE; 408-980-4800
[email protected]
24 international offices: datadomain.com/company/contacts.html
Copyright © 2008 Data Domain, Inc. All rights reserved.
Data Domain, Inc. believes information in this publication is accurate as of its publication date. This publication could include technical inaccuracies or typographical errors. The information is subject to change without notice. Changes are periodically added
to the information herein; these changes will be incorporated in new additions of the publication. Data Domain, Inc. may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time. Reproduction of
this publication without prior written permission is forbidden.
The information in this publication is provided “as is”. Data Domain, Inc. makes no representations or warranties of any kind, with respect to the information in this publication, and specifically disclaims implied warranties of
merchantability or fitness for a particular purpose.
Data Domain and Global Compression are trademarks of Data Domain, Inc. All other brands, products, service names, trademarks, or registered service marks are used to identify the products or services of their respective owners.
WP-OST-0708
DEDUPLICATION STORAGE
www.datadomain.com