Backup

Transcription

Backup
Efficient Backup with Data
DeDuplication
Which Strategy is Right for You?
Arturo Benavides
BURA Practice Manager
EMC Latin America
© Copyright 2009 EMC Corporation. All rights reserved.
Why So Much Interest in Data Deduplication?
Major trends influencing the transformation of backup environments
 Data Growth
Digital Information Created and Replicated Worldwide
2,500
– Daily, weekly, and monthly full backups kept for
months or years
Exabytes
– Typically represents a factor of 4 to 30 times
production capacity
2,000
5-FOLD Growth
1,500
in 4 YEARS
1,000
 New requirements to keep more data for
longer periods
500
0
2008
 Server Virtualization
2009
2010
2011
2012
Source: IDC Digital Universe white paper, sponsored by EMC, May 2009
– Dynamic environment causes increased complexity
New Paradigm
Virtual Environment: High overall
server utilization and little bandwidth
for backup
100%
100%
CPU Utilization
– Server consolidation creates high server utilization
leaving little bandwidth for backup
Old Paradigm
Physical Environment: Low overall
server utilization and plenty of
bandwidth for backup
CPU Utilization
– VM sprawl creates protection challenges
80%
60%
40%
20%
0%
Server
A
Server
B
Server
C
20 percent resource utilization
© Copyright 2009 EMC Corporation. All rights reserved.
80%
60%
40%
ESXServer
20%
Hardware
Shared Physical Resources
0%
Virtual
Server
A
Virtual
Server
B
Virtual
Server
C
80 percent resource utilization
2
Where is Deduplication Applied Today?
Backup
Session Focus
 To address major inefficiencies and costs due to redundant data
 Offered at both source/target depending upon use case
 Delivered as integrated backup solution or as hw target for incumbent
backup software
Archive
 To provide single instancing for long-term retention
 Reduces long-term storage costs
Deduplication
 Ability to guarantee single instance of data for compliance
Across Use Cases
Provides Additive Primary storage
 To provide increased primary storage efficiency; store more data, retain
Savings and
data longer
Readily Co-exist
 Non-disruptive; maintain performance with significantly reduced capacity
requirements
 Reduced storage acquisition cost; longer intervals between storage
capacity upgrades
© Copyright 2009 EMC Corporation. All rights reserved.
3
EMC’s Definition of Deduplication
Data Set 1
“The process of detecting and
identifying the unique data segments
within a given set of information,
enabling the elimination of redundancy
when stored or moved.”
Deduplication
Data Set 2
Data Set 3
Before:
total segments = 39
© Copyright 2009 EMC Corporation. All rights reserved.
After:
Unique segments = 6
4
How Data Deduplication Works
 First instance
 Duplicate instance
March 2009
March 2009
 Modified instance
April 2009
A
B
A
B
E
B
C
D
C
D
C
D
Only unique
data segments
are backed up
A
Data already backed up,
so only a unique ID pointer
is stored (20 bytes)
B
C
E
New data
segment
identified and
backed up
D
A
B
C
D
E
Unique data stored on disk, available for immediate recovery
© Copyright 2009 EMC Corporation. All rights reserved.
5
Factors Impacting Dedupe Ratios for Backup
Small variations can have big impact
Type of data
Data change rate
More user created
content* = higher
deduplication ratio
Less change = higher
deduplication ratio
*Encrypted and
compressed data not ideal
dedupe candidates
Retention policy
Longer retention
policy = higher
deduplication ratio
© Copyright 2009 EMC Corporation. All rights reserved.
Full to incremental
backup Ratio
More full backups =
higher deduplication
ratio
6
Deduplication Basics Concepts
 File-Level Dedupe / Single-Instancing
– File-level comparison & elimination of duplicate files (replaced with stubs/pointers to
common data content)
– Leveraged along with compression for archive and primary storage optimization
 Block-level / Sub-File Deduplication
– Eliminates redundant ‘groups of blocks’ within files -- maximum dedupe granularity
– For B2D / VTL - deploy at either Source or Target, as Immediate or Scheduled process
EMC is the ONLY vendor offering choice of how, when
& where to implement dedupe
© Copyright 2009 EMC Corporation. All rights reserved.
7
One Size Does Not Fit All Environments
SOURCE : Avamar
 Client software agents identify repeated
sub-file data segments at the source
Target: Data Domain
 Backup application sends native data to a
target storage device
 Backup application sends only new, unique  Data is deduplicated as it reaches the
segments across the network to storage
target
device
 Benefits
 Benefits
– Improved backup windows
– Reduced virtual infrastructure stress
– Reduced backup client-server bandwidth
– Plug and Play with existing Backup Software
– High throughput for large datasets & copy to tape
– Protocol independency: VTL, NAS, NBU OST
DEDUPLICATION AT SOURCE
DEDUPLICATION AT TARGET
Network
© Copyright 2009 EMC Corporation. All rights reserved.
Network
8
EMC Data Domain and Avamar Deduplication
Retain more.
Retain backups longer, using less disk
10-30x data reduction eliminates the use of tape
for operational recovery
Replicate smarter.
Efficiently move data offsite – faster
Only move deduplicated data for 99%
bandwidth efficiency and cost-effective DR
Recover reliably.
Leverage end-to-end data verification
Continuous fault-detection and healing ensure
data recoverability to meet SLAs
© Copyright 2009 EMC Corporation. All rights reserved.
9
EMC Data Domain
Inline deduplication storage systems
 Supports backup and archive software
–
–
–
–
–
Backup Software: NetWorker, Symantec, CommVault, IBM TSM, …
Application utilities: Oracle RMAN, SQL Server, …
F5 ARX file virtualization
Archive: SourceOne, Symantec Enterprise Vault, Mimosa, …
Data Domain Retention Lock software option
 Supports any protocol
– SAN: VTL software option
– NAS: NFS, CIFS
– Custom: NetBackup OpenStorage software option
 Scaleable for Local and Distributed Recovery
– Up to 5.4 TB/hour
– Up to 71 TB addressable capacity per system
– Data Domain Replicator software option
Data Domain
Deduplication Storage Systems
 Advanced dedupe architecture for high speed & resilience
– Stream Informed Segment Layout (SISL) scaling architecture
– Data Invulnerability Architecture
© Copyright 2009 EMC Corporation. All rights reserved.
10
EMC Avamar
Deduplication backup software
 Integrated software & hardware solution with
global source-based deduplication
–
–
–
–
Deduplicates across sites and servers globally
Effective full backup every time
Single step recovery
Backup process reduces data sent over the network
and stored
– Variable-length subfile segments for optimal
deduplication
• Integrated high availability and reliability
– RAIN for high availability and fault tolerance
– Avamar server and data recoverability verified daily
– Replication between servers
Avamar
Deduplication Backup Software
 Flexible deployment options
– Avamar software
– Avamar Data Store
– Avamar Virtual Edition for VMware environments
© Copyright 2009 EMC Corporation. All rights reserved.
11
EMC NetWorker
Manage from a single pane of glass
 Integrated deduplication
File Systems and Applications
 Optimize dedupe for the greatest
benefit
 Single pane of glass
NetWorker
Avamar
© Copyright 2009 EMC Corporation. All rights reserved.
Data
Domain
12
EMC Data Protection Advisor
Single view across EMC backup to disk with deduplication solutions
 Complete view into the total
backup environment –
EMC solutions and beyond
Avamar
Data Domain
Disk Library
NetWorker
 Centralizes monitoring,
reporting and analysis
 Manage SLAs, Capacity
Planning, Optimization
Unified Data Protection Management
Lower effort and cost
Reduce Risk
Manage Complexity
© Copyright 2009 EMC Corporation. All rights reserved.
13
Summary -EMC Broadest De-duplication Portfolio
Use Case
EMC Offering
Primary Storage
DataDomain & Celerra De-duplication
Remote office backup
Avamar: Global source de-duplication
VMware backup
Avamar: Global source de-duplication
Bandwidth constrained backup
Avamar: Global source de-duplication
LAN backup to disk
DataDomain
Iseries & Mainframe
DataDomain
High speed SAN backup
DL4000 series + DataDomain
Consolidated management
NetWorker with Avamar and/or Disk Library
File and e-mail archive
DiskXtender & SourceOne: Single instance
Active archive
EMC Centera : Single instance storage
© Copyright 2009 EMC Corporation. All rights reserved.
14
Which deduplication solution is right for you?
Let Us Help You Determine the Right Solution
 Depends on:
– Current backup challenges and
environment
– Application and data type
– Service level requirements
Dataset 1
 Backup, e-mail, and file system
assessments, TCO tools
 EMC Assessment for
Deduplication Service
Deduplication
Dataset 2
 EMC Operational Assurance for
Avamar
Dataset 3
© Copyright 2009 EMC Corporation. All rights reserved.
15
EMC Education Services
 Develop and validate your Deduplication and Backup Recovery expertise
1. Information Storage Technology ‘Open’ Curriculum
2. EMC Technology-Specific Learning Paths
3. EMC Proven Professional Certification Program
 Enhance your Deduplication and Backup Recovery capabilities
Featured Learning Paths
– Avamar (deduplication) Administration
– Backup and Recovery – NetWorker
– EMC Disk Library
– EMC RecoverPoint
– Replication Manager
Take the next step
Visit http://education.EMC.com/BackupRecovery
© Copyright 2009 EMC Corporation. All rights reserved.
16
Why EMC for Backup-to-Disk with Deduplication
 Market Leadership
– Avamar: #1 in source-based deduplication
– Data Domain: #1 in target-based deduplication
– Disk Library: #1 in VTL
 EMC offers the broadest set of integrated
backup-to-disk solutions with deduplication
– Avamar, Data Domain
– Disk Library, NetWorker
 EMC is the only vendor providing solutions to
ALL customer needs
– From “refresh to re-design”
– Tailored to the size, need, and budget of the
customer
 Inline deduplication is a competitive
differentiator
– A fundamental technology appearing across the
portfolio in both hardware and software
© Copyright 2009 EMC Corporation. All rights reserved.
17
Q&A
THANK YOU !!!
Get More Out of Storage with
Data Domain DeDuplication
Storage Systems
Carlos Patiño
DataDomain Sales Manager
EMC Latin America
© Copyright 2009 EMC Corporation. All rights reserved.
Data Domain: Leadership and Innovation
Deduplication Storage Systems
> 9,500 systems installed
> 3,400 customers
> 1,600 petabytes under Data Domain protection worldwide
A History of Industry Firsts
2003
2004
First
First Dedupe
Dedupe NAS
NAS
2005
2006
2007
2008
First
First Dedupe
Dedupe Gateway
Gateway
First
First Dedupe
Dedupe
Volume
Volume Replication
Replication
Largest
Largest Dedupe
Dedupe Array
Array
First
First Dedupe
Dedupe
Directory
Directory Replication
Replication
First
First Dedupe
Dedupe VTL
VTL
© Copyright 2009 EMC Corporation. All rights reserved.
2009
Fastest
Fastest
Backup
Backup
Controller
Controller
First
First Dedupe
Dedupe
Cascaded
Cascaded Replication
Replication
First
First Dedupe
Dedupe
Nearline
Nearline Storage
Storage
46
Backup Data Reduction/Deduplication:
F1000
Roll-up Implementation
Time Frame
EMC
Storage Networking
Heat Index Rank: 1
NetApp
In Use Now
27%
IBM
In Pilot / Evaluation
8%
Symantec
In Near-term Plan
FalconStor
15%
In Long-term Plan
CommVault
25%
Not in Plan
Quantum
26%
2008 Spending Levels and 2009 vs. 2008 Projections
of Users With the Technology in Use or in
Consideration
Oracle
Over $10M
Microsoft
$5M – $10M
3%
$1M – $4.99M
COPAN
8%
11%
$500K – $999K
24%
$250K – $499K
Asigra
16%
$100K – $249K
$75K – $99K
Syncsort
$50K – $74K
5%
3%
11%
Under $50K
HP
No Spending
21%
Dell
Less Money
0%
10%
20%
30%
40%
50%
In Use Now
In Pilot / Evaluation
In Near-term Plan (through Q2 '09)
In Long-term Plan (Q3 '09 – Q1 '10)
© Copyright 2009 EMC Corporation. All rights reserved.
24%
33%
43%
About the Sam e
More Money
(5/5/09) Wave 12: F1000 Sample. TheInfoPro Storage Study, Spring 2009
47
Left & Top Right Charts: n=147, Lower Right Chart: n=38. Lower Bottom Right Chart: n=21
EMC Data Domain
Dedupe everything without
changing anything
Simplify backup, archiving and DR with
easy integration across workloads,
infrastructures, and backup software
Retain.
Replicate.
Recover.
Data Domain
Deduplication Storage Systems
© Copyright 2009 EMC Corporation. All rights reserved.
48
Data DeDuplication: Under the Hood (Daily Fulls)
Store more backups in a smaller footprint.
BACKUP DATA
LOGICAL
FRIDAY FULL
1 TB
2- 4x
250 GB
Mon
Mon Full
Full
Monday Full
1 TB
40-70x
25 GB
Tues
Tues Full
Full
Tuesday Full
1 TB
40-70x
25 GB
Weds
Weds Full
Full
Wednesday Full
1 TB
40-70x
25 GB
Thurs
Thurs Full
Full
Thursday Full
1 TB
40-70x
25 GB
FRIDAY FULL
1 TB
40-70x
25 GB
Friday
Friday Full
Full Backup
Backup
A B C D A E F G
Friday
Friday Full
Full Backup
Backup
B C D E F
PHYSICAL
L G H
A BCDE FGH I J K L
49
ESTIMATED
REDUCTION
© Copyright 2009 EMC Corporation. All rights reserved.
TOTAL
Confidential
6 TB
16x
375 GB
49
Dedupe Everything Without Changing Anything
TODAY
Multiple Platforms
for Backup,
Archiving, DR
…increased
management
WITH DATA DOMAIN
File Systems and Apps
VMware
VMware
NAS, SAN, DAS
Non-Dedupe Disk
…no data
reduction,
inefficient
replication or clone
to tape
Resource Less.
Manage Less.
Easy integration with
existing backup/
archive apps and
infrastructure simplifies
and centralizes
management
File Systems and Apps
NAS, SAN, DAS
Backup
Apps
Backup
Apps
Network
Tape Backup
…long recoveries
from offsite,
shipment risk,
media failures
© Copyright 2009 EMC Corporation. All rights reserved.
Archive
Apps
Network
NonDedupe
Disk
WAN
$
WAN
Disk-Based
Backup, Archiving,
Networked DR
DR Site
Fastest Time to
DR-Readiness for
Backup
Replication flexibility
to ensure data
available for recovery
faster than any other
method
Lower Cost per
Gigabyte for Backup
Efficient disk access
minimizes the number
of disks needed to
deliver high throughput
50
Industry’s Most Scalable Inline Dedupe Systems
New: DD630, DD610, and DD140 Systems
DD880
DD600 Appliance Series
DD630
DD610
1TB Disk
DDX Array Series
DD140
OST, VTL, Replicator, & Retention Lock software options
Up to 16 Controllers
Internal or External Storage
DD140
DD610
DD630
DD565
DD660
DD880
DDX Array
450
675
1.1 TB/hr
1.1 TB/hr
2 TB/hr
5.4 TB/hr
86.4 TB/hr
17-43
75-195
165-420
320-810
520-1.31 PB
1.4-3.5 PB
22.6-56.7 PB
Raw Cap. (TB)
1.5
Up to 6
Up to 12 Up to 23.5
Up to 36
Up to 96
Up to 1.5 PB
Usable Cap. (TB)
.86
Up to 26.1
Up to 71
Up to 1.13 PB
Speed (GB/hr)
Logical Cap. (TB)
© Copyright 2009 EMC Corporation. All rights reserved.
Up to 3.98 Up to 8.4 Up to 16.2
51
Data Domain Basics
Easy integration with existing environment
Backup and Archive
Applications
Data Domain DD880 Appliance
CIFS, NFS,
NDMP, OpenStorage
Ethernet
Replication
VTL over FC
10 Gb and 1 Gb Ethernet
4 Gb Fibre Channel
RAID-6
5.4 to 71 TB usable capacity with shelves
N+1 fans and redundant, hot plug power supplies
© Copyright 2009 EMC Corporation. All rights reserved.
52
Methodology: Inline vs Post-Process
Post Process:
Dedupe After Storing
Store
Dedupe
Inline:
Dedupe Before Storing
Dedupe
3x disk
accesses to
shared store
Process contention worse if more of them
- Copy to tape: Too slow to stream tape
- Recovery: SLA predictability
- Replication: Poor time-to-DR
- Dedupe itself if interleaved with
backup or restore
Other activities unimpeded
• Predictable
• Simpler
More admin to fight these issues
© Copyright 2009 EMC Corporation. All rights reserved.
53
Network-Efficient DR Replication
True DR; lowers WAN costs; improves SLAs
9595- 99%
99% Cross-site
Cross-site Bandwidth
Bandwidth Reduction
Reduction
Flexible Replication
• Many-to-one
• Bi-directional
• System-to-system
1- 5%
DD140
DIR A
Archive Data
Backup Data
home
1- 5%
WAN
DD140
1- 5%
home
DD565
Source: Remote Sites
DDX with DD880s
Destination: Data Center Hub
Supports hundreds of remote sites
© Copyright 2009 EMC Corporation. All rights reserved.
54
Multi-site Protection using Cascaded Replication
Local
Backup
New York
Directory
Boston
LAN
WAN
LAN
Local
Backup
Directory
Data Center
Site # 1
Data Center
Site # 2
London
Protection Site
Consolidation
© Copyright 2009 EMC Corporation. All rights reserved.
Protection
Site # 3
55
Flexibility – More Than Just VTL!
Backup
Archive
Database
AS400
SQL
LaserVault
SAN
LAN
VMware
Tandem
.vmdk
NF
S NF
S
ETI
CIF
S CIF
S
OS
T OS
T
Backbox
WAN
Mainframe
ESCON/
FICON
Data Domain Appliances
3u
RAID-6
NVRAM
N+1 Fan
1 - 6 Ports
© Copyright 2009 EMC Corporation. All rights reserved.
Replica
Data Domain Appliances
56
Your Current Environment…
and Your Future Environment with Data Domain
Remote Site
Primary Site
DR Servers
Windows
Servers
Unix
Servers
Other Database
and File Servers
WAN
Restore Traffic
Data
Network
Connectivity
Backup Server
Tape Library
backup traffic
Tape Vaulting
Restore Traffic
VMWare
Server(s)
Business Regulatory
Offsite Tapes
(If Needed…)
Backup Server/
Master/Media Server
Tapes Are Ejected
Replication
95% Less Bandwidth
Tapes Are Imported
Data Domain Appliance(s)
tape traffic
Data
Tape Library
Business Regulatory
Offsite Tapes
(If Needed…)
Data Domain Appliance(s)
57
© Copyright 2009 EMC Corporation. All rights reserved.
Confidential
57
WAN Efficient Replication
Replication Eliminates many headaches

Avoids manual tape operations

Avoids lost tapes

Gets data offsite faster/True DR

Consolidates Tape Archive
Operations
Control Bandwidth Usage

Manage replication windows

Trigger bandwidth usage
Source: Remote Sites
WAN
Many to 1
Bi-Directional
System Mirroring
Destination: Data Center Hub
Supports hundreds of remote sites
Selective Vaulting
9595- 99%
99% Cross-site
Cross-site Bandwidth
Bandwidth Reduction
Reduction
© Copyright 2009 EMC Corporation. All rights reserved.
Confidential
58
Enterprise Manager
Centralized management with consolidated dashboards
59
© Copyright 2009 EMC Corporation. All rights reserved.
Data Domain Confidential
59
Simple to Integrate
Installs seamlessly into existing storage infrastructure
Supports most backup and archive software
No need to rewrite existing backup policies and scripts
Works with enterprise applications commonly found in data centers
© Copyright 2009 EMC Corporation. All rights reserved.
60
Enables Intelligent Green IT Strategies
Intelligent: Business Requirements
– Cost
 Lower TCO, higher ROI
– Performance
 Less Management: Simple, Mature, and Flexible
 Scalable: Supports multiple tiers of storage, all data types
 Throughput: Dedupes faster than others write raw data
– Security
 Data Integrity: Highest, storage of last resort
 Replication: Pass DR audits
Green: Space, Power, Cooling
– Less Disk
 Inline dedupes before hitting disk
 Uses CPU instead of spindles for performance
IT
– Less Tape
 Reduce/eliminate tape, drives, transport, vaulting
61
© Copyright 2009 EMC Corporation. All rights reserved.
Confidential
61
University – School of Business
Consolidated 1 ½ racks to only 3u
Customer Challenges
Data Domain Solution
 Too much capacity needed to meet
retention requirements
Business
Provost
DR Site
NAS, SAN, DAS
 Problems with tape management and
concerned about security
Arts & Sciences
?
 Needed to migrate to affordable
replication solution for backup data
Replicate
Data Center
© Copyright 2009 EMC Corporation. All rights reserved.
 No changes to infrastructure required
 Reduced footprint, cooling, management
 Eliminated tape transport services
 Gained fast DR synch
 Retain data for 6 weeks
 Recaptured admin time of 2 days per week
DR Facility
62
Real World Compression Chart (NetBackup 6.5)
Amount of Data within
Backup Application
(~245 TB)
Cum. Compression Ratio –
represents higher percentage
of structured data in
environment (~36x)
Cumulative Physical
Data Written to Data
Domain Device (~6.7 TB)
© Copyright 2009 EMC Corporation. All rights reserved.
63
Real World Compression Chart (VMWare)
Amount of Data within
Backup Application
(23TB)
Cum Compression Ratio –
50/50 mix of structured and
unstructured data – ~46x
Cumulative Physical Data
Written to Data Domain
Device (488 GB )
64
© Copyright 2009 EMC Corporation. All rights reserved.
Confidential
64
Real World Compression Chart (RMAN)
Amount of Data within
Backup Application
(~180 TB)
Cum. Compression Ratio –
represents higher percentage
of structured data in
environment (~40x)
Cumulative Physical
Data Written to Data
Domain Device (~4.5 TB)
© Copyright 2009 EMC Corporation. All rights reserved.
65
Directory Replication
Simultaneous, multi-purpose DDR usage
– Local and remote office backup, recovery, de-stage to tape and DR
Simultaneous, multiprotocol replication
– NFS, CIFS, NDMP and Virtual Tape Library
Dir N
LAN
NFS
CIFS
NDMP
Dir X
NFS
CIFS
NDMP
Tape Pool 2
SAN
HBA
Port 2051
Virtual
Tape
Port 2051
/backup
S
Dir C
Source
66
Dir N
Dir X
S
Dir C
Dir Y
Dir T
Destination
Tape Pool 1
HBA
Virtual
Tape
Dir A
Dir C
LAN or
WAN
SAN
Dir S
LAN
© Copyright 2009 EMC Corporation. All rights reserved.
/backup
Tape Pool 1
vtc
Dir A
Tape Pool 2
Dir S
Dir N
Dir X
Dir C
Dir Y
Dir T
vtc
Tape Pool 1 Tape Pool 2
66
Why EMC Data Domain
Dedupe everything without changing anything
 Resource Less. Manage Less.
 Fastest Time to DR-readiness
 Lower cost per gigabyte
Simple, mature appliance design for inline,
CPU-centric deduplication
Any fabric, any software, backup &
archiving applications
In-LineDedupe
© Copyright 2009 EMC Corporation. All rights reserved.
67
Q&A
THANK YOU !!!
© Copyright 2009 EMC Corporation. All rights reserved.
68
EMC Avamar :
Game-Changing, Next-Generation
Backup and Recovery with Data
Deduplication
Ernesto Rodriguez
Avamar Specialist
EMC Latin America
© Copyright 2009 EMC Corporation. All rights reserved.
Agenda
 Today’s data protection reality
 The usual answers
 Avamar overview
– How it works
– Flexible deployment
 Avamar solves customer
challenges
 Summary
© Copyright 2009 EMC Corporation. All rights reserved.
2
Today’s Data Protection
Reality:
Pressure to meet the changing
needs of the business
 Data growth is immense
 Companies want more data protected
 Time required for backup is expanding
 Backup hampered by tape’s
unreliability and poor performance
 Virtual machines pose unique backup
challenges
 Increasing volume of data at remote
offices
© Copyright 2009 EMC Corporation. All rights reserved.
3
The Usual Answers
 Backup to tape, or stage to disk, and
then archive to tape
 Manually move tapes offsite for
disaster recovery
Traditional Backup
 Inefficient and slow, ~200 percent of
primary data moved across the
network weekly
 Requires tedious, multi-step recovery
process
 Risks the loss or theft of tapes in
transit
 Data is difficult to recover and leverage
© Copyright 2009 EMC Corporation. All rights reserved.
4
Avamar Overview
Avamar Software
Agent-only or deployed
on qualified, industry
standard servers
Avamar Data Store
Fully integrated software/
hardware solution
Revolutionizes backup with global,
source data deduplication
 Reduces the size of backup data at the source,
enabling fast, daily full backups across existing
physical and virtual infrastructure
– Up to 10x faster daily full backups
– Reduces daily network bandwidth impact by up to 500x
 Deduplicates across sites and servers globally
for maximum efficiency
– Reduces total disk backup storage by up to 50x
Avamar Virtual
Edition for VMware
Avamar server deployed
as a virtual appliance
© Copyright 2009 EMC Corporation. All rights reserved.
 Cost-effectively store full backups on disk for
extended period of time—reducing/ eliminating
reliance on tape
 Ideal for protecting VMware environments,
remote offices, and LAN / NAS servers in the
data center that suffer from slow/congested
networks and infrastructure
5
Avamar’s Proven Innovation and Leadership
 First to market with source/global data deduplication (2002)
 Awarded 5 technology patents from U.S. Patent and Trademark Office
 Numerous industry accolades and awards
 More than 2,200 customer deployments worldwide...and growing
© Copyright 2009 EMC Corporation. All rights reserved.
6
Avamar: How it Works
Global, source data deduplication
 Send and store each
atom only once
 Break data into atom (sub-file,
variable-length segments of data)
 EMC Avamar
backup repository
O
H
O
H
H
O
H
O
H
H
H
H
O
H
O
H
H
…up to 500 times
daily data reduction
At the source—Deduplication before data is transported across the network
At the target—Assures coordinated deduplication across sites, servers, and over time
Granular—Small, variable-length sub-file segments guarantee most effective deduplication
© Copyright 2009 EMC Corporation. All rights reserved.
7
Avamar: How it Works
Simple example of global, source data deduplication
Data
Center
 First Instance
Only unique data
segments are
backed up
 Duplicate Instance
A
Remote
Site 1
Data already backed up,
so only unique IDs stored
(20 byte pointers)
B
C
 Modified Instance
E
Remote
Site 2
New data segment
identified and
backed up
D
A
B
C
D
E
Deduplication Server
(stored backup data)
© Copyright 2009 EMC Corporation. All rights reserved.
8
Target- and Source-based Data Deduplication
There are strong use cases for both technologies…but only source-based
deduplication reduces daily network bandwidth requirements and
decreases client resource utilization during backups.
Deduplication at Source
Deduplication at Target
 Moves ~ 2 percent of primary data weekly
 Moves ~ 200 percent of primary data weekly
 Up to 50 times reduction in backup storage
 Up to 50 times reduction backup storage
 Up to 500 times less daily network impact
 Backups are typically restored from full and
incremental images
 Up to 10 times faster daily full backups
 Fast, daily full backups, single-step recovery
 Next-generation backup and recovery
Network
Network
EMC Avamar
© Copyright 2009 EMC Corporation. All rights reserved.
 Dedupe device viewed as file system and/or virtual
tape library target for traditional backup
software
EMC Disk
Library
9
EMC Avamar: Real-World Results
Avamar daily full backups vs. traditional daily full backups
Data Type
Amount of
Primary Data
Backed Up
Amount of
Data Moved
Daily
Daily
Deduplication
Ratio
Windows file systems
3,573 GB
6.1 GB
586:1
Mix of Windows, Linux, and UNIX file systems
5,097 GB
11.7 GB
436:1
Engineering files on NAS (NDMP backups)
3,265 GB
24.2 GB
135:1
Mix of 20 percent databases, 80 percent
file systems (Windows and UNIX)
9,583 GB
80.0 GB
120:1
Mix of Linux file systems and databases
7,831 GB
104.2 GB
75:1
Source: EMC
While results will vary by data type and mix, Avamar can
dramatically improve backup performance and efficiency
© Copyright 2009 EMC Corporation. All rights reserved.
10
Avamar’s Innovative Architecture
Avamar Server
Parity across
storage nodes
Verified
checkpoint
U.S. Patent No. 6,826,711
 Redundant Array of Independent
Nodes (RAIN) architecture
– Each server node with internal disk storage
and CPU
– Provides high availability and fault tolerance
across nodes
 Grid architecture for online scalability
and performance
 Daily integrity checks for Avamar
server and data recoverability
Utility and
spare node
 RAID protection from disk failures
 Available for Avamar software and
Avamar Data Store
© Copyright 2009 EMC Corporation. All rights reserved.
11
Avamar’s Centralized Management
Policy-based multi-site management
over existing network bandwidth
 Centralized management of multiple
Avamar systems
 Intuitive, web-based interface
 At-a-glance dashboards
 Capacity reporting and alerting
 Facilitates “set and forget” deployment
in smaller enterprises
 Improves efficiencies and streamlines
management in remote/branch offices
© Copyright 2009 EMC Corporation. All rights reserved.
12
12
Avamar Supported Operating Systems
and Applications
Client Operating Systems Supported
Application Modules
Microsoft Windows Server 2003 Standard and Enterprise

Microsoft Office SharePoint Server 2007
Microsoft Windows 2000 Server and Advanced Server

Microsoft Exchange 2000, 2003, 2007
Microsoft Windows Server 2008

Microsoft SQL Server 7.0, 2000, 2005
Microsoft Windows XP, XP Professional, Vista

Oracle 9i, 10g, 10gR2
Red Hat Linux 9.0

IBM DB2 8.2.x, 9.5
Red Hat Enterprise Linux (RHEL) 3.0, 4.0, 5.0

NDMP (EMC Celerra) DART 5.5, 5.6
Solaris 8, 9, 10

NDMP (NetApp) Data ONTAP 6.5, 7.0.4, 7.0.5, 7.0.6,
7.1x. 7.2

IBM Lotus Domino
SUSE Linux Enterprise Server 8.2, 9, 10

IBM 5.2, 5.3, 6.1

AIX 5.2, 5.3, 6.1

HP-UX 11.0, 11iV1, 11iV2, 11iV3

Mac OS X 10.4x, 10.5x

NetWare 6.5

Free BSD 6.2

Novell Storage Services (NSS) OES 2

SCO UNIX
VMware Infrastructure

VMware ESX Server versions 3.0.x, 3.5, 3i
= ESX client OS-certified using RHEL 4 client (support within virtual
machines depends on VMware support of client operating system)
New Avamar software expands client operating system,
application support
© Copyright 2009 EMC Corporation. All rights reserved.
13
Avamar's Unique Value
 Industry-leading deduplication and reliability
– Patented, variable-length, sub-file deduplication
– Scalable, fault-tolerant architecture and integrated
replication features
 Up to 10 times faster daily full backups
and rapid, single-step recovery
 VMware infrastructure backup
– Deduplication within and across virtual machines
minimizes impact on physical resources
 Fast, efficient LAN server and NAS backups
 Fast, efficient remote office backup
– Global deduplication at the source—where the data
resides
– Minimal consumption of network bandwidth for
backup
– True centralized management
Deduplication at the source, and globally—maximum efficiency and
data reduction; best for VMware and remote/distributed data
© Copyright 2009 EMC Corporation. All rights reserved.
14
14
Avamar’s Flexible Deployment
Large Remote Site
Small Remote Site
Primary systems
(ENCRYPTED)
(ENCRYPTED)
Avamar Data Store
Data Center
Avamar agent only
on primary systems
Remote Recovery Site
WAN
Primary systems
Primary systems
(ENCRYPTED)
(ENCRYPTED)
Avamar Data Store
Tape
Vault
= Avamar Software Agent
© Copyright 2009 EMC Corporation. All rights reserved.
Avamar Data Store
15
Avamar Data Store
Fully integrated software/hardware product
 Complete EMC backup and recovery product
– Avamar backup and recovery software with integrated source/global data
deduplication
– EMC-certified hardware—fully configured and delivered
 Built-in high availability
– Avamar RAIN technology
– Spare node, RAID
– Redundant power distribution
 Simplifies
– Purchase (single vendor, certified hardware)
– Deployment (minimizes onsite setup)
– Service (single vendor support)
New Avamar Data Store Gen 2 doubles
deduplicated backup capacity
© Copyright 2009 EMC Corporation. All rights reserved.
16
Avamar Data Store
Flexible deployment options
 Avamar Data Store
– Multi-node configuration starts at 4 TB and scales to support up to 32 TB
licensable deduplicated capacity
– Equivalent of up to 1.1 PB of cumulative traditional disk or tape backup
storage*
– Backup media requirement reduced 20–40 times
– High availability and reliability with RAIN architecture, RAID, daily integrity
checks, and redundant power
 Avamar Data Store Single Node
– Supports 1 TB and 2 TB licensable deduplicated storage capacity
configurations
– Equivalent of up to 70 TB of cumulative traditional disk or tape backup
storage*
– Designed for easy deployment at remote offices
– Offers fast, local recovery without dependence on a WAN connection
* Equivalent traditional backup capacity assumptions: 100 percent Microsoft Office file data, weekly full and daily
incremental backups, no compression, 10 percent daily change rate, 90-day retention
© Copyright 2009 EMC Corporation. All rights reserved.
17
Avamar’s Optimized Backup for VMware
Infrastructure
Avamar deduplicates at the optimal location and level of granularity
 Optimal location: Reduces data at the source
–
–
–
–
Duplicate data never traverses congested shared resources
Data backed up is reduced from ~200 percent to ~2 percent of primary weekly
Significantly reduces contention for shared resources
Deduplicates within and across virtual machines and/or VMDK (virtual machine disk format) files
 Optimal granularity: Sub-file, variable-length segments
–
–
–
–
–
VMDK is one large file; any changes cause an incremental/full traditional backup
Fixed-length segment deduplication fails due to frame offset
Sub-file, variable-length segment deduplication finds changes anywhere in VMDK
Dramatically reduces the amount of data moved during daily backups
Significantly reduces daily backup time
 Ideal for the protection of VMware environments
– Enables fast, secure backups over existing virtual infrastructure
– Permits greater server consolidation and maximum value from VMware
© Copyright 2009 EMC Corporation. All rights reserved.
18
Avamar Products for VMware Infrastructure
Flexible, fast, efficient, and reliable backup and recovery
Avamar Client Backup Products
Guest
VMware Consolidated
Backup
Avamar Server Backup Products
Service Console
= Avamar Software Agent
Resource
Pool
Avamar
server
Virtual
Machines
Centralized
Data Mover
VMware Virtualization
Layer
x86 Architecture
VCB
proxy
server
with
Avamar
agent
Avamar
Software
Avamar Data
Store
Avamar Virtual
Edition for
VMware
Avamar
VM
ESX Server
Hardware
Resource
Pool
Physical
Server
Avamar
VM
Avamar
VM
VMware Virtualization Layer
SAN
Storage
Avamar ESX
AgentsServer
x86 Architecture
Avamar
Server
© Copyright 2009 EMC Corporation. All rights reserved.
Resource
Pool
19
Avamar’s Optimized Backup for VMware
Infrastructure
Avamar efficiently protects virtual machines
Up to…
Traditional moves ~200% weekly
 95 percent reduction in data moved
 90 percent reduction in backup times
VMware Virtualization Layer
 50 percent reduction in disk impact
x86 Architecture
 95 percent reduction in NIC usage
Resource
Pool
Avamar moves ~2% weekly
 80 percent reduction in CPU usage
 50 percent reduction in memory usage
All backups are stored as “virtual full
backups,” ready for immediate restore
VMware Virtualization Layer
x86 Architecture
Maintains effective consolidation ratios
without overtaxing CPU utilization
Resource
Pool
© Copyright 2009 EMC Corporation. All rights reserved.
20
Avamar Products for VMware Infrastructure
Remote Offices with VMware
Avamar agent
moves data to
Avamar Virtual
Edition for
VMware; data is
then replicated
to the corporate
data center
Remote Offices without VMware
Avamar
VM
WAN
(ENCRYPTED)
(ENCRYPTED)
ESX Server
Hardware
(ENCRYPTED)
Resource
Pool
Avamar agent moves data to a physical Avamar
Single Node server; data is then replicated to the
corporate data center, OR an Avamar agent can
back up directly to the data center over the WAN
VMware Virtualization Layer
VMware Virtualization Layer
VMware Virtualization Layer
VMware Virtualization Layer
VMware Virtualization Layer
VMware Virtualization Layer
x86 Architecture
x86 Architecture
x86 Architecture
x86 Architecture
x86 Architecture
x86 Architecture
Resource
Pool
Resource
Pool
Resource
Pool
Resource
Pool
Resource
Pool
Resource
Pool
LAN/
SAN
SAN
VCB Proxy
Server
VMware Data Center with Guest-Level Backup
© Copyright 2009 EMC Corporation. All rights reserved.
VMware Data Center with VMware Consolidated Backup
21
Remote Office/Branch Office Backup
Commonwealth of Virginia’s Department of Motor Vehicles
Before Avamar
State of Virginia
 73 remote offices, backup to local, direct-attached tape drives
 No local IT staff
 Daily backup required five hours per site; six hours to restore
entire server
With Avamar
“Avamar enabled us to
reduce administrative
support requirements by
80 percent, reduce
backup windows by 90
percent, and recover lost
files and servers in
minutes rather than
hours.”
 Four hours to backup all 73 offices via existing WAN (56k-T1)
to central Avamar server
 45 minutes to restore entire servers—files restored in
seconds
 Centralized management and control
* Source: IDC, “Gaining Control of Remote Office Data Protection,” March 2007
— Mike DePhillip,
Virginia DMV
© Copyright 2009 EMC Corporation. All rights reserved.
22
Use Case:
Limited Bandwidth LAN and NDMP Backup
Challenges
 Full NDMP backups are slow - consume
significant network, filer and storage resources
 Limited Bandwidth LAN backups face backup
window challenges
Dedupe at source for faster
NDMP and LAN backups with
less network resources used
Why Dedupe Backup Software
 NDMP Backup: Eliminates time-consuming, full
backups by only moving new data
LAN
 Limited Bandwidth LAN: Speeds backup by
deduplicating data at source before moving
across LAN
 Single-step restore from full backup image
NAS Device
 Reduces/eliminates associated tape costs
© Copyright 2009 EMC Corporation. All rights reserved.
23
Avamar Transforms Backup and Recovery
 Global data deduplication defuses the
explosion of backup data
– Up to 500:1 daily reduction in network
impact
– Radically reduce time and storage required
for backup
 Alters the fundamental economics of
disk versus tape
– Accelerate shift to disk as primary medium
for backup
 Alternative to archaic IT processes
(shipping tapes for disaster recovery)
– Automated, encrypted remote copy over
existing WANs
 Key element of EMC’s next-generation
backup portfolio
© Copyright 2009 EMC Corporation. All rights reserved.
24
Avamar Solves Customer Challenges
VMware Infrastructure Backup
Deliver even greater efficiencies to consolidated environments by accelerating
deployment, simplifying management, unburdening underlying hardware resources,
and eliminating the need for a dedicated backup server and storage infrastructure.
Remote Office/Branch Office Backup
Extend data center best practices to remote offices in an affordable manner, by
minimizing time, effort, and the consumption of network bandwidth necessary to
protect the growing amount of data stored at remote offices.
Resource-constrained Backup (LAN/NAS servers)
Slow servers, limited bandwidth, growing backup windows
Stretch data storage infrastructure by speeding backup to disk, allowing more
information to be kept on disk, improving reliability and recoverability, and
controlling tape related costs in face of growing data volumes.
© Copyright 2009 EMC Corporation. All rights reserved.
25
Q&A
THANK YOU !!!
© Copyright 2009 EMC Corporation. All rights reserved.
26