Advanced SQL Server on vSphere Techniques and Best

Transcription

Advanced SQL Server on vSphere Techniques and Best
VAPP2979
Advanced SQL Server on
vSphere Techniques and
Best Practices
Scott Salyer, VMware, Inc
Jeff Szastak, VMware, Inc
Disclaimer
• This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these
features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or
sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not
been determined.
CONFIDENTIAL
2
Agenda / Table of Contents
1
Introductions
2
Storage
• Data Volume and Protocol Considerations
• Designing for Performance
3
4
Networking
• Jumbo Frames
• Guest Tuning
Memory
• Understanding Memory Management
• Best Practices in SQL Server Guests
CONFIDENTIAL
3
Agenda / Table of Contents
5
CPU
• Best Practices
• Sizing Considerations
6
Consolidating Multiple Workloads
• Consolidation Options
• Mixing Workload Types
7
SQL Server Availability
• vSphere Features
• Supported SQL Server Clustering Configurations
CONFIDENTIAL
4
Storage:
Disk Volume and Protocol Considerations
CONFIDENTIAL
5
VMFS or RDM?
• Generally similar performance http://www.vmware.com/files/pdf/performance_char_vmfs_rdm.pdf
• vSphere 5.5 supports up to 62TB VMDK files
• Disk size no longer a limitation of VMFS
VMFS
RDM
Better storage consolidation – multiple virtual disks/virtual machines per
VMFS LUN. But still can assign one virtual machine per LUN
Enforces 1:1 mapping between virtual machine and
LUN
Consolidating virtual machines in LUN – less likely to reach vSphere
LUN Limit of 256
More likely to hit vSphere LUN limit of 256
Manage performance – combined IOPS of all virtual machines in LUN <
IOPS rating of LUN
Not impacted by IOPS of other virtual machines
• When to use raw device mapping (RDM)
– Required for shared-disk failover clustering
– Required by storage vendor for SAN management tools such as backup and snapshots
• Otherwise use VMFS
VMDK Lazy Zeroing *
Effect of Zeroing on
Storage Performance
• Default VMDK allocation policy lazy zeroes 1M
VMFS blocks on first write
200
180
• SQL Server operations could be affected by lazy
160
zeroing
– Write operations
– Read operations that use tempdb extensively
– Bulk load/index maintenance
• For best performance, format VMDK as
eagerzeroedthick *
Throughput (MBps)
• Write penalty on an untouched VMDK
140
120
100
80
60
40
20
0
• * Zero offload capability in VAAI improves
1 host 2 hosts 4 hosts 8 hosts
zeroing in supported arrays
"Post-zeroing"
16
hosts
"Zeroing"
Eagerzeroed Thick in the GUI
• When using VMFS for SQL Server data, create VMDK files as eagerzeroed thick or uncheck
Windows “Quick Format” option
vSphere 5
vSphere 4
Block Alignment
• Configure storage presented to vSphere hosts using
vCenter to ensure VMFS block alignment
• Even though Windows is supposed to automatically align
as of Windows 2008, Microsoft recommends double
checking
– http://msdn.microsoft.com/en-us/library/dd758814.aspx
– http://blogs.msdn.com/b/jimmymay/archive/2014/03/14/disk-partition-alignment-
for-windows-server-2012-sql-server-2012-and-sql-server-2014.aspx (Jimmy
May - MSDN Blogs)
Unaligned partitions result in additional I/O
• Whatever the operating system, confirm that new
partitions are properly aligned
stripe unit size value should be an integer
Aligned partitions reduce I/O
PVSCSI, Anyone?
• The latest and most advanced vSphere SCSI controller drivers
• Larger queue depth per-device (256, actual 254) and per-adapter(1024)
– Default values are 64 and 254
• Less CPU overhead
• Requires VMware Tools
– Drivers not native to Windows
– Cannot be used for OS partition without some work-around
• Increase queue depth in Windows Guest OS by increase request ring to 32
– HKLM\SYSTEM\CCS\services\pvscsi\Parameters\Device\DriverParameter
"RequestRingPages=32,MaxQueueDepth=254”
– ESX 5.0 U3 and above only
• Not currently supported for ANY type of Windows Clustering configuration
…and what about NFS and In-guest iSCSI?
• NFS
– Supported for SQL Server (must meet data write ordering requirements and guarantee write-through)
– Not supported by VMware for Windows Clustering
• In-guest iSCSI
– Supported for Standalone and Clustered
• No VMware-mandated considerations
– Facilitates easy storage zoning and access masking
– Useful for minimizing number of LUNs zoned to an ESXi host
– Offloads storage processing resources away from ESXi hosts
– Should use dedicated network and NIC
Storage:
Designing for Performance
CONFIDENTIAL
12
Design for Storage Performance (not just Capacity)
• The fundamental relationship between consumption
and supply has not changed
– Spindle count and RAID configuration still rule
– But host demand is an aggregate of virtual machines
• Factors that affect storage performance include
storage protocols, storage configuration, and Virtual
Machine File System (VMFS) configuration
VMFS
Understand Your Workload!!!
Database Workloads Types
OLTP
Batch / ETL
DSS
 Large amount of small queries
 Sustained CPU utilization during working hours
 Sensitive to peak contentions (slow downs affects SLA)
 Generally Write intensive
 May generate many chatty network round trips
 Typically runs during off-peak hours, low CPU utilization during
normal working hours
 Can withstand peak contention, but sustain activity is key




Small amount of large queries
CPU, memory, disk IO intensive
Peaks during month end, quarter end, year end
Can benefit from inter-query parallelism with large number of threads
SQL Server I/O Characteristics
• Understanding the I/O characteristics of common SQL Server operations and scenarios can
help determine how to configure storage
• Some of the more common scenarios below
• Monitor I/O to determine specifics of each scenario
Operation
Random/Sequential Read/Write
Size Range
OLTP – Log
Sequential
Write
Up to 64K
OLTP – Data
Random
Read/Write
8K
Bulk Insert
Sequential
Write
Any multiple of 8K up to 256K
Read Ahead – DSS and Index Scans
Sequential
Read
Any multiple of 8KB up to 512K
Backup
Sequential
Read
1MB
Storage – Test Before Deployment
• Simulate SQL Server disk I/O patterns using a generic tool, such as the
native SQLIOSim or IOmeter
• Test to make sure requirements, such as throughput and latency, have been met
• Example SQL I/O patterns to tests
R/W%
Type
Block
Threads/ Queue
Simulates
80/20
Random
8K
# cores/files
Typical OLTP data files
0/100
Sequential
60K
1/32
Transaction log
100/0
Sequential
512K
1/16
Table scans
0/100
Sequential
256K
1/16
Bulk load
100/0
Random
32K
# cores/1
SSAS workload
100/0
Sequential
1MB
1/32
Backup
0/100
Random
64K-256K
# cores/files
Checkpoints
LUN Size
• In the example, VMware ESXi™ B can
generate twice as much I/O as ESXi A
VM a
VM b
VM c
VM d
• Improved aggregate throughput
of multiple LUNs
• Implications for the array
– Greater number of smaller LUNs increases
burst intensity
32
…
2
1
32
…
2
1
– Many HBA/LUN pairs could be used
simultaneously
ESXi A
VMFS
ESXi B
32
…
2
1
Performance Best Practices
• DataStores
– Create dedicated data stores to service BCA database workloads
– svMotion / SDRS to balance workloads across datastores
• Load Balance your workloads across as many disk spindles as possible
• Optimize IP-based storage (iSCSI and NFS)
– Enable Jumbo Frames
– Use dedicated VLAN for ESXi host's vmknic & iSCSI/NFS server to minimize network interference
from other packet sources
– Exclude iSCSI NICs from Windows Failover Cluster use
• Choose Storage which supports VMware vStorage APIs for Array Integration (VAAI)
• Deploy vSphere Flash Read Cache (vFRC)
– a volatile write-through cache
– caches read requests of virtual machine I/O requests
– enabled on a per-vmdk basis with a cache block size (4KB - 1MB)
Performance Best Practices (continued)
• Follow storage vendor’s best practices when laying out database (also storage multipathing)
• Use multiple vSCSI adapters to evenly distribute target devices and increase parallel access
for databases with demanding workloads
• Format database VMDK files as Eager Zeroed Thick* for demanding workload database
– * Required ONLY if the storage array is not VAAI-compliant. See VMware KB #1021976
(http://kb.vmware.com/kb/1021976)
– See “Benefits of EMC VNX for Block Integration with VMware VAAI”
http://www.us.emc.com/collateral/hardware/white-papers/h8293-vaai-vnx-wp.pdf
• Network protocol processing for software-initiated iSCSI / NFS operations take place on the
host system, requiring CPU resources
• Follow same guidelines as physical
– separate LUNs with different IO characteristics i.e. data, redo / log, temporary, rollback
Performance Best Practices (continued)
• Be conservative – for mission critical production, dedicate LUNs for above disks
• Ensure storage adapter cards are installed in slots with enough bandwidth to support their
expected throughput
• Ensure appropriate read/write controller cache is enabled
• Pick the right multipathing policy based on vendor storage array design
• Storage Multipathing – Set up a minimum of four paths from an ESX Server to a storage array
(requires at least two HBA ports)
• Configure maximum queue depth if needed for Fibre Channel HBA cards. See:
– http://kb.vmware.com/kb/1267
– http://kb.vmware.com/kb/1267
SQL Server Guest Storage Best Practices
• Follow SQL Server storage best practices – http://technet.microsoft.com/en-us/library/cc966534.aspx
• Ensure correct sector alignment in Windows
– Incorrect setting can result in up to 50% performance hit
– Don’t use the “Quick format” option for database/log volumes
• Pre-allocate data files to avoid autogrow during peak time
– If using auto-growth, use MB and not % increments
• Use multiple data files for data and tempdb – start with 1 file per CPU core
– Multiple TempDB files can co-exist on the same volume – Not encouraged
• Database file placement priority – fastest to slowest drive
– Transaction Log Files > TempDB Data Files > Data Files
• Place data and log files on separate LUNs
• Perform routine maintenance with index rebuild/reorg, dbcc checkdb
• Number of Data Files Should Be <= Number of Processor Cores
Storage – Putting It ALL Together
• Work with storage engineer early in the lifecycle
• Optimize VMFS and avoid lazy zeroing by using eagerzeroedthick disks
• Ensure that blocks are aligned at both the ESXi and Windows levels
• Understand the path to the drives, such as storage protocol and multipathing
• Size for performance, not just capacity (apps often drive performance requirements)
• Understand the I/O requirements of different workloads
– Separate LUNs for files with different I/O characteristics, such as transactional data versus log versus
backup
– Separate VMDKs for SQL binary, data, log, and tempdb files
• Use small LUNs for better manageability and performance
• Optimize IP network for iSCSI and NFS
Network
Jumbo Frames
• Use Jumbo Frames – confirm there is no MTU mismatch
• To configure, see iSCSI and Jumbo Frames configuration on ESX 3.x and ESX 4.x
http://kb.vmware.com/kb/1007654
SQL Server: Network
• Network
• Default packet size is 4,096
– If jumbo frames are available for the entire
stack, set packet size to 8,192
• Maximize Data Throughput for Network
Applications
– Limit file system cache by OS
– NIC > File & Printer Sharing Microsoft
Networks
• Use Minimize Memory or Balance
http://blogs.msdn.com/b/johnhicks/archive/2008/03/03/sql-server-checklist.aspx
AlwaysOn Availability Group Cluster Settings
• Depending on YOUR network, tuning may be necessary – work with Network Team and
Microsoft to determine appropriate settings
Cluster Heartbeat
Parameters
Default
Value
CrossSubnetDelay
1000 ms
CrossSubnetThreshold
5hb
SameSubnetDelay
1000 ms
SameSubnetThreshold
5 hb
View: cluster /cluster:<clustername> /prop
Modify: cluster /cluster:clustername> /prop <prop_name> = <value>
Network Best Practices
• Allocate separate NICs for vMotion, FT logging traffic, and ESXi console access management
– Alternatively use VLAN-trunking support to separate production users, management, VM network, and
iSCSI storage traffic
• vSphere 5.0 supports the use of more than 1 NIC for vMotion allowing more simultaneous
vMotions; added specifically for memory intensive applications like Databases
• Use NIC load-based teaming (route based on physical NIC load) for availability, load balancing,
and improved vMotion speeds
• Have minimum 4 NICs per host to ensure performance and redundancy of network
• Recommend the use of NICs that support:
– Checksum offload , TCP segmentation offload (TSO)
– Jumbo frames (JF), Large receive offload (LRO)
– Ability to handle high-memory DMA (i.e. 64-bit DMA addresses)
– Ability to handle multiple Scatter Gather elements per Tx frame
– NICs should support offload of encapsulated packets (with VXLAN)
Network Best Practices (continued)
• For “chatty” VMs on same host, connect to same vSwitch to avoid pNIC traffic
• Separate SQL workloads with chatty network traffic (Microsoft Always On – Are you there) from
the one with chunky access into different physical NICs
• Use Distributed Virtual Switches for cross-ESX network convenience
• Be mindful of converged networks; storage load can affect network and vice versa as they use
the same physical hardware
• Use VMXNET3 Paravirtualized adapter drivers to increase performance
– Reduces overhead versus vlance or E1000 emulation
– Must have VMware Tools to enable VMXNET3
• Tune Guest OS network buffers, maximum ports
• Ensure no bottlenecks in the network between the source and destination
• Look out for Packet Loss / Network Latency if a network issue is detected
Memory
Understanding Memory Management
Large Pages
• Use ESXi Large Pages (2MB)
– Improves performance by significantly reducing TLB misses (applications with large active memory
working sets)
– Does not share large pages unless memory pressure (KB 1021095 and 1021896)
– Slightly reduces the per-virtual-machine memory space overhead
• For systems with Hardware-assisted Virtualization
– Recommend use guest-level large memory pages
– ESXi will use large pages to back the GOS memory pages even if the GOS does not make use of large
memory pages(full benefit of huge pages is when GOS use them as well as ESXi does)
“Large Pages Do Not Normally SWAP”
In the cases where host memory is overcommitted, ESX may have
to swap out pages. Since ESX will not swap out large pages,
during host swapping, a large page will be broken into small
pages. ESX tries to share those small pages using the pregenerated hashes before they are swapped out. The motivation of
doing this is that the overhead of breaking a shared page is
much smaller than the overhead of swapping in a page if the
page is accessed again in the future.
http://kb.vmware.com/kb/1021095
Swapping is Bad!
• Swapping happens when:
– The host is trying to service more memory than it has physically AND
– ESXi memory optimization features (TPS and Ballooning) are insufficient to provide relief
• Swapping Occurs in Two Places
– Guest VM Swapping
– ESXi Host Swapping
• Swapping can slow down I/O performance of disks for other VM’s
• Two ways to keep swapping from affecting your workload:
– At the VM: Set memory reservation = allocated memory (avoid ballooning/swapping)
• Use active memory counter with caution and always confirmed usage by checking memory counter in Perfmon
– At the Host: Do not overcommit memory until vCenter reports that steady state usage is < the amount
of RAM on the server, be sure to wait at least 1 business cycle (vCOPs)
ESXi Memory Features that Help Avoid Swapping
• Transparent Page Sharing
– Optimizes use of memory on the host by “sharing” memory pages
that are identical between VMs
– More effective with similar VMs (OS, Application, configuration)
– Very low overhead
• Ballooning
– Allows the ESXi host to “borrow” memory from one VM to satisfy
requests from other VMs on that host
– The host exerts artificial memory pressure to the VM via the
“balloon driver” and returns to the pool usable by other VMs
– Ballooning is the host’s last option before being forced to swap
– Ballooning is only effective if VMs have “idle” memory
• DON’T TURN THESE OFF
Memory Reservations
• Allows you to guarantee a certain share of the physical
memory for an individual VM
• The VM is only allowed to power on if the CPU and
memory reservation is available (strict admission)
• The amount of memory can be guaranteed even under
heavy loads
• In many cases, the configured size and reservation size
could be the same
Reservations and vswp
• Setting a reservation creates a 0.00 K
Memory ALLOCATED TO a VM Is Determined by….
• DRS Shares/Limits**
• Total Memory of the Host
• Reservations
• Memory Load of the host
Non-Uniform Memory Access (NUMA)
• Designed to avoid the performance hit when several
processors attempt to address the same memory by
providing separate memory for each NUMA Node.
• Speeds up Processing
• NUMA Nodes Specific to Each Processor Model
Virtual NUMA in vSphere 5
• Extends NUMA awareness to the guest OS
• Enabled through multicore UI
– On by default for 8+ vCPU multicore VM
– Existing VMs are not affected through upgrade
– For smaller VMs, enable by setting numa.vcpu.min=4
• CPU Hot-Add disables vNUMA
• For wide virtual machines, confirm feature
is on for best performance
• SQL Server
– Automatically detects NUMA architecture
– SQL Server process and memory allocation optimized for
NUMA architecture
NUMA Best Practices
• http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-Sched-Perf.pdf
• Avoid Remote NUMA access
– Size # of vCPUs to be <= the # of cores on a NUMA node (processor socket)
• Hyperthreading
– Initial conservative sizing: set vCPUs to # of cores
– HT benefit around 20-25%, < for CPU intensive batch jobs (based on OLTP workload tests )
– Increase vCPUs to get HT benefit, but consider “numa.vcpu.preferHT” option – individual case basis
• # of virtual sockets and # of cores / virtual socket
– Recommendation , keep default 1 core / socket
• Align VMs with physical NUMA boundaries
• ESXTOP to monitor NUMA performance at vSphere
• If vMotioning, move between hosts with the same NUMA architecture to avoid
performance hit (until reboot)
Memory
Best Practices in SQL Server Guests
Large Pages in SQL Server Configuration Manager
• Use Large Pages in the guest – start SQL Server with trace flag -T834
Lock Pages in Memory User Right
• May keep SQL Server more responsive
when paging occurs
• ON by default in 32/64 bit Standard
Edition and higher if rights are granted
• The SQL Server Service Account
(sqlservr.exe) must have “Lock pages in
memory” rights
http://msdn.microsoft.com/en-us/library/ms178067.aspx
Running Multiple Instances on Same VM
• Option 1: Use max server memory
Max Memory = VMMem – ThreadStack –
OS Mem – VM Overhead
– Create max setting for each instance
– Give each instance memory proportional to expected
workload / db size
•
•
ThreadStack = NumOfSQLThreads(ThreadStackSize)
ThreadStackSize = 1 MB on x86 | 2 MB on x64
http://msdn.microsoft.com/en‐us/library/ms178067.aspx
– Do not exceed total RAM allocated to VM
Settings can be modified without having
to restart the instances!
• Option 2: Use min server memory
– Create min settings for each instance
– Give each instance memory proportional to expected
workload / db size
– The sum should be 1-2 GB less than RAM allocated to VM
Pro
Con
Max server memory
When a new process or instance starts, memory is available
immediately to fulfill the request
If instances are not running, the running
instances cannot access the available RAM
Min server memory
Running instances can leverage memory previously used by
instances that are no longer running
When a new process or instance starts, running
instances need to release memory
How Many VMs Can I Put on a Host?
• As many whose active memory will fit in physical
RAM, while leaving some room for memory spikes.
Active memory (%ACTV) of VM’s + Memory
Overhead – Page sharing of VM’s (DE-Duping)
DE-Duping = Transparent Page Sharing
Memory – Putting It ALL Together
• Use ESXi Large Pages
• Avoid host-level swapping
– Utilize ESXi memory management features like TPS and ballooning (don’t disable!)
– Avoid overcommitment of memory at the host level (HostMem >= Sum of VMMem – overhead)
– If overcommitment is unavoidable, use reservations to protect important VMs
• To avoid NUMA remote memory access, size VM memory equal to or less than the memory
per NUMA node if possible
– Utilize ESXi virtual NUMA features (especially for wide VMs)
• Use Large Pages in the guest – start SQL Server with trace flag -T834
• Enable Lock Pages in Memory right for SQL Server service account
• Use Max Server Memory and Min Server Memory when running multiple instances of SQL
Server in the same VM
• Disable unnecessary processes within Windows
CPU
CPU Sizing Considerations
• Understand existing workload, average and peak
• Properly manage pCPU allocation
– For Tier 1 workload, avoid pCPU overcommitment
– For lower-tiered databases workload
• Reasonable overcommitment can increase aggregate throughput and maximize license savings – consolidation
ratio varies depending on workloads
• Leverage vMotion and DRS for resource load balancing
– Monitor to optimize
• Host level – %RDY, %MLMTD, and %CSTP
• Virtual machine level – processor queue length
• Keep NUMA node size in mind
– For smaller virtual machine, try to stay inside a NUMA node if possible
– For wide virtual machine – vSphere 5.x
• Align vCPUs to physical NUMA boundaries
• Enable vNUMA on vSphere host to allow SQL Server NUMA optimization
Processor – Putting It All Together
• Leverage hardware-assisted virtualization (enabled by default)
• Consider avg. and peak utilization
• Be aware of hyper-threading, a hyper-thread does not provide the full power of a physical core
• Consider future growth of the system, sufficient head room should be reserved
• In high performance environment, consider adding additional hosts when avg. host CPU
utilization exceeds 65%
• Consider increasing CPU resource if guest VM CPU utilization is above 65% in average
• Ensure Power Saving Features are “OFF”
• Use vCOPs for consumption & capaticity
Consolidating Multiple
Workloads
CONFIDENTIAL
50
Consolidation Options
• Scale-up approach
– Multiple databases or SQL instances per
virtual machine
• Scale-out approach
– Single database per VM
– Potential increase in mgmt. overhead
– Fewer virtual machines
– Better isolation/performance
– Poor workload management
– Easier security and change mgmt.
– Potential reduction in SQL licensing cost
– DRS more effective with smaller VMs
– Faster migration (vMotion)
51
OLTP vs. Batch Workloads
• What this says:
– Average 15% Utilization
– Moderate sustained activity (around 28% during working
hours 8am-6pm)
– Minimum activities during non working hours
– Peak utilization of 58%
• What this says:
OLTP Workload (avg. 15%)
– Average 15% Utilization
– Very quiet during the working day (less than 8% utilization)
– Heavy activity during 1am-4am, with avg. 73%, and peak 95%
Batch Workload (avg. 15%)
OLTP vs. Batch Workloads
• What This Means
– Better Server Utilization
– Improved Consolidation Ratios
– Less Equipment To Patch, Service, Etc
– Saves Money/Less Licensing
OLTP/Batch Combined Workload
Running with Mixed SQL Server Workloads
• Consider workload characteristics, and manage pCPU overcommitment
as a function of typical utilization
– OLTP workloads can be stacked up to a sustained utilization level
– OLTP workloads that are high usage during daytime and batch workloads that run during off-peak hours
mix well together
– Batch/ETL workloads with different peak periods are mixed well together
• Consider operational history, such as month-end and quarter-end
– Additional virtual machines can be added to handle peak period during month-end, quarter-end,
and year-end, if scale out is a possibility
– CPU and memory hot add can be used to handle workload peak
– Reduce virtual machine density, or add more hosts to the cluster
• Use DRS as your insurance policy, but don’t rely on it for resource planning
SQL Server Availability
Business-Level Approach
• What are you trying to protect?
– i.e. What does the business care about protecting?
• What are your RTO/RPO requirements?
• What is your Service Level Agreement (SLA)?
• How will you test and verify your solution?
vSphere 5 Availability Features
• vSphere vMotion
– Can reduce virtual machine planned downtime
– Relocate SQL Server VMs without end-user interruption
– Perform host maintenance any time of the day
• vSphere DRS
– Monitors state of virtual machine resource usage
– Can automatically and intelligently locate virtual machine
– Can create a dynamically balanced SQL deployment
• VMware vSphere High Availability (HA)
– Does not require Microsoft Cluster Server
– Uses VMware host clusters
– Automatically restarts failed SQL virtual machine in minutes
– Heartbeat detects hung virtual machines
– Application HA can provide availability at the SQL Server service level!
VMware Support for Microsoft Clustering on vSphere
Shared
Disk
Non
shared
Disk
Storage Protocols support
Microsoft
Clustering on
VMware
vSphere support
MSCS with
Shared Disk
VMware HA
support
vMotion DRS
support
Storage vMotion
support
MSCS Node Limits
Yes
Yes1
No
No
2
5 (5.1 only)
Exchange Single
Copy Cluster
Yes
Yes1
SQL Clustering
Yes
Yes1
SQL AlwaysOn
Failover Cluster
Instance
Yes
Yes1
No
No
2
5 (5.1 only)
Yes
Network Load
Balance
Yes
Yes1
Yes
Yes
Same as
OS/app
Exchange CCR
Yes
Yes1
Yes
Yes
Same as
OS/app
Exchange DAG
Yes
Yes1
SQL AlwaysOn
Availability
Group
Yes
Yes1
FC
In-Guest
OS iSCSI
Native
iSCSI
In-Guest
OS SMB
Shared Disk
FCoE
RDM
VMFS
No
Yes5
Yes4
Yes2
Yes3
No
Yes5
Yes4
Yes2
Yes3
No
Yes5
Yes4
Yes2
Yes3
Yes
No
Yes5
Yes4
Yes2
Yes3
Yes
Yes
Yes
N/A
Yes
N/A
N/A
Yes
Yes
Yes
N/A
Yes
N/A
N/A
Same as
Non-Shared
Disk
Configurations:
Supported
on
Yes
Yes
Yes
Yes
OS/app
vSphere just like on physical
Yes
N/A
Yes
N/A
N/A
Yes
N/A
Yes
N/A
N/A
Yes
Yes
2
No
No
Yes
5 (5.1Supported
only)
Shared
Disk Configurations:
on Yes
vSphere with additional considerations
for storage
2
No
No
Yes
Yes
protocols and disk configs 5 (5.1 only)
Yes
Yes
Same as
OS/app
Yes
Yes
* Use affinity/anti-affinity rules when using vSphere HA
** RDMs required in “Cluster-across-Box” (CAB) configurations, VMFS required in “Cluster-in-Box” (CIB) configurations
VMware Knowledge Base Article: http://kb.vmware.com/kb/1037959
Shared Disk Clustering (Failover Clustering and AlwaysOn FCI)
• Provides application high-availability through a shared-disk architecture
• One copy of the data, rely on storage technology to provide data redundancy
• Automatic failover for any application or user
• Suffers from restrictions in storage and VMware configuration
vSphere HA with Shared Disk Clustering
• Supports up to five-node cluster in vSphere 5.1 and above
• Failover cluster nodes can be physical or virtual or any
combination of the two
• Host attach (FC) , FCoE* or in-guest (iSCSI)
• Supports RDM only
• vSphere HA + failover clustering
– Seamless integration, virtual machines rejoin clustering session
after vSphere HA recovery
– Can shorten time that database is in unprotected state
– Use DRS affinity/anti-affinity rules to avoid running cluster
virtual machines on the same host
Failover clustering supported with vSphere HA as of vSphere 4.1
http://kb.vmware.com/kb/1037959
Non-Shared Disk Clustering (Always On Availability Groups)
• Database-level replication over IP; no shared storage requirement
• Same advantages as failover clustering (service availability, patching, etc.)
• Readable secondary
• Automatic or manual failover through WSFC policies
vSphere HA with AlwaysOn Availability Groups
• Seamless integration
• Protect against hardware/software failure
• Support multiple secondary and readable secondary
• Provide local and remote availability
• Full feature compatibility with availability group
• VMware HA shortens time that database is in
unprotected state
• DRS anti-affinity rule avoids running virtual machines
on the same host
EMC Study – SQL Server AlwaysOn running vSphere 5 and EMC FAST VP
http://www.emc.com/collateral/hardware/white-papers/h10507-mission-critical-sql-server-2012.pdf
WSFC – Cluster Validation Wizard
• Use this to validate support for your configuration
– Required by Microsoft Support for condition of support for YOUR configuration
• Run this before installing AAG(AlwayOn Availabilty Group), and every time you make changes
– Save resulting html reports for reference
• If running non-symmetrical storage, possible hotfixes required
– http://msdn.microsoft.com/en-us/library/ff878487(SQL.110).aspx#SystemReqsForAOAG
63
Patching Non-clustered Databases
• Benefits
– No need to deploy an MS cluster simply for
patching / upgrading the OS and database
– Ability to test in a controlled manner
(multiple times if needed)
– Minimal impact to production site
until OS patching completed
and tested
– Patching of secondary VM
can occur during regular
business hours
• Requires you to layout VMDKs correctly to
support this scenario
Resources
• Visit us on the web to learn more on specific apps
– http://www.vmware.com/solutions/business-critical-apps/
– Specific page for each major app
– Includes Best Practices and Design/Sizing information
• Visit our Business Critical Application blog
– http://blogs.vmware.com/apps/
New RDBMS books from VMware Press
vmwarepress.com
http://www.pearsonitcertification.com/store/virtu
alizing-oracle-databases-on-vsphere9780133570182
http://www.pearsonitcertification.com/store/virtuali
zing-sql-server-with-vmware-doing-it-right9780321927750
66
Questions?
Thank You
Fill out a survey
Every completed survey is entered
into a drawing for a $25 VMware
company store gift certificate
VAPP2979
Advanced SQL Server on
vSphere Techniques and
Best Practices
Scott Salyer, VMware, Inc
Jeff Szastak, VMware, Inc