Recipient Name - Rhode Island Housing

Transcription

REQUEST FOR PROPOSALS
INTRODUCTION
Through this Request for Proposals (“RFP”), Rhode Island Housing seeks proposals from
qualified firms with expertise to perform remediation and IT Disaster Recovery. Our
production site consist of FAS 2040/Three ESX Hosts/One VC/about 50 VMs and
Disaster Recovery FAS 2040/Three ESX Hosts/One VC/over 50 VMs. The strategy is to
ensure data from our Storage Area Network (SAN) and applications are replicated, stored
and quickly accessible should Rhode Island Housing experience a disaster. Our primary data
center is located at 44 Washington Street, Providence RI, 02903. The Disaster Recovery
site will be located at 1 Federal Street, Springfield Massachusetts. Both sites are connected
with a 100MB fiber connection between them.
INSTRUCTIONS
The proposal should be submitted to [email protected] no later than 5:00
PM May 4, 2012. Proposals should be presented on electronic business letterhead.
Respondents are advised that all submissions (including those not selected for engagement)
may be made available to the public on request upon completion of the process and award
of a contract(s). Accordingly, any information included in the proposal that the respondent
believes to be proprietary or confidential should be clearly identified as such.
SCOPE OF WORK
Please see Attachment A.
ITEMS TO BE INCLUDED WITH YOUR PROPOSAL
A.
General Firm Information
1. Provide a brief description of your firm, including but not limited to the
following:
a.
Name of the principal(s) of the firm
b.
Name, telephone number and email address of a representative of the
firm authorized to discuss your proposal.
c.
Address of all offices of the firm.
d.
Number of employees of the firm.
Page 2
B.
Experience and Resources:
1. Describe your firm and its capabilities. In particular, support your capacity to
perform the Scope of Work.
2. Indicate which principals and associates from your firm would be involved in
providing services to Rhode Island Housing. Provide appropriate background
information for each such person and identify his or her responsibilities.
3. Provide a detailed list of references including a contact name and telephone
number for organizations or businesses for whom you have performed similar
work.
4. Identify any conflict of interest that may arise as a result of business activities or
ventures by your firm and associates of your firm, employees, or subcontractors
as a result of any individual’s status as a member of the board of directors of any
organization likely to interact with Rhode Island Housing.
5. Identify any material litigation, administrative proceedings or investigations in
which your firm is currently involved. Identify any material litigation,
administrative proceedings or investigations, to which your firm or any of its
principals, partners, associates, subcontractors or support staff was a party, that
has been settled within the past two (2) years.
6. Describe how your firm will handle actual and or potential conflicts of interest.
C.
Fee Structure: Fixed fee based project.
The cost of services is one of the factors that will be considered in awarding this
contract. The information requested in this section is required to support the
reasonableness of your fees.
1. Please provide a cost proposal for consulting work described in attachment A
2. Please provide any other fee information applicable to the engagement that has
not been previously covered that you wish to bring to the attention of Rhode
Island Housing.
D.
Miscellaneous
1. Rhode Island Housing encourages the participation of persons of color, women,
persons with disabilities and members of other federally and State-protected
classes. Describe your firm’s affirmative action program and activities. Include
the number and percentage of members of federally and State-protected classes
who are either principals or senior managers in your firm, the number and
Page 3
percentage of members of federally and State-protected classes in your firm who
will work on Rhode Island Housing’s engagement and, if applicable, a copy of
your Minority- or Women-Owned Business Enterprise state certification.
2. Discuss any topics not covered in this Request for Proposals that you would
like to bring to Rhode Island Housing’s attention.
E.
Certifications
1. Rhode Island Housing insists upon full compliance with Chapter 27 of Title 17
of the Rhode Island General Laws, Reporting of Political Contributions by State
Vendors. This law requires State Vendors entering into contracts to provide
services to an agency such as Rhode Island Housing, for the aggregate sum of
$5,000 or more, to file an affidavit with the State Board of Elections concerning
reportable political contributions. The affidavit must state whether the State
Vendor (and any related parties as defined in the law) has, within 24 months
preceding the date of the contract, contributed an aggregate amount in excess of
$250 within a calendar year to any general officer, any candidate for general
office, or any political party.
2. Does any Rhode Island “Major State Decision-maker,” as defined below, or the
spouse or dependent child of such person, hold (i) a ten percent or greater equity
interest, or (ii) a Five Thousand Dollar or greater cash interest in this business?
For purposes of this question, “Major State Decision-maker” means:
(i) All general officers; and all executive or administrative head or heads of
any state executive agency enumerated in § 42-6-1 as well as the executive
or administrative head or heads of state quasi-public corporations, whether
appointed or serving as an employee. The phrase “executive or
administrative head or heads” shall include anyone serving in the positions
of director, executive director, deputy director, assistant director, executive
counsel or chief of staff;
(ii) All members of the general assembly and the executive or
administrative head or heads of a state legislative agency, whether
appointed or serving as an employee. The phrase “executive or
counsel or chief of staff;
(iii) All members of the state judiciary and all state magistrates and the
executive or administrative head or heads of a state judicial agency,
whether appointed or serving as an employee. The phrase “executive or
Page 4
counsel, chief of staff or state court administrator.
If your answer is “Yes,” please identify the Major State Decision-maker, specify
the nature of their ownership interest, and provide a copy of the annual financial
disclosure required to be filed with the Rhode Island Ethics Commission
pursuant to R.I.G.L. §§36-14-16, 17 and 18.
3. Please include a letter from your president, chairman or CEO certifying that (i)
no member of your firm has made inquiries or contacts with respect to this
Request for Proposals other than in an email or written communication to
[email protected] to seek clarification of the Scope of Work
set forth in this proposal, from the date of this RFP through the date of your
proposal, (ii) no member of your firm will make any such inquiry or contact until
after June 1, 2012, (iii) all information in your proposal is true and correct to the
best of her/his knowledge, (iv) no member of your firm gave anything of
monetary value or promise of future employment to a Rhode Island Housing
employee or Commissioner, or a relative of the same, based on any
understanding that such person’s action or judgment will be influenced and (v)
your firm is in full compliance with Chapter 27 of Title 17 of the Rhode Island
General Laws, Reporting of Political Contributions by State Vendors.
EVALUATION AND SELECTION
A selection committee consisting of Rhode Island Housing employees (the “Committee”)
will review all proposals and make a determination based on the following factors:
 Professional capacity to undertake the scope of work.
 Proposed fee structure (must be fixed fee)
 Ability to perform within time and budget constraints
 Evaluation of potential work plans
 Previous work experience and performance with Rhode Island Housing and/or
similar organizations
 Recommendations by references
 Firm minority status and affirmative action program or activities
 The time line for completion of remediation and disaster recovery implementation
 Other pertinent information submitted
 Having the skill set and experience to perform all aspect of remediation and
configuration for Disaster Recovery
 Holding the skill set and the expertise to complete production failover to Disaster
Recovery site and back to Production site within an acceptable time frame (15
minutes or less). Also provide High Availability for selected applications
 Ability to deliver the final solution by June 15, 2012
 Meeting and fulfilling requirements outlined on the check list (Attachment B)
Page 5
Rhode Island Housing may invite one or more finalists to make presentations.
In its sole discretion, Rhode Island Housing may negotiate with one or more firms who
have submitted qualifications to submit more detailed proposals on specific projects as they
arise.
By this Request for Proposals, Rhode Island Housing has not committed itself to undertake
the work set forth. Rhode Island Housing reserves the right to reject any and all proposals,
to rebid the original or amended scope of services and to enter into negotiations with one
or more respondents. Rhode Island Housing reserves the right to make those decisions
after receipt of responses. Rhode Island Housing’s decision on these matters is final.
For additional information contact:
Abdel El idrissi
401-457-1121
[email protected].
Together with its partners, Rhode Island Housing works to ensure that all people who live
and work in Rhode Island can afford a healthy, attractive home that meets their needs.
Rhode Island Housing uses all of its resources to provide low-interest loans, grants,
education and assistance to help Rhode Islanders find, rent, buy, build and keep a good
home. Created by the General Assembly in 1973, Rhode Island Housing is a privately
funded public purpose corporation.
Page 6
Attachment A
Scope of Work
General:
Rhode Island Housing is seeking proposals from vendors to perform a comprehensive
Remediation and IT Disaster Recovery implementation. The selected vendor must be
qualified in our primary technologies: NetApp SAN, VMware, HP servers, Cisco switches
and Juniper firewalls.
The work to be completed includes two environments: a production environment in
Providence, RI and our recovery site located in Springfield, MA.
Each technical environment consists of FAS 2040, VSphere 4 and all related dependencies
and subset components. This fixed price bid to include professional services and associated
expenses (travel accommodations, after hours and weekend work for both Providence and
co-location in Springfield, MA).
During the Remediation and IT Disaster Recovery Phase the selected vendor will address
remediation of both environments, software upgrades, connectivity, firewalls configuration,
backups, replication, internet bandwidth and all aspect of disaster recovery including
failover and testing (from production to disaster recovery site and back to production).
Technical Requirements:
Rhode Island Housing requires the senior engineers carrying out the implementation to be
certified and have extensive working knowledge with proven track records for these types
of implementations:
1.
2.
3.
4.
5.
6.
7.
NetApp (all modules)
VMware (all modules)
Cisco/Juniper (switches, routers and firewalls)
Disaster and Recovery
Linux & Windows 2003/2008/SQL 2005 (working knowledge)
Novell /OES/GroupWise (working knowledge)
Symantec Backup
Proposer must apply best practice at all times and must spell out step by step methodology
by which the remediation, disaster recovery connectivity testing and full disaster recovery
will be conducted.
Page 7
Systems Remediation and Disaster Recovery Implementation:
Proposer is expected to perform a comprehensive remediation and disaster recovery,
addressing each of the items identified as needing remediation in the Disaster Recovery
Assessment - Phase 1, in accordance with and as supplemented by the recommendations
contained in the VMware Health Check - Production Report and VMware Health Check Disaster Recovery Report, as appropriate to the environment. The Rhode Island Housing
Phase I Design diagram sets forth our illustration of our anticipated connectivity
configuration. These documents are provided Exhibits 1-4 to this RFP and are referred to
as the Documents.
In carrying out the work specified in the Documents 1-4, the proposer must ensure that
each of the following criteria are addressed and satisfied:
 The proposed design must have built in redundancy so that no single point of failure
will result in an outage or downtime
 Zoning speed should be 8 GB
 Consultant should ensure that effective Remote Management features exist in the
solution so issues can be addressed remotely to eliminate frequent visits to colocation.
 The proposed design should insure that all VMs without exception must be
replicated to disaster recovery and have full disaster recovery functionality
 Consultant must achieve real time replication or close to it (less than 1/2 hour) in
addition, provide high availability for selected systems and applications
 Disaster recovery site testing must include all aspect of Rhode Island Housing
normal operation, accessing applications, file servers, emails, Databases, printing,
VPN connections.
 The proposed design must address VPN connection for users and site to site
 As part of disaster recovery design, users should have the ability to perform normal
functions by either connecting directly from Providence to Springfield using their
desktops or thru the use of laptops with VPN
 Consultant will need use wire management, label all data centers equipment
including wires
 After remediation, a Peer Assessment (Health Check assessment for VMware,
Netapp, etc) has to take place prior to any disaster recovery test. The assessment has
to be done by engineer not involved in the remediation project. The result must
meet the recommendation in Exhibit 1-4, only then consultant can start with
disaster recovery configuration and testing
 Setup and configuration of data collection tools to determine rate of change for
volumes being replicated to the Disaster Recovery site
 The selected vendor will install the most current software releases and all hardware
must be tested and operational. Backup and restore processes must be verified,
tested and meet requirement see Exhibit 1-4 for details
Page 8
 Backup solution must address a full backup solution: disk to disk to tape of 10 TB
with back up time 8 hours or less
 Reconfigure Symantec and work on resolving speed issues (No drives will be
purchased)
 Consultant would be responsible for the security, the moving, and the installation &
configuration of all disaster recovery equipment at the Springfield co-location
 Proposer will have to conduct detailed design review with Rhode Island Housing
staff to validate that the design decisions meet the requirements
 Proposer must describe, in detail, their ability to provide a solution for each of the
line item described in Exhibit 1. A non-response will be equivalent to “no solution
available” for the specification. It is the vendor’s responsibility to correctly correlate
their response to each item
 It is the proposer responsibility to remediate all existing issues or complications
arising from the proposed design
 Prior to starting the work on the project, consultant must perform inventory,
verification of equipment, software quantities, and confirm that everything is
available to start the project Provide a list of any additional equipment or software
required to complete this project
 Consultant is responsible for making connectivity configurations for disaster
recovery OSHAEAN is our disaster recovery co-location partner and SecureWorks
is our firewall provider are there to fulfill and facilitate firewall changes
If proposer believes that any of the above requirements are in conflict with the
Documents they must identify the conflict in the response to this RFP and set
specifically
Summary of key Deliverables
 Completion of all remediation work set forth in the Documents, in accordance with
the requirements set forth above
 A successful full disaster recovery test that meet Rhode Island Housing
requirements set forth in this RFP and the Documents
 Delivery of a detailed test report that shows successful implementation of
remediation work
 Delivery of full documentation that outlines step by step how to declare, execute
and recover back to Production
 Delivery of as built document that encompasses physical and logical layout of
production & Disaster Recovery which include Visio diagrams
 Delivery of training and a knowledge transfer document
Page 9
Inquiries and Communication
All inquiries and other communications with respect to this RFP are to be directed ONLY
to the following email address: [email protected]
Attachment B
Check List
This form needs to be returned with your proposal to confirm your understanding of the
assignment and agreement to provide the required services.
___ 1.
This is a Remediation and IT Disaster Recovery Phase, the outcome is a full
disaster recovery within the parameters outline in Attachment A and Exhibit 1-4
___ 2.
The proposed fee must be fixed for the duration of the project and without
exclusions.
___ 3.
Project shall commence immediately after the execution of an agreement and the
completion date must be no later than June 15th, 2012
___ 4.
This type of project will need to have at least 2 to 3 engineers on the premises,
engaged and committed to the project
___ 5.
The full solution implementation must be completed within 4 to 5 weeks from the
signing of the agreement.
___ 6.
Proposer must have at least five (5) years experience installing systems and a list of
locally completed projects (3 minimum) equivalent in size and system type to this
project. Consultant must provide contact names, telephone numbers and dates of
completion.
___ 7.
If Proposer cannot supply a list based on the above criteria, provide a list of five
system installation regionally that is of the same manufactured type and model.
(Include contact names, telephone numbers, dates of completion and size of
systems).
___ 8.
Proposer shall provide a list of all qualified engineers and project managers to be
assigned to this project, including relevant training programs completed by each,
and years of related experience in area of expertise.
Page 10
___ 9.
Provide a detailed connectivity for disaster recovery and testing methodology.
___ 10. Provide a detailed project plan with time table of completion for each component
addressed in Exhibit 1, bullet points in attachment A and apply best practice set
forth in Exhibit 3-4.
___ 11. Commitment to successfully address every item in the Attachments A and B, and
implement all the recommendations contained in the health check assessment
Documents (Exhibits 1-3).
___ 12. Proposer has to show physical and logical layout of their disaster recovery design.
___ 13. Backup solution must be spelled out in great detail in your response to this RFP.
___ 14. Commitment to visiting the co-location prior to starting the project, the goal of
the visit is to be familiar with the site as well as check equipment, fiber line, phone
line, rack.
Exhibit 1
Disaster Recovery
Assessment
Phase 1 – A Consultative Document on
Remediation and Design
Rhode Island Housing – Phase 1 - Disaster Recovery Assessment
Executive Summary.............................................................................................................................................. 3
Overview .............................................................................................................................................................. 3
Phase 1 – Deliverables ............................................................................................................................... 4
Assessment Summary ......................................................................................................................................... 4
Components Analyzed in this report – Production and DR Environments ............................... 5
Detailed Report – Health Check Audit List .................................................................................................. 5
Conduct an end-to-end overview of the whole environment ......................................................... 6
Review overall utilization and performance metrics of virtual environment .......................... 6
Analyze auto-support messages and Syslogs ........................................................................................ 6
Verification of multi-pathing on all hosts ............................................................................................... 7
Review of cabling ............................................................................................................................................. 7
Explore licenses in hand versus what is needed to accomplish disaster recovery ................ 8
Assess SAN switches & Fiber Channel zoning ....................................................................................... 8
Routing, connectivity, bandwidth, firewalls, site to site replication ............................................ 8
Aggregate configuration laid out for best performance .................................................................... 9
Security risks and exposures .................................................................................................................... 10
Switch traffic separation (i.e. console 2, vMotion 2, Network 4... ) ............................................ 10
Time servers for production and disaster recovery ........................................................................ 11
DNS & DHCP redundancy for both environments ............................................................................ 11
Spindle count per aggregate ..................................................................................................................... 12
Space usage and hot spares ....................................................................................................................... 12
NetApp Alua .................................................................................................................................................... 12
NetApp BMC .................................................................................................................................................... 13
VMware DRS.................................................................................................................................................... 13
High Availability ............................................................................................................................................ 13
VMware DPM .................................................................................................................................................. 14
VIan/ Tagging ................................................................................................................................................. 14
VMware SRM ................................................................................................................................................... 14
VMware Thin Provisioning ........................................................................................................................ 15
NetApp Aggregates and volumes ............................................................................................................ 15
VMotion ............................................................................................................................................................. 16
Data Protection .............................................................................................................................................. 16
Storage Networking ..................................................................................................................................... 16
NetApp Deduplication ................................................................................................................................. 17
NetApp SnapShot........................................................................................................................................... 17
NetApp FlexClone .......................................................................................................................................... 18
NetApp Snap Vault ........................................................................................................................................ 18
Monitoring & Management........................................................................................................................ 19
NetApp SnapManager .................................................................................................................................. 19
NetApp Operation Manager ...................................................................................................................... 20
NetApp Protection Manager ..................................................................................................................... 20
NetApp Provisioning Manager ................................................................................................................. 20
Backup and Recovery .................................................................................................................................. 21
Verify all software licensing for NetApp/VMware/Backup/Disaster Recovery .................. 22
Explore NetApp/VMware/HP software and firmware upgradability ...................................... 22
VMware Ports configurations and binding ......................................................................................... 23
Error logs for both environments ........................................................................................................... 23
Address configuration for graceful shutdown of the environment during power .............. 24
loss (Symmetra LX, 16KVA tower) ......................................................................................................... 24
Opportunities to optimize configurations and improve performance .......................................... 24
Roadmap for software, configuration enhancement and upgrades ............................................... 25
Coordination regarding Disaster Recovery ............................................................................................. 27
Coordination with Oshean ......................................................................................................................... 27
Coordination with SecureWorks ............................................................................................................. 27
Procurement Needed........................................................................................................................................ 29
Logical and Physical Layout Visio Diagrams ........................................................................................... 29
Executive Summary
Overview
Rhode Island Housing (RIH) has engaged Vendor Company for Phase 1 of the Disaster
Recovery Site Implementation Project. Phase 1, at its highest level, is to provide an
assessment and analysis of RIH’s current production and future disaster recovery
environments (currently staged at main datacenter in Providence) in order to facilitate the
Disaster Recovery Site Implementation Project, known as Phase 2. Phase 2, at its highest
level, includes moving the Disaster Recovery Environment to Oshean’s datacenter and
configuring for proper disaster recovery functionality.
The purpose of Phase 1 is twofold. First is to identify any current issues with production
and disaster recovery environments that would prevent proper functionality of disaster
recovery. These issues will be defined as necessary remediation steps for Phase 2. Second
is to provide a design and any other necessary information for the implementation portion
of Phase 2, which is configuring and moving the Disaster Recovery Environment to
Oshean’s datacenter.
Phase 1 – Deliverables
The deliverables in this assessment report include the following items as detailed in the
RIH document, Attachment A: Scope of Work, provided in this report as Appendix A.

Assessment Summary

Detailed Report – Health Check Audit List

Detailed Report – VMware Health Check

Coordination regarding Disaster Recovery

As Built Document – Logical and Physical Layout Visio Diagrams

Report addressing opportunities to optimize configurations and improve
performance

Report providing roadmap for software, configuration enhancement and upgrades
Assessment Summary
The Assessment of RIH’s production and disaster recovery environments was performed
by analyzing those components of both environments in order to determine current health
and any requirements needed to facilitate the transition to Phase 2. This summary will
identify the components (hardware and software) that were analyzed and the method for
analysis used which yielded the data for all deliverables in this project. Reading the reports
below you will find the specific audit list items (as determined by Attachment A), the
components included in the analysis, and the recommendation for Phase 2.
Components Analyzed in this report – Production and DR Environments

NetApp FAS2040 w/ dual controllers and DS4243 Disk Shelf
o Analysis Tool – Vendor Company Engineer Manual Analysis and NetApp
Operations Manager

VMware – vSphere 4.1 – Vendor Company Engineer Manual Analysis and VMware
Health Check Analyzer

Juniper SNG w/ HA – Dell SecureWorks Engineer Analysis and Vendor Company
Engineer Manual Analysis

Cisco 3750s & 9124mds – Vendor Company Engineer Manual Analysis

Cabling (Fiber and Ethernet) – Vendor Company Engineer Manual Analysis

Bandwidth – Vendor Company Engineer, Oshean Engineer and Dell SecureWorks
Engineer Manual Analysis

DNS / DHCP – Vendor Company Engineer Manual Analysis

Backups – Symantec, HP StorageWorks Library, NetApp SnapManager – Vendor
Company Engineer Manual Analysis
Based on all the items analyzed in the Components section above, there are many items
that require remediation. Recommended remediation will be detailed in the 2 sections
below (Detailed Report). In addition, any items of note that are important for the delivery
of Phase 2 will be detailed in the 2 sections below as Design Notes.
Detailed Report – Health Check Audit List
The list below was pulled from Attachment A (see Appendix) as developed by RIH and was
used as a guide by the Vendor Company Engineer to provide thorough analysis. This
section will define the specific audit item and then subsequently detail both necessary
remediation steps and design notes; items that are required as deliverable in Phase 2.
Conduct an end-to-end overview of the whole environment
Necessary Remediation –
Remediation is defined in individual items of audit list below.
Design Notes –
See design notes of each individual audit list item below in addition to reports in this
document around Phase 2 coordination, optimization opportunity, upgrade roadmap and
Visio diagrams.
Review overall utilization and performance metrics of virtual environment
VMware – Upgrade ESX to vSphere 5 in both Production and DR and upgrade each virtual
machine hardware version to 8.0.
Design Notes –
VMware – qty three (3) ESX hosts (per environment) with minimal load on cpu and
memory
HP Proliant DL380 G7 – qty three (3) (per environment) with 196GB Ram each
Analyze auto-support messages and Syslogs
NetApp – Syslog messages indicated that system time was wrong across all controllers on
both filers (DR and Production). Recommendation is to configure internet time server.
Design Notes –
Verify time servers are syncing with NetApp (all controllers) and providing proper time.
Verification of multi-pathing on all hosts
VMware – Multi-pathing is properly configured for all datastores, however three (3) Raw
Device Mappings exist in Production environment. Recommendation is to migrate RDM
data to VMFS datastores and ensure multi-pathing is correct.
Design Notes –
Multi-pathing exists for all VMFS datastores – mapped over Fiber Channel paths (four (4)
paths each). Verify number of paths is correct for those RDMs which were migrated to new
VMFS datastores in remediation.
Review of cabling
Fiber – Recommendation is to chase all fiber connections from Production ESX hosts and
verify that each HBA interface on HP server is patched into disparate fiber switch and
configured properly in fiber switch to accomplish no single point of failure.
Ethernet - Recommendation is to chase all Production ESX Ethernet adapters (HP servers)
and verify they are patched into disparate switching. Also recommended is to ensure PCI
and Onboard NICS are leveraged in redundancy across VMware vswitches. Each vswitch
should have both a PCI and Onboard NIC assigned to it.
Design Notes –
In order to accomplish Ethernet redundancy at both Production and DR facilities, a qty of
one (1) Cisco 3750 24-port switches must be procured.
Explore licenses in hand versus what is needed to accomplish disaster recovery
NetApp – Install, configure and test FlexClone license for both controllers on DR and
Production SAN.
Juniper – Need to coordinate setup, configure and test of IPS and IDS with Dell
SecureWorks.
Design Notes NetApp – Procure qty four (4) (two (2) controller X 2 SANs) FlexClone licenses.
Juniper – procure licensing for IDS and IPS from Dell SecureWorks.
Assess SAN switches & Fiber Channel zoning
None noted.
Ensure running at 8gb for all fiber
Design Notes Fiber Zoning – is appropriate – details below
a. Prod A – 1 VSAN with 15 Zones
b. Prod B - 1 VSAN with 70 Zones
c. DR A – 1 VSAN with 15 Zones
d. DR B -
1 VSAN with 30 Zones
Routing, connectivity, bandwidth, firewalls, site to site replication
VMware – SRM is licensed but is not configured. Recommendation is to configure SRM and
ensure replication works.
NetApp – FlexClone needs to be licensed.
NetApp SnapMirror – No replication schedule set. Need to set a replication schedule on all
controllers. Prod Controller A - replication not set for volumes datastore05, 07, 08.
Prod Controller B - replication not set for volumes datastore17, 18, 19.
DR Controller A - destination not set for datastore01, 02, 03,04,05,06.
DR Controller B - destination not set for datastore11, 12, 13,14,15,16.
Design Notes –
VMware – Configure SRM.
NetApp – Configure FlexClone.
Bandwidth is sufficient for requirements – 100mb fiber provided by Oshean
Firewalls. Production site has Juniper SNG (managed by Dell Secureworks) with High
Availability. Routing is in place for DR site subnets. DR site has a Cisco ASA provided by Dell
Secureworks. Vendor will need to create and send logical firewall configuration for DR site
(form provided by Dell) to SecureWorks.
Aggregate configuration laid out for best performance
NetApp Plug-in for ESX – MBR Tools not installed. Recommendation is to install, configure
and test NetApp MBR Tools.
Design Notes –
This ESX console-based tool tests and aligns guest file systems on a VMDK for VMFS and
NFS datastores. Aligning the file system block boundaries to the underlying NetApp storage
system LUN ensures the best storage performance. The data is migrated from a backup of
the original -flat.vmdk file to a new, properly aligned -flat.vmdk file.
Security risks and exposures
NetApp – CIFS and NFS are running on both SANs but are not used. Recommendation is to
turn off both services.
NetApp - Create users for delegated tasks (administrative) and configure auditing of
account access.
VMware – Discrepancies exist between security profiles on services consoles of all ESX
Hosts. Recommendation is to set security profiles to VMware default.
VMware - Create users for delegated tasks (administrative) and configure auditing of
account access.
Cisco – Vlans are configured on Cisco Core 3750 but ACLs are not being leveraged in any
capacity. Recommendation is to configure Access Groups only allowing necessary traffic
between production Vlans and management Vlans (those subnets used for service console,
vmkernel, NetApp Ethernet Management) and apply to the management Vlans.
Cisco - Vlan tagging is not being used in any capacity. Recommendation is to use tagging at
a minimum for NetApp Management Vlan, VMware Service Console Vlan
Juniper – (previously stated in licensing section above) IPS and IDS are not licensed. Need
to procure these through SecureWorks and ensure they configure for proper intrusion
prevention and deep packet inspection.
Design Notes –
It will be required for the purposes of Phase 2 that vendor can supply an experienced Cisco
Engineer for the remediation of all Cisco items (ACL design and configuration).
Switch traffic separation (i.e. console 2, vMotion 2, Network 4... )
VMware – Prod and DR - vmotion, vmkernel and production port groups all share same
physical NICS and Virtual Switches are not separated by traffic type. Recommendation is to
segment vswitches by traffic type (vmkernel for vmotion, management, vm production).
Design Notes VMware – Prod and DR - each vswitch should have at least two (2) physical NICs attached.
Physical NICs uplinked to virtual switching should be disparate PCI buses (i.e. 1 onboard
NIC and 1 pcie NIC to each vswitch).
General - Also note the requirements in security exposures Cisco section the need for
creating separate vlans for management networks.
Time servers for production and disaster recovery
NetApp – DR and Prod - current time server is set to vCenter. Recommendation is to add
internet time server.
VMware – Prod and DR – ESX hosts currently get their time from vCenter and DRVCenter
respectively. Recommendation is to add internet time server.
Design Notes –
NetApp and ESX should gain time from internet time servers. Virtual machines should be
configured to either get time from ESX host or from network time server (domain
controller).
DNS & DHCP redundancy for both environments
VMware – DR Hosts not configured with DNS servers. Configure DNS on these hosts.
VMware – DR – No DNS server currently for Disaster Recovery environment.
Recommendation is to build a VM with DNS configured to provide DNS to DR environment.
DHCP Prod – Currently one (1) DHCP server serves the scope for production systems.
Recommendation is to deploy a second DHCP server and configure for split scope.
Design Notes –
Production DNS servers are 192.168.2.230 and .231 and are AD Integrated.
Spindle count per aggregate
None noted.
Design Notes –
Note the remediation necessary in the “Aggregate configuration laid out for best
performance” section.
Space usage and hot spares
NetApp – two (2) spare disks are assigned to each aggregate (Production and DR).
Recommendation is to add one (1) spare disk to aggregate as RAID is configured for DP and
two (2) spare disks are unnecessary.
Design Notes –
Aggregate usage should not exceed 90%. Volume usage should not exceed 85%.
NetApp Alua
NetApp – Alua is not currently enabled on any fiber initiators (igroup). Recommendation is
to enable, configure and test Alua.
VMware – Once Alua is enabled on NetApp, initiators then configure the ESX hosts to use
the Alua paths.
Design Notes None
NetApp BMC
None noted; it is properly configured
Design Notes None
VMware DRS
VMware – DRS set to maximum aggressiveness. Recommendation is to set to default.
Design Notes None
High Availability
None noted.
Design Notes VMware – DR and Production are properly configured.
NetApp – DR and Production - each SAN has two (2) controllers – is properly configured for
HA (active / active).
Juniper SNG – Dell SecureWorks confirmed that HA is properly configured for the Juniper
pair.
Cisco ASA DR –. Need to procure an additional Cisco ASA from Secureworks and coordinate
with them for configuration. Vendor will ensure and test that proper HA is setup,
configured and tested by Secureworks.
VMware DPM
None noted.
Design Notes DPM is not being used and configuration is not recommended.
VIan/ Tagging
Note – also see “Security exposures and risks” section.
Cisco 3750 Core – Currently Vlans exist for all subnets on this device. Tagging is not being
used in any capacity. Recommendation is to implement tagging for Vlans used by VMware
and NetApp management.
Design Notes –
Note – Phase 2 vendor should supply an experienced Cisco engineer.
Tagged Vlans should include VMware service console (DR and Production), VmKernel (DR
and production), and NetApp Management (Ethernet interfaces configured DR and
production).
VMware SRM
Note – also see “Routing, connectivity, bandwidth, firewalls, site to site replication” section.
VMware – SRM is currently licensed but not configured. Recommendation is to configure
VMware SRM and test functionality.
Design Notes SRM should be configured according to best practice as set forth by collaboration
performed by VMware and NetApp in this document:
http://media.netapp.com/documents/tr-3671.pdf
VMware Thin Provisioning
See “Monitoring and Management” section.
Design Notes –
Thin Provisioning is being leveraged extensively, which is fine. Only qty eighteen (18)
vmdk disks are thick provisioned. Recommendation is to maintain at least 20% free space
on VmDatastores and monitor datastore utilization to ensure storage does not become
overcommitted.
NetApp Aggregates and volumes
On Production SAN there are seven (7) disks that are not assigned to an aggregate.
Recommendation is to assign them.
On Production SAN, disk 0d.01.4 is not currently owned by any controller.
Recommendation is to ensure proper firmware version of disk and assign disk to a
controller and subsequently add this disk to aggregate.
On DR SAN there are eighteen (18) disks not assigned to an aggregate. Recommendation is
to assign them.
On Prod SAN Controller A, volumes restore and vol2 are not being used. Recommendation
is to delete.
Design Notes –
NetApp – Both SANs – 28 disks assigned to single aggregate but no performance issues
noted.
Ensure all disks, per remediation, are assigned to a controller and added to an aggregate.
VMotion
See “Switch Traffic Separation” section.
Design Notes vMotion – Works correctly but recommendation is to have vmkernel for vMotion on its
own vSwitch.
Data Protection
See “Backup and Recovery” section.
Design Notes NetApp, Prod and DR, data is contained within Raid DP (similar to Raid 6) aggregates. Each
aggregate has a designated hot spare disk available. This configuration allows for up to two
(2) simultaneous disk failures at any given time inside an aggregate.
Storage Networking
Production – Both Controllers A & B - e0p is 198.15.1.x/24 – unknown subnet.
Recommendation is to determine reason and disable. E0b is 10.44.55.x/24 – unknown
subnet. Recommendation is to determine reason and disable.
NetApp interfaces e0a – Production is 10.44.10.x/24; DR is 10.55.10.x/24. This is on the
same subnet as production lan for servers. Recommendation is to configure e0a and an
additional NetApp interface with ips in a management vlan.
See Also “Review of Cabling” and “Switch Traffic Separation” sections.
Design Notes –
NetApp Storage Paths, Prod and DR, are configured using Fiber Channel.
NetApp Deduplication
Production - Deduplication window is too short. Recommendation is to have at least a two
(2) hour window per volume and only run during non-business hours daily.
DR - Deduplication not currently configured for any volume. Use same configuration
method as Production, except set the daily schedule for at least four (4) hours after
Production Deduplication schedule begins.
Design Notes –
Deduplication can be enhanced and gain back more storage by grouping Virtual Machines
with same OS on the same NetApp volume.
NetApp SnapShot
Snapshotting is currently configured incorrectly as the retention period is 365 dailies.
Recommendation is to setup, configure and test snapshotting as follows:
a. Daily – qty five (5)
b. Hourly– 7am – 7pm – qty twelve (12)
c. Monthly – qty twelve (12)
d. Yearly = one (1)
Design Notes –
None
NetApp FlexClone
FlexClone is not currently licensed. Need to procure and setup, configure and test FlexClone
Design Notes –
Procure qty four (4) FlexClone licenses (2 SANS X 2 Controllers).
FlexClone is necessary for testing NetApp failover to DR.
NetApp Snap Vault
None noted.
Design Notes NetApp Snap Vault is not being used. It is a backup and recovery product but this will add
an additional layer of complexity to the backup and DR plan that is unnecessary. See
“Backup and Recovery” section regarding design.
Monitoring & Management
NetApp –NetApp tools are currently dispersed amongst several servers. Recommendation
is to consolidate NetApp Management Tools onto a single server for ease of use by IT
Personnel. Recommendation is to also configure NetApp Operations Manager for proper
monitoring and alerting of Prod and DR filers.
VMware – Only default alerts are currently configured, which is fine, but notifications are
not configured. Recommendation is to configure notifications for important alerts when
thresholds are exceeded for storage, networking, resource utilization, and critical errors.
Networking – No snmp monitoring is currently being leveraged for the Cisco switching.
Recommendation is to procure a central monitoring system software solution for
notifications on snmp traps and server-specific thresholds (i.e. Ipswitch What’s Up Gold or
Solarwinds).
Design Notes NetApp – Operations Manager is the primary tool used here to accomplish comprehensive
monitoring and management.
VMware – Monitoring and notifications are built-in, but need to be configured.
NetApp SnapManager
SnapManager for Virtual Infrastructure single file restore does not appear operative. Need
to be able to recover vms from this utility or perform file-level restoration.
Recommendation is to troubleshoot the reason why this is not working and ensure filelevel recovery capability.
Design Notes –
This tool should be installed, configured and tested on the same server with other NetApp
utilities.
NetApp Operation Manager
See also “Monitoring and Management” section above.
SnapMirror not set in Operations Manager; needs to be setup, configured and tested.
Design Notes This tool provides comprehensive NetApp monitoring and management and should be
installed, configured and tested on the same server with other NetApp utilities.
NetApp Protection Manager
This tool provides comprehensive NetApp data protection monitoring and management
and should be installed, configured and tested on the same server with other NetApp
utilities.
Design Notes –
This tool should be used as part of RIH’s use of SnapManager.
NetApp Provisioning Manager
This tool provides comprehensive NetApp storage provisioning and configuration
management and should be installed, configured and tested on the same server with other
NetApp utilities.
Design Notes This tool will provide a single place for RIH to provision new storage and also configure
things like deduplication and thin provisioning.
Backup and Recovery
Symantec Backup Servers = BK38, RIH40, RIH173, RIHBackup253
1. HP StorageWorks Library with only single full height LTO-5 drive and current
backups-to-tape take 36 hours to complete. Recommendation is to procure up to
three (3) half height tape drives.
2. Recommendation is to adopt a disk-to-disk-to-tape solution. Will need to procure
inexpensive SATA storage for this (at least 15TB useable).
3. VCenter - VCenter Databases (Prod and DR) should be backed up leveraging a SQL
maintenance schedule, generating a bkf file to be backed up by BackupExec.
Maintenance schedule requires SQL Standard or Enterprise which is not currently
owned.
Design Notes Recommended Design for Backups - Production
a. Apps, Files, SQL, Groupwise = Symantec BackupExec (disk-to-disk-to-tape)
b. VMs, NetApp Volumes = NetApp SnapManager to SAN
c. NetApp Volumes = BackupExec NDMP Option (already owned and
configured). Change configuration to run a month-end to tape only with
retention of qty twelve (12) to have snapshots of NetApp Volumes on tape.
d. Backup Exec Note - RIH currently owns version 2012 SP3 (installed)
e. Backup-to-Disk Notes – Storage Device for backups must have at least 15TB
useable space and have qty 2 8gb Fibre Channel interfaces. In addition the
current backup server only has 1 port for Fibre Channel on its HBA. An 8gb
dual port Fibre Channel HBA must be purchased, installed and configured in
the current HP Proliant backup server.
f. Vendor needs to show that connectvity and performance of backup design
actually works as designed and vendor must be committed to the performance
as designed and troubleshoot if necessary. The expectation is that interface
perform at 8gb/s and backups run at optimal speed.
Procurement Needs
a. qty three (3) – LTO5 half height drives for HP StorageWorks Library
b. qty one (1) – NAS / SAN – Inexpensive Storage device (at least 15TB usable
space) for backup-to-disk solution with 2 port 8gb Fibre Channel interface.
c. Qty one (1) – PCI Express Fibre Channel 8gb Dual Port HBA for HP Proliant
Backup server
Verify all software licensing for NetApp/VMware/Backup/Disaster Recovery
VMware
1. All licensing for DR is currently owned (ESX Enterprise and SRM).
NetApp
1. qty four (4) FlexClone is required for DR (specifically for testing failover).
2. All other required licensing is owned.
3. SnapManager products – Recommendation is to ensure support is current and
access to new version is always available.
Symantec
1. Recommendation is to maintain Symantec Support Agreement for availability of
newer versions.
Explore NetApp/VMware/HP software and firmware upgradability
1. General - Firmware will be available for all hardware (HP Proliant Servers, NetApp,
Cisco Switches, Juniper) – Recommendation is to upgrade all relevant firmware
before disaster recovery site goes live.
2. NetApp – Recommendation is to upgrade Prod and DR to DataOntap 8.x (most
stable).
3. HP DL380 G7 (qty six (6) Servers) – 8 X CPU (two (2) sockets) – 196GB RAM
(maxed)– has been confirmed as vSphere 5.0 Certified Hardware.
4. ESX – Current version is vSphere 4.1 Build 348481. Recommendation is to upgrade
Prod and DR to vSphere 5.
5. vCenter – Currently running version 4. Recommendation is to upgrade DR and Prod
to vCenter 5.
VMware Ports configurations and binding
Multiple ports on network adapters on ESX hosts are not uplinked. Recommendation is to
use ports available in collaboration with vSwitch and port segmentation.
See “Vlan / Tagging” section.
Design Notes See Visio Diagrams – Logical and Physical Designs
Error logs for both environments
All remediation for errors have been noted in other sections.
Design Notes See “Monitoring & Management” section.
Address configuration for graceful shutdown of the environment during power
loss (Symmetra LX, 16KVA tower)
VMware - Virtual machines not configured for automatic startup / shutdown.
Recommendation is to uninstall any APC powerchute software from virtual machines and
install, configure and test APC agent on all ESX hosts (Prod and DR). Virtual machines all
need to be configured for startup/shutdown in ESX configuration.
NetApp – NetApp (Prod and DR) are not configured for APC Network Shutdown. Need to
setup, configure and test UPS shutdown on NetApp.
Design Notes VMware – APC Network Shutdown
1. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=di
splayKC&externalId=1007036
NetApp – Configure snmp trap on NetApp for APC Symmetra and add the FAS filers as
enabled trap receivers from the APC Network Management Card. Use the “enable ups”
command in DataOnTap to configure the UPS and shutdown event.
Opportunities to optimize configurations and improve performance
RIH has engaged Vendor Company for assessment of both the production and disaster
recovery environments in order to identify any opportunities to optimize current
performance. Almost all optimization notes have already been covered in the previous
section: Health Check Audit List. In addition Vendor Company has provided RIH with a
VmWare Health Assessment of both DR and Prod virtual datacenters. The recommendation
to improve performance and optimize configuration is to remediate all items identified in
these sections. The VmWare Consultant, Vendor Engineer , has also identified the following
items in regards to performance optimization:
1. In the Vmware Disaster Recovery Environment resource pools are not being
currently employed. Recommendation is to mimic resource pool configuration of
the Production environment for DR. This will ensure resources such as CPU and
Memory are allocated to the proper virtual machines
2. In the VmWare Disaster Recovery and Production Environments VM Swap space is
set to use local ESX storage (HP DAS). The recommendation is to configure virtual
machines to swap inside their respective vmfs datastores. This will ensure swap
usage is not crossing disparate storage controllers and increase performance.
3. In the VmWare Disaster Recovery and Production Environments multiple Network
adapters on the ESX hosts are not being utilized. Recommendation is to uplink all
available network adapters and assign them to virtual switches. This will ensure
redundancy in vswitch uplinks and also increase performance as more physical
adapters can be used for vswitch bandwidth.
4. In the VmWare Disaster Recovery and Production Environments the VCenter server
database should be upgraded to SQL Server Standard 2008R2. In addition the
database should be moved off the OS partition. With SQL Server Standard there is
enhanced support for SQL maintenance.
Roadmap for software, configuration enhancement and upgrades
RIH has engaged Vendor Company for assessment of both the production and disaster
recovery environments in order to identify a roadmap for software upgrades and
configuration enhancement. Upgradeability of hardware and software had already been
identified in the previous section: Health Check Audit List. This section will provide those
software upgrades recommended for Phase 2 as an overview.
1. NetApp – Currently DR and PROD filers are running DataOnTap version 7.3.4. It is
recommended to upgrade DataOnTap to v.8.1. The new version of DataOnTap
contains enhancements for both snapshotting and flexclone. In addition the new
version has fewer restrictions on aggregate size and number of disks.
2. Cisco – RIH employs Cisco switching throughout its environment and they are all
under a current Cisco Support Agreement. It is recommended to perform IOS
upgrades with version consistency across all 3750s and 9124mds – about 35
switches total (DR and Prod)
3. HP Proliant DL 380 G7 – There are 3 of these servers each in both DR and Prod. The
servers are the ESX hosts for both environments. It will be important, as part of the
upgrade to ESXi5, to perform firmware upgrades on all hardware components with
particular attention to the BIOS version and HBA firmware.
4. ESX – RIH currently runs ESX 4.1. It is recommended that RIH be upgraded to ESXi 5
before the Disaster Recovery environment is moved to the Oshean datacenter. One
key benefit for RIH in ESXi version 5 will be Storage DRS. It will be key to configure
Storage DRS as part of the upgrade. At a high level the new version provides the
following:
a. Improved Reliability and Security — with fewer lines of code and
independence from general purpose OS, ESXi drastically reduces the risk of
bugs or security vulnerabilities and makes it easier to secure your hypervisor
layer.
b. Streamlined Deployment and Configuration — ESXi has far fewer
configuration items than ESX, greatly simplifying deployment and
configuration and making it easier to maintain consistency.
c. Higher Management Efficiency — The API-based, partner integration model
of ESXi eliminates the need to install and manage third party management
agents. You can automate routine tasks by leveraging remote command line
scripting environments such as vCLI or PowerCLI.
d. Simplified Hypervisor Patching and Updating — Due to its smaller size and
fewer components, ESXi requires far fewer patches than ESX, shortening
service windows and reducing security vulnerabilities.
Coordination regarding Disaster Recovery
RIH has engaged Vendor Company for the information required for the coordination of
disaster recovery Phase 2. Per RIH Attachment A: Scope of Work the following needs for
coordination have been identified:

Commitment to discuss Customer's disaster recovery needs with OSHEAN

Customer's disaster recovery needs with SecureWorks

Strategy for disaster recovery implementation
This section will detail the information gathered by the Vendor Company engineer in this
engagement as it pertains to the items above.
Coordination with Oshean
The Vendor Company engineer had a formal conversation with Oshean in order to
determine their SLA in terms of the Phase 2 engagement. Oshean is providing RIH with a
secure datacenter in Springfield Massachusetts that will serve as RIH’s space for the new
disaster recovery environment. Oshean is primarily responsible for the following items per
their SLA:

Adequate space and rack in Springfield datacenter for RIH equipment

Adequate power and backup (UPS and generator) for RIH equipment

A 100mb Cox-leased fiber point-to-point line (dual path except for local loop in
Providence) from the RIH Providence RI datacenter to the DR datacenter in
Springfield Massachusetts

Proactive monitoring of the health and uptime of the Cox-leased line

Availability to the datacenter for physical access by RIH IT personnel

Coordination with RIH for physical access to the datacenter by RIH vendors
Coordination with SecureWorks
The Vendor Company engineer had a formal discussion with Dell Secureworks as it relates
to the coordination and SLA for Phase 2. Secureworks is the vendor for RIH who provides
hardware and management for the firewalls in Providence and Springfield. Currently
Secureworks maintains QTY 2 Juniper SNG firewalls for RIH in Providence for their
production datacenter. Since the firewalls are managed no access is given to RIH or
vendors for login or configuration. It will therefore be important for the Phase 2 vendor to
coordinate closely with RIH and Secureworks to ensure proper configuration of the new
firewall to be installed in Springfield. Vendor Company spoke with the project manager and
engineer at Secureworks who will be responsible for delivery of the new firewall and
determined the following:

Secureworks will provide QTY 1 Cisco ASA 5510 – 2nd ASA may be procured by RIH

Secureworks will perform configuration of ASA

RIH Vendor must provide Secureworks with the logical configuration for the Cisco
ASA

Logical Configuration is provided to Secureworks by filling out a configuration form

Secureworks, after configuration has been applied, will ship the ASA to the the RIH
Providence Datacenter and vendor will be responsible for delivery and install in
Springfield.

RIH vendor must physically install Cisco ASA into rack provided by Oshean

Secureworks project management team for Phase 2 is only responsible for delivery
of configuration and hardware up to testing of site-to-site connectivity

Once connectivity and proper configuration has been determined then the
Secureworks project management team becomes disengaged

Any issues beyond site-to-site connectivity and configuration must be
communicated to SecureWorks leveraging the standard support channels

SecureWorks support is facilitated by RIH IT personnel

SecureWorks provides ongoing monitoring of the hardware that it manages for RIH
Regarding the coordination with Oshean for the delivery of Phase 2 Oshean has
communicated that RIH must facilitate vendor access to the Springfield datacenter by
notifying Oshean of dates and times. Access by vendors to the datacenter can be
accomplished via escort by RIH personnel with key carded access or by obtaining keycard
access to the vendor.
It is also important to note that the leased line for connectivity to the datacenter is
provided by Oshean. Any issues with connectivity over this line must go through Oshean’s
NOC. Finally, Oshean does not own and is not responsible for any equipment procured by
RIH (all DR servers, hardware or software). The vendor used in Phase 2 must be
responsible for the DR equipment during the Phase 2 project in regards to configuration
and functionality.
Procurement Needed

QTY 1 Cisco 3750 24-port switch

QTY 4 NetApp Flexclone licenses

Licensing for IDS and IPS for Juniper Production Firewall – (will be procured
directly through SecureWorks by RIH) – no need for quote on this

QTY 1 Cisco ASA – in addition to QTY 1 already procured for HA at Disaster
Recovery Facility (will be procured directly through SecureWorks by RIH) – no
need for quote on this

QTY 1 Central Monitoring Solution – must be able to monitor VmWare, NetApp,
Cisco and Windows Servers (ie..What’s Up Gold, Solarwinds)

QTY 3 – LTO5 Half-Height Tape Drives for existing HP StorageWorks Library

QTY 1 – SAN / NAS storage solution for disk-to-disk backups – SATA storage with at
least 15TB useable space. Must have dual-port 8gb Fiber interface for connectivity
to HP Proliant Backup server.

QTY 1 – PCI Express 8Gb Fibre Channel Dual-Port HBA for HP Proliant
Logical and Physical Layout Visio Diagrams
Exhibit 2
VMware Health Check Report
for
Rhode Island Housing
VMware and Rhode Island Housing Confidential
Health Check Report
© 2010 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual
property laws. This product is covered by one or more patents listed at
http://www.vmware.com/download/patents.html.
VMware, VMware vSphere, VMware vCenter, the VMware “boxes” logo and design, Virtual SMP and VMotion are
registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other marks
and names mentioned herein may be trademarks of their respective companies.
VMware, Inc
3401 Hillview Ave
Palo Alto, CA 94304
www.vmware.com
© 2010 VMware, Inc. All rights reserved.
Page 2 of 33
Health Check Report
Contents
1.
Executive Summary ......................................................................... 4
1.1 Report Overview ............................................................................................................ 4
1.2 Assessment Highlights ................................................................................................... 4
1.3 Next Steps ...................................................................................................................... 4
2.
Recommended Action Items ............................................................ 5
3.
Health Check Assessment ............................................................... 7
3.1 Availability/Management ................................................................................................ 7
3.2 Performance................................................................................................................. 16
4.
Appendix A: Audited Inventory ....................................................... 25
4.1 Host Configurations ..................................................................................................... 25
4.2 Networking Configurations ........................................................................................... 25
4.3 Storage ......................................................................................................................... 27
4.4 Virtual Datacenter ........................................................................................................ 28
5.
Appendix B: Health Check Assessment Checklist .......................... 29
6.
Appendix C: References ................................................................ 32
Page 3 of 33
Health Check Report
1. Executive Summary
1.1
Report Overview
This report summarizes activities and findings from a VMware Health Check that was conducted for
Rhode Island Housing. This report contains:



1.2
Recommended changes to configuration and/or usage per VMware best practices that may
improve availability/management or performance of VMware components
Inventory of components analyzed
Checklist of assessment activities performed
Assessment Highlights
Analysis Period

March, 2012
Datacenters

Rhode Island Housing – Providence Production Site
Contributing Participants


Rhode Island Housing Abdel El idrissi
Vendor Engineer, VmWare Consultant
Summary of Activities





1.3
Performed Standard Health Check Assessment Checklist (see Appendix)
Gathered system information collected from VMware HealthAnalyzer
Interviewed participants to discuss priority issues and concerns
Conducted knowledge transfer to
o clarify understanding of VMware component requirements and behavior
o clarify changes to configuration and usage per VMware best practices
Reviewed documents supplied by Rhode Island Housing
Next Steps
Rhode Island Housing should review this report and consider the recommended action items. A follow-up
consult and/or Health Check is also advised. If required, VMware, through its Professional Services
Organization or via one of its many partner organizations, is able to assist Rhode Island Housing in
implementing the recommended actions as detailed within this report.
Page 4 of 33
Health Check Report
2. Recommended Action Items
Priority
Component
Recommended Action Item
1
Network
1
Virtual
Datacenter
Set up a redundant service console portgroup to use a separate
vmnic/uplink, and an alternate isolation response gateway address for
more reliability in HA isolation detection Set up a redundant service
console portgroup to use a separate vmnic/uplink on a separate subnet
Specify "isolation address" for the redundant service console
(das.isolationaddress2) Increase the failure detection time
(das.failuredetectiontime) setting to 20000 milliseconds or greater
1
Virtual
Datacenter
Avoid making resource pools and VMs siblings in a hierarchy in order
to avoid unexpected performance
1
Virtual
Datacenter
Disconnect vSphere Clients from the vCenter Server when they are no
longer needed
1
Virtual
Datacenter
Use vCenter Server roles, groups, and permissions in order to provide
appropriate access and authorization to virtual infrastructure. Avoid
using Windows built-in groups (Administrators)
1
Virtual
Datacenter
Avoid changing default firewall rules and ports unless necessary
1
Virtual
Machines
Check to make sure that VMware Tools are installed, running, and not
out of date for running VMs
1
Virtual
Machines
Limit use of snapshots and for short term use
1
Virtual
Machines
Check to make sure that VMs meet the requirements for VMotion
1
Virtual
Machines
Allocate only as much virtual hardware as required for each VM.
Disable any unused or unnecessary virtual hardware devices.
2
Host
Avoid changes to advanced parameter settings unless necessary
2
Network
Check to make sure there is redundancy in networking paths and
components to avoid single points of failure (e.g. at least 2 paths to
each network)
2
Storage
Allocate separate space on shared datastores for templates and
media/ISOs from datastores for VMs
2
Storage
Check to make sure there is redundancy in storage paths and
components to avoid single points of failure
Change portgroup security default settings ForgedTransmits and
MACAddressChanges to Reject unless app requires it
Page 5 of 33
Health Check Report
Priority
Component
2
Virtual
Machines
Use the latest version of vmxnet that is supported by the guest OS
2
Virtual
Machines
Select the correct guest OS type in the VM configuration to match the
guest OS
2
Virtual
Machines
Use reservations/limits selectively on VMs that need it; don't set
reservation too high or limits too low
2
Virtual
Machines
Consider using virtual hardware v7 to take advantage of additional
capabilities (like VMXNET3, PVSCSI)
2
Virtual
Machines
Disable copy/paste between guest OS and remote console
3
Network
Configure NICs and physical switch speed and duplex settings
consistently. Set to autonegotiation for 1Gb NICs
3
Network
Distribute vmnics for a portgroup across different PCI busses for
greater redundancy
3
Virtual
Machines
Use the correct virtual SCSI hardware (e.g. BusLogic Parallel, LSILogic
SAS/Parallel, VMware Paravirtual)
Page 6 of 33
Health Check Report
3. Health Check Assessment
3.1
Availability/Management
Item
Comments
Observation 1
Advanced Settings for the following 3 ESX host(s) have been changed from
defaults:



10.44.10.10
10.44.10.14
10.44.10.12
Priority
2
Component
Host
Recommendation
Justification
ESX/ESXi hosts have a default configuration that normally does not need to be
changed. Certain advanced parameter settings can be selectively changed if
they are recommended VMware best practices and documented, or under
direction by VMware support in order to address specific issues. Otherwise the
advanced parameter settings should not be changed, as that may have an
adverse and unpredictable impact on management, availability or performance
of the virtual infrastructure.
If an advanced parameter setting needs to be changed, make sure that the
changes are consistently applied to all applicable hosts within the environment
or cluster. Also, maintain proper change management procedures and
document configuration changes.
An example of advanced parameter setting that is valid is if a redundant service
console portgroup is set up and an alternate isolation response gateway
address needs to be configured for more reliability in HA isolation detection (by
modifying das.isolationaddress2 and das.failuredetectiontime
parameters.
An example of advanced parameter setting that should not be changed is
disabling transparent page sharing by setting sched.mem.pshare.enable
parameter to false. Transparent page sharing provides numerous advantages
for memory resource management and should not be changed.
Item
Comments
Page 7 of 33
Health Check Report
Item
Comments
Observation 2
The portgroup/vSwitches on the following 3 host(s) have less than 2 uplink
paths:



10.44.10.10
10.44.10.14
10.44.10.12
Priority
2
Component
Network
Recommendation
Check to make sure there is redundancy in networking paths and components
to avoid single points of failure (e.g. at least 2 paths to each network)
Justification
In order to ensure that there is no service disruption, it is important to ensure
that the networking configuration is fault resilient to accommodate networking
path and component failures.
It is recommended that all portgroups and distributed virtual portgroups are
configured with at least two uplink paths using different vmnics. Use NIC
teaming with at least two active NICs or in the case of service
console/management portgroup one in active and at least one in standby. Set
failover policy with the appropriate active and standby NICs for failover.
Connect each physical adapter to different physical switches for an additional
level of redundancy.
Upstream physical network components should also have the necessary
redundancy in order to accommodate physical component failures.
Item
Comments
Observation 3
The vmnics on the following 3 host(s) are not distributed across different PCI
busses:



10.44.10.10
10.44.10.14
10.44.10.12
Priority
3
Component
Network
Recommendation
Distribute vmnics for a portgroup across different PCI busses for greater
redundancy
Justification
Distributing vmnics for a portgroup across different PCI busses provides greater
redundancy from failures related to a particular PCI bus. It is also important to
team vmnics from different PCI busses in order to improve fault resiliency from
component failures.
Page 8 of 33
Health Check Report
Item
Comments
Observation 4
The portgroup security settings ForgedTransmit or MacAddressChanges are
not set to Reject on the following 3 host(s):
Host: 10.44.10.10




































VMotion (Setting: MAC address changes )
VM Network (Setting: MAC address changes )
Service Console 2 (Setting: MAC address changes )
Service Console (Setting: MAC address changes )
No Network (Setting: MAC address changes )
10.44.10.x (Setting: MAC address changes )
VMotion (Setting: Forged Transmit )
VM Network (Setting: Forged Transmit )
Service Console 2 (Setting: Forged Transmit )
Service Console (Setting: Forged Transmit )
No Network (Setting: Forged Transmit )
10.44.10.x (Setting: Forged Transmit )
Priority
1
Component
Network
Recommendation
Page 9 of 33
Health Check Report
Item
Comments
Justification
It is recommended that both of these options are set to Reject for improved
security.
In order to protect against MAC address impersonations and prevent ESX/ESXi
from honoring requests to change the effective MAC address to anything other
than the initial MAC address, change the settings to Reject.
By setting the MACAddressChange setting to Reject, ESX/ESXi compares the
source MAC address being transmitted by the Guest OS with the effective MAC
address for its adapter to see if they match. If the addresses do not match,
ESX/ESXi drops the packet. This allows impersonated addresses to be
dropped before they are delivered and the Guest OS assume that the packets
have been dropped.
For VMs that require overriding this setting (intrusion detection or MSCS VM),
create a special port group for these (and only these) VMs with the modified
settings.
References:
"Configuring the ESX/ESXi Host" section in VMware Infrastructure 3 Security
Hardening Guide http://www.vmware.com/vmtn/resources/726
Item
Comments
Observation 5
The following 3 ESX host(s) have datastore(s) with no redundant storage path
configured between the datastore and ESX host:
Host: 10.44.10.10






ESX_LUN_1 - naa.600c0ff000d54b87958e314801000000
ESX_LUN_0 - naa.600c0ff000d54024c28e314801000000
Priority
2
Component
Storage
Recommendation
Check to make sure there is redundancy in storage paths and components to
avoid single points of failure
Justification
Configuring multiple paths to storage improves availability and in some cases
allows for load balancing. For FC storage, redundant fabrics are highly
recommended.
References:
SAN System Design and Deploy
http://www.vmware.com/resources/techresources/772
Page 10 of 33
Health Check Report
Item
Comments
Observation 6
The following 3 ESX host(s) do not have redundant service console port groups
on distinct vSwitches:



10.44.10.10
10.44.10.14
10.44.10.12
The default HA failure detection time has not been changed for the following 1
Cluster(s):

Production
No alternate isolation addresses have been specified for the following 1
Cluster(s):

Production
Priority
1
Component
Virtual Datacenter
Recommendation
Set up a redundant service console portgroup to use a separate vmnic/uplink,
and an alternate isolation response gateway address for more reliability in HA
isolation detection
Set up a redundant service console portgroup to use a
separate vmnic/uplink on a separate subnet
Specify "isolation address" for
the redundant service console (das.isolationaddress2) Increase the failure
detection time (das.failuredetectiontime) setting to 20000 milliseconds or
greater
Justification
Although NIC teaming is used to account for NIC failures, overall redundancy
for HA heartbeats and isolation response detection can be made more reliable
by setting up a redundant service console on a separate subnet.
Each service console network should have one isolation address it can reach.
When you set up service console redundancy, you must specify an additional
isolation response address for the secondary service console network. VMware
also recommends increasing the failure detection time setting to 20000 ms or
greater.
References:
"Best Practices for VMware HA Clusters" section in vSphere Availability Guide
http://www.vmware.com/support/pubs
"VMware HA Best Practices" section in VMware High Availability: Concepts,
Implementation, and Best Practices
Item
Comments
Page 11 of 33
Health Check Report
Item
Comments
Observation 7
The following 1 user Windows default users/groups are being used for vCenter
user roles/permissions :

Administrators (for entity Datacenters)
Priority
1
Component
Virtual Datacenter
Recommendation
appropriate access and authorization to virtual infrastructure. Avoid using
Windows built-in groups (Administrators)
Justification
By default, any user or group who is a member of the local Administrators
group of the Windows Server running vCenter Server will have full
administrative control of vCenter Server (and the virtual infrastructure). This can
allow other system administrators that are not virtual infrastructure
administrators access to the virtual infrastructure.
Use the appropriate vCenter Server roles and assign them to the appropriate
vCenter Administrators AD group to ensure access is limited to virtual
infrastructure administrators.
Before removing users or groups from vCenter Server, make sure that you
create and test access to vCenter Server for the new users and groups.
Reference:
"Managing Users, Groups, Roles, and Permissions section in vSphere Basic
System Administration http://www.vmware.com/support/pubs
Item
Comments
Observation 8
The firewall settings for the following ESX host(s) have been modified from the
default:



10.44.10.10
10.44.10.14
10.44.10.12
Refer to findings data for more specific details about the settings that have
been modified
Priority
1
Component
Virtual Datacenter
Recommendation
Page 12 of 33
Health Check Report
Item
Comments
Justification
The default firewall rules are configured in order to ensure adequate security
while allowing communication with the appropriate virtual infrastructure
components.
Unless required to enable communication for virtual infrastructure services,
avoid changing the firewall rules as that can introduce additional security
issues. It is best to leave the default security firewall settings, which block all
incoming and outgoing traffic that is not associated with an enabled service.
If you do enable a service and open ports, make sure to document the
changes, including the purpose for opening each port and consistently make
the changes on all the appropriate ESX/ESXi hosts.
Avoid changing the default ports unless necessary. These ports are required for
vCenter Server, ESX hosts and other components to communicate with the
virtual infrastructure.
References:
"Configure the Firewall for Maximum Security" section in VMware Infrastructure
3 Security Hardening http://www.vmware.com/resources/techresources/727
TCP and UDP Ports for vCenter Server, ESX hosts, and other network
components management access KB http://kb.vmware.com/kb/012382
Item
Comments
Observation 9
The following 1 VM(s) have SCSI controllers which differ from the default SCSI
controller for their guest OS:

GPO (version 7) on host 10.44.10.10 is using VirtualIDEController
Priority
3
Component
Virtual Machines
Recommendation
Use the correct virtual SCSI hardware (e.g. BusLogic Parallel, LSILogic
SAS/Parallel, VMware Paravirtual)
Page 13 of 33
Health Check Report
Item
Comments
Justification
Selecting the incorrect virtual SCSI hardware can prevent the VM from properly
booting or impact the performance of the VM. Check the Guest Operating
System Installation Guide for the correct virtual SCSI hardware that is
supported.
vCenter Server automatically selects the default SCSI adapter that is supported
for the guest OS of the VM. In general, older guest OSs might require
BusLogic. LSILogic SAS is available for VMs with virtual hardware v7. LSI Logic
is best for workloads that drive less than 2000 IOPS and 8 outstanding I/Os.
VMware Paravirtual PVSCSI adapter can be used for environments where
hardware and applications drive a high amount of I/O throughput. PVSCI is best
for workloads that drive more than 2000 IOPS and 8 outstanding I/Os. This
adapter is not suited for DAS environments and has some other limitations,
such as
1.
2.
3.
Supported on a few guest OS (e.g. Win Server 2003/8, RHEL5)
Hot add/remove requires a bus rescan from within the guest
Cannot boot a Linux guest or Windows guest (prior to ESX4 U1) can
be used as a data disk
VMware FT and MSCS cluster is not supported
4.
References:
Configuring Disks to use VMware Paravirtual SCSI KB
http://kb.vmware.com/kb/1010398
Guest Operating System Installation Guide
Do I Choose PVSCSI or LSI Logic virtual adapter on ESX 4.0 for non-IO
intensive workloads? http://kb.vmware.com/kb/1017652
Item
Comments
Observation 10
The following 4 VM(s) do not meet some of the VMotion requirements (either
floppy/cd-rom found, VM in internal network, network or datastore not visible to
all ESX in cluster):




RIH16 COPY MM
RIH44
RIH39
oestester
Priority
1
Component
Virtual Machines
Recommendation
Page 14 of 33
Health Check Report
Item
Comments
Justification
In order to facilitate VMotion of VMs between hosts, the following requirements
must be met:
1.
The source and destination hosts must use shared storage and the
disks of all VMs must be available on both source and target hosts
VM should not be connected to internal networks
The portgroup names must be the same on the source and destination
hosts (easier with vDS vNetwork Distributed Switch)
VMotion requires a 1GB network
CPU compatibility source and destination hosts must have compatible
CPUs (relaxed for EVC Enhanced VMotion Compatibility)
No devices attached that prevent VMotion (CDROM, Floppy,
serial/parallel devices
2.
3.
4.
5.
6.
References:
VMware VMotion and CPU Compatibility
www.vmware.com/files/pdf/vmotion_info_guide.pdf
Item
Comments
Observation 11
Copy/paste is not recommended and is enabled on the following 56 VM(s) :





RIH19
SUSE_BASE
RIH26
RIH233
RIH55
Priority
2
Component
Virtual Machines
Recommendation
Page 15 of 33
Health Check Report
Item
Comments
Justification
When VMware Tools runs in a virtual machine, by default you can copy and
paste between the guest operating system and the computer where the remote
console is running. As soon as the console window gains focus, non-privileged
users and processes running in the virtual machine can access the clipboard of
the virtual machine console. If a user copies sensitive information to the
clipboard before using the console, the user perhaps unknowingly exposes
sensitive data to the virtual machine. It is recommended that you disable copy
and paste for the guest operating system by creating the parameters shown in
the following table.
Note that this does not disable copy and paste for the users when they access
the virtual machine through other means like Terminal Services. This disable
copy and paste only from the virtual machine console e.g. when using the
console from the Virtual Infrastructure client.
Configuration Settings to Disable Copy and Paste
Name
Value
isolation.tools.copy.disable
true
isolation.tools.paste.disable
true
isolation.tools.setGUIOptions.enable
false
References:
"Virtual Machines" section in VMware Infrastructure 3 Security Hardening
3.2
Performance
Item
Comments
Observation 1
The following 3 ESX host(s) has Gbps network adapter(s) and not set to
AutoNegotiate:



10.44.10.10
10.44.10.14
10.44.10.12
Priority
3
Component
Network
Recommendation
Configure NICs and physical switch speed and duplex settings consistently. Set
to autonegotiation for 1Gb NICs
Page 16 of 33
Health Check Report
Item
Comments
Justification
Incorrect network speed and duplex settings can impact performance. The
network adapter (vmnic) and physical switch settings needs to be checked and
set correctly. If your physical switch is configured for a specific speed and
duplex setting, you must force the network driver to use the same speed and
duplex setting. For Gigabit links, network settings should be set to autonegotiate and not forced.
Setting network adapter speed and duplex settings can be done from the
vSphere Client although a reboot is required for changes to take effect.
Reference:
Performance Best Practices for VMware vSphere 4.0
Item
Comments
Observation 2
The following 3 datastore(s) have both VMs and Templates:



DatastoreB4
DatastoreB5
DatastoreA3
Priority
2
Component
Storage
Recommendation
Allocate separate space on shared datastores for templates and media/ISOs
from datastores for VMs
Justification
To improve performance, separate VM files from other files such as templates
and ISO files that have higher I/O characteristics. A best practice is to dedicate
separate shared datastores/LUNs for VM templates and for ISO/FLP files,
separate from the VMs themselves.
ISO and FLP media files can be placed either locally in the /vmimages directory
on each host or in a shared datastore. To avoid storing unnecessary copies,
place media files on shared storage.
Item
Comments
Observation 3
The following 1 Resource Pool(s) have both virtual machines and child
resource pools as siblings in hierarchy :

Priority
Resources (Owner: Production )
1
Page 17 of 33
Health Check Report
Item
Comments
Component
Virtual Datacenter
Recommendation
Avoid making resource pools and VMs siblings in a hierarchy in order to avoid
unexpected performance
Justification
Resource pools help improve manageability and grouping and partitioning of
CPU and memory resources. We recommend, however, that resource pools
and VMs not be made siblings in a hierarchy. Instead, each level should contain
resource pools, or only VMs. This is because by default resource pools are
assigned share values that might not compare appropriately with those
assigned to VMs, potentially resulting in unexpected performance.
References:
Performance Best Practices for VMware vSphere 4
Item
Comments
Observation 4
The following 2 user session(s) have been idle for long time:


idle for 4 hours
idle for 3 hours
Priority
1
Component
Virtual Datacenter
Recommendation
Disconnect vSphere Clients from the vCenter Server when they are no longer
needed
Justification
vCenter Server must keep all client sessions current with inventory changes.
Doing this for connected but unused sessions attached to the vCenter Server
can affect the vCenter Server systems CPU usage and user interface speed.
Disconnect vSphere Client sessions from the vCenter Server when they are not
longer needed in order to improve the performance of vCenter Server.
Reference:
Item
Comments
Page 18 of 33
Health Check Report
Item
Comments
Observation 5
The following 20 VMs (version 7) are not using VMXNET3 even though their
configuration and guest OS support it:





RIH26
RIH55
RIH-SFTP
RIHSCAN
RIHDOC
Only 5 results are displayed above. See Findings data for more observations of
this type.





RIH19
RIH233
DEV
RIH33
UAT
Priority
2
Component
Virtual Machines
Recommendation
Justification
For best performance, use the vmxnet3 paravirtualized network adapter for
operating systems in which it is supported. Note that this requires that the
virtual machine use virtual hardware version 7 and that VMware Tools be
installed in the guest OS.
If vmxnet3 is not supported by the guest OS, use Enhanced vmxnet (vmxnet2).
Both vmxnet3 and Enhanced vmxnet support jumbo frames.
If Enhanced vmxnet is not supported in the guest OS, then use the Flexible
device type, which automatically converts each vlance network device to
vmxnet device if VMware Tools is installed.
Refer to the KB in the references and the product documentation for supported
guest OS for the particular adapter.
References:
Choosing a network adapter for your virtual machine KB
Item
Comments
Page 19 of 33
Health Check Report
Item
Comments
Observation 6
The following 2 virtual machines have a different OS installed than the one
configured:
VM Name: RIH26

Configured OS: windows7_64GuestInstalled OS:
windows7Server64Guest
VMware HealthAnalyzer

Configured OS: rhel5GuestInstalled OS: other26xLinuxGuest
Priority
2
Component
Virtual Machines
Recommendation
Select the correct guest OS type in the VM configuration to match the guest OS
Justification
Selecting the guest OS type determines the
1.
2.
optimal monitor mode to use
default optimal devices for the guest OS (such as SCSI controller and
network adapter)
appropriate VMware Tools to be installed in the guest OS
3.
Thus, it is important to make sure that the guest OS type matches the OS
installed in the VM to improve the performance and manageability of the VM.
Note that changing the guest OS type can only be performed when the VM is
powered off.
Item
Comments
Observation 7
The following 26 virtual machines have resources limits/reservations specified:





RIH19
RIH55
RIH-SFTP
RIH1
DEV
Priority
2
Component
Virtual Machines
Recommendation
Use reservations/limits selectively on VMs that need it; don't set reservation too
high or limits too low
Page 20 of 33
Health Check Report
Item
Comments
Justification
Use reservations selectively on VMs that need it. Specify the minimum
acceptable amount of CPU or memory. Dont set reservations too high since it
can limit the number of virtual machines you can power on in a resource pool,
cluster, or host. Setting reservations can also impact the slot size calculation for
HA clusters which can impact the admission control policy of a HA cluster (for
admission control policy of number of host failures).
Setting limits too low can impact the amount of CPU or memory resources
available to the VMs which can impact the overall performance.
Setting reservations/limits on VMs increases the manageability of the virtual
infrastructure, so it is important to selectively set these only on VMs that need it.
References:
General Resource Management Best Practices section in Performance Best
Practices for VMware vSphere 4.0
Item
Comments
Observation 8
There following 1 VM(s) have VMware tools not installed or not up to date or
not running:

SUSE_BASE (tools status: guestToolsNotInstalled)
Priority
1
Component
Virtual Machines
Recommendation
Check to make sure that VMware Tools are installed, running, and not out of
date for running VMs
Justification
Install VMware Tools in all guests that have supported VMware Tools available.
VMware Tools optimize the guests to make them run better inside virtual
machines by providing
1.
2.
3.
4.
5.
6.
optimized virtual NIC and storage drivers
efficient memory management using the balloon driver
driver to assist with file system quiescing to facilitate backups
improved keyboard, video, and mouse operation
graceful shutdown of VMs
perfmon integration of virtual machine performance data (for vSphere)
To ensure compatibility and optimal performance, upgrade the VMware Tools
for older virtual machines to the highest versions supported by their ESX/ESXi
hosts.
Item
Comments
Page 21 of 33
Health Check Report
Item
Comments
Observation 9
The following 5 VM(s) are using virtual hardware older than v7:





RIH39 is using virtual hardware version vmx-04 on host 10.44.10.10
(host version: 4.1.0)
WS3.rihmfc.com is using virtual hardware version vmx-04 on host
10.44.10.10 (host version: 4.1.0)
Priority
2
Component
Virtual Machines
Recommendation
Consider using virtual hardware v7 to take advantage of additional capabilities
(like VMXNET3, PVSCSI)
Justification
Virtual hardware version 7 provides numerous additional capabilities such as:
1.
2.
3.
vmxnet3 (IPv6 checksum, TSO)
PVSCSI
Additional capabilities like VMware Fault Tolerance (FT)
Although virtual hardware version 7 can provide additional performance
benefits, it is important to note that virtual machines with virtual hardware v7
cannot be run on ESX/ESXi versions earlier than 4.0. This can limit your
choices for VMotion, DRS, and DPM. Also, virtual machines that are converted
to virtual hardware v7 cannot be reverted back to the earlier version unless you
have taken a backup or created a snapshot of the virtual machine prior to
converting to v7.
References:
Item
Comments
Observation 10
The following 5 virtual machine(s) have snapshot(s):





Priority
RIH-SFTP
RIH4
RIH7
RIH6
RIH42
1
Page 22 of 33
Health Check Report
Item
Comments
Component
Virtual Machines
Recommendation
Limit use of snapshots and for short term use
Justification
Snapshots provide a means to allow point-in-time state captures allowing VMs
to have their state reverted to a snapshot for testing and recovery purposes.
Having multiple snapshots results in more disk usage and although SCSI
contention has been significantly improved in VMFS3 and vSphere 4, it is
recommended to limit use of snapshots and use snapshots for short term use.
Snapshots can also prevent certain operations like Storage VMotion.
Item
Comments
Observation 11
Connected virtual hardware devices are found on the following 3 VM(s) :



RIH44
RIH39
oestester
Priority
1
Component
Virtual Machines
Recommendation
Allocate only as much virtual hardware as required for each VM. Disable any
unused or unnecessary virtual hardware devices.
Page 23 of 33
Health Check Report
Item
Comments
Justification
Provisioning a virtual machine with more resources that it requires can, in some
cases, reduce the performance of that virtual machine as well as other virtual
machines sharing the same host. For example, configuring more vCPUs than
required for an application that is single threaded can reduce overall
performance. Also, configuring more memory than required can impact the
other virtual machines on the same host.
In addition to disabling unnecessary virtual devices within the virtual machine,
ensure that no device is connected to a virtual machine if it does not need to be
there. For example, serial and parallel ports are rarely used for virtual machines
in a datacenter environment, and CD/DVD drives are usually connected only
temporarily during software installation.
Disabling any unused or unnecessary virtual hardware devices improves
performance (can reduce device polling), improves security, and reduces
chances of these devices preventing VMotion.
Virtual machine performance can also be improved by configuring the VMs to
use ISO images instead of physical drives, and can be avoided entirely by
disabling optical drives in the virtual machines when the devices are not
needed.
References:
ESX and Virtual Machines section in Performance Best Practices for VMware
vSphere 4.0 http://www.vmware.com/resources/techresources/10041
Page 24 of 33
Health Check Report
4. Appendix A: Audited Inventory
4.1
Host Configurations
Host Configuration 1
Platform Specifications:





System: HP ProLiant DL380 G7
CPU: 2 sockets, 8 total cores, Intel(R) Xeon(R) CPU E5640 @ 2.67GHz
RAM: 192 GB
HBAs: 1 dual-channel ICH10 4 port SATA IDE Controller, 2 single-channel ISP2532-based 8Gb
Fibre Channel to PCI Express HBA, 1 single-channel Smart Array P410i
NICs: 2 dual-port NC364T PCI Express Quad Port Gigabit Server Adapter, 2 dual-port NC382i
Integrated Quad Port PCI Express Gigabit Server Adapter
ESX/ESXi Hosts:



4.2
10.44.10.10
10.44.10.12
10.44.10.14
Networking Configurations
Networking Configuration 1
Virtual Datacenter Name: RIH
Cluster Name: Production
ESX/ESXi Hosts: 10.44.10.10, 10.44.10.12, 10.44.10.14
Switch Name
Total
Ports
Available
Ports
Port Group
Active
NICs/Uplinks
vSwitch0
128
97
10.44.10.x
vmnic1, vmnic2,
vmnic3, vmnic5,
vmnic7, vmnic0,
vmnic4, vmnic6
vSwitch0
128
97
VM Network
vmnic1, vmnic2,
vmnic3, vmnic5,
vmnic7, vmnic0,
vmnic4, vmnic6
vSwitch0
128
97
Service Console 2
vmnic1, vmnic2,
vmnic3, vmnic5,
vmnic7, vmnic0,
vmnic4, vmnic6
Page 25 of 33
Standby
NICs/Uplinks
Health Check Report
vSwitch0
128
97
Service Console
vmnic1, vmnic2,
vmnic3, vmnic5,
vmnic7, vmnic0,
vmnic4, vmnic6
vSwitch0
128
97
VMotion
vmnic1, vmnic2,
vmnic3, vmnic5,
vmnic7, vmnic0,
vmnic4, vmnic6
vSwitch1
128
127
No Network
vSwitch0
128
102
VM Network
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic7,
vmnic4, vmnic6
vSwitch0
128
102
10.44.10.x
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic7,
vmnic4, vmnic6
vSwitch0
128
102
Service Console 2
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic7,
vmnic4, vmnic6
vSwitch0
128
102
Service Console
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic7,
vmnic4, vmnic6
vSwitch0
128
102
VMotion
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic7,
vmnic4, vmnic6
vSwitch1
128
127
No Network
vSwitch0
128
103
10.44.10.x
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic6,
vmnic7, vmnic4
vSwitch0
128
103
VM Network
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic6,
vmnic7, vmnic4
vSwitch0
128
103
Service Console 2
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic6,
vmnic7, vmnic4
Page 26 of 33
Health Check Report
vSwitch0
128
103
Service Console
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic6,
vmnic7, vmnic4
vSwitch0
128
103
VMotion
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic5, vmnic6,
vmnic7, vmnic4
vSwitch1
128
127
No Network
4.3
Storage
Storage Specifications:

Array: Storage Vendor, Model
Datastore
Name
Type
Size (GB)
Free Space
(GB)
DatastoreA1
VMFS
500
324
DatastoreA2
VMFS
975
345
DatastoreA3
VMFS
975
534
DatastoreA4
VMFS
975
821
DatastoreA5
VMFS
500
445
DatastoreA6
VMFS
500
359
DatastoreA8
VMFS
950
480
DatastoreB1
VMFS
500
499
DatastoreB2
VMFS
975
635
DatastoreB3
VMFS
975
656
DatastoreB4
VMFS
975
688
DatastoreB5
VMFS
500
372
DatastoreB6
VMFS
500
462
Comments
Page 27 of 33
Health Check Report
Datastore
Name
Type
Size (GB)
ESX_LUN_0
VMFS
1956
860
ESX_LUN_1
VMFS
1956
827
LocalStorage
10
VMFS
278
169
LocalStorage
12
VMFS
278
169
LocalStorage
14
VMFS
278
93
4.4
Free Space
(GB)
Comments
Virtual Datacenter
Datacenter 1:


Virtual datacenter name: RIH
Physical datacenter: RIH Providence Production
Cluster
Production
Enabled Features
HA, DRS
Hosts Checked
3
Page 28 of 33
No. of VMs
53
Health Check Report
5. Appendix B: Health Check Assessment Checklist
Component
Check (per Best Practice)
Host
Verify equipment was burned in with memory test for at least 72 hours
Host
Verify all host hardware is on the VMware Hardware Compatibility List (HCL)
Host
Verify all host hardware meets minimum supported configuration
Host
Check CPU compatibility for vMotion and FT
Host
Check ESX/ESXi host physical CPU utilization to make sure that it is not saturated
or running in a sustained high utilization
Host
Verify all hosts in the cluster are compatible versions of ESX/ESXi
Host
Check ESX/ESXi host active Swap In/Out rate to make sure that it is not
consistently greater than 0
Host
Check to make sure that there is sufficient service console memory (max is 800MB)
Host
Verify that ESX service console root file system is not getting full
Host
Check if any 3 party agents are running in the ESX service console
Host
Verify that NTP is used for time synchronization
rd
Network
Verify that networking in configured consistently across all hosts in a cluster
Network
Check to make sure there is redundancy in networking paths and components to
avoid single points of failure (e.g. at least 2 paths to each network)
Network
If HA is being used, check that physical switches that support PortFast (or
equivalent) have PortFast enabled
Network
Check that NICs for the same uplink have same speeds and duplex settings
Network
Check that Management/Service Console, Vmkernel, and VM traffic is separated
(physical or logical using VLANs)
Network
Verify that portgroup security settings for ForgedTransmits and
MACAddressChanges are set to Reject
Network
Check the virtual switch portgroup failover policy for appropriate active and standby
NICs for failover
Network
Verify that VMotion and FT traffic is on at least a 1 Gb network
Network
Check that IP storage traffic is physically separate to prevent sharing network
bandwidth
Page 29 of 33
Health Check Report
Component
Storage
Verify that VMs are on a shared datastore
Storage
Check that datastores are masked/zoned to the appropriate hosts in a cluster
Storage
Check that datastores are consistently accessible from all hosts in a cluster
Storage
Check that the appropriate storage policy is used for the storage array (MRU,
Fixed, RR)
Storage
Check to make sure there is redundancy in storage paths and components to avoid
single point of failure (e.g. at least 2 paths to each datastore)
Storage
Check that datastores are not getting full
Virtual
Datacenter
Check that all datacenter objects use a consistent naming convention
Virtual
Datacenter
Verify that hosts within a cluster maintain a compatible and homogeneous
(CPU/mem) to support the required functionality for DRS, DPM, HA, and VMotion
Virtual
Datacenter
Check that FT primaries are distributed on multiple hosts since FT logging is
asymmetric
Virtual
Datacenter
Verify that hosts for FT are FT compatible
Virtual
Datacenter
Check that reservations/limits are used selectively on VMs that need it and are not
set to extreme values
Virtual
Datacenter
Check that vCenter Server is not running other applications and vCenter add-ons
(for large environments and heavily loaded vCenter systems) and is sized
appropriately
Virtual
Datacenter
Check that the DB log setting is Normal unless there is a specific reason to set it to
High
Virtual
Datacenter
Check that the vCenter statistics level is set to an appropriate level (1 or 2
recommended)
Virtual
Datacenter
Check that appropriate vCenter roles, groups, and permissions are being used
VM
Check any VMs with CPU READY over 2000 ms
VM
Check any VMs with sustained high CPU utilization
VM
Check any VMs with incorrect OS type in the VM configuration compared to the
guest OS
VM
Check any VMs with multiple vCPUs to make sure the applications are not single
threaded
Page 30 of 33
Health Check Report
Component
VM
Check the active Swap In/Out rate of VMs to make sure it is not consistently greater
than 0
VM
Check that NTP, windows time service, or another timekeeping utility suitable for
the OS is used (and not VMware Tools)
VM
Check that VMware Tools are installed, running, and not out of date for running
VMs
VM
Check VMs that are configured and enabled with unnecessary virtual hardware
devices (floppy, serial, parallel, CDROM) and any devices that prevent VMotion
VM
Check VMs that are not yet on virtual hardware v7
VM
Check VM configuration (memory reservation) for VMs running JVM to consider
setting reservation to the size of OS+ java heap
Page 31 of 33
Health Check Report
6. Appendix C: References
Item
URL
Documentation
VMTN Technology information
http://www.vmware.com/vcommunity/technology
VMTN Knowledge Base
http://kb.vmware.com
Discussion forums
http://www.vmware.com/community
User groups
http://www.vmware.com/vcommunity/usergroups.html
Online support
http://www.vmware.com/support
Telephone support
http://www.vmware.com/support/phone_support.html
Education Services
http://mylearn.vmware.com/mgrreg/index.cfm
Certification
http://mylearn.vmware.com/portals/certification/
Technical Papers
http://www.vmware.com/vmtn/resources
Network throughput between
virtual machines
Detailed explanation of VMotion
considerations
Time keeping in virtual machines
http://www.vmware.com/vmtn/resources/238
VMFS partitions
VI3 802.1Q VLAN Solutions
http://www.vmware.com/pdf/esx3_vlan_wp.pdf
VMware Virtual Networking
Concepts
Using EMC Celerra IP Storage
(VI3
VMware vCenter Update Manager
documentation
Best Practices
Performance Best Practices for
VMware vSphere 4.0
Recommendations for aligning
VMFS partitions
Performance Troubleshooting for
VMware vSphere
http://www.vmware.com/support/pubs/vum_pubs.html
http://communities.vmware.com/docs/DOC-10352
Large Page Performance
VMware vSphere PowerCLI
http://www.vmware.com/support/developer/windowstoolkit/
VI3 security hardening
VMware HA: Concepts and Best
Practices
Java in Virtual Machine on ESX
http://www.vmware.com/files/pdf/Java_in_Virtual_Machines_on_ESX-FINALJan-15-2009.pdf
CPU scheduler in ESX 4.0
Dynamic Storage Provisioning
(Thin Provisioning)
Page 32 of 33
Health Check Report
Item
URL
Understanding memory resource
management on ESX
Page 33 of 33
Exhibit 3
VMware Health Check Report
for
Rhode Island Housing
VMware and Rhode Island Housing Confidential
Health Check Report
© 2010 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual
property laws. This product is covered by one or more patents listed at
http://www.vmware.com/download/patents.html.
VMware, VMware vSphere, VMware vCenter, the VMware “boxes” logo and design, Virtual SMP and VMotion are
registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other marks
and names mentioned herein may be trademarks of their respective companies.
VMware, Inc
3401 Hillview Ave
Palo Alto, CA 94304
www.vmware.com
Page 2 of 32
Health Check Report
Contents
1.
Executive Summary ......................................................................... 4
1.1 Report Overview ............................................................................................................ 4
1.2 Assessment Highlights ................................................................................................... 4
1.3 Next Steps ...................................................................................................................... 4
2.
Recommended Action Items ............................................................ 5
3.
Health Check Assessment ............................................................... 7
3.1 Availability/Management ................................................................................................ 7
3.2 Performance................................................................................................................. 17
4.
Appendix A: Audited Inventory ....................................................... 24
4.1 Host Configurations ..................................................................................................... 24
4.2 Networking Configurations ........................................................................................... 24
4.3 Storage ......................................................................................................................... 26
4.4 Virtual Datacenter ........................................................................................................ 26
5.
Appendix B: Health Check Assessment Checklist .......................... 28
6.
Appendix C: References ................................................................ 31
Page 3 of 32
Health Check Report
1. Executive Summary
1.1
Report Overview
This report summarizes activities and findings from a VMware Health Check that was conducted for
Rhode Island Housing. This report contains:



1.2
Recommended changes to configuration and/or usage per VMware best practices that may
improve availability/management or performance of VMware components
Inventory of components analyzed
Checklist of assessment activities performed
Assessment Highlights
Analysis Period

March, 2012
Datacenters

Rhode Island Housing – Staged DR Datacenter located in Providence
Contributing Participants


Rhode Island Housing Abdel El idrissi
Vendor Engineer, VmWare Consultant
Summary of Activities





1.3
Performed Standard Health Check Assessment Checklist (see Appendix)
Gathered system information collected from VMware HealthAnalyzer
Interviewed participants to discuss priority issues and concerns
Conducted knowledge transfer to
o clarify understanding of VMware component requirements and behavior
o clarify changes to configuration and usage per VMware best practices
Reviewed documents supplied by Rhode Island Housing
Next Steps
Rhode Island Housing should review this report and consider the recommended action items. A follow-up
consult and/or Health Check is also advised. If required, VMware, through its Professional Services
Organization or via one of its many partner organizations, is able to assist Rhode Island Housing in
implementing the recommended actions as detailed within this report.
Page 4 of 32
Health Check Report
2. Recommended Action Items
Priority
Component
1
Network
Configure networking consistently across all hosts in a cluster
1
Network
Minimize differences in number of active NICs across hosts within a
cluster
1
Network
Avoid mixing NICs with different speeds and duplex settings on the
same uplink for a portgroup/dvportgroup
1
Network
Configure management/service console, Vmkernel, and VM networks
such that there is separation of traffic (physical or logical using VLANs)
1
Network
1
Virtual
Datacenter
Set up a redundant service console portgroup to use a separate
vmnic/uplink, and an alternate isolation response gateway address for
more reliability in HA isolation detection Set up a redundant service
console portgroup to use a separate vmnic/uplink on a separate subnet
Specify "isolation address" for the redundant service console
(das.isolationaddress2) Increase the failure detection time
(das.failuredetectiontime) setting to 20000 milliseconds or greater
1
Virtual
Datacenter
Avoid making resource pools and VMs siblings in a hierarchy in order
to avoid unexpected performance
1
Virtual
Datacenter
appropriate access and authorization to virtual infrastructure. Avoid
using Windows built-in groups (Administrators)
1
Virtual
Datacenter
1
Virtual
Machines
Check to make sure that VMware Tools are installed, running, and not
out of date for running VMs
1
Virtual
Machines
Allocate only as much virtual hardware as required for each VM.
Disable any unused or unnecessary virtual hardware devices.
2
Host
2
Network
Check to make sure there is redundancy in networking paths and
components to avoid single points of failure (e.g. at least 2 paths to
each network)
2
Network
Make sure that VMotion traffic is on at least a 1Gb network
2
Storage
Size datastores appropriately
2
Storage
Minimize differences in number of storage paths
Page 5 of 32
Health Check Report
Priority
Component
2
Virtual
Machines
2
Virtual
Machines
Select the correct guest OS type in the VM configuration to match the
guest OS
2
Virtual
Machines
Consider using virtual hardware v7 to take advantage of additional
capabilities (like VMXNET3, PVSCSI)
2
Virtual
Machines
2
Virtual
Machines
3
Network
Configure NICs and physical switch speed and duplex settings
consistently. Set to autonegotiation for 1Gb NICs
3
Network
Distribute vmnics for a portgroup across different PCI busses for
greater redundancy
Page 6 of 32
Health Check Report
3. Health Check Assessment
3.1
Availability/Management
Item
Comments
Observation 1
Advanced Settings for the following 3 ESX host(s) have been changed from
defaults:



10.55.10.14
10.55.10.10
10.55.10.12
Priority
2
Component
Host
Recommendation
Justification
ESX/ESXi hosts have a default configuration that normally does not need to be
changed. Certain advanced parameter settings can be selectively changed if
they are recommended VMware best practices and documented, or under
direction by VMware support in order to address specific issues. Otherwise the
advanced parameter settings should not be changed, as that may have an
adverse and unpredictable impact on management, availability or performance
of the virtual infrastructure.
If an advanced parameter setting needs to be changed, make sure that the
changes are consistently applied to all applicable hosts within the environment
or cluster. Also, maintain proper change management procedures and
document configuration changes.
An example of advanced parameter setting that is valid is if a redundant service
console portgroup is set up and an alternate isolation response gateway
address needs to be configured for more reliability in HA isolation detection (by
modifying das.isolationaddress2 and das.failuredetectiontime
parameters.
An example of advanced parameter setting that should not be changed is
disabling transparent page sharing by setting sched.mem.pshare.enable
parameter to false. Transparent page sharing provides numerous advantages
for memory resource management and should not be changed.
Item
Comments
Observation 2
The following 1 cluster(s) do not have networking configured consistently
across ESX hosts:

DR
Page 7 of 32
Health Check Report
Item
Comments
Priority
1
Component
Network
Recommendation
Configure networking consistently across all hosts in a cluster
Justification
Minimize differences in the network configuration across all hosts in a cluster.
Consistent networking configuration across all hosts in a cluster easies
administration and troubleshooting. Also, since services like VMotion require
portgroups to be named consistently in order for VMotion to work, it is important
to have a consistent configuration so that DRS and VMotion capabilities are not
disrupted.
Also, use a consistent naming convention for virtual switches, portgroups, and
uplink groups
Product/Version: vSphere 4
VMware vSphere 4 introduces VMware vNetwork Distributed Switches (vDS)
and Cisco Nexus 1000V distributed switches which reduce administration time
and ensure consistency across the virtual datacenter. Changes to the
distributed virtual portgroup are consistently and automatically applied to all
hosts that are connected to the distributed switch. Check the licensing
requirements in order to determine if distributed switches can be used in the
environment.
Consider using distributed switches if possible.
Item
Comments
Observation 3
The portgroup/vSwitches on the following 3 host(s) have less than 2 uplink
paths:



10.55.10.14
10.55.10.10
10.55.10.12
Priority
2
Component
Network
Recommendation
Check to make sure there is redundancy in networking paths and components
to avoid single points of failure (e.g. at least 2 paths to each network)
Page 8 of 32
Health Check Report
Item
Comments
Justification
In order to ensure that there is no service disruption, it is important to ensure
that the networking configuration is fault resilient to accommodate networking
path and component failures.
It is recommended that all portgroups and distributed virtual portgroups are
configured with at least two uplink paths using different vmnics. Use NIC
teaming with at least two active NICs or in the case of service
console/management portgroup one in active and at least one in standby. Set
failover policy with the appropriate active and standby NICs for failover.
Connect each physical adapter to different physical switches for an additional
level of redundancy.
Upstream physical network components should also have the necessary
redundancy in order to accommodate physical component failures.
Item
Comments
Observation 4
The vmnics on the following 3 host(s) are not distributed across different PCI
busses:



10.55.10.14
10.55.10.10
10.55.10.12
Priority
3
Component
Network
Recommendation
Distribute vmnics for a portgroup across different PCI busses for greater
redundancy
Justification
Distributing vmnics for a portgroup across different PCI busses provides greater
redundancy from failures related to a particular PCI bus. It is also important to
team vmnics from different PCI busses in order to improve fault resiliency from
component failures.
Item
Comments
Observation 5
Management, VMKernel (Storage, Vmotion, FT) and VM Networks traffic are
not separate on the following 3 host(s):



Priority
10.55.10.14
10.55.10.10
10.55.10.12
1
Page 9 of 32
Health Check Report
Item
Comments
Component
Network
Recommendation
Configure management/service console, Vmkernel, and VM networks such that
there is separation of traffic (physical or logical using VLANs)
Justification
Separate the following traffic:
1.
2.
3.
4.
5.
management/service console
vmkernel for IP storage
vmkernel for VMotion
vmkernel for FT
Virtual Machine network traffic
Traffic separation improves performance, prevents bottlenecks, and increases
security.
Use physical separation or logical separation using VLANs as appropriate. Note
that physical switch ports should be configured as trunkports for VLANs.
Item
Comments
Page 10 of 32
Health Check Report
Item
Comments
Observation 6
The portgroup security settings ForgedTransmit or MacAddressChanges are
not set to Reject on the following 3 host(s):
Host: 10.55.10.14




































Priority
1
Component
Network
Recommendation
Page 11 of 32
Health Check Report
Item
Comments
Justification
It is recommended that both of these options are set to Reject for improved
security.
In order to protect against MAC address impersonations and prevent ESX/ESXi
from honoring requests to change the effective MAC address to anything other
than the initial MAC address, change the settings to Reject.
By setting the MACAddressChange setting to Reject, ESX/ESXi compares the
source MAC address being transmitted by the Guest OS with the effective MAC
address for its adapter to see if they match. If the addresses do not match,
ESX/ESXi drops the packet. This allows impersonated addresses to be
dropped before they are delivered and the Guest OS assume that the packets
have been dropped.
For VMs that require overriding this setting (intrusion detection or MSCS VM),
create a special port group for these (and only these) VMs with the modified
settings.
References:
"Configuring the ESX/ESXi Host" section in VMware Infrastructure 3 Security
Hardening Guide http://www.vmware.com/vmtn/resources/726
Item
Comments
Observation 7
VMotion traffic for the following 3 host(s) is on less than 1 GB network:



10.55.10.14
10.55.10.10
10.55.10.12
Priority
2
Component
Network
Recommendation
Make sure that VMotion traffic is on at least a 1Gb network
Justification
This is a VMware requirement that VMotion traffic should be on at least a 1Gb
network. Since this traffic is unencrypted, it is also recommended that VMotion
traffic be kept separate from other traffic.
VMotion network can be on a separate isolated non routed network segment.
References:
Item
Comments
Page 12 of 32
Health Check Report
Item
Comments
Observation 8
The following 3 ESX host(s) do not have redundant service console port groups
on distinct vSwitches:



10.55.10.14
10.55.10.10
10.55.10.12
The default HA failure detection time has not been changed for the following 1
Cluster(s):

DR
No alternate isolation addresses have been specified for the following 1
Cluster(s):

DR
Priority
1
Component
Virtual Datacenter
Recommendation
Set up a redundant service console portgroup to use a separate vmnic/uplink,
and an alternate isolation response gateway address for more reliability in HA
isolation detection
Set up a redundant service console portgroup to use a
separate vmnic/uplink on a separate subnet
Specify "isolation address" for
the redundant service console (das.isolationaddress2) Increase the failure
detection time (das.failuredetectiontime) setting to 20000 milliseconds or
greater
Justification
Although NIC teaming is used to account for NIC failures, overall redundancy
for HA heartbeats and isolation response detection can be made more reliable
by setting up a redundant service console on a separate subnet.
Each service console network should have one isolation address it can reach.
When you set up service console redundancy, you must specify an additional
isolation response address for the secondary service console network. VMware
also recommends increasing the failure detection time setting to 20000 ms or
greater.
References:
"Best Practices for VMware HA Clusters" section in vSphere Availability Guide
"VMware HA Best Practices" section in VMware High Availability: Concepts,
Implementation, and Best Practices
Item
Comments
Page 13 of 32
Health Check Report
Item
Comments
Observation 9
The following 1 user Windows default users/groups are being used for vCenter
user roles/permissions :

Administrators (for entity Datacenters)
Priority
1
Component
Virtual Datacenter
Recommendation
appropriate access and authorization to virtual infrastructure. Avoid using
Windows built-in groups (Administrators)
Justification
By default, any user or group who is a member of the local Administrators
group of the Windows Server running vCenter Server will have full
administrative control of vCenter Server (and the virtual infrastructure). This can
allow other system administrators that are not virtual infrastructure
administrators access to the virtual infrastructure.
Use the appropriate vCenter Server roles and assign them to the appropriate
vCenter Administrators AD group to ensure access is limited to virtual
infrastructure administrators.
Before removing users or groups from vCenter Server, make sure that you
create and test access to vCenter Server for the new users and groups.
Reference:
"Managing Users, Groups, Roles, and Permissions section in vSphere Basic
System Administration http://www.vmware.com/support/pubs
Item
Comments
Observation 10
The firewall settings for the following ESX host(s) have been modified from the
default:



10.55.10.14
10.55.10.10
10.55.10.12
Refer to findings data for more specific details about the settings that have
been modified
Priority
1
Component
Virtual Datacenter
Recommendation
Page 14 of 32
Health Check Report
Item
Comments
Justification
The default firewall rules are configured in order to ensure adequate security
while allowing communication with the appropriate virtual infrastructure
components.
Unless required to enable communication for virtual infrastructure services,
avoid changing the firewall rules as that can introduce additional security
issues. It is best to leave the default security firewall settings, which block all
incoming and outgoing traffic that is not associated with an enabled service.
If you do enable a service and open ports, make sure to document the
changes, including the purpose for opening each port and consistently make
the changes on all the appropriate ESX/ESXi hosts.
Avoid changing the default ports unless necessary. These ports are required for
vCenter Server, ESX hosts and other components to communicate with the
virtual infrastructure.
References:
"Configure the Firewall for Maximum Security" section in VMware Infrastructure
3 Security Hardening http://www.vmware.com/resources/techresources/727
TCP and UDP Ports for vCenter Server, ESX hosts, and other network
components management access KB http://kb.vmware.com/kb/012382
Item
Comments
Observation 11
The following 5 VM(s) do not meet some of the VMotion requirements (either
floppy/cd-rom found, VM in internal network, network or datastore not visible to
all ESX in cluster):





XP
IT-Test-VM
RIH39
RIH33
RIH177-software
VMotion traffic for the following 3 host(s) is on less than 1 GB network:



10.55.10.14
10.55.10.10
10.55.10.12
Priority
2
Component
Virtual Machines
Recommendation
Page 15 of 32
Health Check Report
Item
Comments
Justification
In order to facilitate VMotion of VMs between hosts, the following requirements
must be met:
1.
The source and destination hosts must use shared storage and the
disks of all VMs must be available on both source and target hosts
VM should not be connected to internal networks
The portgroup names must be the same on the source and destination
hosts (easier with vDS vNetwork Distributed Switch)
VMotion requires a 1GB network
CPU compatibility source and destination hosts must have compatible
CPUs (relaxed for EVC Enhanced VMotion Compatibility)
No devices attached that prevent VMotion (CDROM, Floppy,
serial/parallel devices
2.
3.
4.
5.
6.
References:
www.vmware.com/files/pdf/vmotion_info_guide.pdf
Item
Comments
Observation 12
Copy/paste is not recommended and is enabled on the following 4 VM(s) :




XP
RIH232
IT-Test-VM
HP Management
Priority
2
Component
Virtual Machines
Recommendation
Page 16 of 32
Health Check Report
Item
Comments
Justification
When VMware Tools runs in a virtual machine, by default you can copy and
paste between the guest operating system and the computer where the remote
console is running. As soon as the console window gains focus, non-privileged
users and processes running in the virtual machine can access the clipboard of
the virtual machine console. If a user copies sensitive information to the
clipboard before using the console, the user perhaps unknowingly exposes
sensitive data to the virtual machine. It is recommended that you disable copy
and paste for the guest operating system by creating the parameters shown in
the following table.
Note that this does not disable copy and paste for the users when they access
the virtual machine through other means like Terminal Services. This disable
copy and paste only from the virtual machine console e.g. when using the
console from the Virtual Infrastructure client.
Configuration Settings to Disable Copy and Paste
Name
Value
isolation.tools.copy.disable
true
isolation.tools.paste.disable
true
isolation.tools.setGUIOptions.enable
false
References:
3.2
Performance
Item
Comments
Observation 1
The following 3 ESX host(s) has Gbps network adapter(s) and not set to
AutoNegotiate:



10.55.10.14
10.55.10.10
10.55.10.12
Priority
3
Component
Network
Recommendation
Configure NICs and physical switch speed and duplex settings consistently. Set
to autonegotiation for 1Gb NICs
Page 17 of 32
Health Check Report
Item
Comments
Justification
Incorrect network speed and duplex settings can impact performance. The
network adapter (vmnic) and physical switch settings needs to be checked and
set correctly. If your physical switch is configured for a specific speed and
duplex setting, you must force the network driver to use the same speed and
duplex setting. For Gigabit links, network settings should be set to autonegotiate and not forced.
Setting network adapter speed and duplex settings can be done from the
vSphere Client although a reboot is required for changes to take effect.
Reference:
Item
Comments
Observation 2
The following 1 cluster(s) do not have portgroups configured consistently (either
name or active NIC total speeds) across ESX hosts:

DR
Priority
1
Component
Network
Recommendation
Minimize differences in number of active NICs across hosts within a cluster
Justification
Having a variance in the number of active NICs across hosts within a cluster
can lead to inconsistent network performance as VMs are migrated to other
hosts within a cluster.
Hosts that have fewer NIC ports than others might have network bottlenecks,
but this might not be obvious if you assume that all hosts have the same
number of active NIC ports available.
Item
Comments
Observation 3
The following 3 host(s) have mixed NICs speed and duplex settings on a
portgroup/dvportgroup uplink:
Host: 10.55.10.14



Priority
vSwitch0
vSwitch0
vSwitch0
1
Page 18 of 32
Health Check Report
Item
Comments
Component
Network
Recommendation
Avoid mixing NICs with different speeds and duplex settings on the same uplink
for a portgroup/dvportgroup
Justification
Having a portgroup/dvportgroup mapped to multiple vmnics at different speeds
is not a best practice because, depending on the traffic load balancing
algorithm, the network speed of the traffic can be arbitrarily and randomly
determined and the result can be undesirable. For example, suppose there are
several VMs all connected to a single vSwitch with two outbound adapters, one
at 100Mbps and one at a gigabit. Some VMs would be luckier than others
depending on how their traffic is routed. A best practice is to ensure the speed
is more predictable and deliberately chosen.
Item
Comments
Observation 4
The following 1 datastore(s) have too many VM in them:

PlaceholderVMs
Priority
2
Component
Storage
Recommendation
Size datastores appropriately
Justification
Use consistent LUN sizes, and create one datastore per LUN. Consider the
time it takes to restore a LUN in the event of a disk failure which choosing a
LUN size. There are restrictions on the maximum LUN size in vSphere so refer
to the Configuration Maximums document.
References:
Configuration Maximums http://www.vmware.com/support/pubs
Item
Comments
Observation 5
The following 1 Resource Pool(s) have both virtual machines and child
resource pools as siblings in hierarchy :

Resources (Owner: DR )
Priority
1
Component
Virtual Datacenter
Page 19 of 32
Health Check Report
Item
Comments
Recommendation
Avoid making resource pools and VMs siblings in a hierarchy in order to avoid
unexpected performance
Justification
Resource pools help improve manageability and grouping and partitioning of
CPU and memory resources. We recommend, however, that resource pools
and VMs not be made siblings in a hierarchy. Instead, each level should contain
resource pools, or only VMs. This is because by default resource pools are
assigned share values that might not compare appropriately with those
assigned to VMs, potentially resulting in unexpected performance.
References:
Item
Comments
Observation 6

Update Manager DR 10.55.10.31

XP
Priority
2
Component
Virtual Machines
Recommendation
Justification
For best performance, use the vmxnet3 paravirtualized network adapter for
operating systems in which it is supported. Note that this requires that the
virtual machine use virtual hardware version 7 and that VMware Tools be
installed in the guest OS.
If vmxnet3 is not supported by the guest OS, use Enhanced vmxnet (vmxnet2).
Both vmxnet3 and Enhanced vmxnet support jumbo frames.
If Enhanced vmxnet is not supported in the guest OS, then use the Flexible
device type, which automatically converts each vlance network device to
vmxnet device if VMware Tools is installed.
Refer to the KB in the references and the product documentation for supported
guest OS for the particular adapter.
References:
Choosing a network adapter for your virtual machine KB
Page 20 of 32
Health Check Report
Item
Comments
Observation 7
The following 2 virtual machines have a different OS installed than the one
configured:
VM Name: VMware HealthAnalyzer

Configured OS: rhel5GuestInstalled OS: other26xLinuxGuest
Update Manager DR 10.55.10.31

Configured OS: windows7_64GuestInstalled OS:
windows7Server64Guest
Priority
2
Component
Virtual Machines
Recommendation
Select the correct guest OS type in the VM configuration to match the guest OS
Justification
Selecting the guest OS type determines the
1.
2.
optimal monitor mode to use
default optimal devices for the guest OS (such as SCSI controller and
network adapter)
appropriate VMware Tools to be installed in the guest OS
3.
Thus, it is important to make sure that the guest OS type matches the OS
installed in the VM to improve the performance and manageability of the VM.
Note that changing the guest OS type can only be performed when the VM is
powered off.
Item
Comments
Observation 8
There following 5 VM(s) have VMware tools not installed or not up to date or
not running:





RIH232 (tools status: guestToolsNotInstalled)
IT-Test-VM (tools status: guestToolsNotInstalled)
HP Management (tools status: guestToolsNotInstalled)
test VM (tools status: guestToolsNotInstalled)
Update Manager 10.44.10.31 (tools status: guestToolsNotInstalled)
Priority
1
Component
Virtual Machines
Recommendation
Check to make sure that VMware Tools are installed, running, and not out of
date for running VMs
Page 21 of 32
Health Check Report
Item
Comments
Justification
Install VMware Tools in all guests that have supported VMware Tools available.
VMware Tools optimize the guests to make them run better inside virtual
machines by providing
1.
2.
3.
4.
5.
6.
optimized virtual NIC and storage drivers
efficient memory management using the balloon driver
driver to assist with file system quiescing to facilitate backups
improved keyboard, video, and mouse operation
graceful shutdown of VMs
perfmon integration of virtual machine performance data (for vSphere)
To ensure compatibility and optimal performance, upgrade the VMware Tools
for older virtual machines to the highest versions supported by their ESX/ESXi
hosts.
Item
Comments
Observation 9
The following 1 VM(s) are using virtual hardware older than v7:

XP is using virtual hardware version vmx-04 on host 10.55.10.14 (host
version: 4.1.0)
Priority
2
Component
Virtual Machines
Recommendation
Consider using virtual hardware v7 to take advantage of additional capabilities
(like VMXNET3, PVSCSI)
Justification
Virtual hardware version 7 provides numerous additional capabilities such as:
1.
2.
3.
vmxnet3 (IPv6 checksum, TSO)
PVSCSI
Additional capabilities like VMware Fault Tolerance (FT)
Although virtual hardware version 7 can provide additional performance
benefits, it is important to note that virtual machines with virtual hardware v7
cannot be run on ESX/ESXi versions earlier than 4.0. This can limit your
choices for VMotion, DRS, and DPM. Also, virtual machines that are converted
to virtual hardware v7 cannot be reverted back to the earlier version unless you
have taken a backup or created a snapshot of the virtual machine prior to
converting to v7.
References:
Item
Comments
Page 22 of 32
Health Check Report
Item
Comments
Observation 10
Connected virtual hardware devices are found on the following 5 VM(s) :





XP
IT-Test-VM
RIH39
RIH33
RIH177-software
Priority
1
Component
Virtual Machines
Recommendation
Allocate only as much virtual hardware as required for each VM. Disable any
unused or unnecessary virtual hardware devices.
Justification
Provisioning a virtual machine with more resources that it requires can, in some
cases, reduce the performance of that virtual machine as well as other virtual
machines sharing the same host. For example, configuring more vCPUs than
required for an application that is single threaded can reduce overall
performance. Also, configuring more memory than required can impact the
other virtual machines on the same host.
In addition to disabling unnecessary virtual devices within the virtual machine,
ensure that no device is connected to a virtual machine if it does not need to be
there. For example, serial and parallel ports are rarely used for virtual machines
in a datacenter environment, and CD/DVD drives are usually connected only
temporarily during software installation.
Disabling any unused or unnecessary virtual hardware devices improves
performance (can reduce device polling), improves security, and reduces
chances of these devices preventing VMotion.
Virtual machine performance can also be improved by configuring the VMs to
use ISO images instead of physical drives, and can be avoided entirely by
disabling optical drives in the virtual machines when the devices are not
needed.
References:
ESX and Virtual Machines section in Performance Best Practices for VMware
vSphere 4.0 http://www.vmware.com/resources/techresources/10041
Page 23 of 32
Health Check Report
4. Appendix A: Audited Inventory
4.1
Host Configurations
Host Configuration 1
Platform Specifications:





System: HP ProLiant DL380 G7
CPU: 2 sockets, 8 total cores, Intel(R) Xeon(R) CPU E5640 @ 2.67GHz
RAM: 192 GB
HBAs: 1 dual-channel ICH10 4 port SATA IDE Controller, 2 single-channel ISP2532-based 8Gb
Fibre Channel to PCI Express HBA, 1 single-channel Smart Array P410i
NICs: 2 dual-port NC364T PCI Express Quad Port Gigabit Server Adapter, 2 dual-port NC382i
Integrated Quad Port PCI Express Gigabit Server Adapter
ESX/ESXi Hosts:



4.2
10.55.10.10
10.55.10.12
10.55.10.14
Networking Configurations
Networking Configuration 1
Virtual Datacenter Name: RIH
Cluster Name: DR
ESX/ESXi Hosts: 10.55.10.10, 10.55.10.12, 10.55.10.14
Switch Name
Total
Ports
Available
Ports
Port Group
Active
NICs/Uplinks
vSwitch0
128
117
10.44.10.x
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
117
Service Console 2
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
117
Service Console
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
Page 24 of 32
Standby
NICs/Uplinks
Health Check Report
vSwitch0
128
117
VMotion
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
117
VM Network
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch1
128
127
No Network
vSwitch0
128
116
10.44.10.x
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
116
Service Console 2
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
116
Service Console
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
116
VMotion
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
116
VM Network
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch1
128
127
No Network
vSwitch0
128
116
10.44.10.x
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
116
Service Console 2
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
116
Service Console
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
Page 25 of 32
Health Check Report
vSwitch0
128
116
VMotion
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch0
128
116
VM Network
vmnic0, vmnic1,
vmnic2, vmnic3,
vmnic4, vmnic5,
vmnic7
vSwitch1
128
127
No Network
4.3
Storage
Storage Specifications:

Array: Storage Vendor, Model
Datastore
Name
Type
Size (GB)
DRStorage
VMFS
500
385
LocalStorage
10
VMFS
278
269
LocalStorage
12
VMFS
278
269
LocalStorage
14
VMFS
278
269
PlaceholderV
Ms
VMFS
24
23
4.4
Free Space
(GB)
Comments
Virtual Datacenter
Datacenter 1:


Virtual datacenter name: RIH
Physical datacenter: RIH DR Staged Environment in Providence
Page 26 of 32
Health Check Report
Cluster
DR
Enabled Features
Hosts Checked
HA, DRS
No. of VMs
40
Page 27 of 32
Health Check Report
5. Appendix B: Health Check Assessment Checklist
Component
Host
Verify equipment was burned in with memory test for at least 72 hours
Host
Verify all host hardware is on the VMware Hardware Compatibility List (HCL)
Host
Verify all host hardware meets minimum supported configuration
Host
Check CPU compatibility for vMotion and FT
Host
Check ESX/ESXi host physical CPU utilization to make sure that it is not saturated
or running in a sustained high utilization
Host
Verify all hosts in the cluster are compatible versions of ESX/ESXi
Host
Check ESX/ESXi host active Swap In/Out rate to make sure that it is not
consistently greater than 0
Host
Check to make sure that there is sufficient service console memory (max is 800MB)
Host
Verify that ESX service console root file system is not getting full
Host
Check if any 3 party agents are running in the ESX service console
Host
Verify that NTP is used for time synchronization
rd
Network
Verify that networking in configured consistently across all hosts in a cluster
Network
Check to make sure there is redundancy in networking paths and components to
avoid single points of failure (e.g. at least 2 paths to each network)
Network
If HA is being used, check that physical switches that support PortFast (or
equivalent) have PortFast enabled
Network
Check that NICs for the same uplink have same speeds and duplex settings
Network
Check that Management/Service Console, Vmkernel, and VM traffic is separated
(physical or logical using VLANs)
Network
Verify that portgroup security settings for ForgedTransmits and
MACAddressChanges are set to Reject
Network
Check the virtual switch portgroup failover policy for appropriate active and standby
NICs for failover
Network
Verify that VMotion and FT traffic is on at least a 1 Gb network
Network
Check that IP storage traffic is physically separate to prevent sharing network
bandwidth
Page 28 of 32
Health Check Report
Component
Storage
Verify that VMs are on a shared datastore
Storage
Check that datastores are masked/zoned to the appropriate hosts in a cluster
Storage
Check that datastores are consistently accessible from all hosts in a cluster
Storage
Check that the appropriate storage policy is used for the storage array (MRU,
Fixed, RR)
Storage
Check to make sure there is redundancy in storage paths and components to avoid
single point of failure (e.g. at least 2 paths to each datastore)
Storage
Check that datastores are not getting full
Virtual
Datacenter
Check that all datacenter objects use a consistent naming convention
Virtual
Datacenter
Verify that hosts within a cluster maintain a compatible and homogeneous
(CPU/mem) to support the required functionality for DRS, DPM, HA, and VMotion
Virtual
Datacenter
Check that FT primaries are distributed on multiple hosts since FT logging is
asymmetric
Virtual
Datacenter
Verify that hosts for FT are FT compatible
Virtual
Datacenter
Check that reservations/limits are used selectively on VMs that need it and are not
set to extreme values
Virtual
Datacenter
Check that vCenter Server is not running other applications and vCenter add-ons
(for large environments and heavily loaded vCenter systems) and is sized
appropriately
Virtual
Datacenter
Check that the DB log setting is Normal unless there is a specific reason to set it to
High
Virtual
Datacenter
Check that the vCenter statistics level is set to an appropriate level (1 or 2
recommended)
Virtual
Datacenter
Check that appropriate vCenter roles, groups, and permissions are being used
VM
Check any VMs with CPU READY over 2000 ms
VM
Check any VMs with sustained high CPU utilization
VM
Check any VMs with incorrect OS type in the VM configuration compared to the
guest OS
VM
Check any VMs with multiple vCPUs to make sure the applications are not single
threaded
Page 29 of 32
Health Check Report
Component
VM
Check the active Swap In/Out rate of VMs to make sure it is not consistently greater
than 0
VM
Check that NTP, windows time service, or another timekeeping utility suitable for
the OS is used (and not VMware Tools)
VM
Check that VMware Tools are installed, running, and not out of date for running
VMs
VM
Check VMs that are configured and enabled with unnecessary virtual hardware
devices (floppy, serial, parallel, CDROM) and any devices that prevent VMotion
VM
Check VMs that are not yet on virtual hardware v7
VM
Check VM configuration (memory reservation) for VMs running JVM to consider
setting reservation to the size of OS+ java heap
Page 30 of 32
Health Check Report
6. Appendix C: References
Item
URL
Documentation
VMTN Technology information
http://www.vmware.com/vcommunity/technology
VMTN Knowledge Base
http://kb.vmware.com
Discussion forums
http://www.vmware.com/community
User groups
http://www.vmware.com/vcommunity/usergroups.html
Online support
http://www.vmware.com/support
Telephone support
http://www.vmware.com/support/phone_support.html
Education Services
http://mylearn.vmware.com/mgrreg/index.cfm
Certification
http://mylearn.vmware.com/portals/certification/
Technical Papers
http://www.vmware.com/vmtn/resources
Network throughput between
virtual machines
Detailed explanation of VMotion
considerations
Time keeping in virtual machines
VMFS partitions
VI3 802.1Q VLAN Solutions
http://www.vmware.com/pdf/esx3_vlan_wp.pdf
VMware Virtual Networking
Concepts
Using EMC Celerra IP Storage
(VI3
documentation
Best Practices
Performance Best Practices for
VMware vSphere 4.0
Recommendations for aligning
VMFS partitions
Performance Troubleshooting for
VMware vSphere
http://www.vmware.com/support/pubs/vum_pubs.html
http://communities.vmware.com/docs/DOC-10352
Large Page Performance
VMware vSphere PowerCLI
http://www.vmware.com/support/developer/windowstoolkit/
VI3 security hardening
VMware HA: Concepts and Best
Practices
Java in Virtual Machine on ESX
http://www.vmware.com/files/pdf/Java_in_Virtual_Machines_on_ESX-FINALJan-15-2009.pdf
CPU scheduler in ESX 4.0
Dynamic Storage Provisioning
(Thin Provisioning)
Page 31 of 32
Health Check Report
Item
URL
Understanding memory resource
management on ESX
Page 32 of 32
Exhibit 4
Rhode Island Housing – DataCenter Disaster Recovery Inventory
Production
Providence, RI
Disaster Recovery
Springfield, Mass
QTY 3 – HP Proliant DL 380 G7 – 8 X
CPU – 2 sockets – 196GB RAM
UID
11
22
11
POWER
POWER
SUPPLY
SUPPLY
99
7
7
88
33
1
5
2
6
3
7
22
2
2
1
6
11
22
33
11
22
OVER
OVER
TEMP
TEMP
4
8
1
5
7
7
88
POWER
POWER
SUPPLY
SUPPLY
2
6
3
7
22
99
22 44 66 88
1
1
2
2
1
11
33
22
POWER
POWER
SUPPLY
SUPPLY
OVER
OVER
TEMP
TEMP
99
7
7
88
4
8
1
5
22
FANS
4
3
4
3
6
3
7
1
6
3
7
4
8
1
5
2
6
3
7
4
8
HP
ProLiant
DL380 G7
PROC
PROC
FANS
4
3
4
3
2
1
2
22
33
22
POWER
POWER
SUPPLY
SUPPLY
OVER
OVER
TEMP
TEMP
7
7
1
6
22
22 44 66 88
1
1
ONLINE
AMP
SPARE
STATUS
PROC
PROC
MIRROR
5
5
FANS
4
3
4
3
2
2
1
1
8
QTY 2 – Cisco 3570 – 24-port
Ethernet Switching
Juniper SSG in HA
SecureWorks Managed
HP
ProLiant
DL380 G7
33 11 11 33 55 77 99
44
2
2
6
44
POWER
POWER
CAP
CAP
DIMMS
55
6
6
PROC
FANS
1
4
5
2
POWER
POWER
CAP
CAP
22 44 66 88
1
1
MIRROR
PROC
PROC
2
8
1
ONLINE
AMP
SPARE
STATUS
5
11
88
PROC
2
4
44
POWER
POWER
SUPPLY
SUPPLY
99
2
22 44 66 88
1
1
MIRROR
5
7
DIMMS
22
5
11
POWER
POWER
CAP
CAP
ONLINE
AMP
SPARE
STATUS
5
6
3
UID
HP
ProLiant
DL380 G7
44
5
2
1
33 11 11 33 55 77 99
44
2
2
6
33 11 11 33 55 77 99
44
2
2
6
POWER
POWER
SUPPLY
SUPPLY
55
6
6
6
DIMMS
55
6
6
PROC
PROC
FANS
6
33
OVER
OVER
TEMP
TEMP
PROC
FANS
1
UID
22
11
POWER
POWER
SUPPLY
SUPPLY
1
2
22
PROC
PROC
FANS
4
3
4
3
2
22
7
7
88
PROC
MIRROR
5
1
PROC
FANS
4
3
4
3
11
ONLINE
AMP
SPARE
STATUS
5
6
HP
ProLiant
DL380 G7
44
PROC
MIRROR
11
POWER
POWER
SUPPLY
SUPPLY
33 11 11 33 55 77 99
44
2
2
6
POWER
POWER
CAP
CAP
DIMMS
55
6
6
PROC
PROC
FANS
33
22 44 66 88
1
1
ONLINE
AMP
SPARE
STATUS
5
UID
HP
ProLiant
DL380 G7
44
22
5
6
POWER
POWER
CAP
CAP
DIMMS
33 11 11 33 55 77 99
44
2
2
PROC
FANS
1
OVER
OVER
TEMP
TEMP
22
POWER
POWER
SUPPLY
SUPPLY
55
6
6
PROC
PROC
FANS
4
3
4
3
22
11
7
7
88
PROC
MIRROR
5
POWER
POWER
SUPPLY
SUPPLY
11
POWER
POWER
SUPPLY
SUPPLY
99
22 44 66 88
1
1
ONLINE
AMP
SPARE
STATUS
5
UID
99
UID
HP
ProLiant
DL380 G7
POWER
POWER
CAP
CAP
DIMMS
33 11 11 33 55 77 99
44
2
2
6
44
OVER
OVER
TEMP
TEMP
22
POWER
POWER
SUPPLY
SUPPLY
55
6
6
PROC
PROC
FANS
6
QTY 3 – HP Proliant DL 380 G7 – 8 X
CPU – 2 sockets – 196GB RAM
QTY 2 – Cisco 3570 – 24-port
Ethernet Switching
Cisco ASA 5510
SecureWorks Managed
CISCO ASA 5510 series
2
1
SYST
RPS
MASTR
STAT
DUPLX
SPEED
STACK
4
3
6
5
7
8
9
10
11 12
13 14
15 16
17 18
19 20
21 22
Catalyst 3750 SERIES
23 24
1X
11X
13X
23X
2X
12X
14X
24X
2
1
SYST
RPS
MASTR
STAT
DUPLX
SPEED
STACK
4
3
6
5
7
8
9
10
11 12
13 14
15 16
17 18
19 20
21 22
TUS
STA
1
HA POWER
11X
13X
23X
2X
12X
14X
24X
R
WE
MDS 9124 MULTILAYER FABRIC SWITCH
MGMT 10/100
P/S
3
4
5
6
7
8
10
8
11
12
13
14
15
16
17
18
19
20
21
22
23
24
0/0
LINK TX/RX
0/1
LINK TX/RX
0/2
LINK TX/RX
0/3
LINK
USB 0
STATUS
ACTIVE
VPN
SYST
RPS
MASTR
STAT
DUPLX
SPEED
STACK
FLASH
1
SYST
RPS
MASTR
STAT
DUPLX
SPEED
STACK
2
3
CONSOLE
AUX
TX/RX
0/0
LINK TX/RX
0/1
LINK TX/RX
0/2
10/100/1000
LINK TX/RX
0/3
LINK
USB 0
POWER
STATUS
ACTIVE
VPN
FLASH
8
9
10
11 12
13 14
15 16
17 18
19 20
21 22
23 24
13X
23X
12X
14X
24X
1X
11X
13X
23X
2X
12X
14X
24X
2
4
3
6
5
7
8
9
10
11 12
13 14
15 16
17 18
19 20
21 22
2
23 24
1
MODE
USB 1
7
11X
1
Adaptive Security Appliance
SLOT NUMBER
NFI
CO
6
5
2X
RESET
CONFIG
HA POWER
4
3
1X
1
MODE
USB 1
* 2nd for HA is
optionally
procured by RIH
FAN
2
TX/RX
G
RM
STATUS
1
TUS
STA
2
CISCO
RESET
AUX
SSG320M
ALA
QTY 2 – Cisco 9124MDS
Fiber Storage Switching
POWER
2
3
CONSOLE
10/100/1000
PO
CONSOLE
NFI
CO
23 24
1X
1
MODE
SLOT NUMBER
G
RM
ALA
2
1
RESET
CONFIG
2
SSG320M
R
WE
PO
1
MODE
2
QTY 2 – Cisco 9124MDS
Fiber Storage Switching
CISCO
CONSOLE
MGMT 10/100
STATUS
P/S
FAN
RESET
DS-C9124-K9
1
2
3
4
5
6
7
8
8
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
DS-C9124-K9
CISCO
CISCO
CONSOLE
CONSOLE
MGMT 10/100
STATUS
MGMT 10/100
STATUS
P/S
P/S
FAN
FAN
RESET
1
2
3
4
5
6
7
8
10
8
11
12
13
14
15
16
17
18
19
20
21
22
23
24
RESET
DS-C9124-K9
NetApp FAS 2040 w/ Dual Controllers –
QTY 12 600GB SAS in head
1
3
4
5
6
7
8
8
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
DS-C9124-K9
QTY 12 1TB SATA in head
A
FAS2040
2
A
FAS2040
B
NetApp DS4243 24-disk Shelf
QTY 24 – 600GB SAS
B
QTY 24 – 1TB SATA
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
DS4243
DS4243
Rhode Island Housing – DataCenter Disaster Recovery
Physical Design
Disaster Recovery
Springfield, Mass
Cisco ASA 5510
SecureWorks Managed
POWER
STATUS
ACTIVE
VPN
FLASH
POWER
QTY 4 – FC Per ESX Host
– FC HBA PCI
STACK 2
STACK 1
STACK 2
3
iLO
2
1
UID
4
3
iLO
2
1
UID
4
3
iLO
2
1
UID
CONSOLE
RATING
100-240V ~
1.6A-0.9A, 50-60 HZ
CONSOLE
RATING
100-240V ~
1.6A-0.9A, 50-60 HZ
QTY 4 Ethernet Per ESX Host –
PCI Onboard
QTY 4 Ethernet Per ESX Host PCIE
Ethernet
DC INPUTS FOR REMOTE
POWER SUPPLY
SPECIFIED IN MANUAL
+12V
@8.5A
DC INPUTS FOR REMOTE
POWER SUPPLY
SPECIFIED IN MANUAL
+12V
@8.5A
ACTIVE
VPN
FLASH
mb
100 nce
e
ean
Osh Provid
-P
P-to
STACK 1
4
STATUS
CISCO
CONSOLE
MGMT 10/100
STATUS
P/S
FAN
RESET
1
2
3
4
5
6
7
8
8
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
RESET
1
2
3
4
5
6
7
8
8
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
DS-C9124-K9
Fiber SAN
CISCO
CONSOLE
MGMT 10/100
STATUS
P/S
FAN
QTY 2 – FC Per Controller
LNK
0b
0a
0b
LNK
e0P
e0a
e0b
e0c
e0d
e0P
e0a
e0b
e0c
e0d
RIH Providence
Production DC
LNK
LNK
IOM6
IOM3
LINK
LNK
2
4
IOM6
IOM3
Disconnect all
supply power for
complete isolation.
IOM6
IOM3
Disconnect all
supply power for
complete isolation.
1
3
2
4
CAUTION
LINK
CAUTION
LNK
1
3
QTY 2 Ethernet Per
Controller –
Management Interfaces
LNK
LINK
CAUTION
1
3
LNK
LINK
LNK
2
4
Disconnect all
supply power for
complete isolation.
IOM6
IOM3
Disconnect all
supply power for
complete isolation.
1
3
2
4
CAUTION
Note – diagram is for
demonstration but vendor
must have their own
detailed design to work
from
0a
LNK
DS-C9124-K9
Cisco 3570 and 9124mds diagrammed here
– it is important that cabling is disparately
patched for redundancy
Rhode Island Housing – Disaster Recovery Logical Design
VMWare
Networking
Storage
Vswitch1 – LAN
10.55.10.0/24
HP - Onboard
HP – FC HBA
X1005A
LINK
ACT
STAT
Chelsio
Communications
NetApp - SATA
HP - PCIE
FC Alua igroup –
qty 4 paths
Vswitch2 – VmKernel
New Vlan
HP - Onboard
Vmfs - datastore1
HP - PCIE
Vswitch1 – Manage
New Vlan
HP - Onboard
HP - PCIE
1.0TA
vmdk1
vmdk2
vmdk3
Rhode Island Housing – Disaster Recovery Logical Design
Site-to-Site Replication
Vmware
Disaster Recovery
Production
VCenter
DRVCenter
SRM
SRM
Juniper
Cisco ASA
SSG320M
S
R
POWE STATU
RESET
CONFIG
SLOT NUMBER
1
2
G
ALARM
HA POWER
3
CONSOLE
CONFI
AUX
TX/RX
0/0
LINK TX/RX
0/1
LINK TX/RX
0/2
LINK TX/RX
0/3
LINK
LINK TX/RX
0/3
LINK
10/100/1000
USB 0
USB 1
POWER
Oshean 100mb P-to-P
Fiber Dual Path
SSG320M
S
R
POWE STATU
RESET
CONFIG
SLOT NUMBER
1
2
G
ALARM
HA POWER
CONFI
3
CONSOLE
AUX
TX/RX
0/0
LINK TX/RX
0/1
LINK TX/RX
0/2
10/100/1000
USB 0
USB 1
STATUS
ACTIVE
VPN
FLASH
POWER
STATUS
ACTIVE
VPN
FLASH
(3) HP Proliant – ESXi 5.0
(3) HP Proliant – ESXi 5.0
Netapp
Production
Disaster Recovery
QTY 12 600GB SAS in head
SnapMirror License Replication
SnapMirror License Replication
QTY 12 1TB SATA in head
A
FAS2040
A
FAS2040
B
B
Juniper
QTY 24 – 600GB SAS
SSG320M
S
R
POWE STATU
RESET
CONFIG
SLOT NUMBER
1
G
ALARM
HA POWER
2
3
CONSOLE
CONFI
AUX
TX/RX
0/0
LINK TX/RX
0/1
TX/RX
0/0
LINK TX/RX
0/1
LINK TX/RX
0/2
LINK TX/RX
0/3
LINK
USB 0
USB 1
LINK TX/RX
0/3
LINK
USB 0
USB 1
10/100/1000
Cisco ASA
SSG320M
S
R
POWE STATU
RESET
CONFIG
SLOT NUMBER
1
G
ALARM
HA POWER
CONFI
POWER
2
STATUS
ACTIVE
VPN
FLASH
3
CONSOLE
AUX
LINK TX/RX
0/2
10/100/1000
QTY 24 – 1TB SATA
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
600GB
600GB
600GB
600GB
1.0TB
1.0TB
1.0TB
1.0TB
POWER
STATUS
ACTIVE
VPN
FLASH
DS4243
DS4243
FlexClone License –
DR Test
Oshean 100mb P-to-P
Fiber Dual Path
FlexClone License –
DR Test

Recipient Name - Rhode Island Housing

Transcription

Similar documents

Re-Imagine Your Desktop Revenue

BMW DIS v44 Instructions.docx

Understanding Virtualization in Longhorn Server What

Wireshark vs. „The Cloud“

Piano Spec Sheet Template

THE BUILDING BLOCKS OF I.T. EDUCATION -SIGMAnet

Datasheet

Bob Benson - Lewis University Department of Computer

Getting Started with VMware Fusion

RESULTS COMMERCIAL INSIGHTS CHALLENGES