Corporate Overview - Data Center World
Transcription
Corporate Overview - Data Center World
DR07 - Maintaining Alignment between Data Center Investment Decisions and Disaster Recovery Planning Mike Andrea Director, The Strategic Directions Group Pty Ltd Data Center Institute, Board of Directors AFCOM Data Center World Nashville, USA October 2012 • Strategic ICT management services • Data center design and development • Telecommunications and networking • Project office services ICT Master Planners and Strategists www.strategicdirections.com.au AFCOM Data Center World – Nashville 2012 • DR07 - Maintaining Alignment between Data Center Investment Decisions and Disaster Recovery Planning Mike Andrea, Director, Strategic Directions • With significant focus on data center energy consumption, PUE, and rising energy costs, many companies are reconsidering their approach to their data center investments. • This leads to organisations investigating various options including upgrades/refurbishment, colocation, and cloud services among others. • This presentation will discuss the importance of maintaining strategic business priority alignment with the critical nature and function of the data center in disaster recovery planning. Agenda • • • • Introduction What are Strategic Business Priorities? How does this impact Data Centers? What is creating change? – Natural disasters review • Why align DR Planning? • Strategies to Consider • Summary & Questions Introduction • Do you – know what your organisations core business is? – understand how the organisation stays in business? – know what the minimum operational status or capability is for the organisation … before its gone? • Are you aware of – The strategic priorities that might force core business change? – How the business makes and spends money and what is important to the business in a disaster? • Production / manufacturing • People, health and safety • Public and emergency services Introduction • Do you – Know how to recover from small issues to total loss disasters? – Know who is in charge should disaster strike? – Know where disaster / emergency management will be performed – Know which staff might be impacted by different scale disasters and your ability to continue to manage, operate and recover from a disaster What are Strategic Business Priorities? OBJECTIVES Core Business is industry based: • Legal • Financial • Manufacturing • Construction • Government • Health / Education • Primary Production • Resources / Mining • ICT CORE BUSINESS What are Strategic Business Priorities? OBJECTIVES Core Business is industry based: • Legal • Financial • Manufacturing • Construction • Government • Health / Education • Primary Production • Resources / Mining • ICT CORE BUSINESS We: • Deliver legal services • Are a bank / insurance company • Build widgets • Build buildings and infrastructure • Deliver public services • Deliver health services • Deliver education services • Grow things • Get resources out of the ground and produce commodities • Deliver IT, communications and data center services What are Strategic Business Priorities? OBJECTIVES CORE BUSINESS STRATEGIC BUSINESS PRIORITIES PRIORITIES Strategic Business Priorities: • Core Business delivery (volume, services, products) • Hours of Operation – No unplanned down time • Meet production deadlines • Meet customer expectations • Comply with regulatory, legal and contractual obligations • Make a profit / improve productivity • Deliver a growth target of x% • Achieve a customer satisfaction rating > y% • Achieve a customer retention target of z% • Enter new markets • Achieve a market ranking of #1 provider / product / service What are Strategic Business Priorities? BUSINESS CONTINUITY OPERATIONAL MODE BUSINESS AS USUAL DISASTER RECOVERY Business Continuity • Ongoing, continuous delivery of the core business outcome • Two operational modes: – Business as Usual (normal workday functionality, maintenance works, including typical change activities) – Disaster Recovery (delivery of core business functions during a disaster, following a disaster, and recovering from a disaster back to Business as Usual) • Is your business: – required to continue delivering core business outcomes during a disaster, or – able to cease core business functions until the disaster has passed or been resolved? How does this impact Data Centers? • Enable and/or support the Corporate Business Continuity Plan – The business must be capable of delivering required outcomes during a disaster • Data Center Strategy – If the Primary Data Center fails, can the alternative (secondary) facility support Core Business outcomes … for how long? – PUE vs Uptime… which is more important… or can they both be achieved through careful facility design or selection? • Location – Facilities supporting Emergency Services should be located in areas with a high likelihood of continued operation and access in the event of significant natural disasters • Tier Rating – Expecting long-term Tier IV uptime from a Tier I data center is not realistic – Fire management solution (detection and suppression) – Concurrent maintainability and fault tolerance? How does this impact Data Centers? • Operational Plan – Staff training, skills retention, roster, location and knowledge levels – Efficiency, DCIM, performance targets • Lifecycle Management – Upgrades, replacements and changing energy profiles • Investment Plan – When should expansion and major maintenance works be scheduled – What is the ROI and expected life of the facility • Maintenance Plan – What are the maintenance windows for lengthy maintenance activities – Will maintenance impact, impede or stop Core Business delivery? – Do maintenance windows create time zone / time of day issues for distributed national or global business services? How does this impact Data Centers? • Security Model – Logical and physical security must align to the category of system and data being supported – 24x7 security staff – CCTV Surveillance, digital recording and playback, and archived footage • Telecommunications interconnectivity – Multi-carrier enabled? – Redundant links between primary and secondary data centers • Change Management planning, communication, documentation and rollback strategies What is Creating Change? Reasons - Business: • Commercial / Financial • Core Business – – – • • • • • • • • • • • Acquisition Divesting capability Expansion / contraction Legal / Regulatory / Jurisdiction Property Funding Business Technical Privacy Security REASONS Market change Customer demand OPTIONS Competitive pressure Shareholder Interest PLAN and Recent Disaster ACTION Insurance Policies RISKS What is Creating Change? Reasons - Business: • Commercial / Financial • Core Business – – – • • • • • • • • • • • Acquisition Divesting capability Expansion / contraction Legal / Regulatory / Jurisdiction Property Funding Business Technical Privacy Security REASONS Market change Customer demand OPTIONS Competitive pressure Shareholder Interest PLAN and Recent Disaster ACTION Insurance Policies RISKS Reasons - Technical: • Technology • Lifecycle (facility, IT, infrastructure) • Density • Consolidation, Virtualisation, Cloud • Efficiency – PUE • Data Retention Policies • Security • Capacity • Space / Floor Loading • Performance • Skillsets • Telecommunications cost, capability and capacity • Support and maintenance • Age of the data center • Energy costs (and carbon taxation) Change: Risks Risks: • Mean Time To Recover • Ability to meet Strategic Priorities • Environmental / Nature / Climate Change – • • • • • • • • • • • • flood, fire, earthquake, snow storm, drought Impact to Core Business Energy Costs Skillsets and Knowledge Down time / Tier Rating Landlord (owner of the building) Loss of Reputation Bottom Line Costs Utility supply (impacted by drought) Site / Location Cyber Security Neighbours Fire and Total Site Loss Risks: Natural Disasters • Generally speaking: – – – – – – – Flood Fire Earthquake Volcanic Eruption Cyclone / Hurricane / Tornado Tsunami Tidal Surge • But, also includes: – Drought – Heat Wave – Dust Storm and Snow Storm Risks: Natural Disasters • Flood / Tsunami / Tidal Surge: – – – – Water damage, silt, mud, salt, corrosion Water flow / force of water and debris Undermining foundations Electrical safety issues • Fire / Volcanic Activity: – Direct fire damage – Burning embers and hot ash (source of new fires) – Smoke & ash entry into the DC (free air economiser & VESDA risk) – Water from fire suppression & forced entry by fire fighters • Earthquake / land subsidence – Direct building damage – Services damage to cables, water & diesel tanks, lifts, etc Natural Disasters: Direct • Cyclone / Hurricane / Typhoon / Tornado: – Wind and projectiles – Water damage, silt, mud, salt, corrosion, debris – Power, water and sewerage system impacts • Drought – Risk to site water supplies and power generation (power station) – Generally longer summers (non economiser cycles) • Heat Wave – Excessively hot days extending the design parameters of the cooling systems • Dust Storm – Direct dust ingress into the data center and possibly engines – Clogs filters and dampers – Fresh air intakes clogged and/or turned off Natural Disasters: Indirect • Flood issues: – Debris in flooded rivers pose significant threat to bridges • Many bridges damaged (unsafe) and/or washed away • Impact on transport corridors significant (road and rail) • Some communities (supplies and staff) isolated – Heavy / large debris in rivers can damage major cross-city bridges (impact on diesel supplies, transport and staff accessing office and data center sites) – The scale of January 2011 flooding across in Australia shows the degree of geographic impact can extend over 2000km’s (> 1200 miles) Natural Disasters: Indirect • Flood issues: – Coal mines (the size of Sydney Harbour) flooded • no exports, and potential to impact power generation – Interstate telecommunications links cut by flash flooding – Basements in buildings flooded – damaged electrical transformers and switchboards • Mud and silt left in electrical equipment • Some buildings electrical systems so badly damaged, unable to be occupied for months after the event • Lift wells flooded (needed to be drained) • buildings not permitted to be occupied until electrical re-certification – Office BCP/DR Sites • • • • All were full and are contracted on a first in / first served basis In some cases the DR office couldn’t fit all affected staff Staff told to work from home had no power at home Remote access connection points (dial-in and Internet VPN) lost power Natural Disasters: Indirect • Flood issues: – Some communities can remain isolated (require food and fuel drops) for weeks after the event – Risk to electrical systems can mean • High-rise buildings are evacuated • Power distribution turned off whole city blocks at transformer level • Critical Emergency Management data centers came within 8in of being shutdown at the main switchboard – Didn’t matter that they had diesel generators – Emergency Services (not the power company) might have the final say on whether a building is shutdown – Sewerage systems directly affected and/or turned off – Storm water drains flowed in the opposite direction from rivers Natural Disasters: Indirect • Flood issues: – ICT DR and Business Continuity Plans failed • In some cases systems deemed ‘non critical’ (and located in a single site) were discovered to be mission critical (business impact was significant) • BCP had been written but not tested… due to risks… • Some businesses had two data centers – but were both in the same city and both affected by floods (turned off) • Some businesses believe BCP is the same as DR • Cyclone / Tornado / Storm issues – Hospitals evacuated for safety reasons – Some data centers are on the same grid as hospitals believing they are safe from power cuts – Insurance cost increases and/or coverage Natural Disasters: Indirect • Indirect issues: – Commercial • Government funds redirected to flood repair and economic recovery • Some commercial entities had IT budgets reduced to focus on core business recovery • Quite a number of IT and data center projects postponed due to reallocation of funds – Business drivers have changed to risk awareness and BCP/DR – Looting and other illegal activities – For those businesses not directly affected, the level of apathy about how safe they are has risen • It won’t happen to us… we were fine in 2011 Natural Disasters: Risk Occurrence Billion Dollar Weather / Climate Disasters 1980 - 2011 Source: National Climatic Data Center, NOAA, USA (2012) Natural Disasters: Risk Occurrence • • • Since 1980, 114 billion-dollar weather and climate disasters in U.S. Three more disasters approaching $1B Total losses since 1980 of billion-dollar disasters exceed $800 billion. Source: National Climatic Data Center, NOAA, USA (25 April 2012) Natural Disasters: Risk Occurrence 14 x Billion-Dollar Weather and Climate Disasters in 2011 Natural Disasters: USA 2012 2012 USA Drought – Direct Impact to Mains Grid Power Generation http://www.ibtimes.com/partnernet/finance/drought-could-cause-massive-blackouts-across-the-country_9738.htm In July U.S. nuclear-power production hit its lowest seasonal levels in nine years as drought and heat forced Nuclear power plants from Ohio to Vermont to slow output. Nuclear Regulatory Commission spokesman David McIntyre explained, “Heat is the main issue, because if the river is getting warmer the water going into the plant is warmer and makes it harder to cool. If the water gets too warm, you have to dial back production,” McIntyre said. “That’s for reactor safety, and also to regulate the temperature of discharge water, which affects aquatic life.” Nuclear is the thirstiest power source. According to the National Energy Technology Laboratory (NETL) in Morgantown, West Virginia, the average NPP that generates 12.2 million megawatt hours of electricity requires far more water to cool its turbines than other power plants. NPPs need 2725 liters of water per megawatt hour for cooling. Coal or natural gas plants need, on average, only 1890 and 719 liters respectively to produce the same amount of energy. Align DR Planning CORE BUSINESS OBJECTIVES STRATEGIC BUSINESS PRIORITIES PRIORITIES BUSINESS CONTINUITY OPERATIONAL MODE BUSINESS AS USUAL REASONS DISASTER RECOVERY RISKS CHANGE OPTIONS PLAN and ACTION Test and Document Align DR Planning Strategic Business Priorities Transform the Business Innovation & New Markets Grow the Business Profits and Market Share Business as Usual Keep the Lights On Consider the following example: • Banking and Finance Sector • The DR Plan for financial services utilising Bank Tellers Versus • The DR Plan for financial services delivering Internet Banking • The DR Plan for Automatic Teller Machine (ATM) services What is the difference for the: • ICT and Telecommunications Strategy • Data Center Strategy Normal Operations Disaster Recovery Disaster Operations Business Continuity Support Interface Business Services / Apps Data Center Access Telco / Internet Align DR Planning Alignment decisions include • Data Center Strategy and Topology (primary-primary, primary-secondary, primary-DR, other…) • Multi-carrier and Internet connectivity solution • Time of Day impact (week days, 24x7, time zones) – Support capabilities and availability (staff and suppliers) • • • • Customer reach and service availability expectations Customer notifications and communication Maintenance windows and Tier ratings End-to-End Service Availability and Architecture – Single corded equipment in a Tier III data center ? • Own, lease, colocation, cloud, outsource, other Align DR Planning • Change directly impacts the Data Center, its operations and assets • Continually re-align the BCP and DR Plan because you don’t know when disaster will strike • Know where and how to access and apply the BCP and DR Plan during a disaster • Ensure only the latest version of the Plan is distributed and used (archive / remove previous editions) Align DR Planning • Maintain separate offsite records to enable quick retrieval of – Insurance Policies – Asset Lists (by site) – Critical contact names and numbers (e.g. fuel suppliers) • Schedule a re-testing of the BCP and DR Plan following significant changes • Test emergency fuel supplies and travel times via routes that are unlikely to be impacted in a natural disaster or major traffic incident Align DR Planning • Don’t forget “Recovery” – Key suppliers and spares… where are you placed on the supplier’s priority list should a significant event impact the city? – Can you rebuild the ICT architecture and systems from design documentation that references previous equipment models and platforms that are no longer be available? – Where will your staff/team be located to manage a rebuild should a Total Loss occur… can they actually get there? – Would Strategic Priorities change the solution should a Total Loss occur? • Could your insurance company dictate a direction or new set of priorities? Align DR Planning Examples of misalignment • • • • • • • Locating your primary data center and secondary data center in the same city, adjacent to the same river… in noted flood plains (to reduce travel time between the sites for support staff) Locating your primary data center in the same colocation facility that your Cloud Service provider is using to deliver your redundant, secondary or backup solution Keeping your Insurance Policy records, Asset Lists, BCP and DR Plans in your primary data center with no alternative, web based access to the information Delaying / stopping data center maintenance works because downtime will impact core business delivery in a 24/7 business Insurance companies placing their primary data center and inbound call center in a flood plain Emergency Services routing all communications links to the Disaster Response center through a primary data center located in a flood plain Loading data center capacity and space to 100% Strategies to Consider • Review and understand the Strategic Business Priorities – They change… at least annually • Engage with the Risk and Change Managers • Discuss and review recovery options from small issues to total site loss events – Would the site be rebuilt, or would management use the opportunity to outsource, use colo or move to the cloud? – Don’t forget people issues – Know what to do and when • Formally review and get business buy-in to the maximum allowable downtime during and following a disaster Strategies to Consider • Ensure the data center facilities support business continuity plans and enable disaster recovery efforts for the business • Make the data center manager a member of the Disaster Management and Response Team • Reviews of the data center might result in consideration of: – – – – Upgrade / refurbishment New facility Outsource / colocation Cloud solution Strategies to Consider • Remember: Location, Location, Location – – – – – Flood plains are just that Flooding from one event can cover x,000’s of miles Cyclones and Tornadoes don’t follow the same path Volcanic ash changes with the wind Know the impact on your business insurance policies • Review separation of – Data Centers… take into account ‘area of impact’ – Carrier Exchanges and Telco Links and Paths – Key offices, call centers and remote access points • Know how to supply / support your site(s) with – Clean water, diesel, staff, spares, and partner / resources • Review and align Insurance Policies with Strategic Priorities and DR Plans Strategies to Consider • Formulate a Disaster Management Plan that targets Natural Disasters – Plan for ‘what if’ scenarios – Know where and what systems might need to be moved, turned off, relocated, secured – Mimic a disaster and test the plan • BCP is not the same as DR – Business Continuity during ‘Business as Usual’ as against Disaster Recovery – Test your BCP Plan for each Operational Mode • Test your DR Strategy… regularly • Simulate different impact events • Simulate an event during a maintenance activity Summary • • • • • • Strategic Business Priorities Impact on the Data Center Risks and Natural Disasters Business Continuity is constant Change is ongoing Align Data Center Investment with DR Planning • Strategies to Consider CORE BUSINESS OBJECTIVES STRATEGIC BUSINESS PRIORITIES PRIORITIES BUSINESS CONTINUITY OPERATIONAL MODE BUSINESS AS USUAL REASONS DISASTER RECOVERY RISKS CHANGE OPTIONS PLAN and ACTION Test and Document Questions ? Thank you Mike Andrea Director, The Strategic Directions Group and Data Center Institute USA, Board of Directors Email: [email protected] Mobile: +61 410 551 080 Web: www.strategicdirections.com.au www.afcom.com/dci_about.html Copyright © 2012 The Strategic Directions Group Pty Ltd. All rights reserved.