Matt Rey, MBCP, MBCI Resiliency Best Practices InfraGard Meeting – May 2013
Transcription
Matt Rey, MBCP, MBCI Resiliency Best Practices InfraGard Meeting – May 2013
Resiliency Best Practices InfraGard Meeting – May 2013 Matt Rey, MBCP, MBCI [email protected] May 2013 Goals • Review eBay Inc. Resiliency’s approach to BCM mixed-in with some best practices • Discussion InfraGard – May 2013 2 A Family of Brands Enabling Commerce InfraGard – May 2013 3 eBay Inc. Resiliency Mission Mission Statement The mission of the eBay Enterprise Resiliency Team is to provide a holistic management process by which businesses plan for continuing critical operations in order to: • Ensure the availability of eBay.com and adjacent business unit sites (e.g., PayPal.com, mobile.de) • Identify potential threats or likelihood of impact to operations and plan / respond accordingly • Enable an effective response that safeguards the interests of our key stakeholders, reputation and value creating activities • Enable executives to manage operations under adverse conditions via appropriate resilience strategies, recovery objectives, and operational risk management considerations • Foster a culture of resiliency and preparedness throughout the enterprise InfraGard – May 2013 4 BCM Program Alignment eBay’s Shared Behaviors • • • • • Be the Customer Simplify & Clarify Debate, Decide & Deliver Be Open, Honest & Direct Do the Right Thing Industry Standards & Best Practices • Disaster Recovery Institute Int’l (DRII) – Professional Practices • Business Continuity Institute (BCI) – Good Practices Guidelines (GPG) • Int’l Organization for Standardization (ISO) – ISO 22301 Business Continuity Management Systems Identification of Regulatory Guidelines & Requirements • Domestic regulatory standards and requirements (e.g. FFIEC, SEC) • Regulatory requirements for businesses wherever there is a presence – Depends on the type of business activities – Can sometimes extend to other aspects/locations of the organization InfraGard – May 2013 5 BCM Program Organizational Structure eBay Inc. Resiliency Dedicated Team ~20 - Program Governance, Oversight, and Maturity Business Engagement & Program Implementation Support Training & Awareness Toolset Management Management Reporting Business Unit Roles Business Executive Sponsor(s) … Crisis Management Support Roles Business Continuity Coordinator Disaster Recovery Coordinator Business Continuity Planners Disaster Recovery Planners … Shared & Dedicated Roles - Program Implementation Plan Documentation Strategy Development Plan Testing Training & Awareness InfraGard – May 2013 6 Dedicated Team Structure eBay Inc. Resiliency PMO, Training Safety & Security Crisis Management Business Continuity [IT] Disaster Recovery Program Tools Training & Awareness Testing InfraGard – May 2013 7 Business Role Structure To support the varying size and complexity of the business units within scope, a structure specific to the BCM program has been developed to organize planning efforts and management reporting. Business Unit / Vertical Executive Sponsor eBay Product Development Business Unit / Vertical Coordinators Business Area / Brand Executive Sponsor Business Area / Brand Coordinators Department / Team Planners Mobile App Dev. ‐ QA InfraGard – May 2013 8 Planning Lifecycle The initial engagement of business units follows a general flow that all begins with identifying and training program coordinator roles. Once the roles have been identified and training has been provided, a timeline for program implementation and planning activities can be developed and further reviewed by the various stakeholders for approval. •Identify BC Planning roles Role Identification & •Conduct training sessions •Finalize planning timeline Training •Identification of Potential Impacts Risk Assessment •Threat Ranking & Business •Business Unit Criticality Ranking Impact Analysis •Dependency (e.g. Technology) Criticality Ranking Business Continuity Planning Gap Analysis & Mitigation Planning •Department/Team‐Level Planning •Recovery Resource Requirements •Recovery Strategies and Procedures Plan Testing •Plan Testing Along Maturity Model •Action Item Tracking & Follow‐up •BC Plan & Strategy Improvement InfraGard – May 2013 9 Business Recovery Strategies Documenting a Business Continuity Plan (BCP) is great, but is the strategy viable? Existing Capabilities Recovery Strategy Where’s the Beef! Conducting a gap analysis to identify and compare existing capabilities to what is actually required to leverage the identified recovery strategies. This stage in the process usually identifies larger issues and projects that the business unit must undertake in order to enable their departments and functions to continue/recover critical activities. Examples include: • Enabling team members with remote working capabilities • Ensuring technology capabilities meet business needs (e.g. availability, data backup) • Pre-identification of recovery locations for critical areas such as Customer Service and distribution centers • Pre-coordination of process-diversion strategies for teams with a presence in multiple locations InfraGard – May 2013 10 Planning Lifecycle InfraGard – May 2013 11 Crisis Management To support the varying size and complexity of the ebay locations around the world, a layered crisis management structure has been developed to organize response efforts based on the needs of the event. Corporate & Executive Team Regional Teams Regional Business Unit Executive Management, Senior Regional Corporate Functions (Security, Facilities, etc.) Site‐based Teams EVENT ESCALATION Corporate & Executive Management, Senior Corporate Functions (Security, Facilities, etc.) Local Business Unit Management, Local Site Support (Security, Facilities, etc.) InfraGard – May 2013 12 Technology Disaster Recovery Website Disaster Recovery Best Practices • • • • Geographically diverse redundancy planning based on website criticality and revenue Defined escalation and support structures Recovery plan documentation (both system- and data center-based) Testing along maturity path Non-website Disaster Recovery Best Practices • • • • • • Standards and policies to meet business needs Accountability on technology owners and business units Alignment of system and application recovery capabilities with business unit requirements (BIA/BCP) System dependency analysis Recovery plan documentation Testing along maturity path Participation - Technology management Monitoring and support teams Subject Matter Experts (SMEs) Vendors (as determined by organizational dependencies) Alternate Work Force (AWF) InfraGard – May 2013 13 Testing Policy & Standards • • • • Define a testing policy that meets the needs of the organization. Define the testing frequency taking into consideration plan types, plan criticality, and exercise maturity. Ensure [all] policies and standards are reviewed and approved by the appropriate corporate officers. Ensure [all] policies and standards are available to the organization via multiple channels. Testing Goals & Best Practices • • • • All critical plans should be tested annually according to a testing maturity path. All key staff members should participate to become familiar with plans and strategies. Validation of plans and strategies. Appropriate documentation of issues during testing, and follow-up thereafter. Participation - Teams and key staff members identified in the targeted plans - Support roles as required (technology, security, facilities, etc.) - Local authorities InfraGard – May 2013 14 Training & Awareness Primary Goals • Engage staff at all levels of the organization in the BCM program. • Foster a culture of resiliency throughout the organization. • Train coordinators and other key roles throughout the program. Example Approaches • Training - in-person meetings - campus-wide sessions - online training sessions - eLearning modules - Industry training opportunities • Awareness Campaigns - BCM Awareness Week - Ad-hoc campaigns to promote the program or specific components - Desk-drops, posters, flyers - Intranet site with training content - Validation exercises • Testing - Testing is the best training! - Tabletops, semi-functional exercises, functional exercises, etc. InfraGard – May 2013 15 BCM Program Toolset The right tools to enable the organization, to enable success. Core Program Tools • Notification Tool – Enabling communication with staff with check-in capabilities. • Planning Tool – A flexible planning tool to capture business plans and other documentation. • Crisis Management Tool – Crisis/Incident management software enabling team members and coordinators to collaborate during an incident (e.g. mapping, chat, document sharing, team status, logging). Sometimes combined with Planning tool. • Data Feeds – Fueling the program toolset, data feeds are critical to supplying the foundational data used for planning and analysis within the program (e.g. facility information, people information, system details). • Coordinator Email DLs – Enabling quick communication with business coordinators. • Satellite Phones – Distributed to executives and key roles to enable communication. • Travel Registry – Logging of business travel to assist in the identification of potentially impacted staff. Extending the Footprint of the Organization • • • • Remote Access – Enabling staff to work from anywhere in the world with a working internet connection. Collaboration Tools – Audio and video conferencing to enable virtual collaboration during events. Instant Messaging – Instant messaging to enable communication. Unified Communications – Extending presence and availability to external devices and networks (e.g. telephony solutions, mobile apps). Outside Programs Worth Consideration • GETS/WPS – Distributed to executives and key roles to enable communication. • CEAS – Enabling access to restricted areas during events for certain metropolitan areas (e.g. NY, MA). InfraGard – May 2013 16 Incident Response (High Level) Situational Awareness (immediate) • Data gathering on potentially impacted locations and staff using program tools. • Continuous event monitoring for updates on details, arrests, lock-downs, etc. • Create log crisis management tool to begin crisis management & collaboration activities. Establishing Communication with Staff • Confirming the well-being of all staff based in the area as well as travelers in the area. • Identifying where local (and travelling) staff are located during and after the event – providing shelter-in-place as required. • Offering assistance to anyone impacted by the event (travel needs, counseling, etc.). Establishing Communication with Management & Stakeholders • Communicating with local management to further identify any employee impact, as well as the immediate plan of action for office operations. • Communicating with corporate management to provide status on the well-being of staff and status of office operations. • Determining a schedule for further communications and meetings to coordinate response activities. Situational Awareness (longer term) • Communication with local management to understand any changes in impacts to staff and business operations. • Ensuring critical functions have strategies to continue workload while office is closed (as necessary). InfraGard – May 2013 17 Thank You! Questions? InfraGard – May 2013 18