Get Instant Access
to This Blueprint

Infrastructure Operations icon

Reduce Costly Downtime Through DR Testing

Improve the accuracy of your DRP and your team’s ability to efficiently execute recovery procedures through regular DR testing.

  • Customers, regulators, as well as your executive team are demanding that you test your DRP, but resources are scarce.
  • Most DR tests are focused solely on the technology and not the DR management process – which is where most plans fail.
  • Over 60% of organizations that are testing do not document the results, so they fail to properly show evidence of testing and incorporate lessons learned into their DRP.

Our Advice

Critical Insight

  • Be proactive – plan an annual test cycle that enables you to identify and coordinate resources well in advance.
  • Don’t focus on one test. Plan a series of tests from walkthroughs to functional tests to validate both the DR process and technical capabilities.
  • If you treat DR testing as a pass/fail exercise, you aren’t meeting the end goal of improving your DRP. Focus on identifying gaps and risks before a real disaster hits.

Impact and Result

  • Create an effective DR test plan by following a structured process to discover current capabilities and defining test procedures for the entire range of testing methodologies. This includes:
    • Defining current readiness through a comprehensive action items list, proficiency assessment, and needs analysis.
    • Creating comprehensive test documentation that will support the test facilitator through both passive and active testing.
    • Implementing a thorough review program that will incorporate learning points from testing into everyday operations.

Reduce Costly Downtime Through DR Testing Research & Tools

1. Determine DR testing readiness and scope

Establish the current testing maturity of the organization.

2. Create a project charter to build a test plan

Construct a project charter that guarantees executive buy-in for the project.

4. Translate lessons learned into improving overall preparedness

Create a process for continuous improvement in DR capabilities and be ready for future disasters.


Member Testimonials

After each Info-Tech experience, we ask our members to quantify the real-time savings, monetary impact, and project improvements our research helped them achieve. See our top member experiences for this blueprint and what our clients have to say.

Client

Experience

Impact

$ Saved

Days Saved

YHA New Zealand

Guided Implementation

10/10

$10,000

10

Catholic Health System

Guided Implementation

9/10

$12,399

10

Catholic Health System

Guided Implementation

10/10

$30,999

20

Children's Hospital Colorado

Guided Implementation

10/10

$61,999

20

National Bonds Corporation

Guided Implementation

9/10

$25,000

55


Workshop: Reduce Costly Downtime Through DR Testing

Workshops offer an easy way to accelerate your project. If you are unable to do the project yourself, and a Guided Implementation isn't enough, we offer low-cost delivery of our project workshops. We take you through every phase of your project and ensure that you have a roadmap in place to complete your project successfully.

Module 1: Determine Your DR Testing Readiness and Scope

The Purpose

  • Identify current testing readiness based on current documentation and infrastructure, as well cross referencing against a list of necessary action items.
  • Determine current DR proficiency and need for DR testing. 

Key Benefits Achieved

  • Define current testing preparedness and determine the scope of testing that your organization is currently capable of.
  • Determine a high level testing strategy and outlook based on readiness, proficiency, and need. 

Activities

Outputs

1.1

Review current testing practices

1.2

Assess current capabilities for all system tiers

1.3

Assess current proficiency and need

  • Defined current testing capabilities and likelihood of success
1.4

Discuss testing strategy

  • Analysis of best-fit testing strategy

Module 2: Create a Project Charter to Build a Test Plan

The Purpose

  • Identify roles and responsibilities for building the test plan.
  • Define project parameters and milestones.

Key Benefits Achieved

  • Create project clarity with a project charter that outlines the objectives, resource requirements, and target milestone dates.

Activities

Outputs

2.1

Complete roles and responsibilities documentation

2.2

Establish project parameters and milestones

  • Completed project charter with management sign-off

Module 3: Create the DR Test Plan

The Purpose

  • Plan and document the entire testing cycle.
  • Create all the necessary documentation that is needed before testing can commence.
  • Identify resource requirements to executive the DR test. 

Key Benefits Achieved

  • Identified which tests are included in the test cycle.
  • Define the roles and responsibilities of each test participant for the test cycle.
  • Create a repeatable process that can be leveraged on an ongoing basis for DR testing. 

Activities

Outputs

3.1

Complete the Test Plan Summary

  • Complete test schedule and prioritized list of systems to include in testing
3.2

Construct the Passive Testing Handbook

  • Established methodology for passive testing
3.3

Construct the Active Testing Handbook

  • Established methodology for active testing

Module 4: Translate Lessons Learned Into Improving Overall Preparedness

The Purpose

  • Demonstrate growth in DR capabilities through DR testing to the management team.
  • Establish process for continual improvement of the DR process.
  • Incorporate DR testing mindset into operational decision making. 

Key Benefits Achieved

  • Formulation of a clear connection between improved DR capabilities and confidence in recoverability.
  • Consistently updated and validated DRP.
  • Competitive advantage when attracting customers who demand an effective DRP. 

Activities

Outputs

4.1

Creation of the DR Test Plan Results Summary Presentation

  • Competed executive presentation deck
4.2

Review of current readiness

  • Indication of capability growth
4.3

Review of all test plans

  • Updated planning documentation

Reduce Costly Downtime Through DR Testing

Improve the accuracy of your DRP and your team’s ability to efficiently execute recovery procedures through regular DR testing.

Follow Info-Tech’s DR Test Planning and Execution Workflow to create a comprehensive test plan

DR Test Planning and Execution Workflow – Phases and Tools

Phase 1: Determine Testing Readiness

1. DR Test Plan Storyboard (Review Planning Process)

2. Readiness Assessment

Phase 2: Create Project Charter

3. Project Charter

Phase 3: Create a Test Plan

4. Test Plan Summary

5. System Status Worksheet

6. Passive Testing Handbook

7. Active Testing Handbook

8. System Test Plans

Phase 4: Maintain Your Test Plan

9. Issue Log and Analysis Tool

10. Active Testing Participant Evaluation Survey

11. Summary of Test Results

Call your account manager to schedule a Guided Implementation

The image is a workflow chart, titled Test Plan Development and Execution High-Level Workflow.

This development workflow corresponds with the tools that are provided in this blueprint.

Validate your DR effectiveness through a DR test plan; know you can recover rather than thinking you can recover

This Research Is Designed For:

  • Senior IT Management responsible for executing disaster recovery testing.
  • Organizations seeking to formalize, optimize, or validate an existing DRP.
  • Organizations needing to validate and prove their DR capabilities to third parties.

This Research Will Help You:

  • Create a DR test plan that will validate your DR process from end-to-end.
  • Capture key learning points of each test and present DR capability improvement to management.
  • Mitigate potential testing issues and risks.

This Research Will Also Assist:

  • Executives seeking to understand the time and resource commitment required for disaster recovery testing.
  • Members of business continuity management and crisis management teams who need to incorporate testing elements into their own recovery processes.

This Research Will Help Them:

  • Understand the role of DR testing in improving overall DR capabilities.
  • Scope the time and resources required to develop a DR test plan.

Executive summary

Situation

  • Recent natural disasters such as Hurricane Sandy have increased executive awareness and internal pressure to validate the effectiveness of the DRP.
  • Similarly, industry and government-driven regulations and customers are demanding that organizations provide evidence of recoverability before the organization is given the right to do business.

Complication

  • Documentation both before and during testing is limited and often ad-hoc, which significantly reduces the effectiveness of DR tests.
  • Lack of engagement and buy-in from test participants results in testing dates being pushed back and often forgotten.
  • Organizations that don’t have a DR test plan are also far less able to recover from a disaster compared to an organization that has a comprehensive DR test plan.

Resolution

  • Create an effective DR test plan by following a structured process to discover current capabilities and defining test procedures for the entire range of testing methodologies. This includes:
    • Defining current readiness through a comprehensive action items list, proficiency assessment, and need analysis.
    • Create comprehensive test documentation that will support the test facilitator through both passive and active testing.
    • Implement a thorough review program that will incorporate learning points from testing into everyday operations.

Info-Tech Insight

  1. Using a DR test cycle will optimize DR test effectiveness, because using a progressive approach will allow value to transfer from one test to the next.
  2. The goal of testing is to uncover gaps and issues so that they are eliminated during a real disaster. Focus on improving capabilities rather than worrying about whether you passed or failed.
  3. Budget size does not determine DR effectiveness; consistent testing and maintenance is the only way to truly prepare yourself against potential disasters.

Three ways to complete this project: Do-It-Yourself, Guided Implementations, or Onsite Workshop

Best-Practice Toolkit Download and customize Info-Tech’s tools and templates to develop your project deliverables. Use this do-it-yourself Best-Practice Toolkit to help you complete this project. The slides in this Blueprint will walk you step-by-step through every phase of your project with supporting tools and templates ready for you to use.
Guided Implementations Speak to an Info-Tech subject matter expert for advice throughout the project.

Arrange to speak to an Info-Tech expert at key milestones to ensure maximum project value.

  • Watch for this icon at key opportunities to speak with an Info-Tech analyst for additional insight and advice.
  • Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com.
Onsite Workshop Accelerate your project with an onsite, expert Info-Tech facilitator to run a workshop for you.

To inquire about or request a workshop:

  • Call 1-888-670-8889, contact your account representative (www.infotech.com/account) or email Workshops@InfoTech.com for more information.
  • Your account representative and workshop coordinator will follow up to help determine the cost, timing, and other details of the workshop.

Understand the value of effective DR testing

Sections:

  • Introduction
  • Project Phases
  • Summary

What's in this Section:

  • DR testing impact on ability to minimize downtime
  • Blueprint and guided implementations overview

The cost of downtime increases exponentially if there are delays in recovery

DR testing has the ability to greatly reduce recovery times, which in turn minimizes the business impact; leverage these statistics to establish economic benefit and build a strong business case.

The image shows a line graph with Loss on the Y-axis and Time on the X-axis. At the start of graph, the line is low and is labelled Incident Occurs. At the end of the graph, the line is high and is labelled All Revenue Lost.

Delay in recovery causes exponential revenue loss

Potential Lost Revenue

The graph above illustrates a typical revenue loss curve during a system outage. The initial business impact is small; however, as the recovery time increases, the impact on revenue will increase exponentially until all revenue is lost. The goal of successful DR is to be able to recover during that initial time period where costs have yet to escalate. DR testing allows your organization to be more confident in your DRP by ensuring its relevancy and discovering DR issues before a real disaster. A robust testing strategy will reduce the possibility of a lengthy recovery process and thus mitigate unnecessary downtime costs. (Adapted from: Rothstein, Philip Jan. Disaster Recovery Testing Exercising Your Contingency Plan [2007 Edition])

Cost of Downtime

The cost of downtime for each organization differs drastically based on several factors such as type of industry and organizational maturity. However, based on the survey results from Disaster Recovery Preparedness Benchmark Survey, almost 20% of organizations reported losses between $50,000 to over $5 million when a critical application experienced downtime. DR testing allows your organization to discover downtime threats before they occur, so that recovery time can be shortened, thus mitigating the potential economic impact. (Disaster Recovery Preparedness Council, The State of Global Disaster Recovery Preparedness 2014)

No Cost 37%
$1K-$6K 18%
$6K-$10K 13%
$10K-$20K 8%
$20K-$50K 5%
$50K-$100K 10%

$100K-$500K

3%
$500K-$1M 3%
$1M-$5M 2%
$5M+ 2%

DR testing reduces potential downtime by improving your ability to successfully execute your DRP

Creating a DRP is the first step. Testing then improves your likelihood of successful recovery from an actual disaster. Consider the following example scenario:

A disaster recovery plan has just been created and includes the following:

  • Specific recovery procedures for all systems.
  • Roles and responsibilities are assigned and personnel have all been informed and educated on the DRP.
  • An appropriate storage and backup strategy.

However, in a real disaster, problems will be encountered that do not have a prepared response. When that happens, system owners will make decisions based on assumptions and guess work as if there was no plan. DR testing identifies those gaps before there is an actual disaster, so your DRP can be updated to be more accurate, staff are better prepared, and the chance of critical mistakes is reduced.

Organizations that test their DRP are substantially more successful than those that do not

"Routine testing is vital to survive a disaster… that’s when muscle memory sets in. If you don’t test your DR plan it falls [in importance], and you never see how routine changes impact it." – Jennifer Goshorn, Chief Administrative Officer, Gunderson Dettmer LLP

The image is a bar graph with the Y-Axis labelled DRP Success, and the bars labelled Testing (44%) and No Testing (19%). Above the No Testing bar is an arrow pointing from it to the top of the Testing Bar. The arrow is labelled 132%.

(Info-Tech Research Group; N = 81)

Effective DR testing is reliant on proper testing methodology and organizational mindset; not on budget size

Budget constraints should not be why your organization neglects testing. Conducting resource-efficient tests such as tabletop exercises is still an effective way to improve DR preparedness.

The image is a pie chart, with the following sections: A-2%; B - 9%; C - 16%; D - 22%; F - 51%

A = Extremely prepared for all disaster scenarios

F = Unprepared for majority of disaster scenarios

The Disaster Recovery Preparedness Benchmark Survey indicated a DR preparedness score for each respondent. About three quarters of organizations were at risk of not being able to recover from a disaster. Among these organizations who were at risk, a common trait is the lack of consistent DR testing and maintenance. (Disaster Recovery Preparedness Council, “The State of Global Disaster Recovery Preparedness” Annual Report 2014)

Best Practices From Prepared Organizations

Those who scored high on the survey exhibited these distinct traits:

  • Tested their DR plans very frequently: Organizations who consistently tested and revised their DRP were able to create a much more actionable and reliable DRP, which is a primary factor in DR preparedness.
  • Identified specific RTOs and RPOs: All prepared organizations had very accurate estimates of their RTOs and RPOs for each of their Tier 1 systems. The accuracy of these metrics were supported by a comprehensive testing plan that allowed the organizations to practice the DRP and make refinements.
  • Large DR budget did not indicate a better DRP: While testing can be resource intensive, simply having a larger budget did not indicate a more prepared organization. An efficient testing strategy that extracts value from several smaller tests is able to deliver just as much value as large expensive tests. A strong and comprehensive DR test plan is much more reliant on the testing process and the commitment from participants. A good DR test plan will significantly improve your DR preparedness and give you a much better chance at mitigating the impact of IT disasters.

Regular testing ensures your DRP stays current and reliable through the constant changes in your data center

A DR test plan defines the process and resources required to ensure regular reviews, testing, and plan updates to keep your DRP accurate and complete.

" If you are running your shop effectively and proactively by having a consistent process of DR testing, review, and updates, then you can improve your ability to recover your IT infrastructure and your business by mitigating the potential consequences of disruptive events when they occur." – Paul Kirvan, FBCI, CISA, Independent IT Consultant/Auditor, Paul Kirvan Associates

Goals of a DR Test Plan

Validate the effectiveness of the DRP

A comprehensive DR test plan enables your organization to ensure the accuracy, completeness, and relevance of your recovery procedures.

If you do not have a DRP, refer to Info-Tech’s Create a Right-Sized Disaster Recovery Plan blueprint. Without a DRP, your DR testing validates only technology and not process.

Ensure data center changes are reflected in your DRP

DR testing uncovers gaps that can only be found by simulating recovery.

In the same manner, regular DR testing (at least annually) ensures the DRP stays current as your data center undergoes changes year after year.

Improve resiliency

Truly resilient organizations have DR and service continuity considerations ingrained in their everyday project and maintenance planning.

Similarly, DR testing improves an organization’s resiliency through better preparedness, and helps reinforce the importance of a solid, comprehensive DRP to business continuity.

Case Study: SAI Global’s newfound focus on DR started with a refresh of its testing methodology

Situation

  • SAI Global is a risk management, standards compliance, and information company based in Sydney, Australia.
  • In 2011, SAI Global’s board members mandated an update to their existing DR capabilities.
  • Under the SAI Global umbrella they had many smaller business units who all had different strategies in terms of disaster recovery. Most of these strategies were designed around an older physical environment, when the current SAI Global was practically all virtual. This outdated DR strategy made testing very difficult.

Action

  • When SAI Global first started their DR update, it was very IT focused, and decisions were made from a technology point of view only. However, after consulting with the business, they realized that the scope of the update needed to be much wider. In particular, SAI Global wanted to incorporate the idea of having a centralized DR process that can provide consistent reporting and consistently test the systems.
  • To achieve the above goal, SAI Global used a Site Recovery Manager (SRM) that allowed their business units to test all of their systems in parallel with each other.
  • Phase one of this update process took 18 months to complete.

Result

  • After the implementation of the SRM, SAI Global has been able to significantly improve their testing process. Currently, they are able to fully test their systems with minimal to no interruption to production. This was a huge win for many of their business units, as it eliminated one of the biggest hurdles to testing.
  • SAI Global’s publishing business had a mandate for achieving five nines, and by creating this constant and structured DR testing process, the publishing business was able to achieve this goal.
  • After phase one, SAI Global is looking to further improve their DR capabilities and perhaps even transition into areas such as disaster avoidance.

(Gardner, Dana. "Case Study: Strategic approach to disaster recovery and data lifecycle management pays off for Australia's SAI Global”)

Develop a DRP test plan – project overview

1. Determine your DR testing readiness and scope 2. Create a project charter to build a test plan 3. Create the DR test plan 4. Turn lessons learned into better preparedness
Best-Practice Toolkit

1.1 Identify current testing readiness and action items

1.2 Determine current DR proficiency and need

2.1 Identify roles and responsibilities for building the test plan

2.2 Define project parameters and milestones

3.1 Create a framework for the overall test plan

3.2 Create the passive testing facilitator’s handbook

3.3 Create the active testing facilitator’s handbook

4.1 Define a process for incorporating lessons learned from testing

4.2. Create a DRP review, testing, and maintenance schedule

Guided Implementations
  • Call 1: DR testing overview, and identify current capabilities
  • Call 2: Determine readiness for DR testing, and appropriate next steps
  • Call 1: Identify and assign roles and responsibilities for building the test plan
  • Call 2: Set expectations for objectives, resource requirements, and target milestone dates
  • Call 1: Identify resource requirements to execute the DR test
  • Call 2: Plan your passive testing exercises
  • Call 2: Plan your passive testing exercises
  • Call 1: Define a process for incorporating lessons learned from testing
  • Call 2: Create a DRP review, testing, and maintenance schedule
Onsite Workshop

Module 1:

Determine the appropriate level of testing

Module 2:

Create a project charter for building the test plan

Module 3:

Create a test plan and supporting documentation

Module 4:

Translate lessons learned in testing into improving overall DR preparedness

Phase 1 Results:

  • Assessment of your current DR testing capability and appropriate scope

Phase 2 Results:

  • Project charter that outlines objectives, resource requirements, and target milestone dates

Phase 3 Results:

  • DR test plan and supporting documents

Phase 4 Results:

  • A process for ensuring test results are incorporated into the DR process

Workshop overview

Contact your account representative or email Workshops@InfoTech.com for more information

This workshop can be deployed as either a four or five day engagement depending on the level of preparation completed by the client prior to the facilitator arriving onsite.

Pre-Workshop Day 1 Day 2 Day 3 Day 4
Preparation Workshop Day Workshop Day Workshop Day Workshop Day

Workshop Preparation

  • Gather and evaluate current DRP documentation.
  • Gather SLAs of current vendors and other third parties that the organization is dependent on.
  • If applicable, review current DR environment status.

Morning Itinerary

  • Introduction to the necessity of DR testing.
  • Review of current testing practice.
  • Assess current capabilities for Tier 1.

Afternoon Itinerary

  • Assess current capabilities for Tier 2 and 3 systems.
  • Assess current proficiency and need.
  • Determine testing strategy based on action items.

Morning Itinerary

  • Introduction and creation of the DR testing project charter.
  • Create the Test Plan Summary.

Afternoon Itinerary

  • Introduction to passive testing.
  • Create facilitator questions.
  • Establish scenarios for the entire testing cycle.
  • Complete the remainder of the Passive Testing Facilitator’s Handbook.

Morning Itinerary

  • Introduction to active testing.
  • Document all necessary resource requirements.
  • Create the issue log.

Afternoon Itinerary

  • Create comprehensive test schedules for each test that is scheduled.
  • Generate the testing evaluation survey.

Morning Itinerary

  • Establish a comprehensive test review program for both active and passive testing.
  • Create hotwash materials and after-action reports.

Afternoon Itinerary

Plan next steps:

  • Leverage the Summary of Test Results presentation deck and identify the resources necessary to present to executives.

Blueprint tools and templates overview

The following tools and templates are included in this blueprint to help you build your DR test plan:

Develop a DR test plan

Sections:

Introduction

Project Phases

Summary

What's in this Section:

  • Phase 1: Determine your DR testing readiness and scope
  • Phase 2: Create a project charter to build a test plan
  • Phase 3: Create the DR test plan
  • Phase 4: Ensure your DRP is updated with lessons learned

Phase 1: Determine your DR testing readiness and scope

Phase 1:

Determine your DR testing readiness and scope

Phase 2:

Create a project charter to build a test plan

Phase 3:

Create the DR test plan

Phase 4:

Translate lessons learned into improving overall preparedness

Phase 1 outline: Identify potential crises and crisis management gaps

Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com for more information.

Complete these steps on your own, or call us to complete a guided implementation. A guided implementation is a series of 2-3 advisory calls that help you execute each phase of a project. They are included in most advisory memberships.

Guided Implementation 1: Determine your DR testing readiness and scope

Proposed Time to Completion (in weeks): 2 weeks (1 call every 1 week)

Phase 1.1: Identify current testing readiness and action items

Start with an analyst kick-off call:

  • Determine your current testing readiness based on prerequisites.
  • Correlate testing readiness with the completion status of Action Items List.

Then complete these activities…

  • Complete and evaluate the Readiness Assessment tabs for Tier 1, 2, and 3 systems.
  • Review the list of action items that still need to be completed before testing can commence.

With these tools & templates:

  • DR Test Plan Development and Execution Workflow
  • DR Testing Readiness Assessment Tool

Phase 1.2: Determine current DR proficiency and need

Review findings with analyst:

  • Analyze your current testing proficiency and analyze your likelihood of success for each system Tier.
  • Evaluate your need for testing and the testing gap between need and proficiency.

Then complete these activities…

  • Define current testing capability to determine proficiency.
  • Document your need for testing based on industry, customer, and internal demand.
  • Based on current situation, determine your testing strategy.

With these tools & templates:

  • DR Testing Readiness Assessment Tool

Phase 1 Results & Insights:

  • Develop a clear understanding of the current DR testing capabilities which provides insight into the type of testing strategy you should use.

Understand the impact and complexity of each DR testing methodology to create the best fit solution

1.1: Readiness Assessment

DR Plan Testing Complexity Spectrum

DR testing methodologies vary greatly in complexity, resource demand, and preparation time. Not all methodologies are practical or even possible for all organizations.

  1. Tabletop Testing (TTX)
    • Walking through DR scenarios, using the DRP. Tabletop is strictly a classroom exercise.
  2. Unit Testing as Systems Are Updated
    • Testing standby equipment, particularly as updates are made to the production environment.
  3. Simulation Testing
    • Starting up standby systems and validating basic functionality.
  4. Parallel Testing
    • Moving beyond simply starting up machines to also restore business data and verify that standby systems can be used to execute business processes/transactions.
  5. Full-Scale (Full Interruption) Testing
    • The primary site is shutdown and a full failover to an alternative site is executed, with a restore of all relevant data from the organization. The DR site becomes the primary site for this test.

Active vs. Passive Testing

This complexity spectrum Tabletop Testing is categorized as “Passive Testing” while all other forms of testing are categorized as “Active Testing.” Refer to the Appendix for a more detailed breakdown of each of the testing methods.

Start the DR test development process by identifying your current testing maturity

1.1: Readiness Assessment

2 – Readiness Assessment Tool: Determine the appropriate level of testing (e.g. parallel testing) and high-level scope (e.g. Tier 1 systems) based on your current capabilities.

The image is a workflow chart, titled Test Plan Development and Execution High-Level Workflow. The text above corresponds to a box near the top of the workflow plan.

"There are different levels of testing and it is very progressive. I do not recommend my clients to do anything, unless they do it in a progressive fashion. Don’t try to do a live failover test with your users, right out of the box." – Steve Tower, Management Consultant, Steve Tower, Disaster Recovery Plans & Assessments

Prepare for the Readiness Assessment by ensuring you have the necessary prerequisites completed

1.1: Readiness Assessment

Evaluate your DRP status

Confirm that the documentation for your DRP is complete. This would include documenting all the roles and responsibilities, incident response plans for each system, and a systems tier list. For more information on how to complete this stage, see: Create a Right-Sized Disaster Recovery Plan.

Evaluate your DR environment status

Identify your current DR environment solution (in-house, co-lo, MSP, vendor, none, etc.). Also review and identify the DR environment terms and conditions to determine:

  • Type of testing permitted.
  • Requirements for scheduling tests.
  • Required sign-offs for testing.

Evaluate your vendor dependencies

Identify your critical vendors (e.g. hosting vendors, product support vendors, etc.).

  • Determine DRP support expectations with these vendors. If this is currently not documented, then work with the vendor to establish DR agreements that include support for testing. For more information, see: DRP Vendor Evaluation Questionnaire and Tool.

The following tools require that the above steps be defined and documented. If you have yet to complete the above steps, please contact Info-Tech for assistance.

Identify current capabilities by assessing if you meet the base requirements for passive and active testing

1.1: Readiness Assessment

[Activity] 2 - DR Testing Readiness Assessment Tool – Readiness Assessment: Prepared by Facilitator

  1. The Passive Testing section is an assessment of your DRP and is a pre-requisite for all subsequent testing options.
  2. The Active Testing section is designed to assess the specific type of Active Testing (simulation, parallel, full-scale) that best suits the current capabilities of your organization.
  3. Within both sections there is a question that asks for the status of “action items.” Refer to the tab “Action Item List” to gain an understanding of all the necessary requirements before testing should occur.
  4. Once populated, the tool will assess which type of DR testing your organization meets the requirements for.
    • Note: This tool is a reflection of maximum testing capability; a comprehensive testing strategy involves a series of less complex tests that lead up to your maximum testing capability.
  5. Repeat this process for Tier 2 and Tier 3 systems.

The image shows a chart, titled DR Testing Readiness Summary for Tier 1 Systems.

Use the Action Item List to track additional requirements that must be met before testing

1.1: Readiness Assessment

  • DR testing is a highly complex activity that requires a large amount of supporting documentation and planning processes. Info-Tech has identified a list of all the action items that are necessary for comprehensive test planning. Use the action items list to keep track of test planning progression.
  • The Action Item List is broken into 2 primary sections:
    1. Test Plan Readiness Requirements: High level documents that break down how each DR test will occur.
    2. System-Level Readiness Requirements: A granular breakdown of system level documentation. E.g. How the ERP will be recovered during a disaster.
  • Document the status of each action item. Choose between: “N/A, completed, in-progress, and requires action.”
  • Along with the status of each piece of documentation, also indicate the person who is responsible for its completion based on a pre-established estimated completion date.

[Activity] 2 - DR Testing Readiness Assessment Tool – Action Item List: Prepared by Facilitator

The image shows multi-section chart, titled DR Testing Action Item List.

Determine your current DR testing proficiency

1.2: Testing Proficiency and Need

[Activity] 2 - DR Testing Readiness Assessment Tool – Testing Proficiency: Prepared by Facilitator

  1. Answer each question and differentiating between the different system tiers, provide a response between 1-10 (1 = Low/Infrequent and 10=High/Very Frequent).
  2. Use the “Weight” system to adjust how important each question is in relation to your organization. The current default is based on an average organization. This weighting system tailors the scores to your specific organization.
  3. Once populated, the tool will generate a DR Testing Proficiency Score. This score is a measure of your testing program’s current maturity as well as likelihood of success for testing. E.g. If the readiness assessment determined that you are ready for “Full-Scale” testing but your proficiency score is low, then for your organization to succeed in full-scale testing, significantly more effort such as having several TTX dry runs could be needed.

The image shows a chart, with the title DR Testing Proficiency Assessment.

Determine your need for DR testing

1.2: Testing Proficiency and Need

[Activity] 2 - DR Testing Readiness Assessment Tool – Testing Proficiency: Prepared by Facilitator

  1. Answer each question and differentiating between the different system tiers, provide a response between 1-10 (1 = Low/Infrequent and 10=High/Very Frequent).
  2. Use the “Weight” system to adjust how important each question is in relation to your organization. The current default is based on an average organization.
    • E.g. Organizations in the financial industry will likely have a very high weight allocation to the regulatory requirements. Consequently, the score attributed to this question will have a much larger impact on the overall need for testing score. This will make the score more accurate and relevant to the specific needs of your organization.
  3. Once populated, the tool will generate a Need for DR Testing Score. If there is a high need for testing, then your organization should take steps to improve your DR testing proficiency so that you are able to meet the higher requirements. Track this metric each time you revise your testing process to determine potential testing demand changes.

The image shows a sample chart, with the text Need for DR Testing Assessment at the top left. The left-hand column of the chart includes questions, and the rest of the columns indicated Tier 1, 2, or 3. Scores have been inputted based on whether or not the user answers the questions as Strongly Disagree to Strongly Agree. At the bottom, the chart generates scores.

Assess your current DR testing gap

1.2: Testing Proficiency and Need

Assess your current DR testing gap in the Proficiency Assessment – DR Testing Readiness Assessment Tool

  1. The Current Testing Gap score will be automatically populated once the previous two steps are completed. This score represents the difference between Testing Proficiency and Need for Testing. Ideally you would want your Testing Gap to be more than 5% positive, as that would mean not only are you capable of meeting the current DR testing needs but you are also relatively prepared for the increased needs of tomorrow.
  2. Track this metric as you repeat the testing process to determine how well your organization is closing the testing gap.

The image shows a chart, with Current Testing Gap Score written at top left, and scores written across the top on the right. On the bottom left, How to Assess Gap Scores is written, and on the right there are explanations for each type of score.

Determine the appropriate testing strategy based on capability, likelihood of success, and need for testing

1.2: Testing Proficiency and Need

Testing capability + Likelihood of success + Need for testing = Optimal Testing Strategy

Example Analysis

  • My organization’s maximum testing capability is Parallel Testing for Tier 1 systems and capable of Simulation Testing for Tier 2 and 3 systems.
  • My likelihood of success is around 70% for Tier 1 systems and 30% for Tier 2 & 3 systems.
  • There is a high need for testing in my organization and as a result there is a negative 6% Testing Gap between my current testing proficiency and need for testing for Tier 1 systems.

My tier 1 test plan should incorporate a mixture of TTX and simulations and culminate in a parallel test at the end of the year that incorporates the lessons learned from other tests for Tier 1 systems. Since the likelihood of success for Tier 2 and 3 is relatively low, I should focus first on mastering the recovery process using TTX before transitioning into active testing. Both internal and external mandates are pushing for more complex testing, as such investments are needed to improve current infrastructure so that full-scale testing can be conducted.

Info-Tech Insight

An optimal testing strategy is like building a pyramid: before conducting a parallel test or a full-scale test, it is best practice to first conduct several TTXs and simulation tests. Reduce the risks of complex testing by leveraging the lessons learned from less-complex tests.

If you want additional support, have our analysts guide you through this phase as part of an Info-Tech workshop

Book a workshop with our Info-Tech analysts

  • To accelerate this project, engage your IT team in an Info-Tech workshop with an Info-Tech analyst team.
  • Info-Tech analysts will join you and your team onsite at your location or welcome you to Info-Tech’s historic Toronto office to participate in an innovative onsite workshop.
  • Contact your account manager (www.infotech.com/account), or email Workshops@InfoTech.com for more information.

The following are sample activities that will be conducted by Info-Tech analysts with your team:

1.1 Identify current testing readiness and action items

Document all the necessary documentation and resources necessary before testing can commence. As well, assess the readiness for testing for Tier 1, 2, and 3 systems to gain a holistic view of current testing maturity and gaps that need to be closed before testing.

1.2 Define testing strategy through proficiency and need analysis

Complete the proficiency and need for testing assessment for all system tiers. Discuss the score’s implication on the likelihood of success of testing and how that contributes to the testing strategy. Finalize a testing strategy and consider future improvements based on need analysis.

Phase 2: Create a DR test plan project charter

Phase 1:

Determine your DR testing readiness and scope

Phase 2:

Create a project charter to build a test plan

Phase 3:

Create the DR test plan

Phase 4:

Translate lessons learned into improving overall preparedness

Phase 2 outline: Create a project charter to build a test plan

Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com for more information.

Complete these steps on your own, or call us to complete a guided implementation. A guided implementation is a series of 2-3 advisory calls that help you execute each phase of a project. They are included in most advisory memberships.

Guided Implementation 2: Create a project charter to build a test plan

Proposed Time to Completion (in weeks): 4 weeks (1 call every 2 weeks)

Phase 2.1 Identify roles and responsibilities

Start with an analyst kick-off call:

  • Review the benefits of the project charter (e.g. clarify expectations and resource requirements).
  • Identify staff who need to be included in building the test plan (identify roles and responsibilities).

Then complete these activities…

  • Complete the roles and responsibilities table in the project charter template (i.e. assign staff to roles), and modify descriptions as needed.
  • Work with the members of your DR test plan team to complete the rest of the project charter.

With these tools & templates:

  • DR Test Plan Project Charter Template

Phase 2.2: Define project parameters and milestones

Review findings with analyst:

  • Review the project charter draft, including assigned roles and responsibilities.
  • Determine appropriate project parameters, including milestones and target dates.

Then complete these activities…

  • Complete the project parameters (e.g. objectives) and the milestones table in the project charter template.
  • Obtain sign-off from senior management.

With these tools & templates:

  • DR Test Plan Project Charter Template

Phase 2 Results & Insights:

  • Clarify project expectations and resource requirements. Executive buy-in is critical to ensuring DR testing does not get pushed to the backburner.

Obtain executive support by creating a project charter for developing the test plan

2.1: Roles and responsibilities

At this stage of the project, the Readiness Assessment tool is completed, and you have a good understanding of the types of testing that your organization is capable of and needs to work toward. Next, create the project charter.

The image is a flowchart titled Test Plan Development and Execution High-Level Workflow. There is a section highlighted, the text of which is transcribed below.

3 – Project Charter: Set expectations for scope, resource requirements, and target dates for building the test plan.

"Ownership needs to be defined clearly from the outset. Ambiguity in terms of who is responsible for each aspect of the testing process and who owns which system for the tests will almost certainly be problematic later on." – Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional

Use Info-Tech’s DR Test Plan Project Charter Template to clarify requirements and expectations

2.1: Roles and responsibilities

Project Charter Components

Use the project charter to define project parameters, roles, and objectives, and thereby clarify expectations with the executive team. The specific components are listed below and then described in more detail in the remainder of this phase:

  • Project Overview: Includes objectives, deliverables, and scope.
  • Governance and Management: Includes roles, responsibilities, and resource requirements. Project Risks, Assumptions, and Constraints: Includes risks and mitigation strategies, as well as any assumptions and constraints. Project Sign-off: Includes IT and executive sign-off.

Note: This phase directs you to name the roles and responsibilities first so they can assist in defining the project charter.

DR Test Plan Project Charter Template

The image is a screen capture from the DR Test Plan Project Charter Template document. The image shows the table of contents for this document.

Email to arrange GI call: GuidedImplementations@InfoTech.com

Define roles and responsibilities for the DR test team

2.1: Roles and responsibilities

Identify who will be participating in developing the test plan, and clarify levels of responsibility using the COBIT “RACI” approach:

  • Responsible: Responsible for doing the activity (the work).
  • Accountable: Accountable to ensure the activity (the work) happens.
  • Consulted: Consulted prior to decision or action.
  • Informed: Informed of the decision or action.

Specifically, assign the following roles (the project charter template provides additional descriptions which you can modify as needed to suit your organization):

  • Executive Sponsor: Liaison with the executive team (the CIO would be a good candidate for this role).
  • Project Lead: Responsible for driving the project, determining the methodology to be followed, and assigning required resources.
  • DR Testing Facilitator: Function as the project manager. This includes coordinating resources and reporting progress.
  • Subject Matter Experts (SMEs): Required to ensure they have a test plan for their respective systems.
  • Business Unit Managers: Assign business users to assist with developing acceptance test plans.
  • Executive Team (or named subset): Sign off on the Project Charter and the DR Test Plan when completed.

Note: This blueprint is directed primarily at the Project Lead who will work with the rest of the team.

The image shows a text-based table. This is an example of how one would define the roles and responsibilities using the RACI approach, with Project Roles listed in the left column, then a description of each role, and then in a righthand column, the RACI category in which the role fits.

Define project parameters

2.2: Project parameters and milestones

Complete the following sections in the project charter template and review with the executive sponsor to confirm project parameters.

  • Project Background and Drivers: Document the rationale for the project, which will reinforce support for the project. Drivers might include a failed audit or concern over the organization’s current ability to recover from a disaster.
  • Project Objectives: The project charter template includes objectives based on this blueprint – modify these as needed.
  • Project Deliverables: The project charter template lists the core deliverables for a DR test plan (generated by this blueprint).
  • Project Scope: Further clarify objectives by listing what is in scope and out of scope.

The image is a screen capture of the described sections of the project charter template, with sample information filled in.

Set achievable, realistic target dates for project milestones

2.2: Project parameters and milestones

The project milestones section in the project charter is prefilled based on the steps in this blueprint to provide a starting point for your project planning:

  • Customize the milestones to accommodate special requirements for your organization.
  • Set achievable, realistic target dates. Most organizations find they have several gaps in DR testing capability that need to be addressed.
  • Use the project milestones table to guide project management and scheduling.

The image is a screen capture of the Project Milestones section of the project charter template, with sample information filled in.

Further clarify project parameters by documenting risks, assumptions, and constraints

2.2: Project parameters and milestones

Set Expectations Up Front

For most organizations, the biggest risk is resource availability. More immediate tasks take priority and DR testing gets pushed to the back burner.

Complete the following sections in the project charter template to explicitly state these risks and resource requirements:

  • Risks, Assumptions, and Constraints
  • Reviews and Reporting
  • Resource Requirements

Mitigate Project Risks

As noted in the project charter template, an effective project risk mitigation strategy is to block a specific timeslot each week to allow time for collaboration as well as completing individual assignments.

The earlier sections of the project charter will also help set expectations and mitigate these risks. For example:

  • Executive sponsorship. The more senior the executive, the better. If steps are delayed due to lack of buy-in or conflicting projects, you need to be able to escalate these issues, and the higher you can go (if necessary), the better.
  • Assigning named resources. Roles such as the Project Lead, DR Test Facilitator, and system SMEs will do the bulk of the test plan development. Assigning these roles up front will help you clarify resource requirements.
  • Define clear project objectives and milestone target dates. This sets clear expectations about the project direction and expected outcomes.

If you want additional support, have our analysts guide you through this phase as part of an Info-Tech workshop

Book a workshop with our Info-Tech analysts

  • To accelerate this project, engage your IT team in an Info-Tech workshop with an Info-Tech analyst team.
  • Info-Tech analysts will join you and your team onsite at your location or welcome you to Info-Tech’s historic Toronto office to participate in an innovative onsite workshop.
  • Contact your account manager (www.infotech.com/account), or email Workshops@InfoTech.com for more information.

The following are sample activities that will be conducted by Info-Tech analysts with your team:

2.1 Identify roles and responsibilities

Review the benefits of the project charter (e.g. clarify expectations and resource requirements). As well, we will identify the staffing requirements needed to build the test plans. This will include a discussion of the roles and responsibilities.

2.2 Define project parameters and milestones

Review the draft project charter, including assigned roles and responsibilities. From here, we will determine the appropriate project parameters, and include milestones with target dates. Once the project charter is finalized, we will look for relevant sign-off authority to give the approval to initiate the planning process.

Phase 3: Create the DR test plan

Phase 1: Determine your DR testing readiness and scope

Phase 2: Create a project charter to build a test plan

Phase 3: Create the DR test plan

Phase 4: Translate lessons learned into improving overall preparedness

Phase 3 outline: Create the DR test plan

Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com for more information.

Complete these steps on your own, or call us to complete a guided implementation. A guided implementation is a series of 2-3 advisory calls that help you execute each phase of a project. They are included in most advisory memberships.

Guided Implementation 1: Create the necessary documentation for DR testing

Proposed Time to Completion (in weeks): 6 Weeks ( 1 Call per 2 weeks)

Step 3.1: Test Plan Summary

Start with an analyst kick-off call:

  • Discuss test plan creation methodology
  • Determine the scope of systems to include in testing

Then complete these activities…

  • Document System Status
  • Determine Test Schedule

With these tools & templates:

  • DR Test Plan Summary Template
  • DR Test Plan System Status Worksheet
  • DR System Test Plan Template

Step 3.2: Passive Testing Handbook

Review findings with analyst:

  • Review passive testing methodology
  • Determine passive testing requirements and scope

Then complete these activities…

  • Complete the Passive Testing Facilitator's Handbook

With these tools & templates:

  • DR Test Plan Passive Testing Handbook

Step 3.3: Active Testing Handbook

Finalize phase deliverable:

  • Review passive testing methodology
  • Determine passive testing requirements and scope
  • Discuss active testing execution tools

Then complete these activities…

  • Complete the Active Testing Facilitator's Handbook

With these tools & templates:

  • DR Test Plan Active Testing Handbook
  • DR Test Issue Log and Analysis Tool
  • DR Active Test Evaluation Survey

Phase 3 Results & Insights:

  • Identified overall testing schedule based on system prioritization and completed all necessary planning documentation needed to execute both active and passive testing.

Optimize testing resources by mapping out the testing process through an all-inclusive DR Test Plan

3.1: Test Plan Summary

At this stage of the project, the Project Charter is completed and approved. You are now ready to create the necessary handbooks and exercises for Passive and Active Testing.

The image shows the development work flow chart, with the centre section highlighted. It is shown in a separate image below.

The image is the centre section of the development work flow chart. This section is titled Create a DR Test Plan to cover all phases of the DR test. It includes subsections titled Test Plan Summary, System Status Worksheet, Passive Testing Handbook, Active Testing Handbook, and System Test Plans.

Info-Tech Insight

Maximize the value of each test by planning ahead. For instance, schedule tabletop exercises to act as dry runs before active testing. Lessons learned greatly improve the success and effectiveness of future DR tests.

Identify the systems to include in your overall DR Test Plan

3.1: Test Plan Summary

[Activity] 4 – DR Test Plan Summary Template: Prepared by Test Facilitator and SMEs

If you have already conducted a baseline test of your overall DR environment.

If you are not provisioned to test all systems. E.g. Your DR environment is intentionally equipped to serve only Tier 1 systems.

Work with management and determine the specific systems to include in your DR Test.

Scope Test Selection Criteria

Determine the ideal systems to include in a test plan by following the criteria below:

  1. Level of criticality: Prioritize testing of Tier 1 systems before Tier 2, and Tier 2 before Tier 3.
  2. Magnitude/Frequency of change: Prioritize the testing of systems that have undergone significant changes or a high number of smaller changes, relative to systems that were unmodified. Leverage your change management process and tools to identify systems that have undergone significant changes.
  3. Time since last tested: Prioritize the testing of systems that have not been tested for an extended period of time.

Level of criticality is the primary deciding factor, magnitude/frequency of change is the secondary factor, and time since last tested is the tertiary factor.

Info-Tech Insight

Once you have successfully tested your entire DR environment (i.e. established a baseline), you can gain testing efficiency by then focusing future tests on a subset of your environment based on the criteria above (criticality, change, and time since last tested).

Document the identified systems for DR testing

3.1: Test Plan Summary

[Activity] 5 – DR Test Plan System Status Worksheet: Completed by Test Facilitator and SMEs

After the systems that are going to be tested have been identified, system owners can leverage the DR Testing System Status Worksheet to document the following:

  • Application/System to be tested: List of all the systems that are to be included in the test.
  • System owners: Identify the system owner of each application/system that is to be tested.
  • System Dependencies: Document items such as SAN, Active Directory, DHCP, DNS, etc.
  • DR Test Readiness Status: Ensure that the specific system is ready for testing by compiling the DR procedures, test plans (Unit Test, System Test, User Acceptance Test), backups, DR test configuration, and backout plans.

Construct a System Test Plan for each individual system

3.1: Test Plan Summary

  • A key line item in the Readiness Status section of 5 – DR Test Plan System Status Worksheet is the creation of System Test Plans for each individual system that is going to be tested. This individual system test plan is used by the system owners during the test and it includes the following:
    • A reference to the DR instructions (e.g. to failover to the DR environment, restore from backups if necessary, and bring the standby system online).
    • A reference to DR testing constraints, such as network configuration requirements to isolate the DR/standby system from the production environment, if necessary.
    • Procedures to validate system functionality after executing the DR procedure (e.g. unit test, system test, and acceptance test instructions to validate that the standby system is functioning as expected after the failover).

Assign system owners the task of completing the System Test Plan; they can leverage Info-Tech’s 8DR System Test Plan Template.

The image is a screen capture of the Introduction section, explaining the types of testing that one might use in their own test objectives. The template is designed to be customized to individual needs.

Establish a test schedule to roadmap your entire DR test plan

3.1: Test Plan Summary

A successful DR test plan is built on foresight; by planning out how each test feeds into the subsequent test, the maximum value of each test can be realized.

[Activity] 4 – DR Test Plan Summary Template: Completed by Facilitator

  • Identify the type of testing based on the assessments made during the readiness assessment.
    • It is best practice to start with tabletop exercises which will act as dry runs for the more complex testing methods.
  • Describe the test at a high level, indicating the scenario that will occur and the scope that the test will capture.
  • Document the date and time of testing for management approval.

The image is a screen capture of the Test Schedule template, which sample information included.

"All organizations will go through a crawl, walk, run phase in terms of test maturity. It is extremely important in the early stages of development to concentrate the focus on actual recoverability and data protection, enhancing these capabilities over time into a fully matured program that can truly test the recovery, and not simply focussing on the testing process itself." – Joe Starzyk, Senior Business Development Executive, IBM Global Services

Info-Tech Insight

Establishing a test schedule for the year enables the DR team and the rest of the organization to work from the same page and avoid resource conflicts. While future events may change test dates, an established and pre-approved test schedule will help you ensure resources are available for each test.

Passive testing overview: tabletop exercises are effective for incident response planning and validation

3.2: Passive Testing Handbook

Tabletop planning is a paper-based exercise where the DRP team walks through disaster scenarios and maps out what should happen at each stage, effectively defining their incident response plan. After you have a DRP in place, use this exercise to walk through and validate your incident response plan.

Tabletop planning had the greatest impact on meeting recovery objectives (RTOs/RPOs) among survey respondents

The image is a horizontal bar graph with different types of testing listed on the Y-axis and Relative Importance listed on the X-axis, from 0 to 60. Tabletop planning rates at 57% relative importance, compared to Unit Testing (33%); Simulation Testing (3%); Parallel Testing (6%); and Full-scale Testing (2%)

Note: Relative importance indicates the contribution an individual testing methodology, conducted at least annually, had on predicting success meeting recovery objectives, when controlling for all other types of tests in a regression model. The relative-importance values have been standardized to sum to 100%.

Success was based on the following items:

  • Recovery time objectives (RTOs) are consistently met.
  • IT has confidence in the ongoing ability to meet RTOs.
  • Recovery point objectives (RPOs) are consistently met.
  • IT has confidence in the ongoing ability to meet RPOs.

Why is tabletop planning so effective?

  • It enables you play out a wider range of scenarios than technology-based testing (e.g. full-scale, parallel, etc.) due to cost and complexity factors.
  • It is non-intrusive, so it can be executed more frequently than other testing methodologies.
  • It provides a thorough test of your incident response plan since the exercise is, essentially, paper-based.

Conduct a tabletop planning exercise using best practices

3.2: Passive Testing Handbook

Use tabletop planning to test the current achievable recovery timeline, and identify gaps in your current disaster recovery capabilities.

For each high-level recovery step, do the following:

  1. On white index cards:
    1. Record the step.
    2. Indicate the task owner.
    3. Note the task start and end time (use the running recovery time as your clock, where 00:00 is when the incident occurred).
  2. On yellow index cards, document gaps in people, process, and technology requirements to complete the step.
  3. On red index cards, indicate risks (e.g. no backup person for a key staff member).

Tabletop planning is simple, but effective:

  • Discuss each step from start to finish.
  • Keep focused; stay on task and on time.
  • Revisit each step and record risks and mitigation strategies.
  • Revise the plan with key task owners.

Info-Tech Insight

Record everything, but don’t get weighed down by tools. Relying on software or other technological tools can detract from the exercise. Use simple tools such as index cards and whiteboards.

Tabletop planning example

3.2: Passive Testing Handbook

Below is a picture of the results of an actual tabletop planning exercise.

The image shows a table, with a variety of differently coloured index cards arranged on it, each with writing on it.

Photo credit: Info-Tech

White index cards indicate high-level DR steps in a linear flow with branches to represent simultaneous steps.

Yellow index cards indicate gaps in people, process, and technology requirements to complete the step.

Red index cards indicate risks (e.g. no backup person for a key staff member).

Execute a successful TTX by drafting a Facilitator’s Handbook

3.2: Passive Testing Handbook

The primary contributor to ineffective tabletop exercises (TTXs) is the lack of engagement from participants. Facilitators can avoid disinterest by generating content rich discussions and realistic scenarios.

[Activity] 6 – DR Test Plan Passive Testing Handbook: Completed by Facilitator

Leverage the sample questions provided in the Passive Testing Handbook to drive insightful discussions during your tabletop exercise. In addition, prepare more questions prior to the exercise to ensure that every minute of the exercise contributes to the overall testing objectives.

The image shows a section of the Passive Testing Handbook, titled Facilitator Questions, followed by a list of questions.

Initiate TTX scenario planning with common threat scenarios that focus on overall service continuity

3.2: Passive Testing Handbook

Unrealistic scenarios are a key contributor to futile TTXs; focus initial TTXs on more common threats to service continuity such as hardware & software failures, network outages, and power outages.

Causes of Unacceptable Downtime:

Software Failure - 24%

Isolated Hardware Failure - 21%

↑ 45% Total ↑

45% of service interruptions that went beyond maximum downtime guidelines set by the business were caused by software and hardware issues.

External Network Failure - 19%

Power Outage - 18%

↑ 37% Total ↑

37% of incidents were caused by network or power outages.

Building is Inaccessible (e.g. due to a local hazard) - 5%

Equipment Damage (e.g. due to fire, roof collapse, etc.) - 7%

Natural Disaster - 5%

↑ 12% Total ↑

Only 12% of incidents were caused by major events (i.e. significant physical damage or regional impact).

(Info-Tech Research Group; N=87)

Info-Tech Insight

Does this mean I don’t need to worry about natural disasters? No. It means DR test planning needs to focus on overall service continuity, not just major disasters. If you ignore the more common, but less dramatic causes of service interruptions, you will suffer the proverbial “death from a thousand cuts.”

Maintain the realism of DR scenarios by planning for compound scenarios

3.2: Passive Testing Handbook

During a real disaster, incidents typically do not occur in an isolated sequential order. A realistic scenario should incorporate the possibility of multiple incidents occurring simultaneously (e.g. a gas leak requires building evacuation and power to be shut down).

[Activity] 6 – DR Test Plan Passive Testing Handbook: Completed by Facilitator

  • Document the scenario that the TTX team will be walking through in the Passive Testing Handbook. Below is an example:

Scenario

Instructions: Identify the scenario that the participants will be walking through. For a TTX1 scenario, Info-Tech Research Group advocates a denial of access type of incident where your IT infrastructure is inaccessible but not physically damaged. Adjust the scenario as you see fit.

6:00AM Monday morning: Local authorities confirmed that there has been a gas leak in close proximity to your building. The entire office building is compromised and all staff needs to be evacuated. All power has been terminated by city officials. The gas leak is expected to take local authorities 2 weeks to remedy and the estimated time for return access to the primary building will be 3 weeks. Given the circumstances, the executives of XYZ Corporation has decided to failover all IT functions to the DR site.

Info-Tech Insight:

While a compound disaster can increase the realism of the TTX, it is generally best practice to limit the number of incidents (e.g. hardware failure combined with network outage) within a scenario to 2 or 3. A scenario with too many incidents can cause the TTX to be too complex and difficult to complete.

Capture key learning points from the TTX in a Hotwash

3.2: Passive Testing Handbook

Eliminate the possibility of disengagement by documenting the strengths and weaknesses of the exercise, as well as areas of improvement for the DRP, immediately after the exercise.

[Activity] 6 – DR Test Plan Passive Testing Handbook: Completed by Facilitator

Hotwash: A discussion directly following the exercise that documents and analyzes the results and lessons learned.

Participant Evaluation Survey: A survey that is distributed to the participants following the hotwash, to capture the discussion. Also allows for anonymous comments.

  • After the TTX scenario has been documented, continue along on the Handbook template and document the hotwash questions and adjust the participation evaluation survey if needed.
  • The hotwash discussion and the participant evaluation survey is designed to be handed out directly following the TTX.

The image shows the Immediate Debrief/Hotwash section from the Passive Testing Handbook.

Ensure that lessons learned during a test contribute to improving overall DR planning and future tests

3.2: Passive Testing Handbook

Lessons learned during an exercise can only translate into operational improvements in a real disaster through repetition; a best practice is to conduct a post-mortem one month after the TTX.

DR Test Plan Passive Testing Handbook

Test Results and Lessons Learned: Following the TTX exercise, the Facilitator will prepare materials for a post-mortem that will occur about one month after. This meeting is used to review the results from the exercise, as well as assign and approve action items to incorporate lessons learned during the TTX and improve the disaster recovery process.

  • This specific section of the template does not need to be modified by the Facilitator during the planning phase. This document is prepared following the exercise.

The image shows a screen capture of the Test Results and Lessons Learned template section of the Passive Testing Handbook, including the General Findings, Specific Gaps Found, Specific Risks Found, and Additional Action Items.

"We had a mature process, but after each test we were still always able to learn something new." – Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional

Repeat the planning process for subsequent TTXs and consider adjusting complexity or scope

3.2: Passive Testing Handbook

The key benefit of establishing a test plan before testing is that it allows you to see how each test feeds into the next; use this strategy to your advantage and improve DR capability with every test.

For the first TTX, the scenario below might have been the one that you walked through:

Scenario for TTX 1

6:00 AM Monday morning: Local authorities confirmed that there has been a gas leak in close proximity to your building. The entire office building is compromised and all staff need to be evacuated. All power has been terminated by city officials. The gas leak is expected to take local authorities 2 weeks to remedy and the estimated time for return access to the primary building will be 3 weeks. Given the circumstances, the executives of XYZ Corporation have decided to failover all IT functions to the DR environment.

For a second TTX, repeat the same planning process but adjust the scenario so that lessons learned in the first scenario can be applied to the second. This will allow the team to demonstrate that they are capable of solving a more complex situation and help reinforce the lessons learned.

Scenario for TTX 2

6:00 AM Monday morning: Building security personnel alert XYZ Corporation that due to heavy rain, the data center has been flooded. Power has been shut down and all systems in the data center are damaged.

Active Testing overview: it is important for your organization to conduct regular active tests

3.3: Active Testing Handbook

Active testing consists of simulation, parallel, and full-scale testing. In active tests, the DR procedures, test plans, and technology (hardware/software) are operationally executed to mimic a live scenario. Passive tests that focus on “what would have happened,” are often used as a dry run before the active test.

Advantages of Active Tests

1. Uncover minute details

While TTXs are great for identifying high-level process issues, they are often unable to uncover issues at a granular level. Changes such as updated passwords or new phone numbers will not be reflected in a TTX and can only be identified during a live test where the procedures are actually executed.

2. Hands-on practice

Regular testing allows the system owner to familiarize themselves with the recovery process, which will contribute to a faster and more reliable recovery process.

3. Experience real pressure

A disaster is often a high pressure situation and it is nearly impossible to predict how your staff will respond under such circumstances. The only way to ensure that your staff responds favorably is to familiarize them through real life testing and to test frequently enough so that the response is instinctual and not reactive.

While TTXs are very efficient exercises, they cannot completely replace active testing. Your organization cannot be fully confident that business operations can be sustained and recovered until you have executed an active test.

Create an Active Testing Facilitator’s Handbook that keeps track of all the necessary resource requirements

3.3: Active Testing Handbook

Active Testing is significantly more involved than Passive Testing, and to ensure that an active test can run smoothly, all relevant resources need to be documented up front and signed off.

[Activity] 7 – DR Test Plan Active Testing Handbook: Completed by Facilitator

  • Document all of the relevant requirements for all the active tests that were planned in 4 – DR Test Plan Summary Template. Make sure to indicate which specific test will be needed for each identified resource.
  • Resources Include:
    • Staff Requirements
    • Documentation Requirements
    • Technology Requirements
    • DR Environment Requirements
    • Third-Party (Vendor) Requirements
    • Budget Requirements
    • Risks and Mitigation Strategies

Example:

The image shows a screen capture of the Resource Requirements section, with a table where columns indicate Name of Participant, Role, Responsibility, Contact Information, and Confirmation Status. The table is filled in with sample information.

Info-Tech Insight

Reduce vendor and travel costs (if applicable) by combining the simulation and parallel test into one exercise. Start with simulation testing (bring systems online and verify basic functionality), and continue with parallel testing (load transaction/application data, and conduct user acceptance testing by replicating business processes with that data in the DR environment).

Prepare a Test Issue Log that will allow your DR team to document the types of errors/issues that occur

3.3: Active Testing Handbook

[Activity] 9 – DR Test Issue Log and Analysis Tool: Completed by Facilitator

  • The aim of this tool is to help the DR Facilitator analyze the types of errors/issues that are occurring during the DR test.
  • In order to standardize reporting, the Facilitator needs to create the list of all error/issue categories that can potentially occur during the test. Once this list has been completed, the Facilitator will distribute a copy of the Issue Log and Analysis Tool to each system owner for them to use during the test.
  • The system owners who are executing the test will then use the dynamic drop-down menus to populate the Issue Log during the test. The analysis tab will automatically populate as the system owner completes the issue log. Note: The tool supports up to 25 unique errors.

The images shows the Define Standard Issue Types section of the DR Test Issue Log and Analysis Tool. The table, which includes Issue Code and Issue Summary, is filled with sample information.

Aggregate the results from the Issue Log and develop a comprehensive analysis of the active test

3.3: Active Testing Handbook

[Activity] 9 – DR Test Issue Log and Analysis Tool: Completed by Facilitator

  • Once the Active Test has been completed, each system owner will send their copy of the Issue Log back to the Facilitator. The Facilitator will then Copy/Paste the inputs from each individual spreadsheet into one master copy.
  • The analysis tab will automatically populate, and the Facilitator can leverage these results in the post-mortem.

The image shows the Issue Log section, with a table filled with sample information. The image also shows the graphs that the Issue Log and Analysis Tool creates, as described above.

Create a comprehensive test schedule to plan out each step in the test execution process

3.3: Active Testing Handbook

[Activity] 7 – DR Test Plan Active Testing Handbook: Prepared by Facilitator

The image shows a table titled Test Schedule, with columns labelled Task, Personnel Assigned, Date, and Results/Comments. The table is filled with sample information.

  • The key difference between the test schedule for Active Tests compared to the agenda for the Passive Tests is that the former incorporates additional elements such as a dry run, kick-off meeting, and responsibility assignment during testing.
  • Dry Run – Confirm DR Environment Readiness: (e.g. Obtain and validate backups to be used for testing.)
  • Kick-Off Meeting: The testing participants gather and review the test procedure prior to the test. Ensure that all participants have the required documentation, review objectives and process for recording test results, and confirm required resources/requirements are available.
  • Personnel Assigned: Make sure that each step in the test is being managed by a test participant as this will ensure clarity of roles and will avoid confusion during testing.

Generate a list of success metrics to track the results of the Active Test

3.3: Active Testing Handbook

[Activity] 10 – DR Active Test Evaluation Survey: Completed by Facilitator

  • Similar to Passive Testing, the results of each Active Test need to be tracked, reviewed, and incorporated into the existing disaster recovery process. However, since Active Testing typically involves more participants that are potentially in geographically dispersed areas, Info-Tech has created an Excel survey that the Facilitator can send to each of the Active Test Participants following the test.
  • The Facilitator is expected to create a list of success metrics that will best measure the test before distributing the survey. Info-Tech has provided several questions that we expect most organizations to be able to leverage. As well as creating success metrics, the Facilitator will also gather the responses from each system owner once they are complete and report the results in a post-mortem meeting similar to that of the TTX post-mortem.

The image is a screen capture of the Active Text Evaluation Survey, showing the Test Evaluation Questions, Score, as well as Status, and Scoring Summary. There is also a box for additional comments/suggestions.

Understand what success means for your organization

3.3: Active Testing Handbook

A successful DR test is able to identify the gaps and risks in your existing DR capabilities so that these issues can be remedied or mitigated before a real disaster strikes.

Testing success can be broken into two types:

DR Capability Success

Metrics such as “Did you meet your desired RTO?” represent DR capability success. These metrics give you an indication of how well your organization is able to recover from a disaster and give validation to your DR capabilities. However, these validating metrics do not provide you with insights on how to improve. As such, if your test team fails to meet a capability metric, do not let that deter you; instead, use it as an indication that you need to dig deeper and find out why you failed that metric.

Test Execution Success

Metrics such as “Were you able to identify gaps in your DRP?” represent test execution success. These metrics give you an indication of the test process and whether or not the test added any value to your DR maturity. A good test will always seek to identify weaknesses and gaps, so that a team can fix them before a true disaster. Use these metrics to identify areas of the test that need to be modified so that your DR plan can continuously be improved and recalibrated.

Info-Tech Insight

Tests in which you “failed” because you were unable to recover your systems under a specific time frame are not bad tests. In fact, this is a very successful test, because the failure will tell you what you need to improve so that it will not happen again in a real disaster.

Implement a rigorous post-test review process to consistently enhance the effectiveness of your DR plan

3.3: Active Testing Handbook

The end is only the beginning. Leverage the review process to identify gaps, assign action items to close gaps, and plan future tests to ensure gaps are closed.

Each test, regardless of scope, provides an opportunity to update your DRP to the current operating procedures of the enterprise. The plan review is a critical aspect of the DR test cycle and must address technical, strategic, and tactical issues. Document these issues in Test Results and Lessons Learned section in 7 – DR Test Plan Active Testing Handbook.

Technical

  • Know what is primary. Are the Tier 1 and Tier 2 applications properly categorized?
  • Ensure all technical needs are properly addressed and prioritized. Outside of system configuration, consider core IP assets and transactional, legal, and financial data as well. Often, email is most crucial to an enterprise.
  • Reassess RTO and RPO in light of the test results. See Create a Right-Sized Disaster Recovery Plan.

Strategic

  • Know what parts of the DR plan worked and what did not. Where did the DR plan lag in addressing the key needs of the enterprise?
  • Could recovery technology assist in fulfilling RTO and RPOs? Mitigation technologies, such as a Site Recovery Manager, could greatly enhance the enterprise’s ability to recover quickly and effectively. Consider what options are financially viable. See Save Costs with a DRP Outsourcer.

Tactical

  • Know the strength of your DR plan documentation. Did your DR plan documentation fulfill the intended role? Assess the scope, role, and function of your DR plan and amend it to be operator-neutral. Aim to have documents that could allow anyone to fulfill the role.
  • Evaluate key assumptions. Recovery assumptions are inherent in the DR plan, and manifest in the procedures used. Follow the adage: “I trust, but I also verify.”

"Following a DR test, grade each document used in testing and update the plan. It’s the only way to improve." - Rob Reed, IT Manager, Christiana Care

Obtain sign-off for all relevant documents

3.3: Active Testing Handbook

Having management support is the baseline criteria for a successful DR test plan. Once all necessary documentation is complete, conduct a management review and gain their buy-in.

The image shows a form titled Test Plan Sign-off. It is not filled in with information.

"I cannot stress how important it is to assign ownership of responsibilities in a test; this is the only way to truly mitigate against issues in a test." – Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional

Info-Tech Insight

Sign-off for 4 – DR Test Plan Summary Template is mandatory as it provides an overview of the entire testing process. Test Plan Sign-off for both testing Handbooks is optional.

Once management has signed-off on the relevant plans, your organization is ready to execute the test plan.

If you want additional support, have our analysts guide you through this phase as part of an Info-Tech workshop

Book a workshop with our Info-Tech analysts

  • To accelerate this project, engage your IT team in an Info-Tech workshop with an Info-Tech analyst team.
  • Info-Tech analysts will join you and your team onsite at your location or welcome you to Info-Tech’s historic Toronto office to participate in an innovative onsite workshop.
  • Contact your account manager (www.infotech.com/account), or email Workshops@InfoTech.com for more information.

The following are sample activities that will be conducted by Info-Tech analysts with your team:

3.1 Formulate the Test Plan Summary

Conduct a test selection criteria discussion and create the entire DR test cycle including a mix of passive and active tests. From there, identify all the resource requirements necessary to be able to successfully execute each of the scheduled tests.

3.2 Create the Passive Testing Handbook

Discuss tabletop exercise best practices and generate facilitator questions to drive engagement. From there, document all the necessary resources for each specific tabletop exercise that is scheduled. Lastly, create the post-mortem material so that lessons learned can be fully integrated into the recovery process.

Phase 4: Translate lessons learned into improving overall preparedness

Phase 1:

Determine your DR testing readiness and scope

Phase 2:

Create a project charter to build a test plan

Phase 3:

Create the DR test plan

Phase 4:

Translate lessons learned into improving overall preparedness

Phase 4 outline: Translate lessons learned into improving overall preparedness

Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com for more information.

Complete these steps on your own, or call us to complete a guided implementation. A guided implementation is a series of 2-3 advisory calls that help you execute each phase of a project. They are included in most advisory memberships.

Guided Implementation 1: Make sure that lessons learned during testing are fully utilized

Proposed Time to Completion (in weeks): 4 weeks (1 call every 2 weeks)

Phase 4.1: Incorporate lessons learned from testing

Start with an analyst kick-off call:

  • Review completed tests and identify key learning points
  • Script the executive presentation and identify presentation best practices

Then complete these activities…

  • Compile results from the Issue Log and Participation survey
  • Create the Summary of Results presentation deck

With these tools & templates:

  • DR Test Issue Log and Analysis Tool
  • DR Active Test Evaluation Survey
  • DR Test Plan Results Summary Presentation

Phase 4.2: Create a DRP review, testing, and maintenance schedule

Review findings with analyst:

  • Review executive response to testing results
  • Discuss future actions to further improve DR capabilities
  • Analyze current testing best practices

Then complete these activities…

  • Plan next year’s DR testing cycle by updating the readiness assessment and test plans

With these tools & templates:

  • Storyboard
  • DR Testing Readiness Assessment Tool
  • DR Test Plans

Phase 4 Results & Insights:

  • Update management on the results of the test and gain buy-in to incorporate an annual testing cycle to continuously update and maintain the DRP.

Reinforce lessons learned from each test by developing a thorough review process

4.1: Test Review and Summary

At this stage of the project, you have created all of the necessary testing documentation as well as executed a test. Next, review the lessons learned from the test with both staff and executives.

The image shows the development work flow chart, with a bottom section highlighted. It is transcribed below.

Evaluate and incorporate lessons learned, and update test plans accordingly using the following tools:

9 – DR Test Issue Log and Analysis Tool: Deploy to system owners to track issues found during testing. Collect and analyze issue trends to target areas for improvement and assess overall success.

10 – DR Active Test Evaluation Survey: Collect feedback on test procedures and readiness to drive improvements.

11 – Results Summary Presentation: Create a summary presentation to communicate test results to all stakeholders, including the executive team and test participants.

Summarize and present test results to the executive team – Step 1 Readiness

Email to arrange GI call: GuidedImplementations@InfoTech.com

Management support is critical to the success of any DR strategy or initiative. When your testing cycle has completed, it is important to re-engage management and brief them on the results.

11 – DR Test Plan Results Summary Presentation: Completed by Facilitator

Step 1: Review your DR testing readiness with management. Indicate the types of tests that you are prepared for and what capabilities you need to develop before more complex tests can be done. Demonstrate your current testing proficiency or need for testing so that management can see your capabilities growth. If applicable, present the action items that were completed to finish the project.

The image shows a screenshot of a document titled Step 1 review: Readiness Assessment.

The image is a screenshot of a document titled Step 1 review continued: Action Items [optional] The image is a screenshot of a document titled Step 1 review continued: Testing Proficiency

Summarize and present test results to the executive team – Step 2 Test Schedule

4.1: Test Review and Summary

11 – DR Test Plan Results Summary Presentation: Completed by Facilitator

Step 2: Review your DR Test Schedule with management. This is intended to showcase all of the work that the DR test team has done during this test cycle, and also acts as a high level overview of the following slides.

The image is a screen capture of a document titled Step 2 review: Test Schedule.

Summarize and present test results to the executive team – Step 3 Passive Testing

4.1: Test Review and Summary

11 – DR Test Plan Results Summary Presentation: Completed by Facilitator

Step 3a: Review the TTX test results with management. The primary purpose here is to focus on the learning points that came out of the discussion. Give management an overview of the scenario that was used in the TTX and then conclude with general findings such as the discussion from the hotwash and the action items.

The image is a screen capture of a document titled Step 3 review: Scenario and General Findings.

The image shows a screen capture of a document titled Step 3 review: Action Items

Summarize and present test results to the executive team – Step 4 Active Testing

4.1: Test Review and Summary

11 – DR Test Plan Results Summary Presentation: Completed by Facilitator

Step 3b: Review the Active Test results with management. The primary purpose here is to focus on the growth in strength and reduction in weakness of your DR capabilities. Give management an overview of the Active Test by reviewing the issue log of the test, the test evaluation findings, and the action items that were completed to close the necessary gaps.

The image shows a screen capture of a document titled Step 4 review: Issue Log.

The image shows a screen capture of a document titled Step 4 review: Test Evaluation.

The image shows a screen capture of a document titled Step 4 review: Action Item Status.

Accelerate your DR Testing strategy ahead of the “norm”

4.2: Future considerations

DR testing has seen a significant rise in importance; however, this renewed focus has not yet translated into improved DR preparedness. Don’t settle for the norm, get ahead of the curve.

The image is a bar graph titled DR testing failing by every measure. On the x-axis, different approaches to DR test results are listed, with the percentages indicated on the Y-axis.

  • In a recent survey conducted by the Disaster Recovery Preparedness Council, only 39% of organizations that test the DRP document the results of their tests. This essentially means that 61% of organizations are wasting valuable time, energy, and capital, since an undocumented test will do nothing to improve DR preparedness.
  • Another alarming metric is that only 24% of organizations repeat the test if the organization did not pass. The goal of testing is to identify strengths/weaknesses and gaps in your DR capabilities, and then close those gaps so that they do not become vulnerabilities in an actual disaster. If your organization has failed a test, then that test needs to be repeated so that the gaps that caused the test to fail initially can be identified and closed. (Disaster Recovery Preparedness Council, “The State of Global Disaster Recovery Preparedness” Annual Report 2014)

Continue to use the cyclical testing approach and establish an annual DR testing habit

4.2: Future considerations

Disaster Recovery Testing Cycle

  • Update DRP & Test Plans
  • Tabletop Exercise
    • Revise
    • Retest
  • Update DRP & Test Plans
  • Simulation
  • Update DRP & Test Plans
  • Parallel
  • Update DRP & Test Plans
  • Full-scale

Adapted from: SANS Institute, “Disaster Recovery Plan Testing: Cycle the Plan, Plan the Cycle”

The Disaster Recovery Testing Cycle (shown above) reinforces the notion of using each test as a building block for the next test. By planning your test cycle at the start, this approach will maximize the value of each test.

  • Tools and templates provided in this blueprint will allow your organization to run through the DR testing cycle for the first time.
  • Make sure that this process is not a one-off activity. Go back and update all relevant documents and repeat the process on at least an annual basis.
  • Conduct a series of tabletop exercises before moving onto Active Testing.
  • Similarly, it is best practice to conduct simulation and parallel testing before engaging in a full-scale test.
  • Lastly, make sure that gaps identified in testing are documented and updated in the DRP before moving onto the next test.

Take the next step in DR preparedness by incorporating DR into new project considerations and maintenance activities

4.2: Future considerations

Truly prepared organizations do not treat DR as an event that only occurs when scheduled, they incorporate DR into the decision-making process, and build DR preparedness from the ground up.

Only 2% of all organization have this level of DR maturity. Leverage the learning points from the following two examples to incorporate these methodologies into your organization:

Example 1- Implement new accounting solution ABC (new project example)

  • An organization that is mature in DR will assess DR requirements as part of the project scoping and requirements definition phase. They will identify metrics such as what is the uptime requirement? From there they would review and update the BIA Tool to determine criticality and established desired RTO/RPO.
  • Based on the above, the organization will scope the project to meet the uptime requirement and desired RTO/RPO. This includes how it’s provisioned in the primary data center and the DR solution that is implemented (e.g. warm standby system at a DR site, or implementing a solution to recreate the accounting system in a cloud environment).
  • As part of implementing the accounting solution, it must also go through a release management process (e.g. unit testing, then system testing, then user acceptance testing before it’s released to the production environment).

The same release management process needs to be followed when the system is implemented in the DR environment (e.g. unit testing, then system testing, then user acceptance testing before it’s released to the production environment).

Note: To this point, the organization has only conducted system validation testing – they haven’t tested the failover procedure. The organization will flag this system for inclusion in the next DR test.

Take the next step in DR preparedness (continued)

4.2: Future considerations

Example 2 - Upgrading from Exchange 2003 to Exchange 2010 (maintenance and upgrades example)

  • The same mature organization will also assess DR requirements as part of their change management process. They will identify the impact of this Exchange upgrade on the existing environment and DR procedures.
  • Based on the change, the company will scope the maintenance project accordingly to maintain the same DR capability.
  • As part of implementing this upgrade, it must go through a change management process (e.g. this includes the above scoping as well as unit testing, then system testing, then user acceptance testing before it’s released to the production environment).

The same change management process needs to be followed when the upgrade is implemented in the DR environment.

Note: To this point, the organization has only conducted system validation testing – they haven’t tested the failover procedure. The organization will flag this system for inclusion in the next DR test.

Info-Tech Insight

The examples reflect a desired state where DR considerations are included in day-to-day decision making. This does not take a large budget to achieve, but rather process improvement. For example, include DR considerations (such as availability and recovery requirements) in the requirements to be evaluated up front during project planning; this is less effort and cheaper than retroactive DR planning.

Case Study: See how Blank Rome created a sustainable DR environment through an incremental improvement process

Current Situation

  • Larry Liss, the CTO of Blank Rome LLP, shared some insights from his DR testing strategies.
  • Currently, Blank Rome uses a co-location site which acts as a DR site. For critical apps such as email, the co-location acts as a hot-site.
  • Blank Rome utilizes a variety of testing methods to verify their DR plans. The chief network architect arranges annual component testing for critical systems. For example, recently the financial systems were tested for their ability to be shut down and brought back up at the co-location.
  • Aside from component tests, exercises such as tabletop exercises are also used. Customers of Blank Rome demanded that DR testing occur on a regular basis and so a hurricane-based tabletop exercise scenario was recently executed. The exercise involved senior managers as well as administrative staff, who spent several hours going over a very detailed exercise that incorporated notification, escalation, and resolution. At the end of the exercise a detailed post-mortem was conducted, and it included the gaps and action plan that came out of the exercise.

Future Decisions

  • Larry plans on implementing a VMware Site Recovery Manager. This will greatly reduce the current RTO, and also make testing significantly easier. Once the SRM solution is in place, Larry plans on executing annual full-scale tests.

Reflections

  • Larry’s situation is representative of an organization that is in the advanced stages of DR testing. Blank Rome was able to arrive at this stage due to management support and external demand for DR testing.
  • An advanced DR testing strategy has documented processes and scheduled testing; however, the lack of full-scale testing is holding Blank Rome back from having a fully mature DR testing strategy.
  • Larry’s strategy of getting the SRM in place before executing a full-scale test is a practice that Info-Tech advocates. It is always prudent to build up the necessary capabilities before a full-scale test as it will greatly reduce the risk of unintentional interruptions.

(Gardner, Dana. "Case Study: Strategic approach to disaster recovery and data lifecycle management pays off for Australia's SAI Global”)

If you want additional support, have our analysts guide you through this phase as part of an Info-Tech workshop

Book a workshop with our Info-Tech analysts

  • To accelerate this project, engage your IT team in an Info-Tech workshop with an Info-Tech analyst team.
  • Info-Tech analysts will join you and your team onsite at your location or welcome you to Info-Tech’s historic Toronto office to participate in an innovative onsite workshop.
  • Contact your account manager (www.infotech.com/account), or email Workshops@InfoTech.com for more information.

The following are sample activities that will be conducted by Info-Tech analysts with your team:

4.1 Incorporate lessons learned from testing

Leverage all the key learning points from testing and use the Summary of Test Results presentation template to communicate to upper management. Identify a best practice of connecting business needs to DR capabilities and improvement strategies.

4.2 Create a DRP review, testing, and maintenance schedule

Discussion of best practices in terms of DR test maintenance. See the reason why organizations fail and how successful organizations are able to incorporate testing into operational decision making. Define a plan for continuous improvement and annual test cycles.

Summary

Sections:

Introduction

Project Phases

Summary

  • What's in this Section:
    • Summary
    • Related research
    • References
    • Research Contributors
    • Appendix

Summary of accomplishment

Knowledge Gained

This blueprint outlined how to:

  • Create all the necessary documentation necessary to conduct both passive and active testing.
  • Document the key learning points from testing and how to incorporate them into future tests and daily operations.
  • Present test findings to executives, and create a positive feedback cycle that connects test results to business improvement.

Processes Optimized

The following processes were optimized:

  • Readiness assessment process
  • DR testing document creation process
  • Participant and management buy-in process
  • Post-test review process

Deliverables Completed

As part of an overall crisis management plan, the following deliverables were completed:

  • Identified current DR capabilities and list of action items in the Readiness Assessment Tool.
  • DR testing project responsibilities assigned and approved in the Project Charter.
  • Summary of systems to be tested and testing schedule defined in the Test Plan Summary and System Status Worksheet.
  • Tabletop planning process documented in the Passive Testing Handbook.
  • Simulation, parallel, and full-scale testing process based on the Active Testing Handbook.
  • Communicated testing results and DR capability progression through the Summary of Test Results.

Project steps summary

Client Project: Reduce costly downtime through DR testing

1. Determine your DR testing readiness and scope

1.1 Identify your current testing readiness and action items

1.2 Identify your testing proficiency and need

2. Create a project charter to build a test plan

2.1 Identify roles and responsibilities for building the test plan

2.2 Define project parameters and milestones

3. Create the DR test plan

3.1 Create a framework for the overall test plan

3.2 Create the passive testing facilitator’s handbook

3.3 Create the active testing facilitator’s handbook

4. Ensure your DRP is updated with lessons learned

4.1 Review your test results and present a summary to management

4.2. Incorporate lessons learned into operational decision making

Info-Tech Insight

This project has the ability to fit the following formats:

  • Onsite workshop by Info-Tech Research Group consulting analysts
  • Do-it-yourself with your team
  • Remote delivery (Info-Tech Guided Implementation)

Related Info-Tech research

Disaster Recovery and High Availability Research

Backup Strategy Research

Bibliography

Rothstein, Philip Jan. Disaster Recovery Testing Exercising Your Contingency Plan (2007 Edition). Brookfield, CT: Rothstein Associates, 2007. Print.

Preparedness Council, Disaster Recovery. "The State of Global Disaster Recovery Preparedness." The State of Global Disaster Recovery Preparedness (n.d.): n. pag. July 2014. Web. 4 Feb. 2015.

Gardner, Dana. "Case Study: Strategic Approach to Disaster Recovery and Data Lifecycle Management Pays off for Australia's SAI Global." ZDNet. BriefingsDirect, 26 Apr. 2012. Web. 04 Feb. 2015. .

Grance, Tim, Tamara Nolan, Kristin Burke, Rich Dudley, Gregory White, and Travis Good. "Guide to Test, Training, and Exercise Programs for IT Plans and Capabilities." (n.d.): n. pag. Technology Administration U.S. Department of Commerce. National Institute of Standards and Technology, Sept. 2006. Web. 4 Feb. 2015. .

Dolewski, Richard. "Disaster Recovery Plans: Practice Makes Perfect." Data Center Knowledge. INDUSTRY PERSPECTIVES, 26 Apr. 2011. Web. 04 Feb. 2015. .

Crump, George. "Disaster Recovery Plan Testing: Will Your Plan Work?"TechTarget. N.p., Oct. 2013. Web. 4 Feb. 2015. .

Earls, Alan R. "Disaster Recovery Testing Best Practices: Test Thoroughly and Often." TechTarget. N.p., 6 Dec. 2010. Web. 4 Feb. 2015. .

Krocker, Guy Witney. "Disaster Recovery Plan Testing: Cycle the Plan, Plan the Cycle." SANS Institute InfoSec Reading Room. STONESOFT, 2002. Web. 5 Feb. 2015. .

Organizations and experts who contributed to this research

Interviews

  • Bernard A. Jones, Manager Business Continuity & Disaster Recovery – HS&E Business Continuity, Novartis Business Services
  • Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional
  • Larry Liss, Chief Technology Officer, Blank Rome LLP
  • Paul Kirvan, FBCI, CISA, Independent IT Consultant/Auditor, Paul Kirvan Associates
  • Steve Tower, Management Consultant, Steve Tower, Disaster Recovery Plans & Assessments
  • Joe Starzyk, Senior Business Development Executive, IBM Global Services
  • Thomas Bronack, Enterprise Resiliency and Corporate Certification Consultant, DCAG
  • Paul S. Randal, CEO & Owner, SQLskills.com

Glossary: Tabletop Testing (Passive Testing) – Walk through disaster scenarios and test your incident response procedures

Tabletop exercises provide a useful opportunity for the DR Coordinator to quickly test various skills and identify professional development needs.

  • Description: DR team members and other applicable third-party participants meet to verbally walk through the DR documentation to validate the specific steps, procedures, and scope without simulating an actual disaster.
  • Purpose: Allows for a review of the DR plan to ensure it remains relevant, and to test nuanced proficiencies. An informal classroom setting allows for easy brainstorming to adapt or enhance the DR plan.
  • Applicability: Broadly applicable as they provide a practical, impactful, and fiscally conscious DR testing methodology. Used in conjunction with previously validated checklist plans, tabletop exercises allow for focused discussions and enhanced training modules.
  • Bottom Line: Widely employed and effective; they have the highest relative impact on overall DR success.

Glossary: Unit Testing as Systems Are Updated – Incorporate standby equipment in change management procedures

Unit testing ensures standby equipment stays current with the production environment and is operational when you need it.

  • Description: When the production environment is updated, put your standby equipment and/or DR environment through the same release process, including change management and QA procedures.
  • Purpose: Ensures standby equipment and/or your DR environment is operational and current.
  • Applicability: Applies to all standby systems.
  • Bottom Line: Info-Tech has found this to be the second most effective testing methodology, after tabletop testing.

Note: Unit testing is also part of the process for validating system functionality after a simulation, parallel, or full-scale test.

Glossary: Simulation Testing – Provides authentic disaster experience without the expense or impact of a full failover

Simulations validate system recovery procedures, but not necessarily the ability to execute business processes at the DR environment.

  • Description: A disaster is simulated to not interrupt normal operations. Recovery facilities and systems are brought online to make sure procedures are accurate.
  • Purpose: Intended to validate, in whole or in part, DR hardware, software, personnel, communications, procedures, supplies, and documentation. Effective simulations are well scoped, rigorous, and comprehensive.
  • Applicability: Largely applicable test method, which allows for comprehensive procedural review and development of key skills. However, they are more costly than simple walkthroughs.
  • Bottom Line: A widely used testing method because it’s comprehensive without interrupting the business.

Glossary: Parallel Testing – Allows for validation of the restoration environment to ensure operations are consistent

Identify operational gaps in the restoration system by loading historical data and running it against the historical outputs.

  • Description: An extension of simulation testing that includes the processing of historical data to ensure that systems are not just working, but working as intended.
  • Purpose: Provides a macro-level confirmation of DR plan readiness by reconciling the operational outputs from each system to identify and investigate variances in operations.
  • Applicability: Similar to simulation testing, it allows for comprehensive procedural review and development of key skills. It also validates business processes can be executed at the DR environment.
  • Bottom Line: Provides an extra level of confidence in your DRP by validating data and the ability to execute business processes.

Glossary: Full Scale Testing – Failover to your DR environment to test end-to-end process

For most organizations, full scale testing is not possible or practical due to the risk of downtime and insufficient DR capabilities.

  • Description: Involves the full interruption of the production environment and failing over to your DR environment.
  • Purpose: By failing over the production environment to the DR environment, this provides the most extreme test of your DRP. The business executes normal activities using the technology provided by the DR environment.
  • Applicability: Full scale exercises are resource and time intensive. Even if capable, smaller organizations may find it difficult to get the executive support for such a program.
  • Bottom Line: Full scale testing has the greatest potential cost and risk. As such, it should only be attempted by enterprises that have a high need to ensure the ability to withstand disaster and accurately validate RTOs/RPOs.

DIY Workshop Instructions: Reduce Costly Downtime Through DR Testing

Introduction

This section provides guidelines for how to use this blueprint to run your own internal workshop to implement DR testing best practices. Alternatively, contact Info-Tech to facilitate an onsite workshop. The DR testing methodology used for the workshop is the same as what is outlined in the rest of this blueprint.

Specifically, this section includes the following:

  • Workshop schedule
  • Post-workshop steps
  • Recommended workshop participants

Note: For the workshop, use the same tools and activities that were outlined in the blueprint.

Workshop Schedule (Project Phase 1)

Complete the instructions outlined for the workshop schedule. This includes identifying the crises that are relevant to your organization and then testing your existing crisis management plans against one of the prioritized risk.

A summary of the activities, goals, and deliverables are listed below:

Project Phase Activity Goal List of Deliverables
1. Determine your DR testing readiness and scope A. Determine current testing readiness Highlight the most likely as well as most impactful crises vulnerabilities that are introduced through business strategy. Readiness Assessment Tool
B. Identify list of action items Review all the necessary action items needed before commencing DR testing. Also document the ones that have been currently completed, and establish an estimated completion date for the remaining. Readiness Assessment Tool
C. Formulate testing strategy Understand the implications of your current testing readiness, proficiency, and need. Readiness Assessment Tool

Workshop Schedule (Project Phase 2)

Project Phase Activity Goal List of Deliverables
2. Create a project charter to build a test plan A. Identify roles and responsibilities in project charter template Create role clarity through assignment of responsibilities, which will reduce disengagement in the future. Project Charter Template
B. Work with the other members of the DR test plan team to complete the charter Document a comprehensive project charter that covers all aspects of the test plan creation process. Project Charter Template
C. Document project parameters and the milestones table Create clear expectations and buy-in from management so that DR testing will remain front of mind. Project Charter Template

Workshop Schedule (Project Phase 3)

Project Phase Activity Goal List of Deliverables
3. Develop the DR test plans A. Create the Test Plan Summary Determine all the systems that are to be included in the test plan, and generate the test schedule for the entire test cycle.

Test Plan Summary

System Status Worksheet

System Test Plans

B. Construct the Passive Testing Handbook Determine the scope of each tabletop exercise that was identified in the test schedule and document the requirements for each test. Passive Testing Handbook
C. Organize the Active Testing Handbook Determine the scope of each active test that was identified in the test schedule. Generate the necessary requirements for each test and also create the active testing execution tools.

Active Testing Handbook

Issue Log and Analysis Tool

Active Test Evaluation Survey

Workshop Schedule (Project Phase 4)

Project Phase Activity Goal List of Deliverables
4. Translate lessons learned into improving overall preparedness A. Construct the executive presentation deck Connect the IT team to the business side, by demonstrating the need for testing and the results/benefits from testing.

Summary of Test Results

B. Revaluate the action items and adjust the statuses and expected completion dates Document the reduction in necessary action items for testing, and adjust the expected dates accordingly. Summary of Test Results
C. Discuss future test cycles and how to incorporate lessons learned Establish DR testing as an annual test cycle, and incorporate the DR mindset into operational decision making through integration of lessons learned.

DR Test Plans

Recommended workshop participants

The DRP team will be the core participants for the full workshop. Include business participants for the following steps:

  • Phase 2: Obtain executive sign-off on the project charter.
  • Phase 3: Invite the relevant business users and get their feedback on which systems to include in testing. Also act as a review process for the business impact analysis.
  • Phase 3: If necessary, obtain executive sign-off on the Facilitator Handbooks.
  • Phase 4: Present workshop results to management, so that everyone is clear on the gaps to address before testing can commence and to indicate an action plan for closing those gaps.

Guided Implementation

For additional guidance on how to run your own workshop, or for assistance with any of the project steps outlined in this blueprint, please call 1-888-670-8889 or email GuidedImplementations@InfoTech.com to arrange to speak to an Info-Tech subject matter expert.

About Info-Tech

Info-Tech Research Group is the world’s fastest-growing information technology research and advisory company, proudly serving over 30,000 IT professionals.

We produce unbiased and highly relevant research to help CIOs and IT leaders make strategic, timely, and well-informed decisions. We partner closely with IT teams to provide everything they need, from actionable tools to analyst guidance, ensuring they deliver measurable results for their organizations.

What Is a Blueprint?

A blueprint is designed to be a roadmap, containing a methodology and the tools and templates you need to solve your IT problems.

Each blueprint can be accompanied by a Guided Implementation that provides you access to our world-class analysts to help you get through the project.

Improve the accuracy of your DRP and your team’s ability to efficiently execute recovery procedures through regular DR testing.

Need Extra Help?
Speak With An Analyst

Get the help you need in this 4-phase advisory process. You'll receive 9 touchpoints with our researchers, all included in your membership.

Guided Implementation #1 - Determine your DR testing readiness and scope
  • Call #1 - DR testing overview, and identify current capabilities
  • Call #2 - Determine readiness for DR testing, and appropriate next steps

Guided Implementation #2 - Create a project charter to build a test plan
  • Call #1 - Identify and assign roles and responsibilities for building the test plan
  • Call #2 - Set expectations for objectives, resource requirements, and target milestone dates

Guided Implementation #3 - Create the DR test plan
  • Call #1 - Identify resource requirements to execute the DR test
  • Call #2 - Plan your passive testing exercises
  • Call #3 - Plan your active testing exercises

Guided Implementation #4 - Turn lessons learned into better preparedness
  • Call #1 - Define a process for incorporating lessons learned from testing
  • Call #2 - Create a DRP review, testing, and maintenance schedule

Authors

Frank Trovato

David Xu

Contributors

  • Bernard A. Jones, Manager Business Continuity & Disaster Recovery – HS&E Business Continuity, Novartis Business Services
  • Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional
  • Larry Liss, Chief Technology Officer, Blank Rome LLP
  • Paul Kirvan, FBCI, CISA, Independent IT Consultant/Auditor, Paul Kirvan Associates
  • Steve Tower, Management Consultant, Steve Tower, Disaster Recovery Plans & Assessments
  • Joe Starzyk, Senior Business Development Executive, IBM Global Services
  • Thomas Bronack, Enterprise Resiliency and Corporate Certification Consultant, DCAG
  • Paul S. Randal, CEO & Owner, SQLskills.com
Visit our COVID-19 Resource Center and our Cost Management Center
Over 100 analysts waiting to take your call right now: 1-519-432-3550 x2019