- Customers, regulators, as well as your executive team are demanding that you test your DRP, but resources are scarce.
- Most DR tests are focused solely on the technology and not the DR management process – which is where most plans fail.
- Over 60% of organizations that are testing do not document the results, so they fail to properly show evidence of testing and incorporate lessons learned into their DRP.
Our Advice
Critical Insight
- Be proactive – plan an annual test cycle that enables you to identify and coordinate resources well in advance.
- Don’t focus on one test. Plan a series of tests from walkthroughs to functional tests to validate both the DR process and technical capabilities.
- If you treat DR testing as a pass/fail exercise, you aren’t meeting the end goal of improving your DRP. Focus on identifying gaps and risks before a real disaster hits.
Impact and Result
- Create an effective DR test plan by following a structured process to discover current capabilities and defining test procedures for the entire range of testing methodologies. This includes:
- Defining current readiness through a comprehensive action items list, proficiency assessment, and needs analysis.
- Creating comprehensive test documentation that will support the test facilitator through both passive and active testing.
- Implementing a thorough review program that will incorporate learning points from testing into everyday operations.
Member Testimonials
After each Info-Tech experience, we ask our members to quantify the real-time savings, monetary impact, and project improvements our research helped them achieve. See our top member experiences for this blueprint and what our clients have to say.
9.8/10
Overall Impact
$28,849
Average $ Saved
15
Average Days Saved
Client
Experience
Impact
$ Saved
Days Saved
YHA New Zealand
Guided Implementation
10/10
$10,000
10
Catholic Health System
Guided Implementation
9/10
$12,399
10
Catholic Health System
Guided Implementation
10/10
$30,999
20
Children's Hospital Colorado
Guided Implementation
10/10
$61,999
20
National Bonds Corporation
Guided Implementation
9/10
$25,000
55
Workshop: Reduce Costly Downtime Through DR Testing
Workshops offer an easy way to accelerate your project. If you are unable to do the project yourself, and a Guided Implementation isn't enough, we offer low-cost delivery of our project workshops. We take you through every phase of your project and ensure that you have a roadmap in place to complete your project successfully.
Module 1: Determine Your DR Testing Readiness and Scope
The Purpose
- Identify current testing readiness based on current documentation and infrastructure, as well cross referencing against a list of necessary action items.
- Determine current DR proficiency and need for DR testing.
Key Benefits Achieved
- Define current testing preparedness and determine the scope of testing that your organization is currently capable of.
- Determine a high level testing strategy and outlook based on readiness, proficiency, and need.
Activities
Outputs
Review current testing practices
Assess current capabilities for all system tiers
Assess current proficiency and need
- Defined current testing capabilities and likelihood of success
Discuss testing strategy
- Analysis of best-fit testing strategy
Module 2: Create a Project Charter to Build a Test Plan
The Purpose
- Identify roles and responsibilities for building the test plan.
- Define project parameters and milestones.
Key Benefits Achieved
- Create project clarity with a project charter that outlines the objectives, resource requirements, and target milestone dates.
Activities
Outputs
Complete roles and responsibilities documentation
Establish project parameters and milestones
- Completed project charter with management sign-off
Module 3: Create the DR Test Plan
The Purpose
- Plan and document the entire testing cycle.
- Create all the necessary documentation that is needed before testing can commence.
- Identify resource requirements to executive the DR test.
Key Benefits Achieved
- Identified which tests are included in the test cycle.
- Define the roles and responsibilities of each test participant for the test cycle.
- Create a repeatable process that can be leveraged on an ongoing basis for DR testing.
Activities
Outputs
Complete the Test Plan Summary
- Complete test schedule and prioritized list of systems to include in testing
Construct the Passive Testing Handbook
- Established methodology for passive testing
Construct the Active Testing Handbook
- Established methodology for active testing
Module 4: Translate Lessons Learned Into Improving Overall Preparedness
The Purpose
- Demonstrate growth in DR capabilities through DR testing to the management team.
- Establish process for continual improvement of the DR process.
- Incorporate DR testing mindset into operational decision making.
Key Benefits Achieved
- Formulation of a clear connection between improved DR capabilities and confidence in recoverability.
- Consistently updated and validated DRP.
- Competitive advantage when attracting customers who demand an effective DRP.
Activities
Outputs
Creation of the DR Test Plan Results Summary Presentation
- Competed executive presentation deck
Review of current readiness
- Indication of capability growth
Review of all test plans
- Updated planning documentation
Reduce Costly Downtime Through DR Testing
Improve the accuracy of your DRP and your team’s ability to efficiently execute recovery procedures through regular DR testing.
Follow Info-Tech’s DR Test Planning and Execution Workflow to create a comprehensive test plan
DR Test Planning and Execution Workflow – Phases and Tools
Phase 1: Determine Testing Readiness
1. DR Test Plan Storyboard (Review Planning Process)
2. Readiness Assessment
Phase 2: Create Project Charter
3. Project Charter
Phase 3: Create a Test Plan
4. Test Plan Summary
5. System Status Worksheet
6. Passive Testing Handbook
7. Active Testing Handbook
8. System Test Plans
Phase 4: Maintain Your Test Plan
9. Issue Log and Analysis Tool
10. Active Testing Participant Evaluation Survey
11. Summary of Test Results
Call your account manager to schedule a Guided Implementation
This development workflow corresponds with the tools that are provided in this blueprint.
Validate your DR effectiveness through a DR test plan; know you can recover rather than thinking you can recover
This Research Is Designed For:
- Senior IT Management responsible for executing disaster recovery testing.
- Organizations seeking to formalize, optimize, or validate an existing DRP.
- Organizations needing to validate and prove their DR capabilities to third parties.
This Research Will Help You:
- Create a DR test plan that will validate your DR process from end-to-end.
- Capture key learning points of each test and present DR capability improvement to management.
- Mitigate potential testing issues and risks.
This Research Will Also Assist:
- Executives seeking to understand the time and resource commitment required for disaster recovery testing.
- Members of business continuity management and crisis management teams who need to incorporate testing elements into their own recovery processes.
This Research Will Help Them:
- Understand the role of DR testing in improving overall DR capabilities.
- Scope the time and resources required to develop a DR test plan.
Executive summary
Situation
- Recent natural disasters such as Hurricane Sandy have increased executive awareness and internal pressure to validate the effectiveness of the DRP.
- Similarly, industry and government-driven regulations and customers are demanding that organizations provide evidence of recoverability before the organization is given the right to do business.
Complication
- Documentation both before and during testing is limited and often ad-hoc, which significantly reduces the effectiveness of DR tests.
- Lack of engagement and buy-in from test participants results in testing dates being pushed back and often forgotten.
- Organizations that don’t have a DR test plan are also far less able to recover from a disaster compared to an organization that has a comprehensive DR test plan.
Resolution
- Create an effective DR test plan by following a structured process to discover current capabilities and defining test procedures for the entire range of testing methodologies. This includes:
- Defining current readiness through a comprehensive action items list, proficiency assessment, and need analysis.
- Create comprehensive test documentation that will support the test facilitator through both passive and active testing.
- Implement a thorough review program that will incorporate learning points from testing into everyday operations.
Info-Tech Insight
- Using a DR test cycle will optimize DR test effectiveness, because using a progressive approach will allow value to transfer from one test to the next.
- The goal of testing is to uncover gaps and issues so that they are eliminated during a real disaster. Focus on improving capabilities rather than worrying about whether you passed or failed.
- Budget size does not determine DR effectiveness; consistent testing and maintenance is the only way to truly prepare yourself against potential disasters.
Three ways to complete this project: Do-It-Yourself, Guided Implementations, or Onsite Workshop
Best-Practice Toolkit | Download and customize Info-Tech’s tools and templates to develop your project deliverables. | Use this do-it-yourself Best-Practice Toolkit to help you complete this project. The slides in this Blueprint will walk you step-by-step through every phase of your project with supporting tools and templates ready for you to use. |
---|---|---|
Guided Implementations | Speak to an Info-Tech subject matter expert for advice throughout the project. | Arrange to speak to an Info-Tech expert at key milestones to ensure maximum project value.
|
Onsite Workshop | Accelerate your project with an onsite, expert Info-Tech facilitator to run a workshop for you. | To inquire about or request a workshop:
|
Understand the value of effective DR testing
Sections:
- Introduction
- Project Phases
- Summary
What's in this Section:
- DR testing impact on ability to minimize downtime
- Blueprint and guided implementations overview
The cost of downtime increases exponentially if there are delays in recovery
DR testing has the ability to greatly reduce recovery times, which in turn minimizes the business impact; leverage these statistics to establish economic benefit and build a strong business case.
Delay in recovery causes exponential revenue loss
Potential Lost Revenue
The graph above illustrates a typical revenue loss curve during a system outage. The initial business impact is small; however, as the recovery time increases, the impact on revenue will increase exponentially until all revenue is lost. The goal of successful DR is to be able to recover during that initial time period where costs have yet to escalate. DR testing allows your organization to be more confident in your DRP by ensuring its relevancy and discovering DR issues before a real disaster. A robust testing strategy will reduce the possibility of a lengthy recovery process and thus mitigate unnecessary downtime costs. (Adapted from: Rothstein, Philip Jan. Disaster Recovery Testing Exercising Your Contingency Plan [2007 Edition])
Cost of Downtime
The cost of downtime for each organization differs drastically based on several factors such as type of industry and organizational maturity. However, based on the survey results from Disaster Recovery Preparedness Benchmark Survey, almost 20% of organizations reported losses between $50,000 to over $5 million when a critical application experienced downtime. DR testing allows your organization to discover downtime threats before they occur, so that recovery time can be shortened, thus mitigating the potential economic impact. (Disaster Recovery Preparedness Council, The State of Global Disaster Recovery Preparedness 2014)
No Cost | 37% |
$1K-$6K | 18% |
$6K-$10K | 13% |
$10K-$20K | 8% |
$20K-$50K | 5% |
$50K-$100K | 10% |
$100K-$500K |
3% |
$500K-$1M | 3% |
$1M-$5M | 2% |
$5M+ | 2% |
DR testing reduces potential downtime by improving your ability to successfully execute your DRP
Creating a DRP is the first step. Testing then improves your likelihood of successful recovery from an actual disaster. Consider the following example scenario:
A disaster recovery plan has just been created and includes the following:
- Specific recovery procedures for all systems.
- Roles and responsibilities are assigned and personnel have all been informed and educated on the DRP.
- An appropriate storage and backup strategy.
However, in a real disaster, problems will be encountered that do not have a prepared response. When that happens, system owners will make decisions based on assumptions and guess work as if there was no plan. DR testing identifies those gaps before there is an actual disaster, so your DRP can be updated to be more accurate, staff are better prepared, and the chance of critical mistakes is reduced.
Organizations that test their DRP are substantially more successful than those that do not
"Routine testing is vital to survive a disaster… that’s when muscle memory sets in. If you don’t test your DR plan it falls [in importance], and you never see how routine changes impact it." – Jennifer Goshorn, Chief Administrative Officer, Gunderson Dettmer LLP
(Info-Tech Research Group; N = 81)
Effective DR testing is reliant on proper testing methodology and organizational mindset; not on budget size
Budget constraints should not be why your organization neglects testing. Conducting resource-efficient tests such as tabletop exercises is still an effective way to improve DR preparedness.
A = Extremely prepared for all disaster scenarios
F = Unprepared for majority of disaster scenarios
The Disaster Recovery Preparedness Benchmark Survey indicated a DR preparedness score for each respondent. About three quarters of organizations were at risk of not being able to recover from a disaster. Among these organizations who were at risk, a common trait is the lack of consistent DR testing and maintenance. (Disaster Recovery Preparedness Council, “The State of Global Disaster Recovery Preparedness” Annual Report 2014)
Best Practices From Prepared Organizations
Those who scored high on the survey exhibited these distinct traits:
- Tested their DR plans very frequently: Organizations who consistently tested and revised their DRP were able to create a much more actionable and reliable DRP, which is a primary factor in DR preparedness.
- Identified specific RTOs and RPOs: All prepared organizations had very accurate estimates of their RTOs and RPOs for each of their Tier 1 systems. The accuracy of these metrics were supported by a comprehensive testing plan that allowed the organizations to practice the DRP and make refinements.
- Large DR budget did not indicate a better DRP: While testing can be resource intensive, simply having a larger budget did not indicate a more prepared organization. An efficient testing strategy that extracts value from several smaller tests is able to deliver just as much value as large expensive tests. A strong and comprehensive DR test plan is much more reliant on the testing process and the commitment from participants. A good DR test plan will significantly improve your DR preparedness and give you a much better chance at mitigating the impact of IT disasters.
Regular testing ensures your DRP stays current and reliable through the constant changes in your data center
A DR test plan defines the process and resources required to ensure regular reviews, testing, and plan updates to keep your DRP accurate and complete.
" If you are running your shop effectively and proactively by having a consistent process of DR testing, review, and updates, then you can improve your ability to recover your IT infrastructure and your business by mitigating the potential consequences of disruptive events when they occur." – Paul Kirvan, FBCI, CISA, Independent IT Consultant/Auditor, Paul Kirvan Associates
Goals of a DR Test Plan
Validate the effectiveness of the DRP
A comprehensive DR test plan enables your organization to ensure the accuracy, completeness, and relevance of your recovery procedures.
If you do not have a DRP, refer to Info-Tech’s Create a Right-Sized Disaster Recovery Plan blueprint. Without a DRP, your DR testing validates only technology and not process.
Ensure data center changes are reflected in your DRP
DR testing uncovers gaps that can only be found by simulating recovery.
In the same manner, regular DR testing (at least annually) ensures the DRP stays current as your data center undergoes changes year after year.
Improve resiliency
Truly resilient organizations have DR and service continuity considerations ingrained in their everyday project and maintenance planning.
Similarly, DR testing improves an organization’s resiliency through better preparedness, and helps reinforce the importance of a solid, comprehensive DRP to business continuity.
Case Study: SAI Global’s newfound focus on DR started with a refresh of its testing methodology
Situation
- SAI Global is a risk management, standards compliance, and information company based in Sydney, Australia.
- In 2011, SAI Global’s board members mandated an update to their existing DR capabilities.
- Under the SAI Global umbrella they had many smaller business units who all had different strategies in terms of disaster recovery. Most of these strategies were designed around an older physical environment, when the current SAI Global was practically all virtual. This outdated DR strategy made testing very difficult.
Action
- When SAI Global first started their DR update, it was very IT focused, and decisions were made from a technology point of view only. However, after consulting with the business, they realized that the scope of the update needed to be much wider. In particular, SAI Global wanted to incorporate the idea of having a centralized DR process that can provide consistent reporting and consistently test the systems.
- To achieve the above goal, SAI Global used a Site Recovery Manager (SRM) that allowed their business units to test all of their systems in parallel with each other.
- Phase one of this update process took 18 months to complete.
Result
- After the implementation of the SRM, SAI Global has been able to significantly improve their testing process. Currently, they are able to fully test their systems with minimal to no interruption to production. This was a huge win for many of their business units, as it eliminated one of the biggest hurdles to testing.
- SAI Global’s publishing business had a mandate for achieving five nines, and by creating this constant and structured DR testing process, the publishing business was able to achieve this goal.
- After phase one, SAI Global is looking to further improve their DR capabilities and perhaps even transition into areas such as disaster avoidance.
(Gardner, Dana. "Case Study: Strategic approach to disaster recovery and data lifecycle management pays off for Australia's SAI Global”)
Develop a DRP test plan – project overview
1. Determine your DR testing readiness and scope | 2. Create a project charter to build a test plan | 3. Create the DR test plan | 4. Turn lessons learned into better preparedness | |
---|---|---|---|---|
Best-Practice Toolkit | 1.1 Identify current testing readiness and action items 1.2 Determine current DR proficiency and need |
2.1 Identify roles and responsibilities for building the test plan 2.2 Define project parameters and milestones |
3.1 Create a framework for the overall test plan 3.2 Create the passive testing facilitator’s handbook 3.3 Create the active testing facilitator’s handbook |
4.1 Define a process for incorporating lessons learned from testing 4.2. Create a DRP review, testing, and maintenance schedule |
Guided Implementations |
|
|
|
|
Onsite Workshop | Module 1: Determine the appropriate level of testing |
Module 2: Create a project charter for building the test plan |
Module 3: Create a test plan and supporting documentation |
Module 4: Translate lessons learned in testing into improving overall DR preparedness |
Phase 1 Results:
|
Phase 2 Results:
|
Phase 3 Results:
|
Phase 4 Results:
|
Workshop overview
Contact your account representative or email Workshops@InfoTech.com for more information
This workshop can be deployed as either a four or five day engagement depending on the level of preparation completed by the client prior to the facilitator arriving onsite.
Pre-Workshop | Day 1 | Day 2 | Day 3 | Day 4 |
---|---|---|---|---|
Preparation | Workshop Day | Workshop Day | Workshop Day | Workshop Day |
Workshop Preparation
|
Morning Itinerary
Afternoon Itinerary
|
Morning Itinerary
Afternoon Itinerary
|
Morning Itinerary
Afternoon Itinerary
|
Morning Itinerary
Afternoon Itinerary Plan next steps:
|
Blueprint tools and templates overview
The following tools and templates are included in this blueprint to help you build your DR test plan:
- DR Test Plan Development and Execution Workflow: Overview of all tools and templates as well as their order of execution for the entire DR testing process.
- DR Testing Readiness Assessment Tool: Determine the appropriate level of testing and scope.
- DR Test Plan Project Charter Template: Set expectations for scope, resource requirements, and target dates for building the DR test plan.
- DR Test Plan Summary Template: Overview of all committed DR tests in a test cycle and acts as the primary source of information for executives.
- DR Test Plan System Status Worksheet: Tracks all of the systems and dependencies to be tested.
- DR Test Plan Passive Testing Handbook: Document all the necessary resource requirements, scope, and the review process for conducting a passive test.
- DR Test Plan Active Testing Handbook: Document all the necessary resource requirements, scope, and the review process for conducting an active test.
- DR System Test Plan Template: Test plan for each individual system that is included in the DR test.
- DR Test Issue Log and Analysis Tool: Analyze and document issues that occurred during testing.
- DR Active Test Evaluation Survey: Leverage this tool to track feedback from test participants.
- DR Test Plan Results Summary Presentation: Present annual test results and lessons learned to management.
Develop a DR test plan
Sections:
Introduction
Project Phases
Summary
What's in this Section:
- Phase 1: Determine your DR testing readiness and scope
- Phase 2: Create a project charter to build a test plan
- Phase 3: Create the DR test plan
- Phase 4: Ensure your DRP is updated with lessons learned
Phase 1: Determine your DR testing readiness and scope
Phase 1:
Determine your DR testing readiness and scope
Phase 2:
Create a project charter to build a test plan
Phase 3:
Create the DR test plan
Phase 4:
Translate lessons learned into improving overall preparedness
Phase 1 outline: Identify potential crises and crisis management gaps
Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com for more information.
Complete these steps on your own, or call us to complete a guided implementation. A guided implementation is a series of 2-3 advisory calls that help you execute each phase of a project. They are included in most advisory memberships.
Guided Implementation 1: Determine your DR testing readiness and scope
Proposed Time to Completion (in weeks): 2 weeks (1 call every 1 week)
Phase 1.1: Identify current testing readiness and action items
Start with an analyst kick-off call:
- Determine your current testing readiness based on prerequisites.
- Correlate testing readiness with the completion status of Action Items List.
Then complete these activities…
- Complete and evaluate the Readiness Assessment tabs for Tier 1, 2, and 3 systems.
- Review the list of action items that still need to be completed before testing can commence.
With these tools & templates:
- DR Test Plan Development and Execution Workflow
- DR Testing Readiness Assessment Tool
Phase 1.2: Determine current DR proficiency and need
Review findings with analyst:
- Analyze your current testing proficiency and analyze your likelihood of success for each system Tier.
- Evaluate your need for testing and the testing gap between need and proficiency.
Then complete these activities…
- Define current testing capability to determine proficiency.
- Document your need for testing based on industry, customer, and internal demand.
- Based on current situation, determine your testing strategy.
With these tools & templates:
- DR Testing Readiness Assessment Tool
Phase 1 Results & Insights:
- Develop a clear understanding of the current DR testing capabilities which provides insight into the type of testing strategy you should use.
Understand the impact and complexity of each DR testing methodology to create the best fit solution
1.1: Readiness Assessment
DR Plan Testing Complexity Spectrum
DR testing methodologies vary greatly in complexity, resource demand, and preparation time. Not all methodologies are practical or even possible for all organizations.
- Tabletop Testing (TTX)
- Walking through DR scenarios, using the DRP. Tabletop is strictly a classroom exercise.
- Unit Testing as Systems Are Updated
- Testing standby equipment, particularly as updates are made to the production environment.
- Simulation Testing
- Starting up standby systems and validating basic functionality.
- Parallel Testing
- Moving beyond simply starting up machines to also restore business data and verify that standby systems can be used to execute business processes/transactions.
- Full-Scale (Full Interruption) Testing
- The primary site is shutdown and a full failover to an alternative site is executed, with a restore of all relevant data from the organization. The DR site becomes the primary site for this test.
Active vs. Passive Testing
This complexity spectrum Tabletop Testing is categorized as “Passive Testing” while all other forms of testing are categorized as “Active Testing.” Refer to the Appendix for a more detailed breakdown of each of the testing methods.
Start the DR test development process by identifying your current testing maturity
1.1: Readiness Assessment
2 – Readiness Assessment Tool: Determine the appropriate level of testing (e.g. parallel testing) and high-level scope (e.g. Tier 1 systems) based on your current capabilities.
"There are different levels of testing and it is very progressive. I do not recommend my clients to do anything, unless they do it in a progressive fashion. Don’t try to do a live failover test with your users, right out of the box." – Steve Tower, Management Consultant, Steve Tower, Disaster Recovery Plans & Assessments
Prepare for the Readiness Assessment by ensuring you have the necessary prerequisites completed
1.1: Readiness Assessment
Evaluate your DRP status
Confirm that the documentation for your DRP is complete. This would include documenting all the roles and responsibilities, incident response plans for each system, and a systems tier list. For more information on how to complete this stage, see: Create a Right-Sized Disaster Recovery Plan.
Evaluate your DR environment status
Identify your current DR environment solution (in-house, co-lo, MSP, vendor, none, etc.). Also review and identify the DR environment terms and conditions to determine:
- Type of testing permitted.
- Requirements for scheduling tests.
- Required sign-offs for testing.
Evaluate your vendor dependencies
Identify your critical vendors (e.g. hosting vendors, product support vendors, etc.).
- Determine DRP support expectations with these vendors. If this is currently not documented, then work with the vendor to establish DR agreements that include support for testing. For more information, see: DRP Vendor Evaluation Questionnaire and Tool.
The following tools require that the above steps be defined and documented. If you have yet to complete the above steps, please contact Info-Tech for assistance.
Identify current capabilities by assessing if you meet the base requirements for passive and active testing
1.1: Readiness Assessment
[Activity] 2 - DR Testing Readiness Assessment Tool – Readiness Assessment: Prepared by Facilitator
- The Passive Testing section is an assessment of your DRP and is a pre-requisite for all subsequent testing options.
- The Active Testing section is designed to assess the specific type of Active Testing (simulation, parallel, full-scale) that best suits the current capabilities of your organization.
- Within both sections there is a question that asks for the status of “action items.” Refer to the tab “Action Item List” to gain an understanding of all the necessary requirements before testing should occur.
- Once populated, the tool will assess which type of DR testing your organization meets the requirements for.
- Note: This tool is a reflection of maximum testing capability; a comprehensive testing strategy involves a series of less complex tests that lead up to your maximum testing capability.
- Repeat this process for Tier 2 and Tier 3 systems.
Use the Action Item List to track additional requirements that must be met before testing
1.1: Readiness Assessment
- DR testing is a highly complex activity that requires a large amount of supporting documentation and planning processes. Info-Tech has identified a list of all the action items that are necessary for comprehensive test planning. Use the action items list to keep track of test planning progression.
- The Action Item List is broken into 2 primary sections:
- Test Plan Readiness Requirements: High level documents that break down how each DR test will occur.
- System-Level Readiness Requirements: A granular breakdown of system level documentation. E.g. How the ERP will be recovered during a disaster.
- Document the status of each action item. Choose between: “N/A, completed, in-progress, and requires action.”
- Along with the status of each piece of documentation, also indicate the person who is responsible for its completion based on a pre-established estimated completion date.
[Activity] 2 - DR Testing Readiness Assessment Tool – Action Item List: Prepared by Facilitator
Determine your current DR testing proficiency
1.2: Testing Proficiency and Need
[Activity] 2 - DR Testing Readiness Assessment Tool – Testing Proficiency: Prepared by Facilitator
- Answer each question and differentiating between the different system tiers, provide a response between 1-10 (1 = Low/Infrequent and 10=High/Very Frequent).
- Use the “Weight” system to adjust how important each question is in relation to your organization. The current default is based on an average organization. This weighting system tailors the scores to your specific organization.
- Once populated, the tool will generate a DR Testing Proficiency Score. This score is a measure of your testing program’s current maturity as well as likelihood of success for testing. E.g. If the readiness assessment determined that you are ready for “Full-Scale” testing but your proficiency score is low, then for your organization to succeed in full-scale testing, significantly more effort such as having several TTX dry runs could be needed.
Determine your need for DR testing
1.2: Testing Proficiency and Need
[Activity] 2 - DR Testing Readiness Assessment Tool – Testing Proficiency: Prepared by Facilitator
- Answer each question and differentiating between the different system tiers, provide a response between 1-10 (1 = Low/Infrequent and 10=High/Very Frequent).
- Use the “Weight” system to adjust how important each question is in relation to your organization. The current default is based on an average organization.
- E.g. Organizations in the financial industry will likely have a very high weight allocation to the regulatory requirements. Consequently, the score attributed to this question will have a much larger impact on the overall need for testing score. This will make the score more accurate and relevant to the specific needs of your organization.
- Once populated, the tool will generate a Need for DR Testing Score. If there is a high need for testing, then your organization should take steps to improve your DR testing proficiency so that you are able to meet the higher requirements. Track this metric each time you revise your testing process to determine potential testing demand changes.
Assess your current DR testing gap
1.2: Testing Proficiency and Need
Assess your current DR testing gap in the Proficiency Assessment – DR Testing Readiness Assessment Tool
- The Current Testing Gap score will be automatically populated once the previous two steps are completed. This score represents the difference between Testing Proficiency and Need for Testing. Ideally you would want your Testing Gap to be more than 5% positive, as that would mean not only are you capable of meeting the current DR testing needs but you are also relatively prepared for the increased needs of tomorrow.
- Track this metric as you repeat the testing process to determine how well your organization is closing the testing gap.
Determine the appropriate testing strategy based on capability, likelihood of success, and need for testing
1.2: Testing Proficiency and Need
Testing capability + Likelihood of success + Need for testing = Optimal Testing Strategy
Example Analysis
- My organization’s maximum testing capability is Parallel Testing for Tier 1 systems and capable of Simulation Testing for Tier 2 and 3 systems.
- My likelihood of success is around 70% for Tier 1 systems and 30% for Tier 2 & 3 systems.
- There is a high need for testing in my organization and as a result there is a negative 6% Testing Gap between my current testing proficiency and need for testing for Tier 1 systems.
My tier 1 test plan should incorporate a mixture of TTX and simulations and culminate in a parallel test at the end of the year that incorporates the lessons learned from other tests for Tier 1 systems. Since the likelihood of success for Tier 2 and 3 is relatively low, I should focus first on mastering the recovery process using TTX before transitioning into active testing. Both internal and external mandates are pushing for more complex testing, as such investments are needed to improve current infrastructure so that full-scale testing can be conducted.
Info-Tech Insight
An optimal testing strategy is like building a pyramid: before conducting a parallel test or a full-scale test, it is best practice to first conduct several TTXs and simulation tests. Reduce the risks of complex testing by leveraging the lessons learned from less-complex tests.
If you want additional support, have our analysts guide you through this phase as part of an Info-Tech workshop
Book a workshop with our Info-Tech analysts
- To accelerate this project, engage your IT team in an Info-Tech workshop with an Info-Tech analyst team.
- Info-Tech analysts will join you and your team onsite at your location or welcome you to Info-Tech’s historic Toronto office to participate in an innovative onsite workshop.
- Contact your account manager (www.infotech.com/account), or email Workshops@InfoTech.com for more information.
The following are sample activities that will be conducted by Info-Tech analysts with your team:
1.1 Identify current testing readiness and action items
Document all the necessary documentation and resources necessary before testing can commence. As well, assess the readiness for testing for Tier 1, 2, and 3 systems to gain a holistic view of current testing maturity and gaps that need to be closed before testing.
1.2 Define testing strategy through proficiency and need analysis
Complete the proficiency and need for testing assessment for all system tiers. Discuss the score’s implication on the likelihood of success of testing and how that contributes to the testing strategy. Finalize a testing strategy and consider future improvements based on need analysis.
Phase 2: Create a DR test plan project charter
Phase 1:
Determine your DR testing readiness and scope
Phase 2:
Create a project charter to build a test plan
Phase 3:
Create the DR test plan
Phase 4:
Translate lessons learned into improving overall preparedness
Phase 2 outline: Create a project charter to build a test plan
Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com for more information.
Complete these steps on your own, or call us to complete a guided implementation. A guided implementation is a series of 2-3 advisory calls that help you execute each phase of a project. They are included in most advisory memberships.
Guided Implementation 2: Create a project charter to build a test plan
Proposed Time to Completion (in weeks): 4 weeks (1 call every 2 weeks)
Phase 2.1 Identify roles and responsibilities
Start with an analyst kick-off call:
- Review the benefits of the project charter (e.g. clarify expectations and resource requirements).
- Identify staff who need to be included in building the test plan (identify roles and responsibilities).
Then complete these activities…
- Complete the roles and responsibilities table in the project charter template (i.e. assign staff to roles), and modify descriptions as needed.
- Work with the members of your DR test plan team to complete the rest of the project charter.
With these tools & templates:
- DR Test Plan Project Charter Template
Phase 2.2: Define project parameters and milestones
Review findings with analyst:
- Review the project charter draft, including assigned roles and responsibilities.
- Determine appropriate project parameters, including milestones and target dates.
Then complete these activities…
- Complete the project parameters (e.g. objectives) and the milestones table in the project charter template.
- Obtain sign-off from senior management.
With these tools & templates:
- DR Test Plan Project Charter Template
Phase 2 Results & Insights:
- Clarify project expectations and resource requirements. Executive buy-in is critical to ensuring DR testing does not get pushed to the backburner.
Obtain executive support by creating a project charter for developing the test plan
2.1: Roles and responsibilities
At this stage of the project, the Readiness Assessment tool is completed, and you have a good understanding of the types of testing that your organization is capable of and needs to work toward. Next, create the project charter.
3 – Project Charter: Set expectations for scope, resource requirements, and target dates for building the test plan.
"Ownership needs to be defined clearly from the outset. Ambiguity in terms of who is responsible for each aspect of the testing process and who owns which system for the tests will almost certainly be problematic later on." – Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional
Use Info-Tech’s DR Test Plan Project Charter Template to clarify requirements and expectations
2.1: Roles and responsibilities
Project Charter Components
Use the project charter to define project parameters, roles, and objectives, and thereby clarify expectations with the executive team. The specific components are listed below and then described in more detail in the remainder of this phase:
- Project Overview: Includes objectives, deliverables, and scope.
- Governance and Management: Includes roles, responsibilities, and resource requirements. Project Risks, Assumptions, and Constraints: Includes risks and mitigation strategies, as well as any assumptions and constraints. Project Sign-off: Includes IT and executive sign-off.
Note: This phase directs you to name the roles and responsibilities first so they can assist in defining the project charter.
DR Test Plan Project Charter Template
Email to arrange GI call: GuidedImplementations@InfoTech.com
Define roles and responsibilities for the DR test team
2.1: Roles and responsibilities
Identify who will be participating in developing the test plan, and clarify levels of responsibility using the COBIT “RACI” approach:
- Responsible: Responsible for doing the activity (the work).
- Accountable: Accountable to ensure the activity (the work) happens.
- Consulted: Consulted prior to decision or action.
- Informed: Informed of the decision or action.
Specifically, assign the following roles (the project charter template provides additional descriptions which you can modify as needed to suit your organization):
- Executive Sponsor: Liaison with the executive team (the CIO would be a good candidate for this role).
- Project Lead: Responsible for driving the project, determining the methodology to be followed, and assigning required resources.
- DR Testing Facilitator: Function as the project manager. This includes coordinating resources and reporting progress.
- Subject Matter Experts (SMEs): Required to ensure they have a test plan for their respective systems.
- Business Unit Managers: Assign business users to assist with developing acceptance test plans.
- Executive Team (or named subset): Sign off on the Project Charter and the DR Test Plan when completed.
Note: This blueprint is directed primarily at the Project Lead who will work with the rest of the team.
Define project parameters
2.2: Project parameters and milestones
Complete the following sections in the project charter template and review with the executive sponsor to confirm project parameters.
- Project Background and Drivers: Document the rationale for the project, which will reinforce support for the project. Drivers might include a failed audit or concern over the organization’s current ability to recover from a disaster.
- Project Objectives: The project charter template includes objectives based on this blueprint – modify these as needed.
- Project Deliverables: The project charter template lists the core deliverables for a DR test plan (generated by this blueprint).
- Project Scope: Further clarify objectives by listing what is in scope and out of scope.
Set achievable, realistic target dates for project milestones
2.2: Project parameters and milestones
The project milestones section in the project charter is prefilled based on the steps in this blueprint to provide a starting point for your project planning:
- Customize the milestones to accommodate special requirements for your organization.
- Set achievable, realistic target dates. Most organizations find they have several gaps in DR testing capability that need to be addressed.
- Use the project milestones table to guide project management and scheduling.
Further clarify project parameters by documenting risks, assumptions, and constraints
2.2: Project parameters and milestones
Set Expectations Up Front
For most organizations, the biggest risk is resource availability. More immediate tasks take priority and DR testing gets pushed to the back burner.
Complete the following sections in the project charter template to explicitly state these risks and resource requirements:
- Risks, Assumptions, and Constraints
- Reviews and Reporting
- Resource Requirements
Mitigate Project Risks
As noted in the project charter template, an effective project risk mitigation strategy is to block a specific timeslot each week to allow time for collaboration as well as completing individual assignments.
The earlier sections of the project charter will also help set expectations and mitigate these risks. For example:
- Executive sponsorship. The more senior the executive, the better. If steps are delayed due to lack of buy-in or conflicting projects, you need to be able to escalate these issues, and the higher you can go (if necessary), the better.
- Assigning named resources. Roles such as the Project Lead, DR Test Facilitator, and system SMEs will do the bulk of the test plan development. Assigning these roles up front will help you clarify resource requirements.
- Define clear project objectives and milestone target dates. This sets clear expectations about the project direction and expected outcomes.
If you want additional support, have our analysts guide you through this phase as part of an Info-Tech workshop
Book a workshop with our Info-Tech analysts
- To accelerate this project, engage your IT team in an Info-Tech workshop with an Info-Tech analyst team.
- Info-Tech analysts will join you and your team onsite at your location or welcome you to Info-Tech’s historic Toronto office to participate in an innovative onsite workshop.
- Contact your account manager (www.infotech.com/account), or email Workshops@InfoTech.com for more information.
The following are sample activities that will be conducted by Info-Tech analysts with your team:
2.1 Identify roles and responsibilities
Review the benefits of the project charter (e.g. clarify expectations and resource requirements). As well, we will identify the staffing requirements needed to build the test plans. This will include a discussion of the roles and responsibilities.
2.2 Define project parameters and milestones
Review the draft project charter, including assigned roles and responsibilities. From here, we will determine the appropriate project parameters, and include milestones with target dates. Once the project charter is finalized, we will look for relevant sign-off authority to give the approval to initiate the planning process.
Phase 3: Create the DR test plan
Phase 1: Determine your DR testing readiness and scope
Phase 2: Create a project charter to build a test plan
Phase 3: Create the DR test plan
Phase 4: Translate lessons learned into improving overall preparedness
Phase 3 outline: Create the DR test plan
Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com for more information.
Complete these steps on your own, or call us to complete a guided implementation. A guided implementation is a series of 2-3 advisory calls that help you execute each phase of a project. They are included in most advisory memberships.
Guided Implementation 1: Create the necessary documentation for DR testing
Proposed Time to Completion (in weeks): 6 Weeks ( 1 Call per 2 weeks)
Step 3.1: Test Plan Summary
Start with an analyst kick-off call:
- Discuss test plan creation methodology
- Determine the scope of systems to include in testing
Then complete these activities…
- Document System Status
- Determine Test Schedule
With these tools & templates:
- DR Test Plan Summary Template
- DR Test Plan System Status Worksheet
- DR System Test Plan Template
Step 3.2: Passive Testing Handbook
Review findings with analyst:
- Review passive testing methodology
- Determine passive testing requirements and scope
Then complete these activities…
- Complete the Passive Testing Facilitator's Handbook
With these tools & templates:
- DR Test Plan Passive Testing Handbook
Step 3.3: Active Testing Handbook
Finalize phase deliverable:
- Review passive testing methodology
- Determine passive testing requirements and scope
- Discuss active testing execution tools
Then complete these activities…
- Complete the Active Testing Facilitator's Handbook
With these tools & templates:
- DR Test Plan Active Testing Handbook
- DR Test Issue Log and Analysis Tool
- DR Active Test Evaluation Survey
Phase 3 Results & Insights:
- Identified overall testing schedule based on system prioritization and completed all necessary planning documentation needed to execute both active and passive testing.
Optimize testing resources by mapping out the testing process through an all-inclusive DR Test Plan
3.1: Test Plan Summary
At this stage of the project, the Project Charter is completed and approved. You are now ready to create the necessary handbooks and exercises for Passive and Active Testing.
Info-Tech Insight
Maximize the value of each test by planning ahead. For instance, schedule tabletop exercises to act as dry runs before active testing. Lessons learned greatly improve the success and effectiveness of future DR tests.
Identify the systems to include in your overall DR Test Plan
3.1: Test Plan Summary
[Activity] 4 – DR Test Plan Summary Template: Prepared by Test Facilitator and SMEs
If you have already conducted a baseline test of your overall DR environment.
If you are not provisioned to test all systems. E.g. Your DR environment is intentionally equipped to serve only Tier 1 systems.
Work with management and determine the specific systems to include in your DR Test.
Scope Test Selection Criteria
Determine the ideal systems to include in a test plan by following the criteria below:
- Level of criticality: Prioritize testing of Tier 1 systems before Tier 2, and Tier 2 before Tier 3.
- Magnitude/Frequency of change: Prioritize the testing of systems that have undergone significant changes or a high number of smaller changes, relative to systems that were unmodified. Leverage your change management process and tools to identify systems that have undergone significant changes.
- Time since last tested: Prioritize the testing of systems that have not been tested for an extended period of time.
Level of criticality is the primary deciding factor, magnitude/frequency of change is the secondary factor, and time since last tested is the tertiary factor.
Info-Tech Insight
Once you have successfully tested your entire DR environment (i.e. established a baseline), you can gain testing efficiency by then focusing future tests on a subset of your environment based on the criteria above (criticality, change, and time since last tested).
Document the identified systems for DR testing
3.1: Test Plan Summary
[Activity] 5 – DR Test Plan System Status Worksheet: Completed by Test Facilitator and SMEs
After the systems that are going to be tested have been identified, system owners can leverage the DR Testing System Status Worksheet to document the following:
- Application/System to be tested: List of all the systems that are to be included in the test.
- System owners: Identify the system owner of each application/system that is to be tested.
- System Dependencies: Document items such as SAN, Active Directory, DHCP, DNS, etc.
- DR Test Readiness Status: Ensure that the specific system is ready for testing by compiling the DR procedures, test plans (Unit Test, System Test, User Acceptance Test), backups, DR test configuration, and backout plans.
Construct a System Test Plan for each individual system
3.1: Test Plan Summary
- A key line item in the Readiness Status section of 5 – DR Test Plan System Status Worksheet is the creation of System Test Plans for each individual system that is going to be tested. This individual system test plan is used by the system owners during the test and it includes the following:
- A reference to the DR instructions (e.g. to failover to the DR environment, restore from backups if necessary, and bring the standby system online).
- A reference to DR testing constraints, such as network configuration requirements to isolate the DR/standby system from the production environment, if necessary.
- Procedures to validate system functionality after executing the DR procedure (e.g. unit test, system test, and acceptance test instructions to validate that the standby system is functioning as expected after the failover).
Assign system owners the task of completing the System Test Plan; they can leverage Info-Tech’s 8 – DR System Test Plan Template.
Establish a test schedule to roadmap your entire DR test plan
3.1: Test Plan Summary
A successful DR test plan is built on foresight; by planning out how each test feeds into the subsequent test, the maximum value of each test can be realized.
[Activity] 4 – DR Test Plan Summary Template: Completed by Facilitator
- Identify the type of testing based on the assessments made during the readiness assessment.
- It is best practice to start with tabletop exercises which will act as dry runs for the more complex testing methods.
- Describe the test at a high level, indicating the scenario that will occur and the scope that the test will capture.
- Document the date and time of testing for management approval.
"All organizations will go through a crawl, walk, run phase in terms of test maturity. It is extremely important in the early stages of development to concentrate the focus on actual recoverability and data protection, enhancing these capabilities over time into a fully matured program that can truly test the recovery, and not simply focussing on the testing process itself." – Joe Starzyk, Senior Business Development Executive, IBM Global Services
Info-Tech Insight
Establishing a test schedule for the year enables the DR team and the rest of the organization to work from the same page and avoid resource conflicts. While future events may change test dates, an established and pre-approved test schedule will help you ensure resources are available for each test.
Passive testing overview: tabletop exercises are effective for incident response planning and validation
3.2: Passive Testing Handbook
Tabletop planning is a paper-based exercise where the DRP team walks through disaster scenarios and maps out what should happen at each stage, effectively defining their incident response plan. After you have a DRP in place, use this exercise to walk through and validate your incident response plan.
Tabletop planning had the greatest impact on meeting recovery objectives (RTOs/RPOs) among survey respondents
Note: Relative importance indicates the contribution an individual testing methodology, conducted at least annually, had on predicting success meeting recovery objectives, when controlling for all other types of tests in a regression model. The relative-importance values have been standardized to sum to 100%.
Success was based on the following items:
- Recovery time objectives (RTOs) are consistently met.
- IT has confidence in the ongoing ability to meet RTOs.
- Recovery point objectives (RPOs) are consistently met.
- IT has confidence in the ongoing ability to meet RPOs.
Why is tabletop planning so effective?
- It enables you play out a wider range of scenarios than technology-based testing (e.g. full-scale, parallel, etc.) due to cost and complexity factors.
- It is non-intrusive, so it can be executed more frequently than other testing methodologies.
- It provides a thorough test of your incident response plan since the exercise is, essentially, paper-based.
Conduct a tabletop planning exercise using best practices
3.2: Passive Testing Handbook
Use tabletop planning to test the current achievable recovery timeline, and identify gaps in your current disaster recovery capabilities.
For each high-level recovery step, do the following:
- On white index cards:
- Record the step.
- Indicate the task owner.
- Note the task start and end time (use the running recovery time as your clock, where 00:00 is when the incident occurred).
- On yellow index cards, document gaps in people, process, and technology requirements to complete the step.
- On red index cards, indicate risks (e.g. no backup person for a key staff member).
Tabletop planning is simple, but effective:
- Discuss each step from start to finish.
- Keep focused; stay on task and on time.
- Revisit each step and record risks and mitigation strategies.
- Revise the plan with key task owners.
Info-Tech Insight
Record everything, but don’t get weighed down by tools. Relying on software or other technological tools can detract from the exercise. Use simple tools such as index cards and whiteboards.
Tabletop planning example
3.2: Passive Testing Handbook
Below is a picture of the results of an actual tabletop planning exercise.
Photo credit: Info-Tech
White index cards indicate high-level DR steps in a linear flow with branches to represent simultaneous steps.
Yellow index cards indicate gaps in people, process, and technology requirements to complete the step.
Red index cards indicate risks (e.g. no backup person for a key staff member).
Execute a successful TTX by drafting a Facilitator’s Handbook
3.2: Passive Testing Handbook
The primary contributor to ineffective tabletop exercises (TTXs) is the lack of engagement from participants. Facilitators can avoid disinterest by generating content rich discussions and realistic scenarios.
[Activity] 6 – DR Test Plan Passive Testing Handbook: Completed by Facilitator
Leverage the sample questions provided in the Passive Testing Handbook to drive insightful discussions during your tabletop exercise. In addition, prepare more questions prior to the exercise to ensure that every minute of the exercise contributes to the overall testing objectives.
Initiate TTX scenario planning with common threat scenarios that focus on overall service continuity
3.2: Passive Testing Handbook
Unrealistic scenarios are a key contributor to futile TTXs; focus initial TTXs on more common threats to service continuity such as hardware & software failures, network outages, and power outages.
Causes of Unacceptable Downtime:
Software Failure - 24%
Isolated Hardware Failure - 21%
↑ 45% Total ↑
45% of service interruptions that went beyond maximum downtime guidelines set by the business were caused by software and hardware issues.
External Network Failure - 19%
Power Outage - 18%
↑ 37% Total ↑
37% of incidents were caused by network or power outages.
Building is Inaccessible (e.g. due to a local hazard) - 5%
Equipment Damage (e.g. due to fire, roof collapse, etc.) - 7%
Natural Disaster - 5%
↑ 12% Total ↑
Only 12% of incidents were caused by major events (i.e. significant physical damage or regional impact).
(Info-Tech Research Group; N=87)
Info-Tech Insight
Does this mean I don’t need to worry about natural disasters? No. It means DR test planning needs to focus on overall service continuity, not just major disasters. If you ignore the more common, but less dramatic causes of service interruptions, you will suffer the proverbial “death from a thousand cuts.”
Maintain the realism of DR scenarios by planning for compound scenarios
3.2: Passive Testing Handbook
During a real disaster, incidents typically do not occur in an isolated sequential order. A realistic scenario should incorporate the possibility of multiple incidents occurring simultaneously (e.g. a gas leak requires building evacuation and power to be shut down).
[Activity] 6 – DR Test Plan Passive Testing Handbook: Completed by Facilitator
- Document the scenario that the TTX team will be walking through in the Passive Testing Handbook. Below is an example:
Scenario
Instructions: Identify the scenario that the participants will be walking through. For a TTX1 scenario, Info-Tech Research Group advocates a denial of access type of incident where your IT infrastructure is inaccessible but not physically damaged. Adjust the scenario as you see fit.
6:00AM Monday morning: Local authorities confirmed that there has been a gas leak in close proximity to your building. The entire office building is compromised and all staff needs to be evacuated. All power has been terminated by city officials. The gas leak is expected to take local authorities 2 weeks to remedy and the estimated time for return access to the primary building will be 3 weeks. Given the circumstances, the executives of XYZ Corporation has decided to failover all IT functions to the DR site.
Info-Tech Insight:
While a compound disaster can increase the realism of the TTX, it is generally best practice to limit the number of incidents (e.g. hardware failure combined with network outage) within a scenario to 2 or 3. A scenario with too many incidents can cause the TTX to be too complex and difficult to complete.
Capture key learning points from the TTX in a Hotwash
3.2: Passive Testing Handbook
Eliminate the possibility of disengagement by documenting the strengths and weaknesses of the exercise, as well as areas of improvement for the DRP, immediately after the exercise.
[Activity] 6 – DR Test Plan Passive Testing Handbook: Completed by Facilitator
Hotwash: A discussion directly following the exercise that documents and analyzes the results and lessons learned.
Participant Evaluation Survey: A survey that is distributed to the participants following the hotwash, to capture the discussion. Also allows for anonymous comments.
- After the TTX scenario has been documented, continue along on the Handbook template and document the hotwash questions and adjust the participation evaluation survey if needed.
- The hotwash discussion and the participant evaluation survey is designed to be handed out directly following the TTX.
Ensure that lessons learned during a test contribute to improving overall DR planning and future tests
3.2: Passive Testing Handbook
Lessons learned during an exercise can only translate into operational improvements in a real disaster through repetition; a best practice is to conduct a post-mortem one month after the TTX.
DR Test Plan Passive Testing Handbook
Test Results and Lessons Learned: Following the TTX exercise, the Facilitator will prepare materials for a post-mortem that will occur about one month after. This meeting is used to review the results from the exercise, as well as assign and approve action items to incorporate lessons learned during the TTX and improve the disaster recovery process.
- This specific section of the template does not need to be modified by the Facilitator during the planning phase. This document is prepared following the exercise.
"We had a mature process, but after each test we were still always able to learn something new." – Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional
Repeat the planning process for subsequent TTXs and consider adjusting complexity or scope
3.2: Passive Testing Handbook
The key benefit of establishing a test plan before testing is that it allows you to see how each test feeds into the next; use this strategy to your advantage and improve DR capability with every test.
For the first TTX, the scenario below might have been the one that you walked through:
Scenario for TTX 1
6:00 AM Monday morning: Local authorities confirmed that there has been a gas leak in close proximity to your building. The entire office building is compromised and all staff need to be evacuated. All power has been terminated by city officials. The gas leak is expected to take local authorities 2 weeks to remedy and the estimated time for return access to the primary building will be 3 weeks. Given the circumstances, the executives of XYZ Corporation have decided to failover all IT functions to the DR environment.
For a second TTX, repeat the same planning process but adjust the scenario so that lessons learned in the first scenario can be applied to the second. This will allow the team to demonstrate that they are capable of solving a more complex situation and help reinforce the lessons learned.
Scenario for TTX 2
6:00 AM Monday morning: Building security personnel alert XYZ Corporation that due to heavy rain, the data center has been flooded. Power has been shut down and all systems in the data center are damaged.
Active Testing overview: it is important for your organization to conduct regular active tests
3.3: Active Testing Handbook
Active testing consists of simulation, parallel, and full-scale testing. In active tests, the DR procedures, test plans, and technology (hardware/software) are operationally executed to mimic a live scenario. Passive tests that focus on “what would have happened,” are often used as a dry run before the active test.
Advantages of Active Tests
1. Uncover minute details
While TTXs are great for identifying high-level process issues, they are often unable to uncover issues at a granular level. Changes such as updated passwords or new phone numbers will not be reflected in a TTX and can only be identified during a live test where the procedures are actually executed.
2. Hands-on practice
Regular testing allows the system owner to familiarize themselves with the recovery process, which will contribute to a faster and more reliable recovery process.
3. Experience real pressure
A disaster is often a high pressure situation and it is nearly impossible to predict how your staff will respond under such circumstances. The only way to ensure that your staff responds favorably is to familiarize them through real life testing and to test frequently enough so that the response is instinctual and not reactive.
While TTXs are very efficient exercises, they cannot completely replace active testing. Your organization cannot be fully confident that business operations can be sustained and recovered until you have executed an active test.
Create an Active Testing Facilitator’s Handbook that keeps track of all the necessary resource requirements
3.3: Active Testing Handbook
Active Testing is significantly more involved than Passive Testing, and to ensure that an active test can run smoothly, all relevant resources need to be documented up front and signed off.
[Activity] 7 – DR Test Plan Active Testing Handbook: Completed by Facilitator
- Document all of the relevant requirements for all the active tests that were planned in 4 – DR Test Plan Summary Template. Make sure to indicate which specific test will be needed for each identified resource.
- Resources Include:
- Staff Requirements
- Documentation Requirements
- Technology Requirements
- DR Environment Requirements
- Third-Party (Vendor) Requirements
- Budget Requirements
- Risks and Mitigation Strategies
Example:
Info-Tech Insight
Reduce vendor and travel costs (if applicable) by combining the simulation and parallel test into one exercise. Start with simulation testing (bring systems online and verify basic functionality), and continue with parallel testing (load transaction/application data, and conduct user acceptance testing by replicating business processes with that data in the DR environment).
Prepare a Test Issue Log that will allow your DR team to document the types of errors/issues that occur
3.3: Active Testing Handbook
[Activity] 9 – DR Test Issue Log and Analysis Tool: Completed by Facilitator
- The aim of this tool is to help the DR Facilitator analyze the types of errors/issues that are occurring during the DR test.
- In order to standardize reporting, the Facilitator needs to create the list of all error/issue categories that can potentially occur during the test. Once this list has been completed, the Facilitator will distribute a copy of the Issue Log and Analysis Tool to each system owner for them to use during the test.
- The system owners who are executing the test will then use the dynamic drop-down menus to populate the Issue Log during the test. The analysis tab will automatically populate as the system owner completes the issue log. Note: The tool supports up to 25 unique errors.
Aggregate the results from the Issue Log and develop a comprehensive analysis of the active test
3.3: Active Testing Handbook
[Activity] 9 – DR Test Issue Log and Analysis Tool: Completed by Facilitator
- Once the Active Test has been completed, each system owner will send their copy of the Issue Log back to the Facilitator. The Facilitator will then Copy/Paste the inputs from each individual spreadsheet into one master copy.
- The analysis tab will automatically populate, and the Facilitator can leverage these results in the post-mortem.
Create a comprehensive test schedule to plan out each step in the test execution process
3.3: Active Testing Handbook
[Activity] 7 – DR Test Plan Active Testing Handbook: Prepared by Facilitator
- The key difference between the test schedule for Active Tests compared to the agenda for the Passive Tests is that the former incorporates additional elements such as a dry run, kick-off meeting, and responsibility assignment during testing.
- Dry Run – Confirm DR Environment Readiness: (e.g. Obtain and validate backups to be used for testing.)
- Kick-Off Meeting: The testing participants gather and review the test procedure prior to the test. Ensure that all participants have the required documentation, review objectives and process for recording test results, and confirm required resources/requirements are available.
- Personnel Assigned: Make sure that each step in the test is being managed by a test participant as this will ensure clarity of roles and will avoid confusion during testing.
Generate a list of success metrics to track the results of the Active Test
3.3: Active Testing Handbook
[Activity] 10 – DR Active Test Evaluation Survey: Completed by Facilitator
- Similar to Passive Testing, the results of each Active Test need to be tracked, reviewed, and incorporated into the existing disaster recovery process. However, since Active Testing typically involves more participants that are potentially in geographically dispersed areas, Info-Tech has created an Excel survey that the Facilitator can send to each of the Active Test Participants following the test.
- The Facilitator is expected to create a list of success metrics that will best measure the test before distributing the survey. Info-Tech has provided several questions that we expect most organizations to be able to leverage. As well as creating success metrics, the Facilitator will also gather the responses from each system owner once they are complete and report the results in a post-mortem meeting similar to that of the TTX post-mortem.
Understand what success means for your organization
3.3: Active Testing Handbook
A successful DR test is able to identify the gaps and risks in your existing DR capabilities so that these issues can be remedied or mitigated before a real disaster strikes.
Testing success can be broken into two types:
DR Capability Success
Metrics such as “Did you meet your desired RTO?” represent DR capability success. These metrics give you an indication of how well your organization is able to recover from a disaster and give validation to your DR capabilities. However, these validating metrics do not provide you with insights on how to improve. As such, if your test team fails to meet a capability metric, do not let that deter you; instead, use it as an indication that you need to dig deeper and find out why you failed that metric.
Test Execution Success
Metrics such as “Were you able to identify gaps in your DRP?” represent test execution success. These metrics give you an indication of the test process and whether or not the test added any value to your DR maturity. A good test will always seek to identify weaknesses and gaps, so that a team can fix them before a true disaster. Use these metrics to identify areas of the test that need to be modified so that your DR plan can continuously be improved and recalibrated.
Info-Tech Insight
Tests in which you “failed” because you were unable to recover your systems under a specific time frame are not bad tests. In fact, this is a very successful test, because the failure will tell you what you need to improve so that it will not happen again in a real disaster.
Implement a rigorous post-test review process to consistently enhance the effectiveness of your DR plan
3.3: Active Testing Handbook
The end is only the beginning. Leverage the review process to identify gaps, assign action items to close gaps, and plan future tests to ensure gaps are closed.
Each test, regardless of scope, provides an opportunity to update your DRP to the current operating procedures of the enterprise. The plan review is a critical aspect of the DR test cycle and must address technical, strategic, and tactical issues. Document these issues in Test Results and Lessons Learned section in 7 – DR Test Plan Active Testing Handbook.
Technical
- Know what is primary. Are the Tier 1 and Tier 2 applications properly categorized?
- Ensure all technical needs are properly addressed and prioritized. Outside of system configuration, consider core IP assets and transactional, legal, and financial data as well. Often, email is most crucial to an enterprise.
- Reassess RTO and RPO in light of the test results. See Create a Right-Sized Disaster Recovery Plan.
Strategic
- Know what parts of the DR plan worked and what did not. Where did the DR plan lag in addressing the key needs of the enterprise?
- Could recovery technology assist in fulfilling RTO and RPOs? Mitigation technologies, such as a Site Recovery Manager, could greatly enhance the enterprise’s ability to recover quickly and effectively. Consider what options are financially viable. See Save Costs with a DRP Outsourcer.
Tactical
- Know the strength of your DR plan documentation. Did your DR plan documentation fulfill the intended role? Assess the scope, role, and function of your DR plan and amend it to be operator-neutral. Aim to have documents that could allow anyone to fulfill the role.
- Evaluate key assumptions. Recovery assumptions are inherent in the DR plan, and manifest in the procedures used. Follow the adage: “I trust, but I also verify.”
"Following a DR test, grade each document used in testing and update the plan. It’s the only way to improve." - Rob Reed, IT Manager, Christiana Care
Obtain sign-off for all relevant documents
3.3: Active Testing Handbook
Having management support is the baseline criteria for a successful DR test plan. Once all necessary documentation is complete, conduct a management review and gain their buy-in.
"I cannot stress how important it is to assign ownership of responsibilities in a test; this is the only way to truly mitigate against issues in a test." – Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional
Info-Tech Insight
Sign-off for 4 – DR Test Plan Summary Template is mandatory as it provides an overview of the entire testing process. Test Plan Sign-off for both testing Handbooks is optional.
Once management has signed-off on the relevant plans, your organization is ready to execute the test plan.
If you want additional support, have our analysts guide you through this phase as part of an Info-Tech workshop
Book a workshop with our Info-Tech analysts
- To accelerate this project, engage your IT team in an Info-Tech workshop with an Info-Tech analyst team.
- Info-Tech analysts will join you and your team onsite at your location or welcome you to Info-Tech’s historic Toronto office to participate in an innovative onsite workshop.
- Contact your account manager (www.infotech.com/account), or email Workshops@InfoTech.com for more information.
The following are sample activities that will be conducted by Info-Tech analysts with your team:
3.1 Formulate the Test Plan Summary
Conduct a test selection criteria discussion and create the entire DR test cycle including a mix of passive and active tests. From there, identify all the resource requirements necessary to be able to successfully execute each of the scheduled tests.
3.2 Create the Passive Testing Handbook
Discuss tabletop exercise best practices and generate facilitator questions to drive engagement. From there, document all the necessary resources for each specific tabletop exercise that is scheduled. Lastly, create the post-mortem material so that lessons learned can be fully integrated into the recovery process.
Phase 4: Translate lessons learned into improving overall preparedness
Phase 1:
Determine your DR testing readiness and scope
Phase 2:
Create a project charter to build a test plan
Phase 3:
Create the DR test plan
Phase 4:
Translate lessons learned into improving overall preparedness
Phase 4 outline: Translate lessons learned into improving overall preparedness
Call 1-888-670-8889 or email GuidedImplementations@InfoTech.com for more information.
Complete these steps on your own, or call us to complete a guided implementation. A guided implementation is a series of 2-3 advisory calls that help you execute each phase of a project. They are included in most advisory memberships.
Guided Implementation 1: Make sure that lessons learned during testing are fully utilized
Proposed Time to Completion (in weeks): 4 weeks (1 call every 2 weeks)
Phase 4.1: Incorporate lessons learned from testing
Start with an analyst kick-off call:
- Review completed tests and identify key learning points
- Script the executive presentation and identify presentation best practices
Then complete these activities…
- Compile results from the Issue Log and Participation survey
- Create the Summary of Results presentation deck
With these tools & templates:
- DR Test Issue Log and Analysis Tool
- DR Active Test Evaluation Survey
- DR Test Plan Results Summary Presentation
Phase 4.2: Create a DRP review, testing, and maintenance schedule
Review findings with analyst:
- Review executive response to testing results
- Discuss future actions to further improve DR capabilities
- Analyze current testing best practices
Then complete these activities…
- Plan next year’s DR testing cycle by updating the readiness assessment and test plans
With these tools & templates:
- Storyboard
- DR Testing Readiness Assessment Tool
- DR Test Plans
Phase 4 Results & Insights:
- Update management on the results of the test and gain buy-in to incorporate an annual testing cycle to continuously update and maintain the DRP.
Reinforce lessons learned from each test by developing a thorough review process
4.1: Test Review and Summary
At this stage of the project, you have created all of the necessary testing documentation as well as executed a test. Next, review the lessons learned from the test with both staff and executives.
Evaluate and incorporate lessons learned, and update test plans accordingly using the following tools:
9 – DR Test Issue Log and Analysis Tool: Deploy to system owners to track issues found during testing. Collect and analyze issue trends to target areas for improvement and assess overall success.
10 – DR Active Test Evaluation Survey: Collect feedback on test procedures and readiness to drive improvements.
11 – Results Summary Presentation: Create a summary presentation to communicate test results to all stakeholders, including the executive team and test participants.
Summarize and present test results to the executive team – Step 1 Readiness
Email to arrange GI call: GuidedImplementations@InfoTech.com
Management support is critical to the success of any DR strategy or initiative. When your testing cycle has completed, it is important to re-engage management and brief them on the results.
11 – DR Test Plan Results Summary Presentation: Completed by Facilitator
Step 1: Review your DR testing readiness with management. Indicate the types of tests that you are prepared for and what capabilities you need to develop before more complex tests can be done. Demonstrate your current testing proficiency or need for testing so that management can see your capabilities growth. If applicable, present the action items that were completed to finish the project.
![The image is a screenshot of a document titled Step 1 review continued: Action Items [optional]](https://dj5l3kginpy6f.cloudfront.net/blueprints/Reduce-Costly-Downtime-Through-DR-Testing/Action-Items.png)

Summarize and present test results to the executive team – Step 2 Test Schedule
4.1: Test Review and Summary
11 – DR Test Plan Results Summary Presentation: Completed by Facilitator
Step 2: Review your DR Test Schedule with management. This is intended to showcase all of the work that the DR test team has done during this test cycle, and also acts as a high level overview of the following slides.
Summarize and present test results to the executive team – Step 3 Passive Testing
4.1: Test Review and Summary
11 – DR Test Plan Results Summary Presentation: Completed by Facilitator
Step 3a: Review the TTX test results with management. The primary purpose here is to focus on the learning points that came out of the discussion. Give management an overview of the scenario that was used in the TTX and then conclude with general findings such as the discussion from the hotwash and the action items.
Summarize and present test results to the executive team – Step 4 Active Testing
4.1: Test Review and Summary
11 – DR Test Plan Results Summary Presentation: Completed by Facilitator
Step 3b: Review the Active Test results with management. The primary purpose here is to focus on the growth in strength and reduction in weakness of your DR capabilities. Give management an overview of the Active Test by reviewing the issue log of the test, the test evaluation findings, and the action items that were completed to close the necessary gaps.
Accelerate your DR Testing strategy ahead of the “norm”
4.2: Future considerations
DR testing has seen a significant rise in importance; however, this renewed focus has not yet translated into improved DR preparedness. Don’t settle for the norm, get ahead of the curve.
- In a recent survey conducted by the Disaster Recovery Preparedness Council, only 39% of organizations that test the DRP document the results of their tests. This essentially means that 61% of organizations are wasting valuable time, energy, and capital, since an undocumented test will do nothing to improve DR preparedness.
- Another alarming metric is that only 24% of organizations repeat the test if the organization did not pass. The goal of testing is to identify strengths/weaknesses and gaps in your DR capabilities, and then close those gaps so that they do not become vulnerabilities in an actual disaster. If your organization has failed a test, then that test needs to be repeated so that the gaps that caused the test to fail initially can be identified and closed. (Disaster Recovery Preparedness Council, “The State of Global Disaster Recovery Preparedness” Annual Report 2014)
Continue to use the cyclical testing approach and establish an annual DR testing habit
4.2: Future considerations
Disaster Recovery Testing Cycle
- Update DRP & Test Plans
- Tabletop Exercise
- Revise
- Retest
- Update DRP & Test Plans
- Simulation
- Update DRP & Test Plans
- Parallel
- Update DRP & Test Plans
- Full-scale
Adapted from: SANS Institute, “Disaster Recovery Plan Testing: Cycle the Plan, Plan the Cycle”
The Disaster Recovery Testing Cycle (shown above) reinforces the notion of using each test as a building block for the next test. By planning your test cycle at the start, this approach will maximize the value of each test.
- Tools and templates provided in this blueprint will allow your organization to run through the DR testing cycle for the first time.
- Make sure that this process is not a one-off activity. Go back and update all relevant documents and repeat the process on at least an annual basis.
- Conduct a series of tabletop exercises before moving onto Active Testing.
- Similarly, it is best practice to conduct simulation and parallel testing before engaging in a full-scale test.
- Lastly, make sure that gaps identified in testing are documented and updated in the DRP before moving onto the next test.
Take the next step in DR preparedness by incorporating DR into new project considerations and maintenance activities
4.2: Future considerations
Truly prepared organizations do not treat DR as an event that only occurs when scheduled, they incorporate DR into the decision-making process, and build DR preparedness from the ground up.
Only 2% of all organization have this level of DR maturity. Leverage the learning points from the following two examples to incorporate these methodologies into your organization:
Example 1- Implement new accounting solution ABC (new project example)
- An organization that is mature in DR will assess DR requirements as part of the project scoping and requirements definition phase. They will identify metrics such as what is the uptime requirement? From there they would review and update the BIA Tool to determine criticality and established desired RTO/RPO.
- Based on the above, the organization will scope the project to meet the uptime requirement and desired RTO/RPO. This includes how it’s provisioned in the primary data center and the DR solution that is implemented (e.g. warm standby system at a DR site, or implementing a solution to recreate the accounting system in a cloud environment).
- As part of implementing the accounting solution, it must also go through a release management process (e.g. unit testing, then system testing, then user acceptance testing before it’s released to the production environment).
The same release management process needs to be followed when the system is implemented in the DR environment (e.g. unit testing, then system testing, then user acceptance testing before it’s released to the production environment).
Note: To this point, the organization has only conducted system validation testing – they haven’t tested the failover procedure. The organization will flag this system for inclusion in the next DR test.
Take the next step in DR preparedness (continued)
4.2: Future considerations
Example 2 - Upgrading from Exchange 2003 to Exchange 2010 (maintenance and upgrades example)
- The same mature organization will also assess DR requirements as part of their change management process. They will identify the impact of this Exchange upgrade on the existing environment and DR procedures.
- Based on the change, the company will scope the maintenance project accordingly to maintain the same DR capability.
- As part of implementing this upgrade, it must go through a change management process (e.g. this includes the above scoping as well as unit testing, then system testing, then user acceptance testing before it’s released to the production environment).
The same change management process needs to be followed when the upgrade is implemented in the DR environment.
Note: To this point, the organization has only conducted system validation testing – they haven’t tested the failover procedure. The organization will flag this system for inclusion in the next DR test.
Info-Tech Insight
The examples reflect a desired state where DR considerations are included in day-to-day decision making. This does not take a large budget to achieve, but rather process improvement. For example, include DR considerations (such as availability and recovery requirements) in the requirements to be evaluated up front during project planning; this is less effort and cheaper than retroactive DR planning.
Case Study: See how Blank Rome created a sustainable DR environment through an incremental improvement process
Current Situation
- Larry Liss, the CTO of Blank Rome LLP, shared some insights from his DR testing strategies.
- Currently, Blank Rome uses a co-location site which acts as a DR site. For critical apps such as email, the co-location acts as a hot-site.
- Blank Rome utilizes a variety of testing methods to verify their DR plans. The chief network architect arranges annual component testing for critical systems. For example, recently the financial systems were tested for their ability to be shut down and brought back up at the co-location.
- Aside from component tests, exercises such as tabletop exercises are also used. Customers of Blank Rome demanded that DR testing occur on a regular basis and so a hurricane-based tabletop exercise scenario was recently executed. The exercise involved senior managers as well as administrative staff, who spent several hours going over a very detailed exercise that incorporated notification, escalation, and resolution. At the end of the exercise a detailed post-mortem was conducted, and it included the gaps and action plan that came out of the exercise.
Future Decisions
- Larry plans on implementing a VMware Site Recovery Manager. This will greatly reduce the current RTO, and also make testing significantly easier. Once the SRM solution is in place, Larry plans on executing annual full-scale tests.
Reflections
- Larry’s situation is representative of an organization that is in the advanced stages of DR testing. Blank Rome was able to arrive at this stage due to management support and external demand for DR testing.
- An advanced DR testing strategy has documented processes and scheduled testing; however, the lack of full-scale testing is holding Blank Rome back from having a fully mature DR testing strategy.
- Larry’s strategy of getting the SRM in place before executing a full-scale test is a practice that Info-Tech advocates. It is always prudent to build up the necessary capabilities before a full-scale test as it will greatly reduce the risk of unintentional interruptions.
(Gardner, Dana. "Case Study: Strategic approach to disaster recovery and data lifecycle management pays off for Australia's SAI Global”)
If you want additional support, have our analysts guide you through this phase as part of an Info-Tech workshop
Book a workshop with our Info-Tech analysts
- To accelerate this project, engage your IT team in an Info-Tech workshop with an Info-Tech analyst team.
- Info-Tech analysts will join you and your team onsite at your location or welcome you to Info-Tech’s historic Toronto office to participate in an innovative onsite workshop.
- Contact your account manager (www.infotech.com/account), or email Workshops@InfoTech.com for more information.
The following are sample activities that will be conducted by Info-Tech analysts with your team:
4.1 Incorporate lessons learned from testing
Leverage all the key learning points from testing and use the Summary of Test Results presentation template to communicate to upper management. Identify a best practice of connecting business needs to DR capabilities and improvement strategies.
4.2 Create a DRP review, testing, and maintenance schedule
Discussion of best practices in terms of DR test maintenance. See the reason why organizations fail and how successful organizations are able to incorporate testing into operational decision making. Define a plan for continuous improvement and annual test cycles.
Summary
Sections:
Introduction
Project Phases
Summary
- What's in this Section:
- Summary
- Related research
- References
- Research Contributors
- Appendix
Summary of accomplishment
Knowledge Gained
This blueprint outlined how to:
- Create all the necessary documentation necessary to conduct both passive and active testing.
- Document the key learning points from testing and how to incorporate them into future tests and daily operations.
- Present test findings to executives, and create a positive feedback cycle that connects test results to business improvement.
Processes Optimized
The following processes were optimized:
- Readiness assessment process
- DR testing document creation process
- Participant and management buy-in process
- Post-test review process
Deliverables Completed
As part of an overall crisis management plan, the following deliverables were completed:
- Identified current DR capabilities and list of action items in the Readiness Assessment Tool.
- DR testing project responsibilities assigned and approved in the Project Charter.
- Summary of systems to be tested and testing schedule defined in the Test Plan Summary and System Status Worksheet.
- Tabletop planning process documented in the Passive Testing Handbook.
- Simulation, parallel, and full-scale testing process based on the Active Testing Handbook.
- Communicated testing results and DR capability progression through the Summary of Test Results.
Project steps summary
Client Project: Reduce costly downtime through DR testing
1. Determine your DR testing readiness and scope
1.1 Identify your current testing readiness and action items
1.2 Identify your testing proficiency and need
2. Create a project charter to build a test plan
2.1 Identify roles and responsibilities for building the test plan
2.2 Define project parameters and milestones
3. Create the DR test plan
3.1 Create a framework for the overall test plan
3.2 Create the passive testing facilitator’s handbook
3.3 Create the active testing facilitator’s handbook
4. Ensure your DRP is updated with lessons learned
4.1 Review your test results and present a summary to management
4.2. Incorporate lessons learned into operational decision making
Info-Tech Insight
This project has the ability to fit the following formats:
- Onsite workshop by Info-Tech Research Group consulting analysts
- Do-it-yourself with your team
- Remote delivery (Info-Tech Guided Implementation)
Related Info-Tech research
Disaster Recovery and High Availability Research
- Create a Right Sized Disaster Recovery Plan
- Develop a Business Continuity Plan
- Bridge the Gap between Service Management & Disaster Recovery
- Disaster Recovery Primary Site Restoration: Home Sweet Home
- Maximize Availability for Mission Critical Systems
- Improve IT-Business Alignment with an Infrastructure Roadmap
- Create Visual SOP Documents that Drive Process Optimization, Not Just Peace of Mind
Backup Strategy Research
Bibliography
Rothstein, Philip Jan. Disaster Recovery Testing Exercising Your Contingency Plan (2007 Edition). Brookfield, CT: Rothstein Associates, 2007. Print.
Preparedness Council, Disaster Recovery. "The State of Global Disaster Recovery Preparedness." The State of Global Disaster Recovery Preparedness (n.d.): n. pag. July 2014. Web. 4 Feb. 2015.
Gardner, Dana. "Case Study: Strategic Approach to Disaster Recovery and Data Lifecycle Management Pays off for Australia's SAI Global." ZDNet. BriefingsDirect, 26 Apr. 2012. Web. 04 Feb. 2015.
Grance, Tim, Tamara Nolan, Kristin Burke, Rich Dudley, Gregory White, and Travis Good. "Guide to Test, Training, and Exercise Programs for IT Plans and Capabilities." (n.d.): n. pag. Technology Administration U.S. Department of Commerce. National Institute of Standards and Technology, Sept. 2006. Web. 4 Feb. 2015.
Dolewski, Richard. "Disaster Recovery Plans: Practice Makes Perfect." Data Center Knowledge. INDUSTRY PERSPECTIVES, 26 Apr. 2011. Web. 04 Feb. 2015.
Crump, George. "Disaster Recovery Plan Testing: Will Your Plan Work?"TechTarget. N.p., Oct. 2013. Web. 4 Feb. 2015.
Earls, Alan R. "Disaster Recovery Testing Best Practices: Test Thoroughly and Often." TechTarget. N.p., 6 Dec. 2010. Web. 4 Feb. 2015.
Krocker, Guy Witney. "Disaster Recovery Plan Testing: Cycle the Plan, Plan the Cycle." SANS Institute InfoSec Reading Room. STONESOFT, 2002. Web. 5 Feb. 2015.
Organizations and experts who contributed to this research
Interviews
- Bernard A. Jones, Manager Business Continuity & Disaster Recovery – HS&E Business Continuity, Novartis Business Services
- Robert Nardella, IT Service Management, Certified z/OS Mainframe Professional
- Larry Liss, Chief Technology Officer, Blank Rome LLP
- Paul Kirvan, FBCI, CISA, Independent IT Consultant/Auditor, Paul Kirvan Associates
- Steve Tower, Management Consultant, Steve Tower, Disaster Recovery Plans & Assessments
- Joe Starzyk, Senior Business Development Executive, IBM Global Services
- Thomas Bronack, Enterprise Resiliency and Corporate Certification Consultant, DCAG
- Paul S. Randal, CEO & Owner, SQLskills.com
Glossary: Tabletop Testing (Passive Testing) – Walk through disaster scenarios and test your incident response procedures
Tabletop exercises provide a useful opportunity for the DR Coordinator to quickly test various skills and identify professional development needs.
- Description: DR team members and other applicable third-party participants meet to verbally walk through the DR documentation to validate the specific steps, procedures, and scope without simulating an actual disaster.
- Purpose: Allows for a review of the DR plan to ensure it remains relevant, and to test nuanced proficiencies. An informal classroom setting allows for easy brainstorming to adapt or enhance the DR plan.
- Applicability: Broadly applicable as they provide a practical, impactful, and fiscally conscious DR testing methodology. Used in conjunction with previously validated checklist plans, tabletop exercises allow for focused discussions and enhanced training modules.
- Bottom Line: Widely employed and effective; they have the highest relative impact on overall DR success.
Glossary: Unit Testing as Systems Are Updated – Incorporate standby equipment in change management procedures
Unit testing ensures standby equipment stays current with the production environment and is operational when you need it.
- Description: When the production environment is updated, put your standby equipment and/or DR environment through the same release process, including change management and QA procedures.
- Purpose: Ensures standby equipment and/or your DR environment is operational and current.
- Applicability: Applies to all standby systems.
- Bottom Line: Info-Tech has found this to be the second most effective testing methodology, after tabletop testing.
Note: Unit testing is also part of the process for validating system functionality after a simulation, parallel, or full-scale test.
Glossary: Simulation Testing – Provides authentic disaster experience without the expense or impact of a full failover
Simulations validate system recovery procedures, but not necessarily the ability to execute business processes at the DR environment.
- Description: A disaster is simulated to not interrupt normal operations. Recovery facilities and systems are brought online to make sure procedures are accurate.
- Purpose: Intended to validate, in whole or in part, DR hardware, software, personnel, communications, procedures, supplies, and documentation. Effective simulations are well scoped, rigorous, and comprehensive.
- Applicability: Largely applicable test method, which allows for comprehensive procedural review and development of key skills. However, they are more costly than simple walkthroughs.
- Bottom Line: A widely used testing method because it’s comprehensive without interrupting the business.
Glossary: Parallel Testing – Allows for validation of the restoration environment to ensure operations are consistent
Identify operational gaps in the restoration system by loading historical data and running it against the historical outputs.
- Description: An extension of simulation testing that includes the processing of historical data to ensure that systems are not just working, but working as intended.
- Purpose: Provides a macro-level confirmation of DR plan readiness by reconciling the operational outputs from each system to identify and investigate variances in operations.
- Applicability: Similar to simulation testing, it allows for comprehensive procedural review and development of key skills. It also validates business processes can be executed at the DR environment.
- Bottom Line: Provides an extra level of confidence in your DRP by validating data and the ability to execute business processes.
Glossary: Full Scale Testing – Failover to your DR environment to test end-to-end process
For most organizations, full scale testing is not possible or practical due to the risk of downtime and insufficient DR capabilities.
- Description: Involves the full interruption of the production environment and failing over to your DR environment.
- Purpose: By failing over the production environment to the DR environment, this provides the most extreme test of your DRP. The business executes normal activities using the technology provided by the DR environment.
- Applicability: Full scale exercises are resource and time intensive. Even if capable, smaller organizations may find it difficult to get the executive support for such a program.
- Bottom Line: Full scale testing has the greatest potential cost and risk. As such, it should only be attempted by enterprises that have a high need to ensure the ability to withstand disaster and accurately validate RTOs/RPOs.
DIY Workshop Instructions: Reduce Costly Downtime Through DR Testing
Introduction
This section provides guidelines for how to use this blueprint to run your own internal workshop to implement DR testing best practices. Alternatively, contact Info-Tech to facilitate an onsite workshop. The DR testing methodology used for the workshop is the same as what is outlined in the rest of this blueprint.
Specifically, this section includes the following:
- Workshop schedule
- Post-workshop steps
- Recommended workshop participants
Note: For the workshop, use the same tools and activities that were outlined in the blueprint.
Workshop Schedule (Project Phase 1)
Complete the instructions outlined for the workshop schedule. This includes identifying the crises that are relevant to your organization and then testing your existing crisis management plans against one of the prioritized risk.
A summary of the activities, goals, and deliverables are listed below:
Project Phase | Activity | Goal | List of Deliverables |
---|---|---|---|
1. Determine your DR testing readiness and scope | A. Determine current testing readiness | Highlight the most likely as well as most impactful crises vulnerabilities that are introduced through business strategy. | Readiness Assessment Tool |
B. Identify list of action items | Review all the necessary action items needed before commencing DR testing. Also document the ones that have been currently completed, and establish an estimated completion date for the remaining. | Readiness Assessment Tool | |
C. Formulate testing strategy | Understand the implications of your current testing readiness, proficiency, and need. | Readiness Assessment Tool |
Workshop Schedule (Project Phase 2)
Project Phase | Activity | Goal | List of Deliverables |
---|---|---|---|
2. Create a project charter to build a test plan | A. Identify roles and responsibilities in project charter template | Create role clarity through assignment of responsibilities, which will reduce disengagement in the future. | Project Charter Template |
B. Work with the other members of the DR test plan team to complete the charter | Document a comprehensive project charter that covers all aspects of the test plan creation process. | Project Charter Template | |
C. Document project parameters and the milestones table | Create clear expectations and buy-in from management so that DR testing will remain front of mind. | Project Charter Template |
Workshop Schedule (Project Phase 3)
Project Phase | Activity | Goal | List of Deliverables |
---|---|---|---|
3. Develop the DR test plans | A. Create the Test Plan Summary | Determine all the systems that are to be included in the test plan, and generate the test schedule for the entire test cycle. | Test Plan Summary System Status Worksheet System Test Plans |
B. Construct the Passive Testing Handbook | Determine the scope of each tabletop exercise that was identified in the test schedule and document the requirements for each test. | Passive Testing Handbook | |
C. Organize the Active Testing Handbook | Determine the scope of each active test that was identified in the test schedule. Generate the necessary requirements for each test and also create the active testing execution tools. | Active Testing Handbook Issue Log and Analysis Tool Active Test Evaluation Survey |
Workshop Schedule (Project Phase 4)
Project Phase | Activity | Goal | List of Deliverables |
---|---|---|---|
4. Translate lessons learned into improving overall preparedness | A. Construct the executive presentation deck | Connect the IT team to the business side, by demonstrating the need for testing and the results/benefits from testing. | Summary of Test Results |
B. Revaluate the action items and adjust the statuses and expected completion dates | Document the reduction in necessary action items for testing, and adjust the expected dates accordingly. | Summary of Test Results | |
C. Discuss future test cycles and how to incorporate lessons learned | Establish DR testing as an annual test cycle, and incorporate the DR mindset into operational decision making through integration of lessons learned. | DR Test Plans |
Recommended workshop participants
The DRP team will be the core participants for the full workshop. Include business participants for the following steps:
- Phase 2: Obtain executive sign-off on the project charter.
- Phase 3: Invite the relevant business users and get their feedback on which systems to include in testing. Also act as a review process for the business impact analysis.
- Phase 3: If necessary, obtain executive sign-off on the Facilitator Handbooks.
- Phase 4: Present workshop results to management, so that everyone is clear on the gaps to address before testing can commence and to indicate an action plan for closing those gaps.
Guided Implementation
For additional guidance on how to run your own workshop, or for assistance with any of the project steps outlined in this blueprint, please call 1-888-670-8889 or email GuidedImplementations@InfoTech.com to arrange to speak to an Info-Tech subject matter expert.