Comprehensive Software Reviews to make better IT decisions
Disaster Recovery Is a Cloud Gateway Drug
Cloud-based disaster recovery (DR) is a sort of gateway drug for public cloud. In addition to providing an on-demand failover target in an emergency, a cloud DR project provides a foothold for exploring further use of cloud, including permanent migration. But it is important to view cloud DR as a potential part of migration, not the whole story.
Among Info-Tech client firms developing a cloud strategy, two near-term projects have come up repeatedly in the past two years: Office 365 migration and leveraging the cloud for offsite backup and disaster recovery. These two are the “right now” projects in cloud strategy; moving legacy core applications to the cloud is the “maybe tomorrow” project.
Discussion of disaster recovery on a public cloud like Azure or AWS is reminiscent of where server virtualization was ten years ago. Back then, virtualization was just beginning to take off, but there was hesitancy, if not outright resistance, to virtualization because of concern over whether a virtual machine (VM) could be trusted with a critical workload.
Today, as ten years ago, many organizations are somewhat hesitant about hosting production workloads on a public cloud. And just as ten years ago, DR in the cloud offers the benefit of building an offsite recovery capability, without building or renting a data center, as well as an opportunity to test the capabilities of the public cloud to host those important systems.
But there are important differences between what is going on with cloud migration now and migration to virtual infrastructure then.
When DR Was the Virtualization Gateway Drug
Early virtualization projects were focused on secondary servers such as those for test and development. Virtualization of core production servers was not nearly as aggressive. For many organizations, what got virtualization moving was not a stellar business case about capital expense reduction and provisioning agility but rather business continuity/disaster recovery planning and backup.
Backup had grown beyond files to full system imaging. A new term, bare metal restore (BMR), was coined for the full restore of a running server (operating system, applications, data) to different server hardware. Success with BMR prompted a question: if the imaged system can be restored to different hardware (on premises or in another data center), could it not be restored to a virtual machine hosted on a server running a hypervisor like VMware?
Restore to VM had a number of advantages for system availability and recovery:
- You didn’t need to purchase redundant hardware to ensure warm failover availability. You only needed enough available capacity on the secondary hardware.
- Instead of acquiring, configuring, testing, and deploying new hardware for restore, a new properly configured VM could be provisioned in minutes.
- In the DR use case, performance did not need to be best but good enough for the system to be available during the emergency.
For all of these reasons virtualization for system restore was an easier sell than virtualization for critical production servers. Those servers could be brought up quickly in an emergency on VMs, and they didn’t have to be perfect, just good enough for the duration of the event.
It was testing that really sold virtualization to the business. Regardless of your backup methodology and architecture, it is only as good as the restore. In testing restore, organizations found that the virtual infrastructure was very resilient, was suitably performative, and was instantiated very quickly.
In one case I recall from this period, an IT director at a large professional services firm made the virtualization case to senior management to a lukewarm response. To them it seemed risky even if the benefits looked promising. However, virtualization was deployed for backup restores in DR planning.
A year later the same IT director was presenting results of an annual DR test. A senior exec stopped him:
Exec: Wait a minute. There is something wrong here.
IT Director: What is that?
Exec: It says here you were able to fully restore system function in three hours.
IT Director: Yes.
Exec: But two years ago your report said it took three days.
IT Director: Yes, but we are restoring to virtual machines now. They are very fast to spin up.
Exec: And how did they work after that?
IT Director: Full function, and if there is a problem, restart takes five minutes.
Exec: Well, if it’s so functional and resilient, why aren’t we just running the systems on virtual machines full time?
Once the tech had proven itself thus, migration to virtual machines took off. Organizations could buy new virtual machine host servers. They could treat those servers at first as restore targets. Then when the old hardware reached end of life they could do a permanent failover, and the secondary would become primary. For new workloads a policy of virtualization first was followed.
From Virtualized DR to Cloud DR
Virtualization helped simplify offsite disaster recovery. Migrating systems to the secondary site was a matter of copying data, not moving hardware. A virtual machine is, essentially, a data file. For fast failover the data from running systems could be replicated continuously, the spinning up of VMs at the second site automated.
The success of site-to-site replication and virtual site recovery services prompted yet another question: If we can host the virtualized systems and data at a second site, why couldn’t we host them in the cloud? After all, cloud Infrastructure as a Service (IaaS) is based on virtual machines running on the cloud provider’s infrastructure.
“Ho yes!” responded the cloud providers quick to exploit a new market for Disaster Recovery as a Service (DRaaS).
And just as restore-to-VM services of old handled the configuring of backed-up bare metal servers to run on VMs, cloud-based DR services (such as Azure Site Recovery) also manage the transition of bare metal server images and VMs to run on their IaaS platform. For example, in the case of Azure, backed-up VMware virtual machine files are configured to run as instances on Microsoft’s non-VMware cloud.
Many of the usual suspects in backup – such as Veeam and Commvault – have partnered with the public cloud providers. The backup software handles onsite backup and offsite replication to the cloud. In an outage, the cloud-based site recovery services can take the data and instantiate your critical app servers.
Also similar to the old restore-to-VM services, cloud DR provides an opportunity to get more comfortable/knowledgeable regarding public cloud IaaS. Recovery services need to be tested. Testing can answer three important questions about migrating a given system to cloud IaaS:
- Will the system run in the cloud?
- Will performance, availability, security, and compliance meet requirements?
- How much will it cost?
Of course the cloud provider or a third-party consultant can answer those questions, but those answers will be estimates based on architecture and rate cards. Nothing is as good as a live test. If it all works, there is also the potential of effecting a permanent failover to migrate to the cloud.
How the Current Situation Is Not Like the Past
Before we run off and architect cloud backup and cloud recovery services as the gateway to permanent cloud migration, it is important to recognize how cloud DR is not like those old restore-to-VM projects.
In restore-to-VM the virtual server infrastructure was the end game. The goal in virtualization was to move all server workloads to virtual infrastructure. Today that goal has largely been achieved, with many organizations reporting 90% or more of their servers virtualized. In hyperconverged infrastructure, storage and network switches are also fully virtualized.
Cloud IaaS is a form of virtual infrastructure. But is your cloud strategy end game to mirror your on-premises infrastructure in a cloud? DR can help you prove the viability of this “lift and shift” cloud migration, but is that the best-case scenario? What about refactoring applications and data for Platform as a Service (including cloud native application development)? What about migrating data to Software-as-a-Service applications?
In Info-Tech’s cloud strategy research, a “cloud first” policy is not the same as the old “virtualize first” policy. Where the latter asked “Can this app or service run on a virtual machine?” the former asks “Can this app or service be hosted on Infrastructure as a Service, Platform as a Service, or Software as a Service? Which is the best fit?”
- Make sure cloud is a fit for your DR planning.
Organizations pursuing “virtualization first” did find that there were applications that were not candidates for virtualization. Similarly, a DR scenario that includes systems (such as non-x86 systems) that are not easily migrated to a public cloud will need to look elsewhere or pursue a more hybrid cloud/non-cloud approach. For more on that see Info-Tech’s Select the Optimal Disaster Recovery Deployment Model.
- Leverage cloud DR as a start for cloud migration, not the whole story.
Your cloud strategy needs to consider the broad canvas of cloud-based services. Cloud DR provides a gateway for broader infrastructure “lift and shift” to cloud IaaS but this may only be the first phase of a longer-term roadmap that ends in multi-service hybrid cloud.
- Use cloud recovery testing to get a real-world understanding of capabilities and costs.
If you are pursuing a cloud DR strategy, leverage your recovery testing to get the best evidence about the viability of permanently hosting your systems in the cloud. Remember that any DR plan needs to be tested. This could be an opportunity.
Disaster recovery on a public cloud service like Azure or AWS is a gateway drug to making further investments in cloud. That’s because cloud DR provides a relatively low-risk method of putting a toe in the water and gauging whether cloud is a viable option for further investment. Just make sure that you do not make it your final destination.
Want to Know More?
Joshua Burgin, the technical advisor to the senior vice-president at Amazon Web Services (AWS), opened AWS Summit Toronto with jabs at Oracle and Microsoft. AWS wants to position itself as customer-centric, but users of its platform might only end up locked-in to a more beneficent vendor.
VMware challenges IT to be more than it may be comfortable with: technologists as members of an elite caste charged with the moral use of technology and guarding the uninitiated against negative consequences.
Analysts make their bones on prognostication and prediction, and the imminent demise of any given technology is a mainstay of their subject matter. San Francisco-based VMware has made its sacrificial offerings but for two different auguries. First the place and dominance of public cloud as the center of the enterprise IT activity and work. Secondly, and more importantly, the enduring importance of self-service, elasticity, measure service, broad network access, and pooled resources.
It is no surprise that this year’s OpenWorld conference continued to focus on Oracle’s cloud efforts. We dive in to discover if Oracle is doing enough to catch up to the competition of Amazon’s AWS, Microsoft’s Azure and Office clouds, and Google’s GCP.
Oracle is aiming to make it extremely easy to shift your VMware workloads to the Oracle Cloud. In addition, it will provide you the capability to choose where your data will reside. This is an important feature for organizations concerned about data sovereignty.
The University of British Columbia is partnering with Amazon Web Services to build a cloud innovation center.
At the AWS Summit in Toronto on October 3, 2019, Amazon Web Services announced a third availability zone (AZ) for Canada Central, to be launched in 2020. A third AZ will provide increased reliability and improved DR capabilities for AWS customers who wish to keep their data in Canada.
Should Google’s parent, Alphabet, buy Nutanix? If analysts at forecasting software vendor Trefis have their way, the search giant should be signing the check.
Amazon has unveiled its Quantum Ledger Database service. This service threatens vendors who build bespoke blockchain solutions without peer-to-peer functionality.