Provide the highest level of Business Continuity with Availability Zones and Availability Sets in Microsoft Azure


Unfortunately, a lot of people still think that moving to the Cloud will solve all their Security and Business Continuity problems. So, let’s make things clear, it’s not doing that! Running workloads in Azure (or any other Cloud vendor, not specifically Azure), you’ll still need to cover those things by yourself. Although, Azure is a very reliable, sustainable platform and didn’t had much downtime throughout last year, but that’s no reason to completely rely on those past results, right?

“He who fails to plan is planning to fail” – Benjamin Franklin

Last week, Microsoft released the Availability Zones in GA, which means that it is now ready and supported for production usage. This was the trigger to create this blog and share my insights on Business Continuity within Microsoft Azure. We all want to reach that 99.99% SLA – uptime, right? With Availability Zones – Business Continuity in Azure gains in possibilities which provides an extra layer of redundancy for better resiliency and high availability for individual workloads. It offers the protection of your applications and data from datacenter failures, meaning that if your primary Azure datacenter location is down, another datacenter is standing by in the same region and will take over the workloads without any downtime.

What about Disaster Recovery? The Azure Cloud is the largest – most feature rich – Cloud platform, which gives you plenty of (easy) possibilities to provide those things, therefore you’ll still need configure them yourself (or automated). Azure Site Recovery is the Disaster Recovery As-a-Service solution that can provide Business Continuity as well as easy migration path to run your infrastructure in Azure. It provides a failover replication mechanism that is easy to configure with RTO (Recovery Time Objectives) times as low as 15 minutes. With Azure ASR, you can replicate to other datacenters in other regions around the globe – just as extra safety layer – in case of a big disaster, such as an earthquake or nuclear attack, which’ll effect the complete region for example. For some business is this way too far and they’re already satisfied when they’ve a separate environment as stand-by, therefore know that Azure can provide that option as well! More later on in this article on this topic…

See below an example of how Availability Zones work when using a Azure Load Balancer and Public IP as external switch.

 

In this article, I’ll go through the different concepts of Business Continuity when u use Microsoft Azure as Cloud platform. Which I surely recommend you when using Windows Server products in your on-premises environment.

And as always – Enjoy reading.

Table of Contents

Click on the title to get forwarded in the article:

Did you know these facts about Azure?

  • Microsoft currently has 50 Azure regions across the globe and is available in 140 countries?
  • Microsoft’s cloud business is growing almost twice as fast as Amazon’s, with Google far behind?
  • You could use Availability Zones for others solutions, such as Citrix XenApp and XenDesktop to make EUC Workspaces highly available within that specific region?
  • Availability Zones offers the protection of your applications and data from datacenter failures.
    • Availability Zones is currently supported for the following DC regions:
      • US Central
      • France Central
      • East US 2
      • West Europe
      • Southeast Asia
  • Within a region, Availability Zones increase fault tolerance with physically separated locations, each with independent power, network, and cooling?
    • The current GA Azure services that support Availability Zones are:
      • Linux Virtual Machines
      • Windows Virtual Machines
      • Virtual Machine Scale Sets
      • Managed Disks
      • Load Balancer
      • Public IP address
      • Zone-redundant storage
      • SQL Database
  • They are growing at a rate of 120.000 new customers per month.
  • More than 5 million organizations are using Azure Active Directory?
  • Availability Sets can be used to separate Azure VMs into separate fault and update domains within a datacenter to reduce the risk of an outage during planned and unplanned maintenance?
  • Microsoft Azure guarantees 99.9% up-time SLA for a single instance virtual machine?
  • Azure Managed Disks are much more capable to provide:
    • Scalable virtual machine deployments
    • Highly durable and available
    • Point-in-time backup
  • Auto-scaling servers is way better than just increase resources for a specific server?
  • Use the following website to test the best Azure Datacenter region to your location http://www.azurespeed.com/– it is really helpful to find the fastest connection with the lowest latency to the nearest DC within your business!
  • Azure for Students gives university students $100 per year in free Azure with no credit card required, interested? Check this link.

Note: Searching for a good migration tool for migrating on-premises workloads to Microsoft Azure? Please check one of my previous blogs around Azure Migrate Service.

Cheat sheet on availability

The sheet below shows very easily the percentage value and the result of downtime. Microsoft is offering a 99.99% service level agreement when virtual machines are running in two Availability Zones in the same region. 

Microsoft also recommends placing two or more VMs within an availability set to provide highly availability to accomplish a 99.95% service level agreement. 

Availability Zones Why You Should Use Them?

As mentioned in the beginning of this article – Availability Zones are a new method to instantly ensure a higher level of high-availability to protect your applications and data from datacenter failures. Azure Regions will be broken down into at least 3 separate Availability Zones, such as West Europe and East USA. Each zone is made up of one or more datacenters equipped with independent power, cooling, and networking.

To ensure resiliency, there’s a minimum of three separate zones in all enabled regions. The physical separation of Availability Zones within a region protects applications and data from datacenter failures. Zone-redundant services replicate your applications and data across Availability Zones to protect from single-points-of-failure. With Availability Zones, Azure offers the highest 99.99% VM uptime SLA possible.

Microsoft only charges for the bandwidth of replication the Virtual Machines, data and components through the different zones in the region. Which’ll end up like 0.009 cents per GB.

Regions that are currently support for Availability Zones are:

  • US Central
  • France Central
  • East US 2
  • West Europe
  • Southeast Asia

See below the steps how to configure Availability Zones. It’s very easy!

Paired regions

The following regions are paired to be used within the Availability Zones. For instance, when u place your Virtual Machines in Zone Europe – North and West Europe will then be used for as secondary sites.

See below the complete list of all the Paired regions within Microsoft Azure.

Note: You’ll get the following error when using a region that isn’t supported for Availability Zones.

You can configure Availability Zones during the creation process of a Virtual Machine in Azure.

Open the Marketplace

Enter the required information

Pick a machine size SKU.

Note: Not all the machines are supported for Availability Zones. Support for more machines will following soon.

Choose your Availability Zone in the Settings menu

 

Note: When u decide to activate Availability Zones for a SaaS web app, or any other service that needs to be available from the outside. Please make sure to use a Azure Load Balancing Service + Public IP activated in the same Availability setup!  

The public DNS record needs to be pointed to the Public IP in Azure to provide the connection if one zone is failing!

Availability Set

When you plan to connect the environment to Azure (what I certainly hope and recommend), you’d need to make sure to place you servers in the same Availability set, with at least two fault domains and two update domains activated. With this Azure service, you ensure the continuity / availability of your connection to your environment. (Just in case there are problems in the same Azure datacenters location.) For instance, if one of the two servers aren’t active in the same availability set, there is a chance that they have been activated in the same rack space of servers in the Azure Datacenter location. With fault domains, you’ll ensure that this can’t happen.

Fault domain 1 means rack 1, and number 2 – rack 2. This can be useful in case of a power outage in the rack for example, see picture below.

 Updates to domains are somewhat similar, now related on Windows patches & updates. When Microsoft releases updates to provide vulnerability fixes and/or exploit leaks, such as ransomware hacks, this can be a high risk on the platform. There will be a chance that updates are forced by Microsoft. When you place them in different update domains, you’ll ensure that the updates don’t apply to both machines at the same time, and again ensure the continuity of your connection!

Azure will notify you for Windows Updates for your Virtual Machines through the following service message. You’ll will also get a message in the Azure notification panel in the upper-right corner of the ARM – Azure portal.

How to move existing machines to an Availability Set

The creation process of Availability Set(s) are pretty easy. You only need to assign and/or create the Availability Set when setting up a new Virtual Machine.

Note: See below the step of the Azure VM marketplace VM deployment process, where you need to choose for the Availability Set!

Availability sets needs to be configured when created a Virtual Machine. There is no simple way to change this by any other way by removing the complete VM and re-create it though PowerShell. In order to change the availability set, you need to delete and recreate the virtual machine.

The only method to change current Virtual Machines to an Availability Set is through PowerShell. You need to follow the steps below to accomplish this. 

#set variables
$rg = "Azure-resource-group"
$vmName = "Name of VM"
$newAvailSetName = "Availability set Name"
$outFile = "C:\tmp\log.txt"

#Get VM Details
$OriginalVM = get-azurermvm -ResourceGroupName $rg -Name$vmName

#Output VM details to file "VM Name: "Out-File -FilePath $outFile
$OriginalVM.Name | Out-File -FilePath $outFile -Append "Extensions: "Out-File -FilePath $outFile -Append
$OriginalVM.Extensions | Out-File -FilePath $outFile -Append "VMSize: "Out-File -FilePath $outFile -Append
$OriginalVM.HardwareProfile.VmSize | Out-File -FilePath $outFile -Append "NIC: "Out-File -FilePath $outFile -Append
$OriginalVM.NetworkProfile.NetworkInterfaces[0].Id | Out-File -FilePath $outFile -Append "OSType: "Out-File -FilePath$outFile -Append
$OriginalVM.StorageProfile.OsDisk.OsType | Out-File -FilePath $outFile -Append "OS Disk: "Out-File -FilePath$outFile -Append
$OriginalVM.StorageProfile.OsDisk.Vhd.Uri | Out-File -FilePath $outFile -Append

if ($OriginalVM.StorageProfile.DataDisks) {
"Data Disk(s): "Out-File -FilePath $outFile -Append
$OriginalVM.StorageProfile.DataDisks | Out-File -FilePath $outFile -Append

}

#Remove the original VM
Remove-AzureRmVM -ResourceGroupName $rg -Name$vmName

#Create new availability set if it does not exist
$availSet = Get-AzureRmAvailabilitySet -ResourceGroupName $rg -Name $newAvailSetName -ErrorAction Ignore
if (-Not $availSet) {
$availset = New-AzureRmAvailabilitySet -ResourceGroupName $rg -Name $newAvailSetName -Location $OriginalVM.Location

}

#Create the basic configuration for the replacement VM
$newVM = New-AzureRmVMConfig -VMName $OriginalVM.Name -VMSize $OriginalVM.HardwareProfile.VmSize -AvailabilitySetId $availSet.Id
Set-AzureRmVMOSDisk -VM $NewVM -VhdUri $OriginalVM.StorageProfile.OsDisk.Vhd.Uri -Name $OriginalVM.Name -CreateOption Attach -Windows

#Add Data Disks
foreach ($diskin$OriginalVM.StorageProfile.DataDisks ) {
Add-AzureRmVMDataDisk -VM $newVM -Name$disk.Name -VhdUri $disk.Vhd.Uri -Caching $disk.Caching -Lun $disk.Lun -CreateOption Attach -DiskSizeInGB $disk.DiskSizeGB

}

#Add NIC(s)
foreach ($nicin$OriginalVM.NetworkProfile.NetworkInterfaces) {
Add-AzureRmVMNetworkInterface -VM $NewVM -Id $nic.Id

}

#Create the VM
New-AzureRmVM -ResourceGroupName $rg -Location$OriginalVM.Location -VM $NewVM -DisableBginfoExtension 

How Azure Site Recovery can help? 

Increase the availability when using Availability Zones combined with ASR. As mentioned in the above section, Availability Zones aren’t a Disaster Recovery solution. It’s a new method to increase the availability / continuity of your Virtual Machines in Azure by spreading them over 3 datacenters within the same region. Using Availability Zones combined with Azure Site Recovery takes your Disaster Recovery and Business Continuity strategy to a higher level!

For instance, if a datacenter goes down – your servers will be started from one of the 2 other datacenters in the availability zone. In case the complete region is down – you always have ASR which’ll will start the Recovery procedure from another zone, such as in the United States – Northern USA.

Azure Site Recovery can solve your on-premises Disaster Recovery problem easily with no requirements for hardware in a separate datacenter. Another great advancement of ASR is that you can use it as Lift-and-Shift migration tool, to seamlessly migrate on-premises Virtual Machines to Microsoft Azure Infrastructure-As-a-Service.

I’ve many customers that are not ready for a complete move to the Cloud. ASR is a great way as introduction with Microsoft Azure. It forces you nothing, only a very small fee of 25 dollars per replicated machine.

Note: An overview of all the required steps to configure Azure Site Recovery based on a on-premises VMware workload. If u use Hyper-V – SCVVM, the management – process server will be replaced by the SCVVM server. 

 

Azure ASR Deployment Planner (for VMware and Hyper-V)

With the Azure ASR Deployment planning, you can collect data and discover how your environment aligns with Azure Site Recovery – or which steps needs to be performed to get supported.

The following options are available in the tool. It gives you a very good and simplified view on the current status of your on-premises environment.

The tool can be downloaded for free through this link. For all the in-dept configuration steps for this tool, please visit the original Microsoft Docs page.

  • Compatibility: Is the virtual machine configuration and guest OS compatible with Azure and ASR?
  • Bandwidth: How much bandwidth is required for replication and how does this bandwidth impact your desired recovery point objective (RPO – how much data is lost after failover).
  • Azure infrastructure requirements: The quantity and tier of storage accounts that are required to maintain the required performance of failed over virtual machines. You are also given guidance on the required number of cores (subscription limits) and virtual machine series/sizes (optionally configured after the initial synchronization is complete).
  • On-premises infrastructure requirements: Hyper-V replication is based on Hyper-V Replica, which consumes storage for Hyper-V Replica Log (HRL) files. VMware customers will get guidance on the configuration/process server requirements.
  • Initial Synchronization: The more machines you add to replication at once, the worse the bandwidth is hit. The tool will make recommendations on batch size versus available bandwidth.
  • Cost analysis: An estimate of your Azure costs will be produced.

See below a screenshot of the outcome of the Deployment Planner.

Azure Site Recovery instant VM integration (preview)

This feature is really awesome, and can obviously only be used when running instances in Microsoft Azure. What this new feature – which is directly available from the VM settings menu –is instantly adding the Virtual Machine to Azure Site Recovery for replication to a different region.

The following regions are currently in preview mode.

Note: you can add the servers instantly to a ASR vault, therefore when you have no vault available, this service will create them for you, as you can see in the image below at the replication settings. The default retention will be 24 hours.

Perform the Migration with Azure Migrate and Azure Site Recovery 

I already wrote a complete and very comprehensive walkthrough article on the initial configuration of Azure Migrate and Azure Site Recovery not so long ago. Please check it out this blog post https://www.christiaanbrinkhoff.com/2017/12/01/how-to-lift-and-shift-on-premise-vmware-workloads-to-the-microsoft-azure-cloud-with-the-new-azure-migrate-service/

I’ve created the following How To video to experience how easy it is to migrate on-premises workloads to Microsoft Azure when using Azure Site Recovery.

If you have any question, or need help, please note them in the comment section.

Cheers,

Christiaan Brinkhoff