Nutanix Full Site Disaster Recovery

In this post I will walk through a full site failover using the intelligence built into the Nutanix Prism Software. The failover will be initiated from the DR location simulating a full site failure and the steps to bring up your VMs/Volume Groups at the remote site.

Note: This post makes the following assumptions.

  1. The Remote site has been configured with proper Network Mappings and vStore/Container Mappings (for details on setting this up go HERE)
  2. The Protection Domain to be recovered is configured to point to the Remote site (see Configuring Nutanix Protection Domain to perform these steps)
  3. The Protection Domain to be recovered has completed the initial seeding of the data aka it has completed 1 successful replication of all required VMs.
Failing Over…

Its 4:59 PM on Friday, you just finished packing up your laptop and zipped up your jacket when your manager walks in looking pale as a ghost. “Production just went down, we lost power to the datacenter on both feeds and it won’t be up for 5 hours. What do we do!?” he exclaims. Previously this would have sent your jaw to your shoes with terror but not today.

Using the below steps, you quickly bring back up production in your DR facility and bring color back to your managers face.

1. First login to Prism for the DR Nutanix Cluster. Go to Home –> Data Protection then select the Table view.

2. Next Select the Protection Domain you need to recover. (you will notice that there is a grey circle next to the Protection Domain name indicating it is not activated on this side of the replication relationship)

3. Go to the Local Snapshots tab and hover over the Details link on the latest snapshot to validate you see all of your VMs.

4. Next, with the Protection Domain still selected, click Activate.

5. You will be prompted, “Are you sure you want to activate Protection Domain <name>?”. Click Yes.

6. You will see a message indicating the VMs on being activated on this side. In a few seconds (depending on the number of VMs) you will see them all show up under Home –> VM.

To validate we can go to Events to see all of the VM registrations successes.

7. From the VM Table view you will see all of the new VMs we verified in step #3.

8. Now bring them up in the order required. For example, bring up your database servers first then application servers once the DBs are online.

9. DONE!

Depending on the number of VMs this could take anywhere from a few minutes to an hour. If you have a large number of VMs you could easily script out a Disaster Recovery procedure to bring up the VMs in the correct order with validations built in. It is possible in the future DR Runbook Modeling could be built into Prism similar to the Nutanix DRaaS XI cloud dropping mid-year 2018 which will allow this complete process to be automated.

As always, please comment below with any questions/concerns!

Share with the world!

Leave a Reply

Your email address will not be published.