In this blog post I will show a new feature added to the Prism Pro product within the Prism Central management plane called X-Play or “Cross-play”. This enhancement to the Prism Pro feature set is a major step towards an autonomous or “self-driving” datacenter that was discussed in the .Next 2019 keynotes in Anaheim, CA (check out keynote HERE).
From a high-level X-play is an IFTTT or “If this, then that” codeless automation framework which allows the user to build automated actions around specific alerts or “triggers” in a Nutanix environment.
To make sense of the below walkthrough here are a few key terms.
Playbook – A playbook is essentially a “script” which steps through each ‘Action’ (defined below) which performs the end to end automated set of tasks
Action – An action represents a function or task inside of a ‘Playbook’. For example, “Take Snapshot of <vm_name>” is an action.
Play – A play is simply an instance where a ‘Playbook’ ran
In the below walkthrough I will show an example of a VM that became CPU constrained and the X-play used to increase vCPUs on this machine.
1) The first step is to enable an alert policy. I could do this on a category of VMs or ALL VMs but for this example I will use a single test VM. If I navigate to the VM, I can hit the ‘Metrics’ dropdown, click ‘CPU Usage’ and on the top-right of the screen select ‘Alert Settings’.
2) From here I define the alert. Note that I can set a Static threshold or Anomaly based threshold. For adding vCPU, it makes sense to have a statistical based threshold since an Anomaly may not always require additional vCPU. I enabled an alert as “Critical” at greater than 90% CPU utilization. I enabled the policy (bottom left) and set the alert to trigger after 5 mins (I could set this to a longer time but for a test 5 minutes made the most sense).
3) Now that I have an alert aka “something to trigger a play”, I can setup the Playbook. If I navigate to Operations –> Playbooks I can see the list of predefined Playbooks. All I did for this test was clone the “Increase a constrained VMs vCPU” playbook.
4) Inside of my Playbook we can see the steps X-play will take once the alert is triggered. On the “VM Add CPU” action we can see that once this alert is hit, we will add vCPU in increments of ‘1’ until the VM has ‘6’ vCPUs. Note that if I know that my machine can “hot-add” CPU then I could remove the actions for Power off and Power on the VM.
5) I have an Alert policy and I have a Playbook. Now let’s hammer this VM! For this I used a free utility called HeavyLoad (download). Please note that once I hit the PLAY button my VM completely froze up as the HeavyLoad tool attempts to create 3D models and pegs 100% CPU. There are test options that I did not play with that may have avoided my frozen VM scenario.
To show that I’m not making this up here is a screenshot of the VMs vCPU field prior to hitting the PLAY button on HeavyLoad.
Now let’s crush the VM!
6) Now that my VM is hitting 100% CPU I waited 5 minutes until an alert popped up which triggered the ‘Play’.
I can see from the main dashboard that 1 Play ran.
If I go into ‘Plays’ I can see that my Play ran but Completed with Errors.
Drilling into the play I can see the list of Actions and their results. All actions work except the Email action due to SMTP not being setup.
Now if I go back to the VM and look at the vCPU field I can see that vCPUs now equal 3!
To summarize, we just saw how Nutanix’s X-play feature (inside of the licensed Prism Pro feature set) allowed us to craft automated activities to add additional vCPU to a VM once my alert policy was triggered. These can be much more advanced where I could push details to ServiceNow so that an operator can approve a certain task or run an API call to CALM to perform a scaling action on an application. You are limited only by your imagination! Once your playbooks and alerts are defined sit back and enjoy the time you now have back in your day to focus on other skills to help your organization move into this brave new cloud world. 🙂
Thanks!