In this post I will walk through what enables the Nutanix architecture to inherently allow for linear scalability of performance as the virtualization cluster scales out as compared to scaling out the virtualization cluster in a legacy 3-tier environment.
First let’s walk through what scaling a 3-tier environment would look like. For the purposes of everyones valuable time I will not walk through the many steps involved in performing this scaling and focus on high-level details.
Example: Let’s say we have a SAN array capable of 100,000 8k IOps and 4 hosts attached. Each host has the performance capacity to perform 25,000 8k IOps before the SAN array starts to bottleneck and latencies are less than stellar.
Now I’m at the point where I’ve consumed all of the memory in these hosts and need to add 1 more (let’s forget the many steps involved for a second). I add that 5th host and now each host in that virtualization cluster has the performance capacity to perform 20,000 8k IOps.
This means as I scale the virtualization layer I reduce the effective performance capacity available to each host. So how do I add performance capacity? Well now I need a whole new SAN array which means a new silo of storage, a new management endpoint, more SAN cabling and zoning to the new and existing hosts…aka more headaches.
Now, what happens when we scale a Nutanix environment where we have run out of compute resources like the above scenario. Now a typical Nutanix node can do 100k iops by itself but lets for arguments sake say my 4 nodes combined can do 100k IOps (25k each).
Because the Nutanix Hyperconverged Architecture has a virtual storage controller (CVM) on each host we essentially add a new storage controller as we add nodes. More importantly though Nutanix is built on Webscale design which means these CVMs work in a fully distributed manner where data placement is intelligently determined using machine learning and advanced algorithms to reduce bottlenecks within a single host. This is very important when we think of failure scenarios but more on that in a future post. The most important piece that allows this architecture to maintain consistent performance as the cluster scales is the unique and patented implementation of Data Locality. Data locality allows Nutanix to reduce network traffic and maintain all reads from the locally attached disk. So as we add another node, that node will read locally and distribute the writes throughout the entire cluster NOT to an HA pair or another disk group within another node (again think about failures when thinking about data placement, if all writes are mirrored to another node or diskgroup what happens when one of those pairs fails?).
So in the example given, as we expand the cluster the 5 nodes we have the storage performance capacity to handle 125,000 8k IOps. (read HERE for further details on expanding the Nutanix cluster)
Now the question might be “Well what happens if I don’t need compute and just storage?”. Then you add a storage only node which will give you a large capacity of storage and also…you guessed it…ANOTHER storage controller. These nodes have very limited CPU/Mem, enough to run our CVM, and then a large amount of storage. So you can see that Nutanix does not limit you to scaling with a single node type but allows you to scale in many different ways and uses intelligence built into software to maintain performance as the cluster grows.
For any questions please comment below. 🙂