If you’re researching how to detect server sprawl or storage sprawl, you probably already know they can sap network performance and balloon your hardware and licensing budget. See which strategies for reversing sprawl we’ve seen work well for IT managers, and which haven’t.
Remember the bad old days when setting up a new server took a few weeks to get approvals, requisition and install hardware, load the software and boot it up? Today, if you start winding up a virtual server and eating a Twix bar simultaneously, you can finish both at the same time.
Some amount of server sprawl and/or storage sprawl may be inevitable, given how easy it is now to set up and provision a new guest. But you can gain control of it. We’ve seen the most success with a combination of administrative tactics and a unified approach to network traffic monitoring.
First, a few words about what usually doesn’t work very well for controlling VM sprawl:
A spreadsheet: If you’re tracking VMs manually on a spreadsheet, I hope you’re pretty disciplined in entering updates.
If your tracking is literally manual–post-it notes on the wall or index cards tacked to a cork board–we applaud your can-do, frontier spirit. But it’s time to fire up the Flux Capacitor and come back to the future.
A wrangler: “Hey, who set this up?” “Did you guys configure this?” “Who built this?” “Are you still using this?”
Asking these types of questions repeatedly, trying to track down who owns which VM, isn’t an effective use of a valuable IT pro’s time. Even when it works, it doesn’t address sprawl’s root causes.
A cobbled-together tracking/reporting system: You can choose from a large variety of tools for tracking the VM inventory, configuration, and ongoing performance in your private cloud. The trouble comes when pieces are added over time that don’t work well together.
Many companies have quickly ramped up virtualization in recent years. It’s easy to ignore the threat of sprawl until it creates bottlenecks, outages or cost overruns. Then, different silo managers may throw multiple tools–with multiple learning curves–at the problem.
This is when you discover that VM sprawl and chaos have a codependent relationship. Without a unified, coherent tracking and reporting process, you end up playing whack-a-mole with over-loaded hosts, rogue guests, and over- or under-utilized storage.
The solution to preventing and detecting VM server and storage sprawl has two main components: administrative and technological.
The Administrative Component: VM Accountability
Virtualization, by removing so much friction from the process of provisioning new servers, makes IT development and deployment more nimble. You don’t want to lose that with excessive management restrictions. You should, however, insist on accountability.
Schedule regular VM reviews, and send your staff reports showing which VMs they’ve created. Configure your system to require admins to document each server when they launch it, so it can be traced to them later. Filter each admin’s list so it only includes potentially abandoned servers.
Also consider using a chargeback engine. Again, you don’t want to create unnecessary roadblocks, but it simply may be necessary to remind your staff that back-end data stores are consuming expensive resources.
The Technology Component: A Comprehensive Management Tool
Proprietary hypervisor tools are great for setting up and inventorying virtual machines, but they’re often inefficient for Identifying sprawl. For example, manually searching event logs, machine by machine, can be an enormous time suck.
You may require multiple add-on tools to augment your hypervisor, and even then, abandoned VMs can remain hidden or misdiagnosed.
The best solution is a network management tracking system that can show you at a glance which VMs are no longer communicating with the outside world, which aren’t consuming any CPU resources, etc.
The same system should also monitor back-end data stores, so you can quickly cross-check inactive VMs with their corresponding storage provision.
If these monitoring features also share a console screen with your real-time tracking of the host load balancing, you’ve got a decent shot at identifying at-risk hosts and quickly drilling down to look for potentially inactive resources you can reassign.
A single overarching network management monitoring tool on top of a heterogeneous virtualized environment has another advantage: It’s easier to produce properly filtered reports. When you’re working with a hodgepodge of different tools, you have to learn a bunch of different reporting UIs.
Break the Binge and Purge Cycle
Do you scramble to reduce server sprawl or storage sprawl only when a problem arises, or when a switch requisition comes back “Not Approved”? Controlling VM sprawl should be a continuous process.
Ongoing VM usage maintenance helps you make better decisions about deploying new resources, and extracting the best value from those you already have.