Partial blindness is a trade-off some network admins make for the advantages of virtualized systems. It’s often hard to see whether you’re pushing your virtual resources too much—or too little. Also, the roots of slowdowns and outages can hide in hypervisors. Is It time to get your virtual vision checked?
The advantages of virtualization are undeniable: redundancy, flexibility, and the potential for substantial cost savings. Even so, we talk with plenty of network admins and directors who struggle with performance issues within their virtual environments.
One reason they struggle is that troubleshooting is a different beast than it was when they had nothing but rack after rack of static servers.
The virtualization software’s built-in management tools often make it difficult to spot anomalies that may be slowing down key applications. Instead of drilling down tier after tier to find the source, the temptation is simply to throw more CPUs and more storage at a problem.
Better safe than sorry, right?
The trouble with that approach is, it can lead to an ongoing see-saw of over-provisioning your resources, and when that finally comes to light, under-provisioning them in response. Then you’ve got performance bottlenecks–or worse. Either way, you end up siphoning the returns on your virtualization investment.
Based on what we see when working and talking with IT pros, here are a few tips to help you monitor your virtualized environment, and find the cause and solution for service issues faster:
1. Use the right tool for setting up alerts
Hypervisor monitoring tools built into virtualization solutions are generally very good at helping you set up your virtual environment and do the initial provisioning. For ongoing monitoring, however, their focus is generally too narrow to give you the big picture you need to react to service issues quickly.
For example, some of these virtual machine monitoring tools require time-intensive manual alert setups. A busy admin staff might get to all these setups…oh, let’s say…never. Without pre-set alerts, a hypervisor might trigger a warning you’ll see only if you happen to be at the hypervisor’s admin console at the time.
A better method is to fully integrate your virtual machine manager with a comprehensive network management monitoring solution. A holistic monitoring system should give you advantages such as:
- A more automated, coherent alert setup procedure. It should be easier to automatically configure new guests with pre-set alerts.
- Ready access to history and reports for each cluster and each host that doesn’t require drilling down six, seven, eight or more layers.
- The ability to see at a glance whether a mechanical connectivity issue, such as a defective router or switch, might be causing a bottleneck. This way, you don’t waste time chasing non-existent host machine load imbalances, rogue guests, etc., on a separate monitoring system.
2. Watch for anomalies.
This is too often overlooked, and it can make a major difference in how quickly you can spot a problem, assess its history, and either take corrective action or move on.
Rather than setting a static threshold for a particular drive’s I/O request volume and/or latency, set the alert to compare current data with the “normal” levels of traffic for a certain time of day, or day of the week.
When you don’t have to spend time finding out whether the traffic is unusual for a given period, you’ve got a better chance of isolating the issue before it affects users.
For example, as soon as you’re aware of an I/O anomaly, you can go into deeper diagnostics for your storage array, looking for utilization spikes by a particular guest, degraded RAID conditions, failed hard drives, or other potential issues.
3. Don’t create new blind spots.
Virtualized environments make it easy for IT users to provision new guests. Some companies have the equivalent of a self-serve portal that allows individuals to spin up new servers at will. If these fly under the radar of your monitoring system, you’re creating new blind spots.
This could be an especially costly blind spot if the new server supports a mission-critical virtualized application. You’ll probably have no warning of a service issue until you hear about it from users.
Be sure everyone involved in creating guests understands how to connect new virtual machines to your network monitoring software.
Better yet, choose a monitoring system that automatically detects and monitors new servers. If your people have to complete a checklist to get every new server synchronized with the monitoring system, some servers will almost certainly fall through the cracks.
Overall, virtualization helps us reduce the time to value for our products and services by making hardware, networks, and data centers more efficient and safe. It can also make service issues more complex and mysterious–unless we learn how to keep our eyes open.