Effective monitoring means that IT staff gets information about misbehavior and exceptions quickly and accurately enough that they can get to work on diagnosis and response before users start an increasing cascade of reports and complaints. Without effective monitoring, in fact, IT lives in “reactive mode” — that is, it doesn’t start dealing with problems until users call in to report (or complain) about them. In this mode of operation, IT spends far too much time trying to fix problems to do other things such as plan for growth, evaluate new technologies, deploy new solutions, and all the other things that permit information technology to help companies and organization be more productive.
Think about the dynamics involved. If the help desk or support team waits until users start complaining before they begin to act, this may very well mean that problems or issues have been present for some time before reports start rolling in. Also, what users perceive as problems are usually symptoms rather than meaningful indicators of root causes or fundamental issues. By the time a user e-mails a report into the desk, he or she has probably tried to make something work repeatedly on their own, then gone to their coworkers and colleagues for help and discussion. It can take 45 minutes or longer for a perceived issue to become a reported issue, which leads to a trouble ticket and starts the response machinery going.
Alas when IT works in reactive mode, it also means that IT is building a technology debt at the same time. That’s because time spent firefighting is time taken away from investigating new technologies, planning upgrades and replacements, and deploying more productive and valuable solutions and security. This puts IT in a position of staying behind the curve, forcing them to expend time and effort fighting fires, making little or no forward progress in keeping up with the relentless march of technology.
What’s the alternative? A more proactive approach to IT management, based on the understanding that effective monitoring permits preemptive response. If a company Web site’s typical response time jumps from 1 second for a page download to 5 seconds, monitoring can issue alerts to IT staff as soon as that happens. They can begin investigation and start working on diagnoses and fixes or workarounds, perhaps before users even start to notice that things are slowing. Same thing goes for resources like Internet bandwidth and disk space: by setting alerts on thresholds as capacity comes close to being exhausted or fully consumed, instead of waiting for a bottleneck to appear, IT can start making arrangement to add resources, or to prevent their unwanted or low-priority consumption, before users begin to notice that systems have run out.
One proactive approach to IT management is implementing an Application Performance Management tool. APM tools, like Stackify Retrace, empower development teams to find performance issues in their code before end users are impacted. Retrace’s automated alerts notify users when something has gone beyond a specified threshold setting, allowing for proactive troubleshooting. For example, your API Gateway out-of-the box limits 10,000 requests per second. Setting an alert in Retrace for when you hit 70% to 80% of your requests lets you proactively optimize your application for increased requests.
When IT monitoring is effective and efficient, it can put its most valuable personnel and resources to work as “fire prevention” (literally, avoiding outages or service interruptions before they occur), investigating new tools and technologies, planning for upgrades and deployments and so on. With effective monitoring, even when the worst-case occurs and something does break or fail, it helps reduce the time between that event and achieving a repair because it provides detailed, focused information about causes and effects – ideally, before user complaint calls start rolling in.
Though it may seem like a big and potentially expensive investment of time, effort, or money, it doesn’t have to be if you pick the right vendor – one that offloads the typical headaches. Establishing effective monitoring is one of the most important things you can do to protect your organization’s digital infrastructure and will pay immediate dividends, particularly in enhanced user satisfaction and increased productivity.