So, alerting vs reporting: Are you using your alerting system for something that is better suited to periodic reporting?
You say Tomayto, I say Tomahto … let’s call the whole thing off.
On second thought, let’s not. Better yet, let’s see if we can understand the differences and move on from there. I’ve been an SE in the systems and network management field for a long time. Over the last 13 years I’ve encountered more situations than I can count where we start an implementation and the customer proceeds to use the terms “alerting” and “reporting” interchangeably. However, unlike potatoes, tomatoes, neithers, and eithers, and with all due respect to the Brothers Gershwin, there is a significant difference between reporting and alerting and a correct usage of each.
For loyal fans of this blog (I’ve never taken a formal poll, but I’m sure you’re out there …. somewhere) you’ll recall I’ve explained the acronym of “FCAPS”. The term is short for “Fault, Configuration, Accounting, Performance, and Security”. FCAPS is a nice way to sum up what most network and systems management applications do in one respect or another. When we examine the distinction between reporting and alerting it might be helpful to think of alerting as something that’s paired with the “F”, “C”, and “S” side of an application whereas the “A” and “P” fall into the reporting wheelhouse.
Grease the Squeaky Wheel
Assuming my assertion is correct and alerting does more closely align with faults, configuration, and security, then what exactly is it and when should it be used? The simple answer: When you as the person responsible for a given IT resource must take immediate action if there is a problem with said resource. I regularly train administrators about network management best practices in general and Netreo’s products in particular. I preach this simple rule of thumb: In your environment, if you get an email or text alert about a problem and you are not prepared to DROP EVERYTHING to fix that problem RIGHT THIS INSTANT, then it very likely isn’t something that needs to be alerting you. Am I using hyperbole here? Yes. However, it hopefully hammers home the point on what alerting really is. If you’re getting alerts for things aren’t particularly serious issues, then it probably wasn’t alert-worthy to begin with. For the perils of over-alerting take a look at this blog post. And, of course, since you’ve now been converted to a lifelong fan of these monthly ramblings stay tuned. At some point in the near future I’ll expand on this idea and talk about establishing an alerting policy for your IT infrastructure.
Alerting vs reporting: Don’t Lose Sight of the Forest for the Trees
At this point I can hear the peanut gallery start to voice their objections. “Your rule of thumb makes no sense. I need my NMS system to report to me not only when things are broken, but when they’re close to being broken as well. Otherwise I can’t be proactive. I’m no better off than before. I’m still fighting fires and running from one problem to the next.” My response to this criticism is “Of course you are. You’re using your alerting system for something that is better suited to periodic reporting”. That’s the whole point of a structured reporting strategy. They give you an idea of long-term trends and emerging problems. Contrary to popular belief, IT reporting isn’t just aimed at the Pointy-Haired Bosses among us for business-centric reasons. Reporting, if designed and implemented correctly, highlights issues that do need to be addressed, but aren’t things worthy of clogging up our inboxes and NOC dashboards. Lack of reporting and over-alerting ultimately masks these critical, drop-everything problems.
Shall We Dance?
Remember, not unlike Fred Astaire and Ginger Rogers, if you’re ever tempted to “call the whole thing off” when all is said and done alerting and reporting dance nicely together. Both are essential parts of all NMS systems. Whether the desired end state is to avoid a Network Operations Center dashboard that looks like an electronic version of the Catalonian flag (bonus points if you know the flag design without a visit to Google Images) or to get more proactive in terms of solving technical issues with your customers and clients. In alerting vs reporting, each construct has its proper place. The idea is to know how and where to deploy each.