
How do you monitor your network? There are a myriad of technologies and tools out there, each providing different benefits and challenges. Today we are going to focus on one specific area, Simple Network Management Protocol (SNMP) Traps. That’s right, we are going narrow here, not just focusing on SNMP but on one specific portion of the protocol: namely the ability of devices that support SNMP to send alert information to collectors.
SNMP History
Before we get into the details let’s take a very brief look at some of the history and details of SNMP:
- The original RFC for SNMP (RFC 1065) appeared in 1988 and became known as SNMPv1
- SNMPv1 rapidly became the standard for network monitoring
- Since then, SNMPv1 has been supplanted first with SNMPv2 (formalized in 1993 but never widely used – instead SNMPv2c (formalized in 1996) became the replacement for v1) and finally with SNMPv3 (formalized in 2004 though v2c is still widely used today)
- SNMP consists of a variety of different protocol data units (PDU’s, which are basically different ways data can be sent and received between devices). SNMPv1 consisted of five PDU’s, SNMPv2 introduced two more PDU’s and SNMPv3 introduced one more
SNMP Traps
Now that we covered a bit about SNMP, let’s get back to the topic of SNMP Traps. SNMP Traps are one of the original PDU’s included in the v1 release of SNMP. The purpose behind traps was to allow devices to send notifications when specific criteria were met. For example, when a system’s CPU exceeds a certain threshold, a trap could be configured to be sent.
The same was true of other monitored metrics for other devices (i.e. memory, disk, network, etc.). By allowing systems to notify a management station when thresholds were being exceeded, the theory went, administrators could find out about problems before things went totally sideways (this sounds vaguely familiar…).
Advantages of SNMP Traps (Historically…)
SNMP Traps served an important purpose, and at a time when there were limited other options for monitoring, traps were extremely important. Some of the benefits of traps included:
- Bandwidth utilization – In a time when 5Mbps (yes, megabits) was a fast backbone, limiting the amount of bandwidth used for management was critical
- Limiting cost – Disk space, memory and CPU were far more expensive, meaning the collection of data was much more expensive. Collecting data only when an issue occurred was seen as a way to save money
- Productivity – Networks were nowhere near as reliable in the 90’s as they are today (and as someone who spent a lot of time trying to get 10Base2 terminated correctly, I can verify this). Network engineers spent a lot more time troubleshooting physical network connectivity problems, so they didn’t have the same amount of time to deal with non-critical performance issues
Disadvantages of SNMP Traps
While SNMP Traps remain the most practical solution for things like power supply failures, fan failures and other physical resource failures, traps also have numerous disadvantages, including:
- Timing – SNMP Traps are sent after a specific metric has been violated. This means that by the time you are alerted, it is probably already too late
- False-positives – To avoid the issue of timing, you can adjust thresholds so alerts are sent earlier. Unfortunately, this results in a significant number of false-positives, wasting resources troubleshooting what may not be an actual problem
- Static thresholds – SNMP Traps are entirely dependent on a specific threshold for a specific metric being violated (i.e. if CPU exceeds 80% utilization). While there is value in knowing if specific thresholds have been exceeded or sub-ceded, the lack of additional context means that potentially harmless, one-time events (or perfectly legitimate processes) will generate an alert
- Limited visibility – Because SNMP Traps are only sent when an issue occurs, you have no way to get on-going performance information about your environment. This makes it much more difficult (or impossible) to take advantage of modern analytics (such as AIOps) to move from a reactive to proactive monitoring model
Moving Beyond SNMP Traps
So what is the takeaway here? Well, put succinctly, SNMP Traps had their 15-minutes of fame but with modern technology you’re likely missing out on the myriad benefits that come with new-age monitoring capabilities. Just getting an alert when a static threshold is violated means you are unable to do any of the following:
- Capacity planning – Without continuous data on performance, you are unable to get an idea of how systems are performing overall. Is a system running at 89% capacity 99% of the time and only exceeding the threshold 1% of the time, or is the system running at 5% capacity 99% of the time? These are very different scenarios and need to be treated very differently
- Advanced analytics – Just looking at data when a trap is sent limits technologies such as AIOps to look at the overall picture of a system and determine – in advance – when problems are going to occur
- Root cause analysis – Knowing there is a problem is very different from figuring out why the problem is occurring and resolving it. Without sufficient data, it is near impossible to identify the cause of a problem without significant additional investigation. Even then, identification may be impossible, since the information may no longer be available
If your network monitoring solution is heavily dependent on SNMP Traps, it is time to take a look at what you are monitoring, how you are monitoring it, and if you are getting real value from your monitoring solution. Netreo can help address your monitoring gaps with the latest capabilities, and we can support your need for SNMP Traps where necessary. Request a demo of Netreo today.
About the Author
Josh Chessman is VP of Products at Netreo and brings more than 30 years of experience in IT. He was most recently Senior Director, Analyst, speaker and author in the Information Technology Leaders division within Gartner. His unique blend of strategic market knowledge and hands-on enterprise network management experience provide valuable insights to Netreo, our customers and the IT community.