In a previous blog post, we dove into the wayback machine and looked at Simple Network Management Protocol (SNMP) Traps – a technology that allows devices (including network devices) to send alerts when specific thresholds have been reached. In this post, we are going to be a bit more forward looking and discuss some technologies that will, in theory, replace SNMP.
It is important to keep in mind that the demise of SNMP has been predicted for years (actually decades). A quick Google search for “demise of snmp” shows links from as far back as 2006 on the first page and numerous links discussing why SNMP is either dead or dying:
This leads to some very interesting questions:
- Is SNMP really at the end of its life?
- What are some alternatives to SNMP?
- Is this something I have to worry about right now?
- What is Netreo doing around this stuff?
Let’s look at each of these questions in turn.
Is SNMP Really at the End of its Life?
The short answer to this question is no. People have been predicting the demise of SNMP for years and it has never come to fruition. SNMP has several things going for it that make it incredibly hard to replace:
- SNMP is a formal standard that can and has been implemented by almost every organization making devices that need to have data collected from them. Anyone can find the SNMP specifications and implement it on their devices. Because SNMP has been around for so long, it is the de facto standard for collecting data.
- SNMP is supported by virtually every device. It doesn’t matter if the device is brand new or twenty years old, it supports SNMP. It doesn’t matter if the device is a router, firewall, switch or wireless access point, it supports SNMP. The fact that virtually every network device supports SNMP makes it very convenient. It also takes away much of the headache of ensuring proper support from tools.
- SNMP just works. Once you get SNMP up and running, it does what it is supposed to do and it does it well. Sure there are weird configuration things, and it is not very secure. If you are running on an internal, secure network, the level of security provided by SNMPv3 is likely fine. If you are running on a publicly accessible network, you really should stop doing that!
While there are numerous alternatives to SNMP, the reality is that SNMP is here to stay for the foreseeable future. Even if organizations start moving to some alternative technology, they are still going to have legacy devices in their environment that require SNMP (or lack support for a new technology). That (perhaps more than anything else) has helped keep SNMP around for so long.
What are Some Alternatives to SNMP?
Over the past decades, numerous alternatives to SNMP have been proposed, introduced, developed and deployed. For good or for bad (see previous bullet), almost none of them have made any significant sort of headway into replacing SNMP.
Before we discuss the alternatives, it is important to understand that there are at least three distinct types of devices that need to be monitored. These devices are servers (that can include systems such as Linux, Microsoft Windows, MacOS, BSD Unix and more), network devices (such as routers, switches, firewalls, etc.), and Internet of Things (IoT) devices (such as IP cameras, motion detectors, vehicle sensors, etc.). Across these three segments (and this discussion could easily be broken out into different categories) SNMP is widely used for everything except Microsoft products. These different categories come into play because right now there is no one alternative that looks like it will be the replacement for all categories.
The first alternative to discuss, and the one that has arguably been most successful is Windows Management Instrumentation (WMI) (and its various children) from Microsoft. SNMP was included with early versions of Windows, but due to concerns around reliability, Microsoft deprecated SNMP in Windows 2000 and stopped installing it by default after that (it is still available for installation manually however).
Instead of using SNMP, Microsoft decided to take an alternative approach. This approach is based on standard technologies however it is entirely and exclusively designed to work with Microsoft Windows, significantly limiting its reach. One could make a decent argument that WMI (and its successors such as Windows Management Infrastructure (MI)) really is successful, because Microsoft Windows is successful. Nothing inherently bad about that, but it does mean that at least one other standard is needed since WMI and MI are Microsoft centric. Each has a very limited reach beyond Microsoft Windows environments.
Beyond Microsoft Windows environments there are numerous other agent-based solutions that are looking to replace SNMP on the server-side of data generation. These include technologies like OpenTelemetry, Prometheus, Telegraf, Collectd and more. Each of these leverages agents to aggregate data, which is then forwarded to collectors. There are pluses and minuses to this sort of an approach. Pluses include:
- Agents can allow much more detailed information to be gathered.
- Agents can be customized to be very specific to the environment they are running in.
- Because the technology these agents are built on is much newer than SNMP, they operate more efficiently, effectively and securely.
- Agents have to be customized and built individually for each environment (and often software version) they are to monitor.
- Agents must be installed separately from the base operating system. For example, Microsoft WMI is included by default in Microsoft Windows, whereas an agent will need to be installed separately.
- Because there are no recognized standards, each vendor does things differently. Once you start deploying your agent of choice, you’re locked into a specific technology.
On the other side of things, we have technologies such as network devices and IoT devices. Each of these still needs to be monitored (with or without SNMP). Unfortunately, the just discussed batch of agent-based solutions are not well suited for these types of devices. The agents often require more resources than are available, operating systems that are not available, or other technologies that will just increase the cost of devices (while many devices do run on Linux, they have limited memory and CPU available that will limit the ability of these agents to work – not to mention that each agent will need to be custom built for each individual device type and version).
One technology that is continuing to be pushed as an alternative to SNMP is Streaming Telemetry. Streaming Telemetry takes a different approach than SNMP. Whereas SNMP polls each device (i.e. sends a request to each device and waits for a response with the requested data) streaming telemetry sends data continuously and collectors effectively subscribe to the data they want/need. As with everything, there are pluses and minuses to this approach. Some pluses include:
- Data is sent automatically without the need to issue a request.
- Multiple collectors can receive the data simultaneously (or near-simultaneously) without having to send multiple data requests (this can also help reduce bandwidth and load on the devices).
- Newer data types can be included, and the technology can be much more extensible than SNMP.
- Streaming telemetry, because it pushes data, can be closer to real-time (since there is no need to wait for a pull request every few minutes).
Some minuses include:
- Very few devices currently support streaming telemetry, limiting its appeal.
- There is no one streaming telemetry standard that everyone conforms to, complicating the process of collecting the data and potentially leading to significant vendor lock-in.
- Many data collection and analysis tools do not currently support streaming telemetry (though there is definitely a chicken and egg thing here).
Of course, this is discussing servers (whatever operating system they are running) and network devices. We have not even touched on IoT yet. While there is no reason IoT devices couldn’t run some sort of streaming telemetry, there are (of course) numerous challenges:
- There is no single standard for streaming telemetry. If a vendor chooses the “wrong” standard, they may have a dead end product. Even if they choose the “right” standard, anyone choosing different standards (even temporarily) will be unable to monitor their technology. To paraphrase, no one ever got fired for buying SNMP.
- IoT devices tend to put the absolute minimum required amount of hardware in their devices. This allows them to keep costs low, which then allows for less expensive devices (plus the potential for more profit). If devices need to start upping the amount of memory or power of the CPU, that will start to impact the cost of the device. And worse, since there is no streaming telemetry standard (but SNMP is still ubiquitous) a competitor could easily steal business by offering a lower-cost option with SNMP, instead of a new streaming telemetry choice.
- Like many other technologies, IoT devices tend to have decently long lifespans. I have a few Internet connected thermostats in my house. They are around five years old, and I have no intention of replacing them for as long as possible. While this may not be entirely applicable to the private sector, the reality is that even businesses are (rightfully) unwilling to replace devices if they don’t have to. If a thermostat has a usable, legitimate lifespan of a decade (which is probably too short in reality), why replace the device in five years, especially if the reason is solely to address a change in my underlying monitoring technology?
Is this something I have to worry about right now?
As with many things, there is no easy answer to this question. SNMP’s days as the primary (and arguably, sole) means of acquiring network performance metrics seem to be nearing their end. It seems likely that alternatives will continue to gain traction over the next several years. However, it seems equally unlikely that anything will fully supplant SNMP for at least the next decade.
There are at least two parts to the problem here:
- SNMP is ubiquitous. Replacing every device that uses SNMP in a short period of time (say a decade) is just unrealistic. The cost would be in the hundreds of billions of dollars and is entirely unnecessary. There’s no reason to replace devices with years of life left in them, simply because someone said SNMP is dead. One could argue that for low-cost devices it makes sense to change the device. But for a switch, router or firewall, the cost would be in the tens or hundreds of thousands of dollars. Do you really want to be replacing multiple devices every few years as monitoring technologies change?
- The technology and standards behind the various SNMP replacements are still being developed. Much like the war between Blu-ray and HD DVD, there are numerous options out there with no clear leader. And, unlike the Blu-ray vs. HD DVD battles, there is no one vendor who is able to force the issue. For example, if Cisco were to announce they were no longer supporting SNMP and only supporting their version of streaming telemetry, they might force headway but at the cost of a significant portion of their business. Not many organizations are going to be willing to upend their entire business and have to replace monitoring software just to accommodate a new network device.
Don’t Worry. But Ask Questions
So, while there are some vendors out there who are pushing very heavily on the concept of streaming telemetry, the reality is that the technology is not quite ready for prime time yet. However, because this is an up and coming technology and is likely to become a major part of network infrastructure monitoring in the future, a few things to keep in mind/questions to ask include:
- What is my network equipment (or other equipment) vendors’ strategy around streaming telemetry? Do they have anything today and if so, what? Are they taking a wait and see approach? What are their five year plans?
- Where am I in my lifecycle replacement process for relevant equipment? Are all my network devices going to be replaced over the next two years (and thus should I more seriously investigate streaming telemetry)? Did I just go through a replacement cycle and have five to ten years before my next cycle?
- Is my monitoring tool vendor looking into streaming telemetry and trying to evaluate the options to ensure they are ready to help customers as streaming telemetry becomes more prevalent? What technologies are my monitoring vendors looking into and why?
What is Netreo doing around this stuff?
Netreo is focused on providing the best infrastructure monitoring solution on the planet. We provide robust support for the most commonly used and desired methods of acquiring operational and performance data available today. This includes (but is not limited to) methods such as SNMP, WMI, WinRM, API’s, SSH, PowerShell and our own agent. We strive to use the best available method for each device type. We also look to support more than one method of data collection for a given device. For example, you can pull data from a basic Linux server using SNMP, SSH or our agent.
As part of providing the best solution possible, we keep a careful watch on new technologies as they arise. We combine this with careful listening to our customers and prospects to identify areas requiring further investigation on our part. During our investigations, we always ask ourselves the following questions:
- Is this telemetry standard ready for use in a majority of environments?
- Does this type of telemetry make sense for the use cases we are discussing?
- What are the benefits of implementing this technology?
- Will this benefit a majority of our customers?
- Is there another standard still in development that may supplant this standard?
- What technology is this standard replacing and why?
As we look at these (and other) questions in relation to new technology, we make decisions about what standards we should implement that will best allow us to serve our existing and future customers.
Currently, we are investigating technologies such as OpenTelemetry and Kafka to see where they fit within the market and industry. We always map how a new technology would impact our current use cases. As existing and new technologies develop, we look to identify those that best meet the needs of our clients and our business.