What the Heck is an AIOp?

AIOps is one of the current buzzwords (buzz-initialisms?) that is hot in the monitoring space. Everyone seems to be talking about it. How you have to have it, how much better it will make everything if only you just had it, etc. But how much of that is real and how much of that is wishful thinking? Let’s take a look and see if we can separate the buzz from the words.

AIOps was first used by Gartner in 2016 and was defined as: “… an industry category for machine learning analytics technology that enhances IT operations analytics. AIOps is the acronym[sic] of ‘Algorithmic IT Operations.’” As with most things in the “best of intentions” category, everyone looked at AIOps and assumed AI meant artificial intelligence. 

A Market is Born … sort of

Thus, a market was born, with some vendors mistakenly listing the original definition as Artificial Intelligence and not Algorithmic. While this was probably to be expected, it is unfortunate since it has, in many ways, adversely impacted the potential that AIOps could bring to the IT industry. Several challenges exist with the current state of AIOps including:

  • There is no such thing currently as a true, working AI. This is one of those cases where we know what AI is, what it should look like, how it should act, but no one has successfully built a true AI (but if you want some fun check out the movie Free Guy – definitely worth a view and AI plays into the theme).
  • There is no standard on what AIOps is or means. Every vendor takes a different approach to how they define AIOps, what they are doing with it and how it can help you. For example:
    • Some vendors are trying to provide “complete” AIOps solutions that analyze the data collected in your environment, identify which problems are real and which are not, perform root cause analysis (RCA), identify a solution and finally, implement that solution.
    • Other vendors are more narrowly focused, leveraging AIOps on just one or two of those areas (for example RCA or anomaly detection).
    • Still other vendors use AIOps for internal, self-tuning functionality (that’s what we here at Netreo do – more on that later).

Apples to Oranges

Every vendor is doing something different with AIOps and defining it differently. Subsequently, having a conversation with multiple vendors, or worse, comparing their AIOps solutions ranges from really difficult to virtually impossible. How can you talk to three different vendors about their AIOps options if each of those options are entirely different? For example, looking at 3 different vendors:

  • #1 provides a tool that collects significant amounts of data from your environment (across network, infrastructure and application via SNMP, agents, API’s, etc.) and then correlates that data looking for anomalies.
  • #2 does no data collection but analyzes other vendors’ collected data looking for anomalies and then attempts to perform RCA on the identified anomalies.
  • #3 only uses their own collected data and attempts to look for anomalies and also automate solutions to those anomalies.

These three examples all exist in some form or another, and all are different, sometimes in major ways. Comparing vendor #1, who provides some data collection but then only looks for anomalies in that data, to vendor #2, who does no data collection but does more analysis, can be done but is very complex. If you choose vendor #1, you get the data collection piece but less of the analysis piece. If you choose vendor #2, you get more of the analysis piece but have to look elsewhere for data collection. And rest assured, those vendors will probably tell you they do some AIOps, as well.

A Short Aside

Before we continue, I want to make sure I clarify something. I truly believe that AIOps is the future of IT monitoring. I don’t see any way that down the road everyone (or almost everyone) is leveraging AIOps in some form or another to help them move from being reactive (waiting for someone to report an issue, then manually identifying the root cause, developing and implementing that solution) to being proactive (all that reactive stuff done automatically). So far, we have no viable solutions. Some of the pieces are in place, and organizations should leverage those as best they can, while managing expectations appropriately.

Now back to our regularly scheduled program.

At Netreo, we decided that the best way to leverage AIOps is not trying to solve all our customers’ problems. Sure, being able to do RCA, incident remediation, automation, etc. would be great, and our customers would love us. But the reality is that tools trying to boil the ocean rarely succeed.

Instead, we looked at the different places machine learning (ML) could help and determined that leveraging ML for tuning our tool to ensure it is operating at its peak performance and efficiency was the best place for AIOps today. Is AIOps: Autopilot as sexy as what some other vendors are doing? Nope. Does it work consistently and reliably and solve a real-world problem really, really well? Yup. Think about it, how many FTEs do you have tuning your monitoring solution? A half? A full? Multiple? What if you could get that down to one person spending a few hours a week? That is real time savings and real money being saved.

An Ounce of Planning

The good news is that I have some thoughts on what organizations should do:

Make sure you fully understand what you are looking for with AIOps. Do not just start Googling the term AIOps and talking to the vendors that show up on the first page. Develop a plan that covers:

  • What your organization thinks AIOps is.
  • Where you think AIOps can help you (i.e. application performance, incident remediation, network operations, etc.).
  • What you hope to get out of an AIOps solution (i.e. easier tuning of monitoring tools by leveraging ML or improved operations by reducing downtime via more rapid identification of application issues)
  • How you will deploy AIOps (i.e. as part of an existing monitoring solution, via an independent tool, etc.).
  • What your expectations are for the tool in the first three, six, 12 and 24 months.
  • How you will address issues as they arise (and they will).

Vendor Selection

Once you have a plan to deal with AIOps in place, you can move to the next step of identifying vendors. Make sure you keep focused on what your objectives are. Vendors will likely try and sell you everything they can and that may or may not meet your needs/plans.

When choosing a vendor, make sure they are on board with what your expectations are and are willing to commit to your timeline (interestingly, this aspect often comes from the vendor side – they want clearly defined objectives so they can close the deal and in this case you are doing the same thing just from the customer perspective – clearly defined objectives of what the tool will do).

Manage your expectations. If you go into the process assuming you will find a tool that does everything and does it amazingly, you are pretty much guaranteed to be disappointed. If you go into the process with realistic expectations, you are much more likely to succeed.

Finally, if your needs align with leveraging ML for tuning your monitoring solution to deliver peak performance and efficiency, schedule a Netreo demo today!

Ready to get started? Get in touch or schedule a demo.