As many of us continue to work remotely and the strength of our IT processes are thoroughly tested, organizational leaders are on the lookout for any advantage that can afford them greater control over possible business disruptions. Analytics can leverage key data sources across major systems of record to achieve this advantage, but it should not rely on structured data as the sole source for its insights.
Unstructured data is an oft-overlooked source of business value, one that can contain many rich insights. Many IT leaders struggle with making use of structured data that can be incomplete, lacking the fullness of view needed to identify incident commonalities or incident root causes. Arbitrary entries selected from drop-down menus in IT service platforms and other key systems of record can inject uncertainty into data-derived insights. Furthermore, “BLANK” or “NOT FOUND” entries can make things murky when IT leaders are attempting to elicit critical observations.
Unstructured data can supply the crucial link in the broken chains of structured data trails. Emails, open text field entries on IT tickets, descriptions on incident reports, and other sources of structured data can all serve to fill in the blanks in a way that completes the insight picture. However, it is incredibly difficult to manually parse through unstructured data in order to derive such insights. The quantity of needed work is high, and the signal-to-noise ratio is often low.
An NLP engine can automate the process of extracting insights from unstructured IT data. Machine learning and artificial intelligence based capabilities allow an NLP engine to recognize stop words, pass over filler words, identify cognates, and then apply analysis, such as by generating a word cloud or revealing similar topic clusters.
The following are five major ways an NLP engine can provide essential value to businesses, allowing IT analytics to generate a clearer picture than ever of issues and opportunities that deserve their attention.
The Key: Topic Clustering of Similar Incident Types
Traditional methods of categorizing incidents may result in glaring blind spots without the aid of AI/ML techniques. Some types of incidents may be undercounted, and relationships between similar incidents may go unnoticed. Without awareness of these observational shortcomings, the true scale of patterns of incidents may never be understood.
An NLP engine can use topic clustering to identify correlated incidents. It does so by extracting critical text items that reveal an incident’s relationship with others. If one incident is miscategorized or categorized in several various ways, NLP can still pick up on commonalities.
As an example, a Numerify customer realized that they were having a large-scale password reset issue across a number of applications. When viewing these incidents through traditional structured data methods, the isolated incidents appeared to not affect any one application at a significant volume. However, viewing all incidents in aggregate — accomplished by creating a topic cluster defined by words like “password”, “login expiry” and “reset” from text descriptions — revealed that the problem volume was actually quite high when viewed in aggregate across all applications.
Not only was NLP able to bring this issue to IT’s attention, but it also detected that an early resolution attempt to provide users access to a single sign-in utility failed to gain traction. IT leaders realized that they should promote the single sign-in utility to bring awareness to it, reducing the issues they saw with password resets, expired login credentials, and similar issues.
Use Case 1: Identification of Common Root Causes
Root cause analysis can frequently involve complex deductive diagnosis. Yet, other times, the root cause can be staring IT in the face in the form of comments and descriptive text held within unstructured data related to an incident. Relying on structured data fields alone can cause IT to miss these insights.
With topic cluster driven root cause analysis, IT can extract data that reveals cause-and-effect, which underlying systems triggered an issue, or incident clusters that reveal problems stemming from specific assets. These insights can often drive a permanent corrective fix.
Identifying root cause across clusters of incidents not only aids in diagnosis, but it can also allow for the rollout of a standardized problem fix. In some instances, an assignment group can create a knowledge base article and a step-by-step resolution process, rather than individual teams addressing each problem on a ticket-by-ticket basis.
This level of efficiency is especially important in enterprises where resources are spread out across large geographical areas, which can obfuscate the organization’s ability to detect common problem root causes and address them in one sweeping remedial action.
Use Case 2: Early Detection and Remediation of Rising Incident Threats
The biggest problems of next week might lurk in this week’s unstructured data. If IT organizations are merely categorizing incidents by current problem volume, they may be caught unawares by emerging problems down the road. Waiting to address these issues increases the risk of business disruption while also raising the costs of remediation.
Combining NLP and analytics can allow IT leaders to perform a trend analysis on incident data. Comparing cluster growth rates with benchmark levels reveals that some problems may be quickly having a larger impact. Detecting these problems early can empower an IT organization to address the issue early on, possibly preventing the emergence of incidents down the road.
One example is identifying performance issues in a system that is becoming stressed under increasing user activities. Cloud-based tools with low levels of adoption may hide performance shortcomings at first, only to reveal that they can become completely overwhelmed under larger server loads. Since many SaaS and remote work tools are now seeing higher levels of use than most developers could have anticipated, identifying trends in load stress and degrading performance now can prevent a full-blown service disruption later.
Use Case 3: Revealing Which Incidents Are Related to Changes
Machine learning can coordinate with NLP to identify fluctuations in data that indicate a strongly correlated relationship. Whereas without analytics, some IT teams must merely guess whether or not certain changes could have triggered later incidents, ML and NLP can develop models to reveal whether the two are related within a stated range of probability. For example, some tickets may only mention that certain problems occurred after a change in descriptive text, rather than any of the closed fields.
Uncovering which incidents were related to changes provides highly valuable information to both development and operations teams. Operations can address performance issues stemming from changes prior to rollout, and development teams can learn how their coding decisions affect performance in the production environment.
Machine learning can also produce a risk correlation model capable of making predictions about whether or not certain changes have a high risk of causing incidents. The change advisory board can then expedite decision-making in response to the level of threat a given change presents.
Use Case 4: Discovering “Shift-left” Opportunities
Without the right enablement, some IT response teams may reflexively escalate incidents to higher-level assignment groups. Many of these high-level group assignments often come from incidents that are automatically flagged as high-risk for one reason or another. When the L3 or L4 team receives an incident that could have been remediated by a lower support tier, they may often include remediation instructions in the incident and bump it back down to the appropriate group.
An incident topic cluster can reveal groups of such incidents that have been unnecessarily escalated to high-level assignment groups in the past. In response, knowledge base articles can be created to avoid unnecessary escalation. By shifting problems that only appear to be high level “left” to a lower-level assignment group, IT organizations can more efficiently allocate resources and reduce the costs of addressing incidents.
Unstructured IT Text Represents an Untapped Resource
Without the tools needed to analyze all IT correspondence, a lot of important information can get missed. The unstructured data IT overlooks may be packed with valuable insights that can incite action. At the very least, NLP analysis can supplement structured data to generate clarity and reduce natural skewing that gets introduced through incomplete information.
Overall, leveraging unstructured data contributes to a clearer picture of IT operational health that may be incomplete without it. In this way, NLP, ML, and analytics can drive new levels of efficiency, productivity, and comprehensive monitoring into the future. Achieving these goals in our current moment of global crisis — when businesses and governments depend on the platforms they use to remain functional — allows organizations to maintain the value they provide, even when their capabilities are tested to their limits.
Learn more about the power NLP has to provide value and sustain business services in our recent Webinar: “Smarter Incident Management with NLP-Driven Topic Clustering“