While businesses across the globe wrestle with unprecedented volatility, stakeholders expect IT leaders to sustain business operations. Many professionals are now working remotely, and the burden on IT has never been higher. At the same time, businesses that own popular platforms have more of a responsibility than ever to maintain stable operations for their customers. The pressure is high, and the complexity of work is higher considering most organizations have a highly distributed workforce.
Analytics, powered by artificial intelligence, can enable IT teams to manage this rocky transition while offering improved stability. IT leaders can not only monitor key performance areas and identify opportunities for operational improvements, but they can also predict potential future outcomes.
Remote Service Desks Bombarded by New Types of Tickets, But Key Metrics Can Reveal Opportunities for Optimization and Automation
In its initial stages, a highly distributed workforce was bound to cause an increase in IT service requests. However, many IT service leaders did not anticipate the extent to which the channels with which these requests would be fulfilled would change. One Numerify customer recently observed a 75% increase in tickets via either a self-service tool or chat-based channel when comparing March/April 2020 to January/February.
Many of these new tickets involve easily resolved issues related to common business applications, access permissions, or minor security incidents. But without an operational process to identify emerging trends, manually resolving these issues will cause both the backlog and MTTR to grow.
IT leaders may wish to remediate the situation through self-service tools, automation, and other solutions. These can greatly expedite resolution while freeing up resources across all assignment groups. However, providing reliable self-service requires investment in new tools or development. And with IT service catalogs so large, prioritizing opportunities for automation can be a tough judgement call.
AI-powered analytics allows service organizations to readily identify issues that have the highest impact on productivity and value creation. Utilizing a metric we at Numerify call the Service Delivery Friction Index (SDFI), for example, you can compare the volume of specific service requests in proportion to their MTTR. This metric rapidly reveals which IT service requests have the highest volumes, the longest wait times to resolve, or both. Once problem areas are highlighted, IT leaders can respond with process improvements, automation, new technology, or the creation of a new metric to monitor the situation more closely.
Natural Language Processing (NLP) can further enhance these capabilities by rapidly observing commonalities in unstructured data fields that human eyes could never hope to sift through manually. For example, if a lot of tickets filed under various categories all relate to permissions for low risk database updates, then an assignment group can craft a tool to automate approvals for the most-impacted teams.
Another revealing metric can be the ROI of a problem fix, as calculated using an algorithm that accounts for time, labor costs, downtime, and other workload cost estimation factors. With an ROI-based metric available on an explorable dashboard, IT leaders can quickly identify the most expensive categories of problems and prioritize from there.
Evolving Operating Models Require Teams to Adapt to New Types of Incidents
Many IT service teams now struggle with surges in problems that they had never dealt with at scale. While tickets related to, say, office hardware performance have been way down, issues with VPN token requests have skyrocketed. Other less predictable, and less easily managed, problems have also grown in size and velocity, as well, in some organizations.
To address emerging issues related to either IT service or change, leadership needs visibility. With AI-powered analytics, they can monitor organizational performance and focus on teams that are most impacted. Using data drawn from across organizational silos, IT leaders can obtain visibility into how teams are performing, hold them accountable, and make sure they have all the resources they need in case metrics trend in the wrong direction.
Dashboard-based scorecards allow IT stakeholders to keep tabs on everything vital to productivity. A scorecard can measure the aggregate health of a configuration item or IT service area using metrics like MTTR, first-call resolution, system downtime, and so on. Interactive dashboards permit further exploration through drill down, trends, and root cause discovery.
For example, an SLA compliance scorecard that lights up red can be investigated further to see which metrics, teams, or individuals are causing the most issues. IT groups can then take action by eliminating root cause, enhancing their knowledge base, or pushing users towards self-service tools.
NLP can be used for deeper and more insightful visibility into emerging incident trends. Scraping unstructured data and implementing a topic cluster algorithm allows IT groups to catch trending incidents and unconsidered commonalities. This capability allows IT to manage problems intelligently, helping teams find related incidents with common root causes.
An ML model can also observe trends in incidents or service problems to unveil which factors have the highest predictive potential. A predictive engine can then proactively prescribe next-best actions based on historical data related to prior resolutions when rising conditions for an incident emerge.
Better Change Risk Management Can Help Leadership Avoid Unnecessary Change Freezes
When business services were unexpectedly disrupted in early 2020, many organizations responded with blanket change freezes to their platforms. The intention, in many cases, was to minimize variables in an already-chaotic climate.
We would argue that such actions can lead to less stability and a higher risk of unplanned downtime compared to letting changes continue as normal. While no one wants unpleasant surprises, the fact is that freezing changes means that the queue of undeployed changes only grows over time. Releasing these changes all at once — as is what tends to happen — has a higher risk of leading to business disruption. Organizations willing to adopt a more sophisticated change risk strategy can avoid this scenario and at the same time address issues that tend to lead to more unstable changes.
In most organizations, change risk management is a hard problem to solve. Siloed data limits the visibility of risky changes before they find their way into production. Factors upstream of operations, like development, as well as downstream, like CX, can impact change risk assessment.
What IT operations leaders need is a platform that can identify high-risk changes while offering insights into how to manage lower-risk ones. The use of a machine learning model in combination with analytics provides this capability. By analyzing change risk factors from the entirety of the organizational ecosystem, ML-backed analytics can uncover patterns in which categories or types of changes carry a higher probability of causing problems — or full-blown incidents. For example, changes made by a certain group or to a certain CI could be riskier than others.
Not only can a change risk prediction engine score changes, but it can also reveal the underlying factors that caused the change to be flagged in the first place. In response to these insights, IT leaders can then address the underlying cause — whether it’s based in people, technology, or a bit of both.
Even more critical, a change risk scoring system can allow change teams to expedite their response to risky changes. Mid-level risks can be addressed with thorough rollback procedures in place and a relevant assignment group on-call, for example. Low-level risks can be addressed through a proactive automated fix or an acceptance of the risks the changes pose. High-level risks can be appropriately frozen and delayed until the risk level is brought down.
These actions allow IT operations teams to not only address upcoming risks confidently but remove systemic causes of change failure. They can refine their processes, retrain problematic development or change deployment teams, or implement new technology to lower risks overall.
Managing a Distributed Workforce Means Seeing More and Proactively Addressing Threats to Productivity
Without a strategic approach to IT service and change management, IT organizations will be forced to play “whack a mole” with emerging problems and incidents as work-from-home policies continue. But with a business analytics platform, backed by AI and ML, they can proactively identify opportunities to relieve primary sources of pain while learning more about the factors that cause unwanted disruptions in the first place.
One of the biggest sources of anxiety for business stakeholders in the current climate is that we are largely unable to predict what tomorrow may bring. Yet, with a view on data and the factors that impact your organization the most, you can at least understand the likely consequences of today’s actions. This visibility allows you to make proactive decisions and anticipate risks in order to better serve your organization and its business objectives, giving you the confidence you need to transition during a tumultuous time in human history.
Learn more about this vital and emerging topic that impacts us all by listening to our recent, highly informative webinar: “How to Adapt Your IT Service & Change Management for a Distributed Workforce“