Using IT Analytics to Make Business Services More Resilient
High velocity IT environments have the central goal of enabling businesses to deliver more value, faster. Time to market, time to customer, time to change, and speed, in general, are all crucial.
Many transformative business technologies and practices share this objective. Cloud platforms, DevOps, and the resurgence of Lean practices are all by-products of this environment. Development, operations, and IT service delivery have all moved from a silo-oriented culture to a collaborative one, gaining velocity in the process. Now that many of these technologies have become widely adopted, many business leaders are no longer asking "how do we change how we do business?" but rather "how do we take it up to the next level?"
In the medium term, the ongoing pandemic will not slow this transition. If anything, digital transformation and digital enablement have become even more of an imperative. Almost literally overnight, business leaders woke up to a world that had completely changed. In the face of these rapid changes, they're forced to consider, in the words of Troy DuMoulin of Pink Elephant puts it, "new digital engagement platforms to create a new digital business model where much of the workforce is distributed across the four winds."
One thing IT leaders have realized during the midst of this transition is that this pursuit of speed does not entail throwing caution to the wind. Quite the opposite. Without robust, resilient operations, brand image can suffer. This highlights the growing need for IT value chains that can not only deliver something with a high speed to market, but that can also ensure business continuity with a swift speed to restore business services.
Considering Resilient Operations as a Critical Component of High-Velocity IT Services
Revised ITIL 4 models highlight the importance of 5 key areas that can serve agile, high-velocity business value delivery:
- Valuable Investments — Strategically innovative and effective application of IT
- Fast Development — Quick realization and delivery of IT services and IT-related products
- Co-Created Value — Effective interaction between service providers and service consumers
- Assured Conformance — Adherence to governance, risk, and compliance (GRC) requirements
- Resilient Operations — Highly resilient IT services and IT-related products
All five are undeniably important. Making valuable investments means leveraging platforms on a more strategic basis, enabling organizations to do more with less — more remote, more digitally based, and more distributed platforms are all common goals.
Fast development encompasses not just coding but incorporation of value early on in the process. This inevitably means co-created value, which is determined through requirements generation, customer engagement dialogue, and defining the problem in concrete terms before developing a solution for it.
Assured conformance is, naturally, paramount. To paraphrase DuMoulin, "no one will thank you if you've made it to market and beat your competition, but your customers end up in orange jumpsuits." Incorporating regulatory and compliance aspects are an important component of delivering value.
Resilient operations can tend to be overlooked within this paradigm. You can achieve all four of the prior goals to obtain high velocity, but if organizations can't supply robust, resilient operations — with high availability, high reliability, and a high degree of performance — then the best product is not going to get the value realization of which it is capable.
From this perspective, IT operations should provide a stable environment, and they can do so with the help of IT business analytics. Having a system of intelligence allows them to truly know what is going on in their production environment, to see patterns through data, and to use those patterns to predict instabilities. This concept of resilient operations has become as much a part of a high-velocity organization as the other four high-velocity IT objectives, even if its necessity is not as immediately apparent.
Giving Context to Data Through Analytics to Make ITSM More Responsive, Predictive, Resilient, and Agile
A system of intelligence yields actionable insights by aggregating data from across all systems of record, examining that data for patterns, and using those patterns to offer context to human supervisors so they can know the appropriate action in response. This reflects one of ITIL's classic paradigms: DIKW, or Data-to-Information-to-Knowledge-to-Wisdom.
Using event management as an example, examining every single event log for useful information can be a Herculean task for a human. Events are not necessarily good nor bad. They may reflect a mundane action, such as a server root login or the successful completion of a backup. At the same time, they may also represent the first step in a chain of events that can lead to a new problem — or an incident that leads to a service disruption.
To make sense of this data, it must be given structure and context. Events can be examined by machine learning models to decipher patterns that can reasonably categorize and classify events considered noteworthy. By applying descriptive qualities to data, a system of analytics can then apply extra contextual layers or place events in descriptive buckets. Certain events can then indicate that a certain metric has reached a level to serve as a warning of possible problems. Some events might indicate serious exceptions to established benchmarks, indicating the possibility of a major incident. Other events can be determined to be merely informational, getting logged but not really prompting any further action.
Human beings and automated APM systems can apply the same categorization, in theory, but the sheer volume and velocity of data makes this an unrealistic responsibility to delegate to either. Instead, a system of intelligence can determine the optimal model to categorize events based on past data, including cause-and-effect outcomes that indicate how certain events may qualify as the initial root cause of a subsequent incident.
Machine learning models can apply yet another layer by automating the process of discovering which events have the most significance for, say, predicting a major incident before it comes into effect. They may also be able to score certain process elements, such as an individual requested change, in order to indicate how much risk that change could have for causing a service disruption.
Analytics can, thereby, translate noise into signal, helping IT organizations to proactively anticipate and address risks to business continuity, building resilience in the process.
Enabling IT to Make More Use of Digital Information to Maintain Velocity in a Reliable Way
The machines we use for business create millions of records each day. Machines can also be capable of sifting through these records to highlight meaningful insights, issue dire warnings, or generate informative report cards on the health of platforms, systems, and individual configuration items.
IT leaders must leverage technology in order to give context to the growing flow of data, turning it into something that can enable them to take action when needed. This process brings them from knowledge — merely gathering or monitoring data — to wisdom: using that data to become a proactive problem manager.
The main goal is to ensure that the data we gather is both appropriate and useful. By working alongside analytics, IT leaders can gain a helper in the battle they face to deliver operations that not only work fast, but remain resilient. By looking to the actionable information delivered from a system of intelligence, they can identify when they need to be proactive ...and not just shoot from the hip.
This information is paraphrased and summarized from a recent webinar delivered in partnership with Pink Elephant, a world leader in ITSM training, consulting, and informational presentations. To Learn More, view the entire webinar here: "Driving resiliency & high velocity in your IT Operations through AI-powered Analytics"