Executive Assessment
Chaotic-1
Jun 28, 2026, 11:55 AM
Process Health
Severe Rework, Process Chaos, and High SLA Breaches Cripple Incident Resolution
The incident management process is in a critical state, defined by extreme instability and inefficiency. A 60.3% rework rate means most incidents require redundant effort. This is compounded by a chaotic workflow with 222 different resolution paths, indicating a complete lack of standardization. Consequently, the process is failing to meet its service commitments, with a 63% SLA breach rate. While individual agents work efficiently on tasks (80% flow efficiency), the broken process structure negates these efforts, leading to unpredictable outcomes and poor service quality.
3.8 out of 10
Process Health Score
Weak
✓ Confidence: High
The score reflects a critical combination of severe rework (60.3%), extreme process fragmentation (222 variants), and a very high SLA breach rate (63%). While flow efficiency is high, the underlying process is unstable, unpredictable, and consistently fails to meet service commitments. The data quality is good, which provides high confidence in these negative findings.
Headline Signals
Rework Rate
ProcessCritical
60.3%
More than half of all incidents require repeated effort, indicating underlying issues with initial diagnosis, information gathering, or resolution steps, significantly increasing effort and delaying closure.
SLA Breach Rate
sla_signalsCritical
63%
A majority of incidents with SLAs are failing to meet service commitments, exposing the business to risk and indicating the current process cannot deliver on its time-based promises.
Process Variants
ProcessCritical
222
An extremely high number of process paths for a standard workflow like Incident Management shows a lack of a standard process, making it difficult to manage, improve, or automate.
Top Rework Loop: Assigned -> Active
TransitionsCritical
35.6% of items
Over one-third of incidents bounce back from an 'Assigned' state to 'Active', a clear signal of widespread incorrect routing and triage failure at the start of the process.
Flow Efficiency
Time IntelGood
80.2%
When work is actively being handled, it progresses efficiently. This is a strength to build on, but it is completely undermined by the high rework and fragmentation that disrupt the flow.
Assignment Group Fragmentation
AttributesWarning
46 Groups
Work is spread across a large number of assignment groups, with no single group handling more than 2.7% of the volume. This can lead to inconsistent handling and delays in finding the right resolver.
Time Profile
The process has a very high flow efficiency of 80.2%, meaning that most of the ~24-hour cycle time is spent in active work (18.3 hours). Wait and queue times are relatively low. However, this efficiency is misleading, as the high rework rate means this 'touch time' is often repeated and wasteful, inflating the total effort required for resolution.
Average Cycle Time
20.8 hours
Average Touch Time
18.3 hours
Average Wait Time
4.5 hours
Flow Efficiency
80.2%
Major DiscoveryRules
Find
1
Process is Defined by Rework, Not a Standard Flow
💡 rework
📊 Evidence
60.3% of all incidents involve rework. The most common rework loops are 'Assigned -> Active' (affecting 35.6% of items) and 'Work in Progress -> Pending User' (affecting 43% of items).
🔎 Insight
Rework is the standard mode of operation. Incidents are either mis-assigned and bounce back, or they are paused frequently to gather more information, suggesting poor initial data capture and triage.
💼 Business Impact
Drives up resolution time, increases manual effort, reduces predictability, and frustrates both end-users and support teams.
Find
2
Extreme Fragmentation Prevents Effective Management
💡 standardisation
📊 Evidence
There are 222 unique paths (variants) to resolve an incident. The most common path is followed by only 14.25% of incidents.
🔎 Insight
There is no standard operating procedure. Teams handle incidents ad-hoc, leading to unpredictable outcomes and making the process impossible to train on, manage, or automate.
💼 Business Impact
Causes inconsistent service quality, inflates operational costs, and prevents effective root cause analysis or targeted improvement.
Find
3
Systemic SLA Failure Indicates a Broken Service Promise
💡 predictability
📊 Evidence
Based on a sample of 100 incidents, 63% breach their SLAs. This high failure rate is observed across all priority levels.
🔎 Insight
The process is fundamentally unable to meet its service level targets. The combination of rework, fragmentation, and routing delays makes timely resolution unattainable.
💼 Business Impact
Erodes trust with business stakeholders, exposes the organization to service credit risks, and undermines the purpose of prioritization.
Find
4
Ineffective Triage and Assignment Creates Early Delays
💡 structural_design
📊 Evidence
Problematic variants show repeated looping between 'Active' and 'Assigned' states. The 'Assigned -> Active' rework transition alone affects over a third of all incidents.
🔎 Insight
The initial assignment is frequently incorrect or requires further clarification, causing incidents to bounce back before work can begin. This points to a critical failure at the front of the process.
💼 Business Impact
Delays the start of meaningful resolution work, inflates touch time, and is a primary driver of SLA breaches.
Find
5
Fragmented Group Ownership Slows Resolution
💡 workload_segmentation
📊 Evidence
Incident workload is distributed across 46 assignment groups, and the top 10 groups combined handle only 25% of the volume.
🔎 Insight
There is no clear specialization or ownership for incident types. This suggests work is frequently misrouted or that teams lack specific expertise, contributing to reassignment churn and delays.
💼 Business Impact
Slows down time-to-resolution as incidents bounce between teams to find the correct owner. It also complicates performance management and skill development.
Path Insights
The top 12 variants cover only 68.2% of incidents. The remaining 31.8% are spread across 210 other variants, demonstrating extreme process fragmentation and a lack of predictable execution.
New > Active > Assigned > Work in Progress > Closed
Dominant Path
This is the most common 'happy path,' representing an ideal, linear flow. However, it only accounts for 14.25% of incidents, highlighting the severe lack of standardization in the process.
Covers 14.25% of incidentsNo rework loopsShould be the target model for standardization efforts
New > Active > Assigned > Work in Progress > Pending User > Closed
Dominant Path
This common path involves pausing work to await user information. Its high frequency (10.6% of incidents) suggests an opportunity to improve initial information gathering to avoid this delay.
Covers 10.6% of incidentsInvolves a wait state for user infoA key target for automation (e.g., user follow-up)
New > Active > Assigned > Active > Assigned > Closed
Problem Path
This path, affecting 7.8% of incidents, demonstrates significant assignment churn. The incident bounces between 'Active' and 'Assigned' states twice, a clear indicator of failed routing and triage.
Affects 7.8% of incidentsContains two wasteful rework loopsHighlights critical issues in triage and assignmentDirectly contributes to SLA breaches
New > Active > Assigned > Active > Closed
Problem Path
Affecting 7.6% of incidents, this variant involves a single but impactful bounce between 'Active' and 'Assigned', reinforcing the pattern of systemic assignment failure.
Affects 7.6% of incidentsInvolves one common rework loopShows instability at the assignment stage
Leadership Priorities
🔐
Standardize the Core Incident Process
Foundational
The current process is unmanageable, with 222 variants and a 60% rework rate. This chaos makes improvement impossible and drives up operational costs.
Expected Benefit
Drastically reduce process variations, lower the rework rate, and create a predictable, measurable baseline for performance management and automation.
Likely Owner
Head of Service Management / Incident Process Owner
AI: Use process mining insights to define the optimal 'happy path' and diagnose the root causes of major deviations for elimination.Automation: Implement ServiceNow Playbooks or Flow Designer to guide agents through the newly standardized process steps, ensuring consistency.Risk if delayed: Continued service failures, high operational costs, and an inability to scale support operations.
📋
Fix the Front Door: Overhaul Triage and Routing
Strategic
High-volume rework between 'Active' and 'Assigned' states is a primary driver of the 63% SLA breach rate. Incorrect initial assignment is wasting significant time.
Expected Benefit
Improve first-assignment accuracy, reduce resolution time, and significantly lower the SLA breach rate.
Likely Owner
Service Desk Leadership / Platform Owner
AI: Implement AI-powered routing to predict the correct assignment group based on incident data (e.g., summary, category, CI).Automation: Automate the assignment of well-structured incidents from channels like the Service Portal or Virtual Agent directly to specialist teams.Risk if delayed: Persistent SLA failures and wasted effort from skilled resolver teams.
Automate User Follow-Up to Reduce Wait Time
Quick Win
The transition to 'Pending User' is part of a rework loop affecting 43% of all incidents, introducing significant delays while waiting for information.
Expected Benefit
Reduce the manual effort of chasing users for information and shorten the time incidents spend in a waiting state.
Likely Owner
Service Management / Automation CoE
AI: Use AI to analyze incident text to determine if required information is missing before it is even assigned to a human.Automation: Implement automated reminders and escalations for incidents in 'Awaiting User Info'. Use Virtual Agent to proactively gather required information at submission.Risk if delayed: Continued cycle time inflation and poor user experience due to avoidable delays.
Executive Decision Support
Key Risks if Delayed
Erosion of Business Trust
The systemic failure to meet SLAs (63% breach rate) undermines the credibility of the IT support function. Business stakeholders will lose confidence in IT's ability to provide reliable and timely service.
Urgency: High
Inability to Scale or Improve
With 222 process variants and no standard workflow, it is impossible to implement meaningful improvements, automation, or performance management. The process will remain inefficient and costly as volume grows.
Urgency: High
High Operational Cost and Agent Burnout
The 60% rework rate creates a significant amount of unnecessary work, driving up operational costs and leading to frustration and burnout among support staff who are constantly re-addressing the same issues.
Urgency: Medium
Readiness & Constraints
AI Readiness
Medium
Automation Readiness
Low
Data Readiness
High
Data readiness is High; core fields like priority and assignment group are consistently populated, providing a good foundation for AI. However, Automation readiness is Low because the process itself is too chaotic. AI can be applied to tactical problems like routing, but broad automation requires significant process standardization first.
Consultant Note
This assessment highlights critical instability in the Incident Management process. The consultant should focus subsequent analysis on the root causes of the 60% rework rate and the 63% SLA breach rate. Key areas for investigation are the triage/assignment process, the reasons for the 'Work in Progress -> Pending User' loop, and the drivers behind the 222 process variants.
Evidence Base
metrics, transitions, variants, field usage, sample items, activity model, time intelligence, task_sla metrics
✓ Process Metrics✓ Transitions✓ Variants✓ Field Usage✓ Status Types✓ Time Intelligence✓ Sample Items