Excuse me... I had an Incident! Why old school ITSM Incident Management Fails! - Matthew Hooper VigilantGuy Digital Transformentalist - Create High Performing IT
Excuse me… I had an Incident!   Why old school ITSM Incident Management Fails!

A few years ago at a Pink Elephant conference there was a juggler who joked about cutting himself with the knifes he was tossing around saying: “now is that an Incident or Problem?”  Like the nerdy ITSM looser that I am, I busted out laughing.

And while it’s been a running joke for a lot of years, I feel like we have finally transcended above the difference between an Incident and a problem.  The Term and Definition of “Incident Management” is a funny thing.   In some companies, IT staff are not allowed to use the term Incident, such as Oil & Gas.  The reference to an Incident is usually associated to mishap that could have human, equipment, regulatory or reputational impact and consequences.

As an ITSM professional and consultant, words were very important.  While I know some, like my good friends Charlie Betz and Stephen Mann, like to disagree and say it doesn’t matter what you call things, it’s all just work.

However, in my world people have not lived and breathed ITSM to the point where “WHY” we do what we do is so blatantly obvious.  Clarity of terms, definitions and actions need to be clearly spelled out for most of the teams I’ve had.  Especially in support roles, their limited view of business and IT interaction demand things are defined appropriately.  Ambiguity confuses and frustrates them.

What’s more disturbing to me then the term or label used for an issue is the definition applied to it.  ITSM professionals are using completely the term wrong definition.   This in turn is leading support staff to act and react in the wrong way.

Here is the term most IT people use for an issue that is reported to IT: “Incident Management”

If I were to ask you to pull out your process documentation on Incident Management, I bet it looks something like this:

Incident Management Definition:

An Incident is an unplanned interruption to a technology service or reduction in quality of a technology service.  Failure of a configuration item or product that has not yet impacted service is also an incident.

Incident Management Objectives:

The Goal of Incident Management is to restore service to normal operation as quickly as possible.

I’m quite fine with the label of “Incident Management”.  I also make a clear distinction between Incidents & Requests and Incidents & Problems. (See: Password Reset: Is it an Incident or a Request?)

However, these definitions and objectives, which come from ITIL and other older frameworks, are no longer applicable.

A responsible and expected business objective for how IT should handle Incident Management is nothing like, “Restore Service to Normal Operation as quickly as possible”.

3 Reasons WHY old school ITSM Incident Management fails to meet modern business?  

1) The normal operation of the service sucked to begin with.  Either it failed, which is why we had the incident, or needed a change, which caused the incident.  In any event, “normal” was not the desired state to begin with.

2) 80% of the Incidents logged have no tieback to a business service.  If at best, it will tieback to an asset or a component.  The reality is we only really tie the Business Service or IT System that has failed when there is a major outage.  For most incidents we are clueless on how to build remedial sustainability.  Utilizing confusing and overwhelming Category / Type / Item drop down lists.

3) The 3rd and most important reason is the IT service was only a means to an end to begin with.  From the business perspective: I’m trying to accomplish something here, and restoration of your incompetent IT services is your priority not mine.  I just want to get “X” done.  (“X” being a business outcome)

Here is how modern Incident Management should be defined:

Incident Management Definition:

An incident is the disruption to an action that reduces or impairs the organization to achieve it’s desired outcome.  Incidents can be caused by internal factors such as improperly designed or configured technology, lack of training or information, or external factors such as supplier failure, cyber attacks, or natural or physical disasters.  Incident causes can be identified by 6 types: Fault, Capacity, Availability, Performances, Changes, Security.

Incident Management Objectives:

The Goal of Incident Management is to protect business value through the elimination of risk and impact to the business. To assist affected users to achieve their business outcome as safely, sustainably and effectively as possible.

I could cite numerous examples of how these definitions would dramatically affect your processes, training and approach to incident management.  However, since this blog is already too long, I’ll pick 2.  Loss of a mobile device and reset of a password.  2 very common issues where request for IT support is needed.

1) Loss of corporate issued mobile device.

Incident Management – Old Definition

Action:  Order and provision new phone

Result:  Service restored to normal operation

IT Measured Result: Success

Business Impact: Lost device is at risk of being hacked and private corporate data stolen.

Value of IT to the business: Low, commodity, supplier

Incident Management – Moder ITSM Definition

Action: Phone remotely locked, data wiped. Data service provider notified. Security team notified. Replacement phone issued. Asset management team notified.

Result:  Data protection plans are invoked. End user gets mobile services restored. (note the order)

IT Measured Result: Success

Business Impact: Data loss prevention is mitigated.

Value of IT to the business: High, integrated, partner

2) Password reset

Incident Management – Old Definition

Action:  User verification is performed, password reset issued.

Result:  Service restored to normal operation

IT Measured Result: Success

Business Impact: User regains access to systems

Value of IT to the business: Low, commodity, supplier

Incident Management – Moder ITSM Definition

Action: User verification is performed, logs analyzed for attempts and locations.

Result: This depends on logs of attempted access.

If general password logout – this is not an Incident, this is a request and redirected as such

If login attempt from unknown state or multiple locations – logged as Security Incident, user account maintained in lockdown

IT Measured Result: Success

Business Impact: Further impact to business is prevented through cyber resilience

Value of IT to the business: High, trusted, partner

Can you see the difference between how IT not only see’s it’s role change and behaves differently, but also how it can enable greater business value by thinking outcomes and protection?   As always, I’m open to disagreeing points of views, comments are on in my blog.

Modern ITSM methods have the opportunity to add real value to business growth and sustainability.  However, we need to rethink our role and purpose.  For more thoughts on this check out my webinar Accelerating Digital Transformation through Agile Service Management.

In my next blog, I’ll discuss the major differences between traditional ITSM Change Management and Modern ITSM Change Management.
Written by

Digital Transformentalist Twitter: @VigilantGuy http://twitter.com/vigilantguy Linkedin: http://www.linkedin.com/in/matthewbhooper Web: http://www.vigilantguy.com Matt Hooper is an industry advocate for Service Management strategies and best practices around Enterprise Service Management. For over 20 years Matt has instituted methodologies for business intelligence and optimization. Leveraging technology to drive business outcomes, he has built an industry reputation for his highly effective approach to creating value through Service Management. Matt is active on Social Media known as VigilantGuy, and co-hosts the weekly podcast: Hacking Business Technology. HackBizTech.com

  • Brenda K Holder says:

    I like the idea. You are essentially rolling in RCA and continuous improvement strategies for common incidents types during resolution which eliminates the heavy backend processes of Problem Management for the things we already know. It strengthens day-to-day operations. It is not only focused on the”hurry up and get the individual user service restored” model but rather a model that strengthens the corporation as a whole. Focused on doing the right thing up front at the time of the need instead of the fastest with the real need getting addressed possibly after the ship sailed. More thorough accountibility. Interesting concept. Maybe do not eliminate ITIL. Perhaps it is time to revamp ITIL.

    • Vigilant Guy says:

      Thanks Brenda for your comments. You are spot on. Believe me, I’ve invested a lot of reputation and time into ITIL, so I don’t want it gone, I just want it keep up.

  • Fully agree although I do believe that many, and more and more, organisations are in ‘modern’ mode already.

    I want to comment on one specific sentence; ‘Utilizing confusing and overwhelming Category / Type / Item drop down lists’.
    You are so absolutely right here, business is waiting for us to help them continue their work and we are wasting time filling in useless fields on a form like Categories, sub categories, sub-sub categories etc. If I ever ask why we collect all that data on a ticket it is because of reporting. If I then ask when was the last time anyone took any actionable decision based on that report/information it always gets quiet…. It is such a waste of time to ask people to fill in dropdown fields like; Hardware, Software, etc. Besides the fact that this information should already be known from the CI you linked to the request…
    And did anyone ever calculate how much time is wasted by people having to think about what to put in such field every time they register a request. Let’s stop bothering our staff with filling in useless fields and let’s not make our customers wait while we are busy collecting useless data! Like Einstein said ‘Everything should be as simple as possible (but not simpler’).

    • David Crutchley says:

      I’m a Problem Manager and I use those categories for my reports all the time. Or I would if they were in the least bit useful to me.
      Getting rid of them isn’t the answer but in my experience the lists are made by the wrong people. The Service Desk who complete them and Problem Management who use them should be the people who build the category lists.

Leave a Reply

Matthew Hooper

Digital Transformentalist