SMS-DOC-086-8 Problem Management Process

Page 1

Problem Management Process

ISO20000 Toolkit: Version 10 ŠCertiKit


Problem Management Process

Implementation guidance The header page and this section, up to and including Disclaimer, must be removed from the final version of the document. For more details on replacing the logo, yellow highlighted text, and certain generic terms, see the Completion Instructions document.

Purpose of this document This document describes how the service provider will manage problems related to the services.

Areas of the standard addressed The following areas of the ISO/IEC 20000:2018 standard are addressed by this document: •

8. Operation of the service management system o 8.6 Resolution and fulfilment ▪ 8.6.3 Problem management

General guidance The standard is fairly specific about what should be covered in this procedure and so we would recommend that the headings at least be maintained in this document. Many organizations produce lower level procedures or work instructions that set out how to log and manage problems in the specific service desk system in use.

Review frequency We would recommend that this document is reviewed annually.

Document fields This document may contain fields which need to be updated with your own information, including a field for Organization Name that is linked to the custom document property “Organization Name”. To update this field (and any others that may exist in this document): Version 1

Page 2 of 26

[Insert date]


Problem Management Process

1. Update the custom document property “Organization Name” by clicking File > Info > Properties > Advanced Properties > Custom > Organization Name. 2. Press Ctrl A on the keyboard to select all text in the document (or use Select, Select All via the Editing header on the Home tab). 3. Press F9 on the keyboard to update all fields. 4. When prompted, choose the option to just update TOC page numbers. If you wish to permanently convert the fields in this document to text, for instance, so that they are no longer updateable, you will need to click into each occurrence of the field and press Ctrl Shift F9. If you would like to make all fields in the document visible, go to File > Options > Advanced > Show document content > Field shading and set this to “Always”. This can be useful to check you have updated all fields correctly. Further detail on the above procedure can be found in the toolkit Completion Instructions. This document also contains guidance on working with the toolkit documents with an Apple Mac, and in Google Docs/Sheets.

Copyright notice Except for any specifically identified third-party works included, this document has been authored by CertiKit, and is ©CertiKit except as stated below. CertiKit is a company registered in England and Wales with company number 6432088.

Licence terms This document is licensed on and subject to the standard licence terms of CertiKit, available on request, or by download from our website. All other rights are reserved. Unless you have purchased this product you only have an evaluation licence. If this product was purchased, a full licence is granted to the person identified as the licensee in the relevant purchase order. The standard licence terms include special terms relating to any third-party copyright included in this document.

Disclaimer Please Note: Your use of and reliance on this document template is at your sole risk. Document templates are intended to be used as a starting point only from which you will create your own document and to which you will apply all reasonable quality checks before use.

Version 1

Page 3 of 26

[Insert date]


Problem Management Process

Therefore, please note that it is your responsibility to ensure that the content of any document you create that is based on our templates is correct and appropriate for your needs and complies with relevant laws in your country. You should take all reasonable and proper legal and other professional advice before using this document. CertiKit makes no claims, promises, or guarantees about the accuracy, completeness or adequacy of our document templates; assumes no duty of care to any person with respect its document templates or their contents; and expressly excludes and disclaims liability for any cost, expense, loss or damage suffered or incurred in reliance on our document templates, or in expectation of our document templates meeting your needs, including (without limitation) as a result of misstatements, errors and omissions in their contents.

Version 1

Page 4 of 26

[Insert date]


Problem Management Process

Problem Management Process

Version 1

DOCUMENT REF

SMS-DOC-086-8

VERSION

1

DATED

[Insert date]

DOCUMENT AUTHOR

[Insert name]

DOCUMENT OWNER

[Insert name/role]

Page 5 of 26

[Insert date]


Problem Management Process

Revision history VERSION

DATE

REVISION AUTHOR

SUMMARY OF CHANGES

Distribution NAME

TITLE

Approval NAME

Version 1

POSITION

SIGNATURE

Page 6 of 26

DATE

[Insert date]


Problem Management Process

Contents 1

2

Introduction ............................................................................................................... 9 1.1

Purpose ....................................................................................................................... 9

1.2

Objectives.................................................................................................................... 9

1.3

Scope ........................................................................................................................ 10

Problem management process ................................................................................. 11 2.1

Process diagram......................................................................................................... 11

2.2

Process triggers.......................................................................................................... 12

2.3

Process inputs............................................................................................................ 12

2.4

Process narrative ....................................................................................................... 13

2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 2.4.7 2.4.8 2.4.9

2.5

Process outputs ......................................................................................................... 16

2.6

Process roles and responsibilities ............................................................................... 16

2.6.1 2.6.2 2.6.3

2.7

3

4

5

Problem detection ..................................................................................................................... 13 Problem logging ........................................................................................................................ 13 Categorisation ........................................................................................................................... 13 Allocation of priority .................................................................................................................. 13 Investigation and diagnosis ....................................................................................................... 15 Workarounds ............................................................................................................................ 15 Change requests ........................................................................................................................ 16 Resolution and closure .............................................................................................................. 16 Major problem review ............................................................................................................... 16

Problem manager ...................................................................................................................... 17 Service desk analyst................................................................................................................... 17 Technician ................................................................................................................................. 17

RACI chart.................................................................................................................. 18

Problem management tools ..................................................................................... 19 3.1

Service desk system ................................................................................................... 19

3.2

Problem analysis and investigation tools .................................................................... 19

3.3

Email and collaboration tools ..................................................................................... 19

3.4

Configuration management system ............................................................................ 20

Communication and training .................................................................................... 21 4.1

Communication with users ......................................................................................... 21

4.2

Communication with customers ................................................................................. 21

4.3

Communication with IT teams .................................................................................... 21

4.4

Communication with suppliers ................................................................................... 22

4.5

Process performance ................................................................................................. 22

4.6

Communication related to changes ............................................................................ 22

4.7

Training for problem management ............................................................................. 22

Interfaces and dependencies .................................................................................... 24

Version 1

Page 7 of 26

[Insert date]


Problem Management Process

6

Proactive problem identification and reporting ........................................................ 25

7

Conclusion................................................................................................................ 26

Figures Figure 1: Problem management process ..................................................................................... 11

Tables Table 1: Problem priority assessment ......................................................................................... 14 Table 2: Problem priority definitions .......................................................................................... 14 Table 3: RACI matrix ................................................................................................................... 18 Table 4: Process interfaces and dependencies ............................................................................. 24 Table 5: Process KPIs .................................................................................................................. 25

Version 1

Page 8 of 26

[Insert date]


Problem Management Process

1 Introduction 1.1 Purpose In order to reduce the number and frequency of incidents and improve the level of service to users, it is essential that the causes of incidents are investigated and, through managed actions, permanently removed. Problem management has the potential to not only make services better but to reduce the support overhead of providing them, so minimising cost and maximising warranty. It is important therefore that it is carried out according to a clear, well designed process. This document defines how the process of problem management is implemented within [Organization Name]. According to ISO/IEC 20000 “the purpose of the problem management process is to minimise service disruption”. This will have benefits for our organization in terms of fewer service outages, so improving service availability and increasing customer and user satisfaction. Effective problem management also means that fewer incidents will be logged at the service desk thus freeing up resources within [Organization Name] and increasing user productivity within the organization as a whole. A Problem is defined by the ISO/IEC 20000 standard as a “cause of one or more actual or potential incidents”. In many cases the process of problem management will come into play when a series of incidents is logged which suggests a single unknown root cause affecting multiple users. In these circumstances an analysis of these incidents must be carried out in order to establish what the underlying cause is and what needs to be done to fix it. The resolution may then require one or more change requests to be raised. This process may also be done using historical data which highlights trends that in turn may suggest an underlying problem.

1.2 Objectives The objectives of the problem management process are to: • • • •

Proactively prevent incidents from occurring by identifying and fixing their root cause Minimise the impact of incidents that cannot be prevented by providing information about their causes and workarounds Define the way in which problems will be identified, logged, investigated, resolved and reported on so that consistency is achieved within the IT organization Ensure that the management and investigation of problems takes due account of business priorities and helps to maximise business productivity

Version 1

Page 9 of 26

[Insert date]


Problem Management Process • •

Foster an effective and efficient approach to the handling of problems that presents a positive image to the business and maintains user satisfaction Ensure that information about problems and their progress is communicated to the relevant parties in a timely and accurate manner at all times

1.3 Scope The scope of this process is defined according to the following parameters: • • • •

Organizational o [List organizations and parts of those organizations covered] Geographical o [List locations from which problems will be identified and managed] Services o [Define the services covered by the process] Technical o [If necessary, cover the technology that may give rise to problems managed via this process]

This process covers all problems identified by [Organization Name] in support of the customers and users of services defined in the service catalogue. The following areas are specifically excluded from this process: [Describe any areas that need to be clearly stated as outside the scope]

Version 1

Page 10 of 26

[Insert date]


Problem Management Process

2 Problem management process 2.1 Process diagram

Figure 1: Problem management process

Version 1

Page 11 of 26

[Insert date]


Problem Management Process

2.2 Process triggers The problem management process is initiated from one or more of the following triggers: •

• •

As a reaction to one or more incidents with similar symptoms occurring for which the cause is not currently known. This may be recognised by: o The Service Desk o Second-line support o Third-line support o Suppliers and contractors o Customers o Users o Other source or stakeholder From information provided by the service transition stage regarding problems that have not been resolved prior to live running e.g. bugs in software or issues with configuration items As a result of a proactive analysis of previous incidents or message logs carried out with the intention of identifying common factors and trends worth investigating

2.3 Process inputs The process of problem management requires a number of inputs in order to be able to function effectively. These may not always be available but will ideally be: •

• • • • •

Details of incidents related to the problem, including o Number of incidents o Dates and times of incidents o Categorisations o Impacts o Symptoms o Actions carried out so far with results Configuration Management System (CMS) records for relevant CIs Technical and business input to investigation and diagnosis sessions Details of completion of requested changes from the change management process Feedback from incident management, users and other parties regarding whether the problem resolution has been successful Information from internal development teams and external suppliers regarding software and hardware problems that are known about but not yet fixed in the version in use within the organization

Version 1

Page 12 of 26

[Insert date]


Problem Management Process

2.4 Process narrative The following sections set out what happens at each stage of the process as depicted in the diagram above.

2.4.1 Problem detection Problems may be identified from any source, including: • • • • • •

Service Desk Analysts Suppliers Monitoring tools Customers Users Analysis of incident records

It is important that, once identified, problems are recorded so that effort can be allocated to resolving them.

2.4.2 Problem logging Upon a potential problem being recognised, a problem record will be created within the service desk system and populated with the references of the related incidents and the details of the symptoms of the problem, including: • • •

Users and user groups affected Any relevant information about the timing of the problem Possible causes identified so far

2.4.3 Categorisation The problem will be classified using the same scheme as for incidents (or a simplified subset of it) and then a member of staff will then be allocated to the analysis of the problem.

2.4.4 Allocation of priority The priority of a problem will determine the order in which it is addressed by problem managers and subsequent teams involved in its investigation. This will be based on a combination of two factors:

Version 1

Page 13 of 26

[Insert date]


Problem Management Process • •

Impact: A measure of the effect of a problem on business processes Urgency: A measure of how quickly the business needs the problem to be fixed

The priority should consider the benefits that will be achieved if we manage to resolve it (obviously not all problems will be resolvable). These benefits may take a number of forms but the main questions to be asked will be: • • • • • •

How much will business disruption be reduced? (e.g. no. of work-hours p.a.) What effect will this have on our customers? How many incidents will we prevent in a year? How much time will be saved in the IT team? What direct costs will we avoid? What effect will solving this problem have on staff morale?

These questions will allow a benefit profile to be created for the problem which will indicate how much effort it makes sense to put in to get it solved. Both impact and urgency will be assessed on a scale of high, medium and low. The priority of a problem will then be calculated based on the rating of its urgency and impact as follows:

IMPACT/URGENCY

HIGH

MEDIUM

LOW

High

1

2

3

Medium

2

3

4

Low

3

4

5

Table 1: Problem priority assessment

The priority of a problem will be calculated automatically by the service desk system based on the above rules. The definitions of each priority level are as follows:

PRIORITY

TITLE

DESCRIPTION

1

Critical

Significant delay or disruption to the business until the problem is fixed

2

High

Significant delay or disruption to parts of the business until the problem is fixed

3

Medium

Localised delay or disruption affecting one or more users

4

Low

Localised inconvenience affecting single user

5

Planning

Very minor inconvenience or non-urgent problem

Table 2: Problem priority definitions

Version 1

Page 14 of 26

[Insert date]


Problem Management Process

There may be circumstances where a problem affecting a single user has a significant business impact, particularly if the user is a member of the senior management team or a high-value financial transaction is involved. The priority should therefore be set in consultation with the user.

2.4.5 Investigation and diagnosis Once a problem has been logged, all activities performed with respect to that problem should be recorded as actions e.g. adding notes, referring to supplier. Where appropriate, one or more of the following root cause analysis techniques will be used: • • • • • •

Chronological Analysis Pain Value Analysis Kepner and Tregoe Brainstorming Ishikawa Diagrams Pareto Analysis

Where the root cause of the problem cannot be identified initially, it may be assigned to an appropriate second line resolver group. This may be internal or external to [Organization Name]. If the current resolver group cannot resolve the problem the technician may opt to escalate it further e.g. to an external supplier. In this case the problem remains with the internal resolver group and it is the technician’s responsibility to ensure that the problem is updated on a regular basis based on feedback from the external supplier. Once investigations have been completed and root cause of the problem is diagnosed (or before this point if useful), a known error record will be created.

2.4.6 Workarounds Any workarounds found which reduce or eliminate the symptoms of the problem temporarily should be recorded in the problem record and made available to the Service Desk. A knowledgebase will be maintained within the service desk system into which known errors i.e. diagnosed problems which are yet to be resolved, will be placed.

Version 1

Page 15 of 26

[Insert date]


Problem Management Process

2.4.7 Change requests Where a change to the live environment is required in order to fix the problem, a change request must be raised in accordance with change management procedures.

2.4.8 Resolution and closure Once the problem has been diagnosed and resolved, it may be closed. In some circumstances it may be decided to close the problem without it being resolved e.g. if the cost of resolving it is prohibitive or the service involved is about to be replaced or retired. In this case the reasons should be documented in the problem record.

2.4.9 Major problem review In the case of major problems which have had a significant impact upon service to users, a problem review will be carried out to identify lessons learned. The report produced will be made available to interested parties and any recommendation input to the service improvement plan.

2.5 Process outputs The outputs of the problem management process will be the following: • • • • •

Closed problems Complete and accurate problem records Feedback from customers and users regarding levels of satisfaction Communication and feedback to other service management processes such as availability management, capacity management and change management Reports to management regarding problem volumes, impacts, resolution success rates and process effectiveness

2.6 Process roles and responsibilities The following roles have the stated responsibilities within the problem management process.

Version 1

Page 16 of 26

[Insert date]


Problem Management Process

2.6.1 Problem manager • • • • •

Owner of the problem management process Responsible for identifying improvements to the process and ensuring it is adequately resourced Provides information regarding the success rates of the process Runs the process on a day-to-day basis Ensures that major problem reviews are carried out

2.6.2 Service desk analyst • • •

Assists with the detection of possible problems May log a problem within the service desk system Uses known error information when dealing with incidents

2.6.3 Technician • •

Participates in the technical investigation of the problem, under the guidance of the Problem Manager May be involved in the implementation of changes to resolve the problem

Version 1

Page 17 of 26

[Insert date]


Problem Management Process

2.7 RACI chart The table below clarifies the responsibilities at each step using the RACI system, i.e.: • • • •

R: Responsible A: Accountable C: Consulted I: Informed

STEP

SERVICE DESK ANALYST

PROBLEM MANAGER

Problem Detection

R

A/R

Problem Logging

R

A/R

Categorisation

TECHNICIAN

CHANGE MANAGER

A/R

Allocation of Priority

I

A/R

C

Investigation and Diagnosis

C

A

R

Workarounds

I

A

R

A

R

I I

Change Requests Resolution and Closure

I

A

R

Major Problem Review

C

A/R

C

Table 3: RACI matrix

Version 1

Page 18 of 26

[Insert date]


Problem Management Process

3 Problem management tools There are a number of key software tools that underpin an effective problem management process. These are subject to change as requirements and technology are updated and so specific systems are not described here. However, the main types of tools that play a significant part in the process within [Organization Name] are as follows.

3.1 Service desk system The service desk system provides the workflow engine and database to implement the core activities within problem management. These include: • • • • • • • • • •

Problem logging Routing and assignment of problems to teams and individuals Recording of actions against problems Updating of problem status from open through to closed Assessment of impact and urgency and auto-calculation of priority Email (and other forms of) communication with users from within problem records Problem categorisation to multiple levels Reporting Knowledgebase of past incidents with search capability Known error database

The service desk system is integrated with the systems that support various other processes, including incident, change and configuration management.

3.2 Problem analysis and investigation tools There are various root cause analysis techniques that may be used during the different stages of the investigation of a problem. Some of these, such as Pareto Analysis and Ishikawa Diagrams, may be supported by tools implemented using spreadsheets and mapping software.

3.3 Email and collaboration tools The email system, and other collaboration tools where available, are key to communication between the problem management team and other involved groups such as users and suppliers.

Version 1

Page 19 of 26

[Insert date]


Problem Management Process

3.4 Configuration management system The CMS provides real-time information about the hardware and software within the IT environment and allows problem management to view any changes that have been implemented on key components that are under consideration with regard to a problem. It allows the installed software and its versions to be viewed without the need to access the user’s computer remotely as well as helping problem management understand the relationships between service components.

Version 1

Page 20 of 26

[Insert date]


Problem Management Process

4 Communication and training There are various forms of communication that must take place for the problem management process to be effective. These are described below.

4.1 Communication with users It is likely that many of the incidents that give rise to the identification of a problem are reported by users. If such incidents are not able to be closed via the use of a workaround then it will be appropriate to keep these users informed about the progress of the investigation of the problem. In the event that such incidents can be closed but reoccur on a regular basis then users will still want to be kept informed about when the underlying problem will be fixed, and the frequent incidents can be expected to cease. Emails and other communications that are exchanged with the user should be incorporated into the request record so that a full audit trail of all communication is kept and is available to whoever is working on the problem. It may be appropriate to invite selected users to sessions organized to investigate problems via the various techniques available. Users who have first-hand knowledge of the symptoms and circumstances of a problem can provide valuable insight into its causes and may speed up its resolution.

4.2 Communication with customers Even where there is no formal SLA associated with the resolution of problems, customers should be kept informed about the progress of high priority problems affecting their business area, including what is being done to resolve them and the resources dedicated to their investigation.

4.3 Communication with IT teams Problem management needs the support of technical specialists to identify and resolve sometimes complex problems for the benefit of the business and often the IT team itself. The problem manager will foster close relationships with key teams within the IT organization so that the benefits of effective problem management are understood and demonstrated. IT specialists will be involved in investigative sessions and are likely to be key contributors to the use of techniques such as chronological analysis and fault isolation. Version 1

Page 21 of 26

[Insert date]


Problem Management Process

4.4 Communication with suppliers Often the input of suppliers will be critical to diagnose, test and resolve difficult problems. Their knowledge of the products and services they supply will usually exceed that available in-house and sometimes access to the developers of products may be needed to determine a resolution. The internal supplier manager for the third party involved should be kept informed of the ongoing communication between problem management and supplier staff and may be useful in securing additional resource to speed up investigations.

4.5 Process performance It is important that the performance of the problem management process is monitored and reported upon on a regular basis in order to assess whether the process is operating as expected. The content of performance reports is set out in section 6 of this document, but it is vital that the reports are not only produced but are also communicated to the appropriate audience. This will include the customers of the IT service and the management of IT concerning resource utilisation and allocation. Depending on the health of the process it may be appropriate to hold regular meetings with customers and IT management to discuss the performance and agree any actions to improve it.

4.6 Communication related to changes The problem management process manager must have visibility of the change management schedule and ideally will be briefed on any changes with the potential to affect ongoing problems. This may be a regular meeting or carried out on an ad-hoc basis according to the frequency of occurrence of such changes. Problem management will also communicate with change management as part of the logging of changes to resolve problems and the review of these after the event.

4.7 Training for problem management In addition to a well-defined process and appropriate software tools it is essential that the people aspects of problem management are adequately addressed. The process requires that training be provided to all participants in order that it runs as smoothly as possible. The main areas in which training will be required for problem management are as follows.

Version 1

Page 22 of 26

[Insert date]


Problem Management Process • • • • • •

The problem management process itself, including the activities, roles and responsibilities involved Problem management software tools such as the service desk system and configuration management system Specific problem investigation techniques such as Kepner-Tregoe, 5-Whys and Affinity Mapping Soft skills such as customer service, dealing with difficult conversations and avoiding technical jargon The basics of the technology and how it is implemented within [Organization Name] The business, its structure, locations, priorities and people

In addition, training should be provided to the user population regarding how to identify and report a problem, including: • • •

The difference between an incident, a service request, a problem and a change request and how they are handled How to report a problem via the various means available What may be expected of them as part of problem investigation

This training may be provided via short workshops and supplemented by on demand resources such as videos and user guides.

Version 1

Page 23 of 26

[Insert date]


Problem Management Process

5 Interfaces and dependencies The problem management process has a number of interfaces and dependencies with other processes within service management and the business. These are outlined here and are described in further detail in the relevant procedural documentation.

PROCESS

INPUTS TO PROBLEM MANAGEMENT FROM THE NAMED PROCESS

OUTPUTS FROM PROBLEM MANAGEMENT TO THE NAMED PROCESS

Budgeting and Accounting for IT Services

Cost information to help assess the relative priority of problems Costings of hardware and software components to be used to resolve problems

Cost of proposed problem resolutions for input to budget cycle

Service Level Management

Service Level Agreements to determine impact of problems

Problem status information for inclusion in service level reports

Availability Management

Areas in which availability needs to be improved

Resolved problems to improve availability

Capacity Management

Performance information as part of investigation of capacity issues

Resolved performance problems

IT Service Continuity Management

Details of service continuity plans for options appraisal

Invocations of service continuity plans in the event of major problems

Configuration Management

Configuration Management System (CMS) records

Linking of problems to CIs

Change Management

Information about changes that may have affected existing problems or created new ones

Changes raised to resolve problems

Release and Deployment Management

Known errors with new releases Release schedules for changes to fix problems

Information regarding priority of problems for which fixes are included in planned releases

Incident Management

Problems raised as a result of one or more incidents Ongoing information about incidents related to problems

Resolution of problems leading to closure of open incidents Workarounds Known errors

Table 4: Process interfaces and dependencies

Version 1

Page 24 of 26

[Insert date]


Problem Management Process

6 Proactive problem identification and reporting On a quarterly basis to coincide with the production of the IT service report, an analysis of logged incidents will be performed in order to identify areas in which possible problems exist. Based on this analysis, problem records will be raised, and any required actions will be agreed to address them e.g. user training. The reports that will be produced for analysis will include (but not limited to): • • • • • • •

Number of incidents by category (e.g. problems with MS Word) Number of incidents by user Number of incidents by business team Number of incidents by priority Number of incidents by configuration item Daily distribution of incidents across the quarter Hourly distribution across the quarter

Figures will also be reported in historical context to identify trends. The following KPIs will also be used on a regular basis to evidence the successful operation of the problem management process:

KPI REF

KEY PERFORMANCE INDICATOR

KPI1

Number of problems resolved per month

KPI2

Number of incidents prevented per month

KPI3

Number of workarounds identified

KPI4

Mean time to resolve problems by problem type

KPI5

Number of major problems raised

KPI6

Number of hours service lost due to identified problems

KPI7

User satisfaction scores from user surveys

KPI8

Customer satisfaction scores from customer surveys

KPI9

Number of complaints about the problem management process

KPI10

Number of open problems

KPI11

Staff to problem ratio

KPI12

Average cost per resolved problem

Table 5: Process KPIs

Version 1

Page 25 of 26

[Insert date]


Problem Management Process

7 Conclusion Problem management is a key component of the [Organization Name] service management quality system. The combination of disciplined reactive problem management, and analytical proactive problem management, should reap significant benefits in improving the quality of the service provided and users’ ability to make best use of the available IT services.

Version 1

Page 26 of 26

[Insert date]