Previous Section   Next Section

6.3 Basic concepts

The availability of relevant and easily applied advice in the early stages of an Incident is central to an organisation's ability to resolve Incidents effectively; very few Incidents that are received at the Service Desk are new or mysterious to the support staff. Similarly, specialists within second-line or third-line support staff will have already resolved many difficult and 'original' Incidents and Problems. The best use of the resources expended on these resolutions is to document them in such a way that frontline staff can apply them.

The Problem Management process is intended to reduce both the number and severity of Incidents and Problems on the business. Therefore, part of Problem Management's responsibility is to ensure that previous information is documented in such a way that it is readily available to first-line and other second-line staff. This is not simply a matter of producing documentation. What is required includes:

It is common to make use of 'expert system' software to facilitate the Problem Management process. However, it is important that it includes expert knowledge, updated with feedback from those staff who use the system.

Problems and Known Errors can be identified by:

A Problem is a condition often identified as a result of multiple Incidents that exhibit common symptoms. Problems can also be identified from a single significant Incident, indicative of a single error, for which the cause is unknown, but for which the impact is significant.

A Known Error is a condition identified by successful diagnosis of the root cause of a Problem, and the subsequent development of a Work-around.

Structural analysis of the IT infrastructure, reports generated from support software, and User-group meetings can also result in the identification of Problems and Known Errors. This is proactive Problem Management.

Problem control focuses on transforming Problems into Known Errors. Error control focuses on resolving Known Errors structurally through the Change Management process.

6.3.1 What is the difference between Incident Management and Problem Management?

Problem Management differs from Incident Management in that its main goal is the detection of the underlying causes of an Incident and their subsequent resolution and prevention. In many situations this goal can be in direct conflict with the goals of Incident Management where the aim is to restore the service to the Customer as quickly as possible, often through a Work-around, rather than through the determination of a permanent resolution (for example, by searching for structural improvements in the IT infrastructure, in order to prevent as many future Incidents as possible). In this respect, therefore, the speed with which a resolution is found is only of secondary (albeit still of significant) importance. Investigation of the underlying Problem can require some time and can thus delay the restoration of service, causing downtime but preventing recurrence.

6.3.2 Problem control

The Problem control process is concerned with handling Problems in an efficient and effective way. The aim of Problem control is to identify the root cause, such as the CIs that are at fault, and to provide the Service Desk with information and advice on Work-arounds when available.

The process of Problem control is very similar to, and highly dependent on, the quality of the Incident control process. Incident control focuses on resolving Incidents and on providing Work-arounds and temporary fixes for specific Incidents. If a Problem is identified for an Incident or a group of Incidents, available Work-arounds and temporary fixes are recorded in the Problem record by the Problem control process. Problem control also advises on the best Work-around available for the Problem.

Because Problem control is concerned with preventing the recurrence of Incidents, the process should be subject to an approach that is carefully managed and planned. The degree of management and planning required is greater than that needed for Incident control, where the objective is restoration of normal service as quickly as possible. Priority should be given to the resolution of Problems that can cause serious business disruption.

Activities recognised in Problem control are:

6.3.3 Error control

Error control covers the processes involved in progressing Known Errors until they are eliminated by the successful implementation of a Change under the control of the Change Management process. The objective of error control is to be aware of errors, to monitor them and to eliminate them when feasible and cost-justifiable.

Error control bridges the development (including applications development, enhancement and maintenance) and live environments. Software errors introduced during the development phase can affect live operations; therefore, Known Errors identified in the development or maintenance environment should be handed over to the live environment.

Activities recognised in error control are:

In practice, each of these processes of Problem Management requires careful management and control. Different operational objectives apply during each of these control processes.

6.3.4 Proactive Problem Management

Proactive Problem Management covers the activities aimed at identifying and resolving Problems before Incidents occur. These activities are:

By redirecting the efforts of an organisation from reacting to large numbers of Incidents to preventing Incidents, an organisation provides a better service to its Customers and makes more effect use of the available resources within the IT support organisation.

6.3.5 Completion of major Problem reviews

Feedback from these reviews is a major contributor to the continual process of improvement.

Previous Section   Next Section