OnPage incident management

Enterprise incident management

Enterprise incident management

At its core, enterprise incident management is the process of providing incident management to a specific team or business. IT often faces incidents which have the potential to disrupt or waylay the team or the company. To successfully combat these critical incidents, teams need to have a deliberate process in place by which to resolve issues quickly.

Enterprise incident management brings together processes such as ticketing, alerting, escalations, reporting and documentation to the company as a whole so that there is a sustainable and repeatable process to achieve process excellence.

Purpose of Enterprise Incident Management

The main goal of the enterprise incident management process is to restore normal service operations to the enterprise as quickly as possible. By doing so, the company will minimize the adverse impact of outages on the business and ensure that the optimal level of service quality is maintained. This optimal level of service is defined as the level of service operation as defined within service level agreement (SLA) limits.

What is the enterprise incident management process?

1. Create a service level agreement (SLA)

The company creates an SLA between itself and its customer that define the path for incident priorities, escalation paths and response times.

2. An incident is identified and logged

When an incident occurs, it is identified and logged in a ticketing system so a record is kept. Ideally, the ticket will be updated along the way as the team works to resolve the issue.

3. Templates are used to categorize the issue.

The ticket is categorized according to type. For example, the ticket might be defined as a server issue or a networking issue.

4. The issue is prioritized based on severity and impact on the business.

High priority issues are prioritized above other issues based on the significant financial impact they have on the business. Low priority issues are ones that have minimal financial impact and thus are typically resolved after high and medium priority issues.

5. Issue is escalated if more technical expertise is required to resolve the issue

If the team which receives the alert needs to call in assistance from other groups in the organization or is unable to resolve the issue on their own, the issue is escalated to bring further expertise to the issue.

6. Investigation and diagnosis of the issue

By using messaging between team members and run books, It professionals are able to rapidly investigate and diagnosis the issue which is impeding the appropriate functioning of the company’s technology.

7. Resolution and recovery

Once the issue has been diagnosed, it can be resolved and service levels can return to their expected level of performance

8. Incident closure

The incident is closed. This typically happens through the ticketing system.

9. Customer survey or internal post mortem.

By bringing in a step for reflection on the process, teams are able to review the processes and steps they took to resolve the issue and see what can be done better next time.

Each of these steps is important in the creation of a clear incident management process. Skipping the steps in an attempt to resolve the issue more quickly can easily lead to overwhelming IT teams and hurting SLAs.

OnPage is the perfect tool for Enterprise Incident Management

OnPage can be implemented enterprise wide. Consolidate all enterprise alerts on to one Incident Management system hosted in a secure, SSAE-16 compliant hosting facilities across the USA. Handle enterprise wide communication through the built-in team messaging.

  • Fragmented teams are no longer a Problem! The intuitive built in messaging allows for the entire ticket details to be forwarded. Get full event visibility!
  • Add notes, a conference bridge number, attachments and predefined message templates to the event alert.
  • OnPage “Alert-Until-Read” ensures that critical alerts are never missed.
  • Follow the audit-trail to ensure a notification was read and replied to.
  • The fault-proof scheduler defaults to “always full” i.e. if a person is removed from an on-call shift by mistake with no replacement, the entire team will be alerted to ensure the alert is delivered.

OnPage provides powerful integrations with mission critical systems through the industry’s easiest integration framework.

 

Learn More

Related Links

OnPage has written several relevant whitepapers that can assist you in understanding the complexities of an effective IT on-call policy.

WHITE PAPER [IT]: How To Survive Being On-Call:

WHITE PAPER [IT]: Mastering On-Call Scheduling

WHITE PAPER [IT]: 5 Ways To Conquer Alert Fatigue

Other terms you might want to explore

On-call management

On-call scheduling

Escalation management

DevOps management

Manage IT on-call

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×