An AIOPS assessment provides a framework for identifying automation candidates by working with individual operations teams to understand where they spend the most time. By collecting process specific data and operational data (tickets, events, and logs) you can apply ML models to test out various improvement targets. For example, you might want to investigate the impact of event clustering on your current incident volumes. The deliverables might look like the below:
When outlining the strategy the project team should identify a set of key guiding principles that establish a strong AIOPS foundation and aligned automation goals, for example:
Along with the business-level guiding principles, the responsible team should collaborate to develop a set of design principles for each functional area through which AIOPS will be integrated. The following are some examples of design principles:
GENERAL
FAULT/EVENT MANAGEMENT
PERFORMANCE MANAGEMENT
CONFIGURATION MANAGEMENT
INCIDENT MANAGEMENT
CHANGE MANAGEMENT
PROBLEM MENAGEMENT
OPERATIONAL KNOWLEDGE MENAGEMENT
RUNBOOK AUTOMATION