This curriculum spans the equivalent of a multi-workshop program, addressing the integration of cloud computing into core ITSM processes such as incident management, change control, and service continuity, with a focus on operational workflows found in organisations managing hybrid and multi-cloud environments.
Module 1: Strategic Alignment of Cloud Services with ITSM Frameworks
- Decide which ITIL processes (e.g., Incident, Change, Configuration Management) require direct integration with cloud provider APIs for real-time data synchronization.
- Assess whether existing service catalogs need restructuring to reflect cloud-native service offerings such as auto-scaling groups or serverless functions.
- Map cloud service ownership across ITIL-defined roles (e.g., Service Owner, Process Owner) to clarify accountability in hybrid environments.
- Define thresholds for service level agreements (SLAs) with cloud providers that align with internal ITSM performance metrics and reporting cycles.
- Establish criteria for evaluating cloud service adoption against business service continuity requirements within the Continual Service Improvement (CSI) model.
- Integrate cloud cost visibility into service portfolio management to ensure financial governance aligns with service value reporting.
Module 2: Cloud Integration with Service Catalog and CMDB
- Implement automated discovery mechanisms to populate Configuration Items (CIs) for ephemeral cloud resources such as containers and spot instances.
- Design CMDB reconciliation workflows that handle dynamic cloud resource lifecycles without creating stale or duplicate CIs.
- Select attribute sets for cloud-based CIs that support impact analysis across interdependent services in multi-cloud architectures.
- Configure API-based integrations between cloud configuration management tools (e.g., AWS Config, Azure Resource Graph) and the CMDB.
- Enforce tagging standards at provisioning time to ensure consistent metadata for cost allocation, ownership, and compliance tracking.
- Develop audit procedures to validate CMDB accuracy for cloud resources during compliance or internal control reviews.
Module 3: Incident and Problem Management in Cloud Environments
- Configure event routing from cloud monitoring tools (e.g., CloudWatch, Azure Monitor) to the ITSM incident management system using normalized event formats.
- Define escalation paths for incidents involving shared responsibility where root cause spans internal applications and cloud infrastructure.
- Implement automated incident classification based on resource tags, service criticality, and cloud region to streamline triage.
- Adapt problem management workflows to address recurring issues in auto-scaling or configuration drift scenarios.
- Integrate postmortem findings from cloud outages into the knowledge base with actionable remediation steps and ownership assignments.
- Establish thresholds for auto-closure of transient cloud events that do not require manual intervention.
Module 4: Change Enablement and Cloud Configuration Control
- Define which cloud configuration changes (e.g., security group modifications, IAM policy updates) require formal Change Advisory Board (CAB) review.
- Implement policy-as-code controls (e.g., using AWS Config Rules or Azure Policy) to enforce pre-approved change templates.
- Integrate Infrastructure as Code (IaC) pipelines with the change management system to log deployments as standard changes.
- Design rollback procedures for failed cloud configuration changes that align with ITSM change success criteria.
- Configure automated notifications to stakeholders when emergency changes are executed in cloud environments.
- Track change failure rates for cloud-native services to identify systemic issues in deployment practices.
Module 5: Service Continuity and Disaster Recovery in the Cloud
- Validate recovery time objectives (RTOs) for cloud-hosted services using controlled failover tests across availability zones.
- Document data sovereignty constraints in disaster recovery plans when replicating data across cloud regions or providers.
- Integrate cloud backup operations (e.g., snapshot schedules, cross-region replication) into service continuity runbooks.
- Assess the impact of cloud provider maintenance windows on service availability and coordinate with change schedules.
- Define criteria for declaring a disaster in cloud environments, including loss of region-level access or data corruption.
- Ensure incident response teams have access to cloud account recovery credentials and break-glass accounts during outages.
Module 6: Security and Compliance Governance in Cloud-Integrated ITSM
- Map cloud identity and access management (IAM) roles to ITSM-defined user roles to enforce least privilege access.
- Integrate cloud security findings (e.g., from AWS Security Hub) into the ITSM problem and incident workflows.
- Define retention policies for cloud logs that satisfy both regulatory requirements and internal audit needs.
- Implement automated compliance checks for new cloud resources against internal security baselines.
- Coordinate vulnerability management between cloud security tools and the ITSM known error database.
- Document shared responsibility boundaries in service design documents to clarify security ownership with cloud providers.
Module 7: Performance and Capacity Management for Cloud Services
- Configure predictive scaling policies based on historical usage trends collected through ITSM performance reports.
- Establish baselines for cloud service utilization to identify underused resources for rightsizing or decommissioning.
- Integrate cloud cost anomaly detection with capacity planning cycles to prevent budget overruns.
- Define performance thresholds that trigger automatic alerts in the ITSM system when cloud service degradation occurs.
- Conduct regular service capacity reviews that include cloud resource elasticity and burst capacity scenarios.
- Model the impact of service growth on cloud spend using forecasting tools tied to service portfolio projections.
Module 8: Financial Management and Cloud Cost Optimization
- Implement chargeback or showback models that allocate cloud costs to business units based on service usage data.
- Integrate cloud billing data into the service portfolio to reflect true cost-to-serve for each cloud-hosted service.
- Establish approval workflows for provisioning high-cost cloud resources such as GPU instances or data transfer volumes.
- Track reserved instance utilization and renewal dates within the asset management system to avoid overprovisioning.
- Conduct quarterly cost optimization reviews using tagging accuracy and usage reports from cloud providers.
- Link cloud cost alerts to incident or problem records when unexpected spending indicates configuration or operational issues.