Description

This curriculum spans the breadth of IT service management governance and operational execution, comparable in scope to a multi-workshop advisory engagement focused on maturing enterprise service lifecycle processes across strategy, design, delivery, and performance management.

Module 1: Service Strategy and Portfolio Management

Decide which services to retire, sustain, or invest in based on utilization metrics, cost-to-serve, and business unit demand forecasts.
Implement a standardized business case template for new service requests that includes TCO, risk exposure, and alignment with enterprise architecture principles.
Balance investment between run-the-business and change-the-business initiatives within the service portfolio under constrained budget cycles.
Establish governance thresholds for service approval, including mandatory engagement from security, compliance, and infrastructure teams.
Integrate portfolio reviews with enterprise financial planning cycles to align IT spending with fiscal year planning.
Manage shadow IT by defining escalation paths for unauthorized services, including enforcement mechanisms and remediation workflows.

Module 2: Service Design and Architecture Alignment

Enforce design compliance by requiring architecture review board (ARB) sign-off before any service moves to build phase.
Map service dependencies to underlying infrastructure components using CMDB data, identifying single points of failure and redundancy gaps.
Define SLA and OLA structures during design phase, ensuring measurable KPIs are technically enforceable through monitoring tools.
Integrate non-functional requirements (e.g., scalability, disaster recovery) into service blueprints with input from operations and security teams.
Standardize service templates for common offerings (e.g., virtual servers, SaaS onboarding) to reduce design rework and accelerate delivery.
Negotiate design trade-offs between agility (e.g., cloud-native patterns) and enterprise standards (e.g., network segmentation policies).

Module 3: Change Enablement and Risk Control

Classify changes using a dynamic model that adjusts risk scoring based on asset criticality, change type, and historical failure rates.
Implement peer-review requirements for standard changes to prevent automation from bypassing human judgment on high-impact systems.
Define rollback procedures during change planning, including data state restoration and configuration drift detection methods.
Enforce change freeze windows during critical business periods, with exception processes requiring executive and technical approvals.
Integrate change data with monitoring systems to correlate incidents with recent deployments using time-based event analysis.
Optimize CAB meeting frequency by tiering changes—only high-risk changes require full board review; others use delegated authority.

Module 4: Incident and Major Event Management

Define major incident criteria using business impact, not just technical severity, to trigger escalation protocols.
Assign incident commanders with clear authority to redirect resources during active outages, documented in runbooks.
Implement war room coordination across time zones using shared dashboards and real-time communication tools with audit trails.
Standardize post-incident timelines to ensure root cause analysis is initiated within four hours of resolution.
Enforce incident categorization consistency using a controlled taxonomy linked to knowledge base articles and known errors.
Balance transparency and risk during public incidents by defining pre-approved messaging templates reviewed by legal and PR teams.

Module 5: Problem Management and Knowledge Integration

Prioritize problem records based on recurrence frequency, business impact, and cost of workaround.
Link known error database (KEDB) entries directly to incident records to reduce mean time to resolve through proactive matching.
Assign problem ownership to technical domains, requiring regular review meetings with service owners and engineering leads.
Integrate problem data with change management to identify patterns of failure associated with specific deployment types.
Measure problem resolution effectiveness using escape rate—the percentage of incidents recurring after a fix is implemented.
Enforce knowledge article creation as part of problem closure, with mandatory peer review before publication.

Module 6: Service Level Management and Performance Reporting

Negotiate SLA terms with business units using historical performance data to set realistic targets and avoid overcommitment.
Automate SLA breach alerts with escalation paths that trigger service review meetings when thresholds are consistently missed.
Break down end-to-end service performance by component (e.g., network, application, database) to assign accountability.
Report service performance in business terms (e.g., transaction success rate, user productivity loss) rather than system uptime.
Adjust SLA review cycles based on service criticality—mission-critical services reviewed quarterly, others annually.
Handle SLA exceptions for emergency changes by defining compensating controls and post-implementation validation requirements.

Module 7: Knowledge and Configuration Management Integration

Define CI ownership at the team level, requiring approval workflows for updates to critical configuration items.
Automate CI discovery while implementing manual override controls to prevent inaccurate or redundant entries.
Link knowledge articles to specific CIs to enable technicians to access relevant documentation during incident resolution.
Enforce CMDB audit schedules based on CI criticality, with quarterly reviews for Tier-1 systems and annual for Tier-3.
Integrate CMDB data with service mapping tools to visualize service impact during infrastructure changes.
Resolve CMDB data conflicts by establishing a reconciliation process between discovery tools and manual entries using change records as source of truth.

Module 8: Continuous Service Improvement and Metrics Governance

Select CSI initiatives based on gap analysis between current performance and business objectives, not just low-hanging fruit.
Define baseline metrics before implementing improvements to measure actual impact, not perceived success.
Assign improvement owners with cross-functional authority to implement changes beyond IT service boundaries.
Use balanced scorecards to track improvements across dimensions: cost, quality, speed, and compliance.
Integrate customer feedback loops through structured surveys and service review meetings, not just operational data.
Retire outdated metrics that no longer align with business goals, avoiding metric overload in reporting dashboards.