This curriculum spans the technical, organisational, and governance challenges of infrastructure assessment at the scale of a multi-workshop diagnostic program, reflecting the iterative coordination required in enterprise service portfolio reviews involving IT operations, security, finance, and business units.
Module 1: Defining Scope and Stakeholder Alignment
- Selecting which business units and service lines to include in the assessment based on operational criticality and budget ownership.
- Negotiating access to infrastructure documentation with siloed operations teams resistant to external review.
- Determining whether cloud, on-premises, or hybrid environments are in scope based on current service dependencies.
- Mapping decision rights across IT, security, and business stakeholders to establish escalation paths for findings.
- Deciding whether to include third-party managed services in the assessment and defining data-sharing agreements.
- Establishing thresholds for what constitutes a “critical” infrastructure component to prioritize evaluation efforts.
Module 2: Inventory and Dependency Mapping
- Integrating data from CMDBs, cloud consoles, and network scans when sources are inconsistent or outdated.
- Resolving conflicts between documented architectures and actual runtime dependencies observed via traffic analysis.
- Identifying shadow IT systems running in production without formal approval or tracking.
- Classifying dependencies by strength (e.g., hard dependency vs. optional integration) for risk modeling.
- Handling incomplete or missing metadata for legacy systems with undocumented interfaces.
- Deciding whether to include disaster recovery and backup infrastructure in the dependency map.
Module 3: Performance and Capacity Benchmarking
- Selecting performance metrics (e.g., latency, throughput, error rates) relevant to service-level expectations.
- Establishing baselines using historical monitoring data when instrumentation was inconsistent across systems.
- Identifying resource contention points during peak load periods across shared infrastructure tiers.
- Assessing whether current capacity planning aligns with business growth projections or seasonal demand.
- Deciding whether to include auto-scaling behaviors in performance analysis for cloud-hosted services.
- Validating monitoring coverage gaps that could lead to false performance conclusions.
Module 4: Resilience and Availability Analysis
- Evaluating redundancy at each layer (network, compute, storage) against documented high-availability requirements.
- Assessing failover procedures for critical systems based on documented runbooks versus actual test results.
- Identifying single points of failure in configurations that contradict architectural design principles.
- Determining whether recovery time objectives (RTO) and recovery point objectives (RPO) are technically enforceable.
- Reviewing backup integrity and retention policies for systems with compliance obligations.
- Documenting untested disaster recovery plans and their implications for service continuity.
Module 5: Security and Compliance Integration
- Mapping infrastructure components to regulatory obligations (e.g., PCI, HIPAA) based on data flow.
- Identifying systems with privileged access patterns that lack monitoring or justification.
- Assessing patch management cadence against known vulnerability windows for critical software.
- Reviewing network segmentation effectiveness in limiting lateral movement during breach scenarios.
- Determining whether encryption is consistently applied for data at rest and in transit.
- Validating identity and access management configurations for service accounts with excessive permissions.
Module 6: Cost and Resource Utilization Assessment
- Allocating infrastructure costs to services using direct assignment versus proportional models.
- Identifying underutilized resources (e.g., idle VMs, overprovisioned databases) for rightsizing.
- Comparing actual cloud spend against reserved instance or commitment-based pricing strategies.
- Assessing cost implications of technical debt, such as maintaining outdated hardware or software.
- Deciding whether to include internal support labor in total cost of ownership calculations.
- Reconciling finance data with technical usage metrics when billing dimensions differ across platforms.
Module 7: Service Portfolio Integration and Prioritization
- Aligning infrastructure health indicators with service portfolio criticality rankings.
- Flagging services dependent on end-of-life infrastructure for modernization or retirement.
- Integrating infrastructure risk scores into service lifecycle decision gates (e.g., renewal, decommission).
- Coordinating infrastructure upgrade timelines with service roadmap milestones.
- Identifying shared platform dependencies that impact multiple services for consolidated investment.
- Establishing thresholds for infrastructure risk tolerance that trigger service-level reviews.
Module 8: Reporting and Continuous Monitoring Frameworks
- Designing executive summaries that translate technical findings into business risk statements.
- Selecting KPIs for ongoing infrastructure health monitoring aligned with service SLAs.
- Implementing automated data pipelines to keep inventory and performance data current.
- Defining refresh cycles for reassessment based on service volatility and change frequency.
- Integrating findings into existing IT governance forums (e.g., change advisory boards, service reviews).
- Establishing ownership for remediation actions and tracking progress without direct authority.