This curriculum reflects the scope typically addressed across a full consulting engagement or multi-phase internal transformation initiative.
Module 1: Strategic Cloud Adoption and Business Alignment
- Evaluate total cost of ownership (TCO) trade-offs between on-premises, hybrid, and cloud-native architectures under variable workload projections.
- Map business capabilities to AWS service categories (compute, storage, networking, security) to prioritize migration candidates based on ROI and risk.
- Assess organizational readiness for cloud adoption across people, process, and technology dimensions, identifying critical capability gaps.
- Define cloud governance boundaries by aligning AWS account structures with business units, regulatory domains, and cost centers.
- Develop cloud funding models (showback vs. chargeback) and integrate them with existing financial systems and budget cycles.
- Identify key performance indicators (KPIs) for cloud transformation success, including time-to-market, operational efficiency, and compliance adherence.
- Establish escalation paths and decision rights for cloud-related investments, ensuring executive sponsorship and cross-functional alignment.
- Analyze failure modes in cloud migration programs, including scope creep, skill shortages, and misaligned incentives across IT and business units.
Module 2: Secure and Governed AWS Account Architecture
- Design multi-account strategies using AWS Organizations to enforce isolation for production, development, and sensitive workloads.
- Implement Service Control Policies (SCPs) to restrict region availability, service usage, and resource creation at scale across accounts.
- Configure AWS Control Tower for standardized landing zones, balancing automation with customization for enterprise requirements.
- Integrate identity providers via SAML or OIDC to enable centralized access management and meet least-privilege principles.
- Define tagging standards and enforce them through automated guardrails to support cost allocation, security classification, and resource discovery.
- Implement cross-account access patterns using IAM roles, evaluating trust boundaries and audit implications.
- Establish centralized logging and security monitoring accounts with appropriate resource sharing and data retention policies.
- Assess trade-offs between single-sign-on integration depth and operational complexity across hybrid identity environments.
Module 3: Identity and Access Management at Scale
- Design IAM policies that balance granularity and maintainability, avoiding over-permissioned roles in dynamic environments.
- Implement role-based access control (RBAC) and attribute-based access control (ABAC) models for large-scale AWS deployments.
- Configure identity federation with enterprise directories, managing session duration, MFA enforcement, and Just-In-Time provisioning.
- Rotate and manage long-term access keys, evaluating risks of embedded credentials in legacy applications.
- Monitor and audit IAM activity using AWS CloudTrail, identifying anomalous behavior and privilege escalation attempts.
- Define break-glass access procedures for emergency administrative access, including time-bound permissions and notification workflows.
- Enforce MFA across all privileged roles and evaluate usability impacts on developer productivity and operational response times.
- Integrate third-party identity governance tools for access certification, role mining, and compliance reporting.
Module 4: Network Design and Connectivity in Hybrid Environments
- Design VPC architectures with public, private, and isolated subnets, aligning with application security and data residency requirements.
- Implement AWS Transit Gateway for centralized routing across multiple VPCs and on-premises networks, evaluating scalability limits.
- Configure site-to-site VPN and AWS Direct Connect connections, comparing cost, performance, and availability trade-offs.
- Manage DNS resolution across hybrid environments using Route 53 and Resolver endpoints, ensuring consistency and low latency.
- Enforce network segmentation using security groups, NACLs, and AWS Network Firewall, avoiding overly permissive rules.
- Plan IP address allocation across regions and accounts to prevent overlap and support future growth.
- Implement private access to AWS services using VPC endpoints, reducing exposure to public internet and improving performance.
- Diagnose and resolve network performance bottlenecks using VPC Flow Logs, CloudWatch, and AWS Trusted Advisor.
Module 5: Resilient and Scalable Compute Strategies
- Compare EC2 instance types, purchasing options (On-Demand, Reserved, Spot), and auto-scaling group configurations for cost-performance balance.
- Design fault-tolerant applications across Availability Zones, evaluating RTO and RPO targets for critical systems.
- Implement containerized workloads using Amazon ECS or EKS, assessing operational overhead versus flexibility.
- Manage stateful workloads on EC2, including backup strategies, AMI lifecycle, and disaster recovery planning.
- Evaluate serverless compute (Lambda) for event-driven use cases, considering cold start latency and execution duration limits.
- Configure lifecycle hooks and health checks in auto-scaling groups to ensure safe deployments and instance replacements.
- Monitor compute utilization and rightsize instances using AWS Compute Optimizer, balancing performance and cost.
- Design blue/green or canary deployments using Elastic Load Balancing and AWS CodeDeploy for zero-downtime releases.
Module 6: Data Management and Storage Optimization
- Select storage classes (S3 Standard, IA, Glacier) based on access patterns, durability requirements, and cost constraints.
- Implement S3 bucket policies, encryption (SSE-S3, SSE-KMS), and access points to enforce data protection and sharing controls.
- Design cross-region replication and versioning strategies for compliance, disaster recovery, and data sovereignty.
- Optimize RDS deployments using read replicas, Multi-AZ configurations, and parameter group tuning for performance and availability.
- Manage database licensing costs for commercial engines (SQL Server, Oracle) in AWS, evaluating Bring-Your-Own-License (BYOL) vs. license-included models.
- Implement data lifecycle policies in S3 and EBS snapshots to automate tiering and deletion based on retention rules.
- Use AWS DataSync and Storage Gateway for hybrid data transfer, evaluating bandwidth, latency, and consistency requirements.
- Assess trade-offs between managed databases (RDS, DynamoDB) and self-managed instances on EC2 for control and operational burden.
Module 7: Cost Management and Financial Governance
- Interpret AWS Cost and Usage Reports (CUR) to identify cost drivers by service, account, and resource tag.
- Set up budget alerts and anomaly detection using AWS Cost Explorer and Cost Anomaly Detection, minimizing overspending risks.
- Negotiate and manage Reserved Instance and Savings Plans commitments based on historical usage and forecasted demand.
- Implement chargeback models using cost allocation tags, ensuring accuracy and stakeholder acceptance.
- Optimize data transfer costs between regions, availability zones, and services using architectural adjustments.
- Evaluate egress fees and their impact on multi-cloud or hybrid exit strategies, incorporating them into vendor lock-in assessments.
- Conduct regular cost reviews with business units to align spending with value delivery and strategic priorities.
- Identify and remediate cost overruns caused by untagged resources, orphaned volumes, or idle instances.
Module 8: Security, Compliance, and Audit Readiness
- Implement AWS Config rules to enforce compliance with internal policies and regulatory standards (e.g., HIPAA, GDPR, PCI-DSS).
- Use AWS Security Hub to aggregate findings from GuardDuty, Inspector, and third-party tools for centralized risk visibility.
- Conduct evidence collection workflows for audits using automated scripts and AWS Artifact reports.
- Design encryption strategies for data at rest and in transit, managing CMKs in AWS KMS with rotation and access policies.
- Respond to security incidents using CloudTrail logs, VPC Flow Logs, and GuardDuty findings to trace attacker behavior.
- Implement least-privilege access for third-party vendors and contractors using time-bound, scoped IAM roles.
- Evaluate shared responsibility model implications for SaaS, PaaS, and IaaS services in audit planning.
- Integrate AWS with SIEM and SOAR platforms for real-time threat detection and response orchestration.
Module 9: Operational Excellence and Incident Management
- Define operational runbooks for common incidents (e.g., service degradation, failed deployments) using AWS Systems Manager.
- Configure CloudWatch alarms and dashboards to monitor application health, setting appropriate thresholds and notification channels.
- Implement structured logging with CloudWatch Logs Insights to enable rapid troubleshooting and pattern analysis.
- Use AWS Systems Manager Parameter Store or Secrets Manager for secure configuration and credential management.
- Automate routine operational tasks (patching, backups, restarts) using Systems Manager Automation and Maintenance Windows.
- Conduct blameless post-mortems for production incidents, capturing action items and updating controls to prevent recurrence.
- Design monitoring coverage for serverless and containerized workloads, addressing visibility gaps in ephemeral environments.
- Integrate AWS operations with existing ITSM platforms (e.g., ServiceNow) for incident ticketing and change management.
Module 10: Continuous Improvement and Cloud Maturity Assessment
- Conduct cloud maturity assessments using AWS Well-Architected Framework, prioritizing improvements across pillars.
- Establish feedback loops between development, operations, and business teams to refine cloud usage and investment.
- Measure technical debt in cloud infrastructure, including outdated AMIs, unpatched systems, and deprecated services.
- Implement infrastructure as code (IaC) using AWS CloudFormation or Terraform, ensuring version control and reproducibility.
- Manage IaC pipeline security, preventing privilege escalation and unauthorized changes in deployment workflows.
- Evaluate new AWS services and features for potential adoption, balancing innovation with stability and support burden.
- Develop cloud center of excellence (CCoE) operating models, defining roles, responsibilities, and decision-making processes.
- Track cloud competency development across teams using skill matrices and hands-on validation exercises.