Description

This curriculum spans the technical and operational rigor of a multi-workshop cloud migration program, covering the same breadth of cross-functional planning, platform engineering, and runtime governance activities typically engaged in enterprise advisory projects for cloud-native transformation.

Module 1: Strategic Assessment and Readiness for Cloud-Native Migration

Evaluate existing monolithic applications to determine rehost, refactor, or rebuild decisions based on business criticality and technical debt.
Map legacy system dependencies using automated discovery tools to identify integration risks during migration.
Define cloud readiness criteria including compliance posture, data residency constraints, and operational maturity.
Conduct workload profiling to assess performance, scalability, and availability requirements pre-migration.
Establish cross-functional migration teams with clear ownership across development, operations, security, and business units.
Develop a phased migration roadmap prioritizing applications by risk, value, and interdependencies.

Module 2: Cloud Platform Selection and Architecture Design

Compare managed Kubernetes services (EKS, AKS, GKE) based on control plane SLAs, networking models, and integration with existing tooling.
Design multi-account or multi-tenant cloud landing zones with isolated environments for dev, staging, and production.
Implement identity federation between on-premises directories and cloud IAM using SAML or OIDC.
Select storage backends (block, object, file) based on application I/O patterns and durability requirements.
Architect hybrid connectivity using Direct Connect, ExpressRoute, or Cloud VPN with failover and bandwidth planning.
Define network segmentation using VPC peering, transit gateways, or service mesh sidecar patterns.

Module 3: Containerization and Microservices Transformation

Break monolithic applications into bounded-context services using domain-driven design and transactional boundary analysis.
Containerize legacy applications with minimal code changes while managing stateful components and local filesystem dependencies.
Define Docker image build standards including base OS selection, CVE scanning, and artifact signing.
Implement sidecar patterns for logging, monitoring, and configuration to decouple cross-cutting concerns.
Manage service discovery and inter-service communication using DNS, service mesh, or API gateways.
Refactor synchronous HTTP calls to asynchronous event-driven patterns using message queues or event buses.

Module 4: CI/CD Pipeline Implementation for Cloud-Native Systems

Design immutable deployment pipelines with versioned artifacts and environment promotion gates.
Integrate static code analysis, SAST, and dependency scanning into pull request workflows.
Configure canary deployments with automated rollback based on health checks and metrics thresholds.
Manage infrastructure as code using GitOps workflows with tools like ArgoCD or Flux.
Secure pipeline secrets using external vaults (Hashicorp Vault, AWS Secrets Manager) instead of environment variables.
Enforce pipeline compliance through policy-as-code tools like OPA or Sentinel for IaC validation.

Module 5: Observability and Runtime Governance

Standardize telemetry collection across logs, metrics, and traces using OpenTelemetry instrumentation.
Design log aggregation pipelines with filtering, sampling, and retention policies to control cost and volume.
Define SLOs and error budgets for critical services to guide incident response and release pacing.
Configure distributed tracing to diagnose latency across service boundaries and third-party dependencies.
Implement synthetic monitoring to validate end-to-end user journeys across cloud regions.
Enforce tagging and resource naming policies to enable chargeback, cost allocation, and security audits.

Module 6: Security and Compliance in Cloud-Native Environments

Apply least-privilege IAM roles to workloads using pod identity or instance profiles instead of shared credentials.
Scan container images in registries for vulnerabilities and enforce admission controls via Kubernetes policies.
Encrypt data in transit using mTLS across service mesh or API gateway layers.
Implement network policies to restrict pod-to-pod communication based on zero-trust principles.
Conduct regular configuration audits of Kubernetes clusters using tools like kube-bench or Policheck.
Integrate cloud security posture management (CSPM) tools to detect misconfigurations in real time.

Module 7: Resilience, Scaling, and Cost Optimization

Design failure domains across availability zones and implement pod disruption budgets for rolling updates.
Configure horizontal and vertical pod autoscalers based on custom or external metrics from application queues.
Implement circuit breakers and retry logic with exponential backoff in service clients.
Use spot instances or preemptible VMs for stateless workloads with fallback strategies for termination events.
Right-size container resource requests and limits using historical utilization data from monitoring tools.
Optimize storage costs by tiering data across storage classes and automating lifecycle policies.

Module 8: Operationalization and Team Enablement

Define runbooks for common incidents including pod evictions, node failures, and API throttling.
Establish on-call rotations with escalation paths and integrate alerts into incident management systems.
Conduct blameless postmortems to document root causes and track remediation actions.
Standardize developer onboarding with self-service environments via internal developer portals.
Train operations teams on Kubernetes debugging tools (kubectl, stern, k9s) and log querying syntax.
Implement feedback loops from production telemetry into development backlog prioritization.