This curriculum spans the technical and governance dimensions of cloud infrastructure setup with a scope and sequence comparable to a multi-workshop advisory engagement for establishing a production-ready cloud environment across global enterprise operations.
Module 1: Cloud Provider Selection and Service Model Alignment
- Evaluate regional availability of managed services to determine compliance with data sovereignty requirements across operating jurisdictions.
- Compare SLA terms for compute and storage across AWS, Azure, and GCP to align with application uptime and recovery time objectives (RTO).
- Assess egress pricing models when designing data-intensive workloads to avoid unexpected operational costs during peak transfer periods.
- Select between IaaS, PaaS, and container-based services based on application refactoring scope and team DevOps maturity.
- Negotiate enterprise agreements by projecting three-year usage patterns for reserved instances and committed use discounts.
- Validate identity federation capabilities with existing SSO providers to ensure seamless integration with corporate identity stores.
Module 2: Landing Zone Design and Multi-Account Architecture
- Implement AWS Organizations or Azure Management Groups to enforce separation of environments (dev, test, prod) with distinct billing and access controls.
- Define guardrail policies using SCPs or Azure Policy to restrict region usage and prevent unauthorized service deployment.
- Architect centralized logging and security account with automated ingestion of CloudTrail, VPC Flow Logs, and Azure Activity Logs.
- Configure DNS resolution across VPCs and on-premises networks using private hosted zones and transit gateways.
- Establish cross-account IAM roles with least privilege for CI/CD pipelines operating across development and production accounts.
- Design tagging standards enforced at resource creation to support cost allocation and automated governance workflows.
Module 3: Network Architecture and Connectivity
- Size and deploy AWS Direct Connect or Azure ExpressRoute circuits based on application throughput benchmarks and failover requirements.
- Implement VPC peering or Transit Gateway routing tables to control traffic flow between application tiers and shared services.
- Configure NAT gateways with autoscaling considerations to handle burst workloads without IP exhaustion.
- Design hybrid DNS resolution to resolve on-premises and cloud-hosted services within a unified namespace.
- Enforce network segmentation using security groups and NSGs that reflect zero-trust principles and application dependency mapping.
- Plan IP address space allocation across regions and VPCs to prevent overlap during future mergers or migrations.
Module 4: Identity and Access Management Governance
- Implement identity federation using SAML 2.0 or OIDC to synchronize corporate directory roles with cloud IAM groups.
- Define and rotate long-term access keys for service accounts using automated credential rotation pipelines.
- Enforce MFA for privileged roles and configure conditional access policies based on sign-in risk and location.
- Conduct quarterly access certification reviews using automated reports on inactive IAM users and overprivileged roles.
- Design role chaining policies with session duration limits to reduce lateral movement risk during administrative tasks.
- Integrate IAM with SIEM tools to trigger alerts on anomalous activities such as root account usage or console login from new geographies.
Module 5: Data Migration and Storage Strategy
- Select between offline (Snowball, Azure Data Box) and online data transfer based on dataset size and network bandwidth constraints.
- Classify data sensitivity to determine encryption requirements and key management approach (KMS, customer-managed keys).
- Migrate legacy file shares to cloud-native storage using lift-and-shift tools while preserving NTFS permissions.
- Implement lifecycle policies to transition infrequently accessed data from standard to archive storage tiers.
- Design multi-region replication for critical databases using native services like S3 Cross-Region Replication or Azure Geo-Redundant Storage.
- Validate data integrity post-migration using checksum comparisons and automated reconciliation scripts.
Module 6: Security and Compliance Enforcement
- Deploy automated configuration auditing using AWS Config or Azure Security Center to detect non-compliant resources.
- Integrate vulnerability scanning into CI/CD pipelines to block deployment of container images with critical CVEs.
- Implement host-based firewall rules on EC2 and Azure VMs to supplement network-level controls.
- Configure WAF rules to protect public-facing applications from OWASP Top 10 threats during and after migration.
- Establish encryption-by-default policies for all EBS volumes, Azure Managed Disks, and object storage buckets.
- Conduct penetration testing post-migration and remediate findings before decommissioning legacy environments.
Module 7: Automation and Infrastructure as Code (IaC)
- Standardize Terraform or Bicep module interfaces to ensure consistent deployment of networking and compute resources.
- Implement state file management using remote backends with locking and versioning to prevent configuration drift.
- Enforce IaC code reviews using pre-commit hooks and static analysis tools like Checkov or tfsec.
- Design reusable module inputs to support multiple environments while maintaining isolation and naming consistency.
- Automate drift detection by scheduling periodic plan executions and alerting on unapproved changes.
- Integrate IaC into CI/CD pipelines with approval gates for production deployments using pull request workflows.
Module 8: Monitoring, Cost Management, and Operational Readiness
- Deploy centralized monitoring dashboards using CloudWatch, Azure Monitor, or third-party tools to track resource utilization and error rates.
- Set up predictive cost alerts based on historical spend trends to flag potential budget overruns.
- Configure automated scaling policies using CPU, memory, and custom metrics to balance performance and cost.
- Document runbooks for common failure scenarios such as database failover, DNS misconfigurations, and IAM lockouts.
- Conduct cutover dry runs to validate DNS TTL adjustments, database replication lag, and application failback procedures.
- Establish operational handover processes including shift rotations, escalation paths, and incident response integration.