Description

This curriculum spans the technical and operational rigor of a multi-workshop serverless adoption program, addressing the same architectural trade-offs, security controls, and operational patterns encountered when redesigning enterprise workloads for event-driven, scale-to-zero environments.

Module 1: Defining Serverless Scope and Service Boundaries

Selecting between Function-as-a-Service (e.g., AWS Lambda, Azure Functions) and Backend-as-a-Service (e.g., Firebase, Auth0) based on control requirements and integration complexity.
Deciding on function granularity: balancing single-responsibility functions against invocation overhead and monitoring sprawl.
Establishing ownership models for serverless components across distributed teams to prevent operational ambiguity.
Defining which workloads are appropriate for serverless (event-driven, sporadic) versus those better suited for containers or VMs (long-running, predictable).
Mapping legacy monolith capabilities to serverless functions while identifying data and state dependencies that impede decomposition.
Setting thresholds for cold start tolerance based on user experience requirements and geographic distribution needs.

Module 2: Infrastructure as Code for Serverless Deployments

Choosing between framework-based tooling (e.g., Serverless Framework, AWS SAM) and general-purpose IaC (e.g., Terraform, Pulumi) for managing function configurations.
Designing versioned deployment pipelines that support atomic updates of function code and associated IAM roles.
Managing environment-specific configurations (dev, staging, prod) without hardcoding or exposing secrets in source control.
Implementing rollback strategies for failed deployments when serverless platforms lack native version rollback triggers.
Enforcing tagging policies across functions to support cost allocation, compliance, and resource discovery.
Automating dependency validation (e.g., correct runtime versions, layer compatibility) before deployment to prevent runtime failures.

Module 3: Identity, Permissions, and Least Privilege Enforcement

Constructing IAM roles with minimal permissions for each function, avoiding wildcard policies even during development.
Managing cross-account function invocations securely using role assumption and resource-based policies.
Integrating short-lived credentials via OIDC or federated identity for functions accessing external SaaS APIs.
Rotating and auditing access keys used by functions that interact with legacy systems lacking IAM integration.
Implementing permission boundaries to constrain developer-deployed roles within organizational guardrails.
Monitoring for privilege escalation attempts through CloudTrail or equivalent audit logs when functions modify policies.

Module 4: Observability and Distributed Tracing

Correlating logs across fragmented function invocations using trace IDs propagated through event sources and APIs.
Configuring structured logging formats to ensure compatibility with centralized log aggregation systems (e.g., ELK, Splunk).
Instrumenting custom metrics for business-critical operations not captured by platform-native monitoring.
Setting up distributed tracing across serverless and non-serverless components using OpenTelemetry or vendor SDKs.
Filtering and sampling high-volume logs to control cost without losing diagnostic fidelity for error conditions.
Diagnosing performance bottlenecks in chained function calls by analyzing inter-function latency and payload size.

Module 5: Event-Driven Design and Integration Patterns

Selecting event sources (e.g., S3, SQS, EventBridge) based on delivery guarantees, throughput, and retry semantics.
Designing idempotent functions to handle duplicate events from message queues or retry mechanisms.
Implementing dead-letter queues (DLQs) for failed event processing with alerting and reprocessing workflows.
Decoupling producers and consumers using event buses while managing schema evolution and backward compatibility.
Throttling function concurrency to prevent downstream system overload during traffic spikes.
Orchestrating complex workflows using step functions or state machines instead of chaining synchronous function calls.

Module 6: Security, Compliance, and Data Protection

Encrypting function environment variables at rest and in transit using KMS or equivalent key management services.
Scanning function packages for vulnerabilities and embedded secrets during CI/CD pipeline execution.
Enforcing data residency requirements by restricting function deployment regions and data egress points.
Implementing input validation and sanitization to prevent injection attacks via event payloads.
Auditing function configuration changes using configuration drift detection tools and alerting on unauthorized modifications.
Meeting compliance requirements (e.g., SOC 2, HIPAA) by documenting serverless control implementations and evidence collection processes.

Module 7: Performance Optimization and Cost Management

Tuning function memory and timeout settings based on profiling data to balance performance and cost.
Using provisioned concurrency to mitigate cold starts in latency-sensitive applications, weighing cost implications.
Monitoring invocation patterns to identify and eliminate idle or underutilized functions.
Optimizing package size by removing unused dependencies and leveraging layers for shared code.
Forecasting and budgeting for variable serverless costs based on usage trends and scaling behavior.
Implementing circuit breakers and bulkheads in function-to-function communication to prevent cascading failures under load.

Module 8: Disaster Recovery and Operational Resilience

Designing multi-region failover strategies for critical serverless APIs using DNS routing and replicated event sources.
Backing up function code, configuration, and environment variables to version-controlled repositories or artifact stores.
Testing recovery procedures by simulating region outages and measuring RTO/RPO for serverless workloads.
Managing dependencies on managed services (e.g., API Gateway, DynamoDB) that may not support cross-region replication by default.
Documenting manual intervention steps for incidents involving platform-level outages beyond organizational control.
Establishing incident response playbooks specific to serverless failures, including log access, tracing, and rollback procedures.