Description

This curriculum spans the technical and procedural rigor of a multi-workshop benchmarking initiative, comparable to establishing an internal performance engineering program within a cloud-native organisation.

Module 1: Defining Performance Metrics for Technical Systems

Selecting latency percentiles (p50, p95, p99) based on user experience requirements and system SLAs
Deciding between throughput measurements in requests per second versus completed transactions per minute
Implementing custom instrumentation to capture meaningful metrics not exposed by default in cloud platforms
Aligning metric definitions across development, operations, and business teams to avoid misinterpretation
Handling metric drift when underlying infrastructure changes (e.g., containerization, serverless migration)
Establishing thresholds for metric degradation that trigger investigation without causing alert fatigue

Module 2: Selecting and Configuring Benchmarking Tools

Choosing between open-source tools (e.g., JMeter, k6) and commercial load-testing platforms based on test complexity and compliance needs
Configuring realistic client-side behavior in synthetic load generators, including think times and session persistence
Integrating benchmarking tools with CI/CD pipelines without introducing pipeline bottlenecks
Managing tool licensing and scalability constraints when simulating large-scale distributed loads
Validating tool accuracy by cross-referencing results with production telemetry under controlled conditions
Securing test environments to prevent benchmarking tools from exposing credentials or triggering security alerts

Module 3: Designing Representative Workloads

Extracting and anonymizing production traffic patterns for replay in staging environments
Adjusting payload sizes and request distributions to reflect seasonal or regional usage spikes
Modeling mixed workloads (read-heavy, write-heavy, batch) to test system balance under real-world conditions
Deciding when to use synthetic versus production-derived workloads based on data sensitivity and availability
Simulating user concurrency accurately when user sessions involve multiple backend dependencies
Accounting for caching effects by warming systems before workload execution and measuring cold-start impact

Module 4: Infrastructure Isolation and Test Environment Management

Allocating dedicated test environments with production-equivalent topology to avoid noisy neighbor interference
Replicating production network conditions (latency, bandwidth, packet loss) using traffic shaping tools
Managing database state resets between test runs while preserving referential integrity and data volume
Using infrastructure-as-code to ensure consistent environment provisioning across benchmarking cycles
Handling dependencies on third-party APIs by implementing contract-compliant mocks or sandbox endpoints
Deciding whether to scale test infrastructure horizontally or vertically based on cost and fidelity trade-offs

Module 5: Execution and Data Collection Protocols

Scheduling benchmark runs during maintenance windows to avoid impacting production or shared services
Collecting correlated metrics across layers (application, database, network) with synchronized timestamps
Implementing automated data tagging to distinguish between baseline, experimental, and regression test results
Validating data completeness before analysis by checking for gaps in metric ingestion pipelines
Controlling garbage collection and background job scheduling to minimize performance noise during tests
Using distributed tracing to isolate performance bottlenecks across microservices during high-load scenarios

Module 6: Statistical Analysis and Result Interpretation

Determining the number of test iterations required to achieve statistically significant results
Applying outlier detection techniques to identify and exclude anomalous runs due to external factors
Using confidence intervals to assess whether performance differences between versions are meaningful
Normalizing results across test runs when hardware or configuration differences cannot be fully eliminated
Distinguishing between correlation and causation when performance regressions coincide with code changes
Documenting assumptions and limitations in benchmarking methodology to support auditability

Module 7: Governance and Change Control in Benchmarking

Establishing approval workflows for production-impacting benchmarking activities involving shared resources
Defining retention policies for benchmark data to comply with data governance and storage cost constraints
Requiring peer review of benchmarking methodology before major release validation cycles
Managing access controls to prevent unauthorized execution or modification of benchmark configurations
Updating benchmark baselines after architectural changes to maintain relevance and comparability
Integrating benchmarking outcomes into post-mortem processes when performance incidents occur

Module 8: Integrating Benchmarking into Technical Decision-Making

Using benchmark data to justify infrastructure upgrades or capacity planning decisions to stakeholders
Comparing performance trade-offs between architectural patterns (e.g., monolith vs. microservices) using controlled tests
Evaluating third-party service providers based on reproducible benchmark results under defined conditions
Setting performance gates in release pipelines based on regression thresholds derived from historical data
Adjusting auto-scaling policies using benchmark-derived metrics on response time versus load
Documenting performance characteristics in system design records to inform future technical debt assessments