Description

This curriculum spans the technical and operational rigor of a multi-workshop program, addressing the same service level challenges faced during large-scale event deployments, from pre-event planning and vendor coordination to on-site incident response and post-event compliance audits.

Module 1: Defining Event Service Level Objectives (SLOs)

Selecting appropriate SLOs for ticketing system availability during peak registration windows, balancing user expectations with infrastructure capacity.
Establishing measurable thresholds for mobile app response time during live event check-in, based on historical load testing data.
Deciding between uptime percentage and transaction success rate as the primary SLO for payment processing during high-volume sales periods.
Negotiating SLOs with third-party vendors for badge printing services, including acceptable turnaround time and error rate tolerances.
Determining recovery time objectives (RTO) for streaming platforms after broadcast interruptions during virtual keynotes.
Setting SLOs for Wi-Fi network performance in high-density attendee areas, factoring in device-per-person estimates and bandwidth allocation.

Module 2: Event Infrastructure Monitoring and Observability

Deploying distributed tracing across microservices handling registration, scheduling, and session tracking to isolate latency bottlenecks.
Configuring synthetic transaction monitoring for critical user journeys, such as group registration and agenda syncing.
Integrating monitoring tools with on-site networking equipment to detect rogue access points or bandwidth saturation during multi-day events.
Implementing log aggregation from mobile event apps, backend APIs, and kiosk systems to create a unified observability dashboard.
Setting dynamic alert thresholds for API error rates during phased event rollouts to avoid alert fatigue during expected traffic spikes.
Mapping monitoring coverage across hybrid environments, including cloud-hosted services and temporary on-premise infrastructure.

Module 3: Incident Response and On-Site Coordination

Establishing escalation paths between venue IT staff, cloud operations teams, and third-party AV providers during service degradation.
Pre-defining communication protocols for incident status updates to event organizers when SLO breaches occur.
Conducting tabletop exercises with cross-functional teams to simulate failure scenarios, such as registration database outages.
Deploying portable failover networks at critical access points when primary Wi-Fi fails during keynote sessions.
Assigning dedicated incident commanders for different service domains (e.g., networking, registration, streaming) during large-scale events.
Documenting post-incident timelines to identify gaps in detection, response, and resolution during post-mortem analysis.

Module 4: Vendor and Third-Party Service Integration

Enforcing SLA compliance through contractual clauses with AV integrators, including penalties for broadcast latency exceeding 5 seconds.
Validating API rate limits and retry mechanisms when integrating third-party gamification platforms into the event app.
Requiring monitoring access from vendors providing RFID tracking systems to ensure end-to-end visibility of attendee movement data.
Assessing the impact of vendor-specific downtime windows on overall event SLOs, particularly during concurrent sessions.
Implementing circuit breakers in integrations with external registration partners to prevent cascading failures.
Conducting pre-event security and performance audits of vendor-hosted services that handle attendee PII.

Module 5: Capacity Planning and Load Testing

Simulating concurrent user loads on the session reservation system based on peak attendance projections for popular breakout sessions.
Adjusting auto-scaling policies for cloud-hosted services based on load test results from previous year’s event data.
Staging load tests during off-peak business hours to avoid impacting production systems used for ongoing event planning.
Validating database connection pool sizing under stress conditions to prevent exhaustion during flash registration periods.
Coordinating with venue facilities to ensure power and cooling capacity align with temporary server and networking deployments.
Testing failover of content delivery network (CDN) configurations to ensure streaming continuity during regional outages.

Module 6: Event-Specific Change and Configuration Management

Implementing a freeze on non-critical configuration changes 72 hours prior to event kickoff to reduce risk of unintended disruptions.
Using feature flags to enable or disable real-time polling and Q&A functions during sessions based on system performance.
Version-controlling all infrastructure-as-code templates used to deploy temporary event environments to ensure repeatability.
Validating DNS and routing changes for event-specific domains before redirecting live traffic from registration portals.
Documenting rollback procedures for mobile app updates deployed during multi-day events when new features introduce instability.
Coordinating configuration updates across time zones when managing global virtual event platforms with regional data centers.

Module 7: Post-Event Analysis and Continuous Improvement

Correlating SLO performance data with attendee feedback to identify service gaps that did not trigger technical alerts.
Archiving monitoring data and incident logs for compliance review and future forensic analysis.
Calculating burn rates for error budgets during the event to assess operational risk exposure and team responsiveness.
Updating runbooks based on observed failure patterns, such as recurring delays in badge re-printing processes.
Revising SLOs for the next event cycle based on actual performance trends and changing business requirements.
Conducting blameless retrospectives with technical and event operations teams to refine cross-functional workflows.

Module 8: Regulatory Compliance and Data Residency in Global Events

Mapping data flows across event systems to ensure GDPR compliance when collecting consent during session sign-ups in EU venues.
Configuring data residency settings in cloud platforms to store attendee information within jurisdictional boundaries.
Implementing audit logging for access to sensitive attendee data, such as dietary restrictions or accessibility requirements.
Validating encryption standards for data in transit between on-site kiosks and central databases during hybrid events.
Assessing vendor compliance with local privacy laws when using regional registration partners in APAC or LATAM markets.
Designing data retention and deletion workflows aligned with legal requirements post-event, including backups and archives.