Description

This curriculum spans the equivalent of a multi-workshop process improvement initiative, guiding practitioners through the statistical workflows typical of end-to-end Six Sigma projects, from defining business-aligned metrics to sustaining gains via controlled monitoring systems.

Module 1: Defining Statistical Objectives Aligned with Business Goals

Selecting critical-to-quality (CTQ) metrics that directly reflect customer requirements and operational constraints
Mapping statistical analysis goals to specific business outcomes such as cost reduction, cycle time improvement, or defect rate targets
Establishing baseline performance metrics using historical data while accounting for data gaps or measurement system inconsistencies
Deciding whether to prioritize short-term problem-solving or long-term process capability based on organizational maturity
Engaging stakeholders to validate statistical objectives and secure alignment on success criteria
Documenting assumptions and constraints related to data availability, timing, and resource allocation for analysis scope

Module 2: Data Collection Strategy and Measurement System Validation

Designing a sampling plan that balances statistical power with operational disruption during data gathering
Conducting Gage R&R studies to quantify repeatability and reproducibility in measurement processes
Identifying and mitigating sources of bias in manual data entry or automated data logging systems
Classifying data as continuous or discrete and selecting appropriate collection protocols accordingly
Implementing data validation rules at the point of entry to reduce rework during analysis phases
Establishing data ownership and access protocols to ensure consistency and security across departments

Module 3: Exploratory Data Analysis and Assumption Verification

Using control charts to distinguish between common cause and special cause variation before applying inferential statistics
Assessing data normality using graphical methods (e.g., Q-Q plots) and statistical tests (e.g., Anderson-Darling) with awareness of their limitations in large samples
Handling outliers by investigating root causes rather than automatically removing them from the dataset
Applying data transformation techniques (e.g., Box-Cox) only when justified by both statistical need and process understanding
Segmenting data by shift, machine, or operator to uncover hidden patterns before formal hypothesis testing
Documenting data anomalies and decisions made during exploration for audit and replication purposes

Module 4: Hypothesis Testing for Process Comparisons

Selecting the appropriate test (e.g., t-test, ANOVA, chi-square) based on data type, distribution, and number of groups
Calculating required sample size using power analysis to avoid Type II errors while minimizing data collection burden
Interpreting p-values in context of practical significance, not just statistical significance
Managing multiple comparison issues when conducting several hypothesis tests simultaneously
Communicating test results using effect sizes and confidence intervals rather than binary reject/fail-to-reject conclusions
Archiving raw data, test parameters, and output for future validation or regulatory review

Module 5: Regression and Correlation for Root Cause Analysis

Identifying multicollinearity among predictor variables before building regression models
Distinguishing between correlation and causation when interpreting model coefficients
Validating model assumptions (linearity, homoscedasticity, independence of residuals) using diagnostic plots
Selecting variables for inclusion using stepwise methods only when supported by domain knowledge
Using residual analysis to detect unmodeled process behavior or omitted variables
Deploying regression models in real-time dashboards with safeguards against extrapolation beyond training data ranges

Module 6: Design of Experiments (DOE) in Operational Environments

Choosing between full factorial, fractional factorial, or response surface designs based on factor count and resource constraints
Randomizing run order to minimize the impact of lurking variables in uncontrolled environments
Blocking experimental runs by shift or batch when known sources of variation cannot be eliminated
Securing operational buy-in to implement planned factor level changes without deviation
Handling missing or corrupted data points in experimental results without invalidating the design
Translating statistically significant effects into actionable process settings considering engineering tolerances

Module 7: Process Capability and Performance Monitoring

Determining whether to calculate short-term (Cp/Cpk) or long-term (Pp/Ppk) capability based on data collection duration and stability
Interpreting capability indices in non-normal processes using transformation or non-parametric methods
Updating capability baselines after process improvements while maintaining historical comparisons
Integrating capability metrics into control plans with defined response protocols for out-of-specification trends
Aligning specification limits with customer requirements rather than internal tolerances when calculating indices
Using capability data to prioritize improvement projects across multiple processes

Module 8: Sustaining Improvements through Statistical Control Systems

Selecting appropriate control chart types (e.g., I-MR, Xbar-R, p-chart) based on data characteristics and subgroup structure
Setting rational subgroups to maximize detection of process shifts while minimizing within-group variation
Defining escalation procedures for out-of-control signals that balance speed and accuracy of response
Training process owners to interpret control charts without overreacting to common cause variation
Automating data feeds to control charts while maintaining data integrity checks
Conducting periodic audits of control systems to verify continued relevance and effectiveness