This curriculum spans the rigor and coordination of a multi-workshop quality initiative, covering data practices from initial scoping to long-term governance, comparable to the phased execution seen in enterprise-wide process improvement programs.
Module 1: Defining Data Requirements in the Define Phase
- Selecting critical-to-quality (CTQ) metrics based on stakeholder input and project charters
- Mapping process outputs to measurable variables that align with business objectives
- Establishing operational definitions for each data point to ensure consistency across teams
- Identifying data owners and custodians during process scoping to secure access early
- Determining data granularity (e.g., per transaction, per shift) based on process cycle time
- Deciding between primary and secondary data sources considering accuracy and availability
- Documenting data collection constraints such as system access, privacy regulations, or legacy systems
- Developing a preliminary data collection plan with timelines and responsibilities
Module 2: Designing Data Collection Tools and Methods
- Choosing between check sheets, automated logging, and digital forms based on error rates and volume
- Designing paper and electronic forms with built-in validation rules to minimize entry errors
- Implementing stratified sampling strategies when population segments behave differently
- Setting sampling frequency based on process stability and measurement system capability
- Integrating timestamp and operator ID fields to support root cause analysis later
- Conducting pilot tests of data collection tools to identify usability gaps
- Standardizing units of measure across departments to prevent aggregation errors
- Documenting field-level instructions directly on forms to reduce interpretation variance
Module 3: Ensuring Data Accuracy Through Measurement System Analysis
- Conducting Gage R&R studies for continuous data with at least 10 parts, 3 operators, 3 trials
- Performing attribute agreement analysis for categorical data using kappa statistics
- Identifying sources of measurement variation: equipment, appraiser, environment, or procedure
- Deciding whether to recalibrate instruments or revise operational definitions based on MSA results
- Handling destructive testing scenarios by using split specimens or proxy measurements
- Establishing calibration schedules for measurement devices based on usage and drift history
- Training data collectors on consistent technique, especially for subjective assessments
- Documenting MSA outcomes and obtaining sign-off before proceeding to full data collection
Module 4: Executing Data Collection in the Measure Phase
- Deploying trained data collectors with documented protocols and escalation paths
- Monitoring real-time data submission rates to detect collection bottlenecks
- Implementing data validation checks at point of entry (e.g., range limits, mandatory fields)
- Handling missing data: deciding between imputation, exclusion, or re-collection
- Logging deviations from the collection plan and justifying adjustments in the project file
- Synchronizing data collection across multiple shifts or locations to ensure representativeness
- Using barcode scanners or IoT sensors when manual entry introduces unacceptable error
- Securing interim data backups and access controls, especially for sensitive process data
Module 5: Validating and Cleaning Data for Analysis
- Identifying outliers using statistical methods (e.g., IQR, Z-score) and verifying with process experts
- Resolving duplicate records caused by system integration or manual re-entry
- Standardizing categorical responses (e.g., “Yes,” “yes,” “Y”) into consistent coding
- Handling inconsistent timestamps due to time zones or system clock mismatches
- Reconciling data discrepancies between source systems and reported metrics
- Documenting all data transformations and cleaning rules for auditability
- Validating data completeness against expected sample size and time period
- Flagging data quality issues that may require revisiting the collection plan
Module 6: Integrating Data into Process Baseline Calculations
- Selecting appropriate metrics: DPMO, sigma level, process yield, or cycle time
- Calculating short-term vs. long-term process capability using correct standard deviation formulas
- Adjusting for process shifts (e.g., 1.5 sigma) only when justified by historical behavior
- Mapping process steps to collected data to compute rolled throughput yield (RTY)
- Handling non-normal data using transformations or non-parametric methods
- Visualizing baseline performance with time series plots and control charts
- Identifying data segmentation opportunities (e.g., by shift, machine, location) for deeper insight
- Presenting baseline metrics with confidence intervals to reflect sampling uncertainty
Module 7: Maintaining Data Integrity During Analyze and Improve Phases
- Preserving original data sets while creating analysis-specific subsets
- Tracking changes to data or assumptions during root cause validation
- Collecting additional data to test hypotheses identified in the Analyze phase
- Using before/after paired sampling when evaluating solution impact
- Ensuring consistency in measurement methods pre- and post-improvement
- Documenting data sources and transformations used in statistical models (e.g., regression, ANOVA)
- Validating that new process data reflects sustained changes, not temporary fixes
- Archiving raw data, analysis scripts, and output for future replication
Module 8: Sustaining Data Collection in the Control Phase
- Transitioning project data collection to operational owners with documented handover
- Embedding key metrics into existing dashboards or performance reporting systems
- Establishing control charts with appropriate control limits and response protocols
- Defining frequency and responsibility for ongoing data review and escalation
- Updating data collection procedures in standard operating instructions (SOPs)
- Training process owners on interpreting control signals and taking corrective action
- Conducting periodic audits of data collection adherence and accuracy
- Planning for system changes (e.g., ERP upgrades) that may disrupt data continuity
Module 9: Governing Data Practices Across the DMAIC Lifecycle
- Establishing data governance roles: steward, custodian, and process owner
- Defining data retention policies aligned with compliance and audit requirements
- Implementing access controls based on sensitivity and role-based permissions
- Conducting periodic data quality assessments across active Six Sigma projects
- Standardizing data dictionaries and metadata documentation enterprise-wide
- Resolving cross-functional data conflicts through governance committees
- Aligning data collection practices with enterprise data management frameworks
- Auditing project data for compliance with organizational data standards