This curriculum spans the technical and organisational complexities of CMDB capacity planning with a depth comparable to a multi-workshop infrastructure readiness program, addressing data governance, system integration, and performance engineering as typically encountered in large-scale IT operations.
Module 1: Defining CMDB Scope and Data Boundaries
- Select which configuration item (CI) types to include based on operational impact, such as servers, network devices, applications, and cloud services, while excluding low-impact items like user workstations.
- Determine ownership of CI data per domain (e.g., network team owns router CIs, application teams own application service CIs) to enforce accountability.
- Decide whether virtual machines and containers are modeled as individual CIs or grouped under host-level entries based on monitoring and incident management requirements.
- Establish criteria for excluding shadow IT or unmanaged cloud instances from the CMDB while documenting the risk exposure of such omissions.
- Define lifecycle states (e.g., planned, in production, decommissioned) and enforce state transition rules during provisioning and retirement workflows.
- Implement data retention policies for decommissioned CIs, specifying archival duration and access controls for audit and forensic use cases.
- Negotiate with security teams on whether vulnerability scanners should update CI attributes directly or feed findings via integration interfaces.
- Resolve conflicts between CMDB scope and overlapping asset management systems by defining authoritative sources for each data attribute.
Module 2: Sizing Discovery Tools and Scanning Frequency
- Calculate discovery scan intervals for different CI classes (e.g., every 24 hours for servers, every 7 days for printers) based on change velocity and operational tolerance.
- Estimate network bandwidth consumption of discovery probes in distributed environments and schedule scans during off-peak hours to avoid congestion.
- Configure throttling parameters on discovery tools to limit CPU and memory usage on target systems during active scans.
- Allocate dedicated proxy servers in remote data centers to reduce cross-WAN discovery traffic and improve scan reliability.
- Select credential sets for discovery tools based on least-privilege access, balancing completeness of data collection with security policy compliance.
- Implement staggered scan schedules across geographies to prevent spikes in CMDB update processing and database load.
- Define thresholds for re-attempting failed discovery jobs and escalate after three consecutive failures to operations teams.
- Monitor discovery tool health and resource usage on a weekly basis to preempt performance degradation.
Module 4: Database Infrastructure and Performance Tuning
- Size CMDB database storage based on projected CI count, attribute depth, relationship volume, and audit trail retention period over three years.
- Configure database indexing on frequently queried fields such as CI name, serial number, and last discovered timestamp to support incident and change workflows.
- Partition CMDB tables by CI type or business unit to improve query performance and simplify backup and restore operations.
- Implement read replicas for reporting and analytics queries to offload the primary transactional database.
- Set up query timeout thresholds and alerting for long-running operations that could indicate schema or indexing issues.
- Plan for regular database vacuuming and statistics updates in PostgreSQL or index rebuilding in SQL Server to maintain performance.
- Allocate memory and I/O resources for the CMDB database server based on peak concurrent user and integration loads.
- Test failover procedures for the CMDB database in clustered or high-availability configurations quarterly.
Module 5: Managing CI Relationships and Dependency Mapping
- Define relationship types (e.g., "runs on," "depends on," "connected to") with clear semantics and usage guidelines to prevent inconsistent modeling.
- Validate bidirectional relationships during CI updates to ensure referential integrity (e.g., if App A runs on Server B, then Server B must list App A as a hosted service).
- Implement automated cleanup of stale relationships when CIs are decommissioned or reclassified.
- Limit the depth of dependency traversal in impact analysis to four levels to prevent excessive processing time and UI timeouts.
- Integrate with APM tools to enrich application-to-infrastructure relationships and resolve indirect dependencies.
- Establish ownership rules for relationship creation—whether by discovery tools, change management, or manual entry—and enforce via workflow controls.
- Cache high-frequency dependency queries to support real-time impact analysis during incident response.
- Document exceptions where dependency data is estimated or inferred rather than verified, and flag them for review cycles.
Module 6: Change Management Integration and Audit Controls
- Enforce mandatory CMDB updates as part of the change approval process for standard, normal, and emergency changes.
- Configure automated CMDB update workflows triggered by successful change implementation, using integration with change management tools.
- Implement pre-change CI snapshotting to support rollback planning and post-implementation verification.
- Flag unauthorized configuration drift detected during change audits and route to compliance review queues.
- Define audit frequency for CMDB data accuracy (e.g., monthly spot checks for 5% of CIs) and assign responsibility to domain owners.
- Generate reconciliation reports comparing discovery data with change records to identify unlogged modifications.
- Configure audit trails to capture user identity, timestamp, field-level changes, and change ticket reference for all CI modifications.
- Integrate with IT security tools to correlate CMDB changes with privileged access logs for forensic investigations.
Module 7: Capacity Modeling and Scalability Testing
- Project annual CI growth rate based on historical provisioning trends and upcoming digital transformation initiatives.
- Simulate bulk import operations for data migration scenarios to measure ingestion throughput and identify bottlenecks.
- Stress-test CMDB APIs under concurrent load from integrations (e.g., 50 external systems polling every 5 minutes) to validate response times.
- Model worst-case impact analysis scenarios (e.g., network outage affecting 500 servers) to size compute resources for real-time queries.
- Measure latency of CI search operations as data volume increases and adjust indexing or caching strategies accordingly.
- Plan for horizontal scaling of CMDB application servers based on concurrent user sessions and integration call volume.
- Conduct quarterly scalability reviews with infrastructure and application teams to align CMDB capacity with business growth.
- Document performance baselines and set thresholds for alerting on deviation from expected response times.
Module 8: Data Governance and Stewardship Frameworks
- Appoint data stewards per CI domain (e.g., network, server, application) with defined responsibilities for data quality and validation.
- Implement mandatory field policies for critical CIs, requiring attributes like owner, location, and business service before activation.
- Define data quality metrics (e.g., completeness, accuracy, timeliness) and report them monthly to governance boards.
- Establish escalation paths for resolving data conflicts between discovery tools, spreadsheets, and manual entries.
- Enforce data classification and encryption requirements for sensitive CI attributes such as IP addresses or serial numbers.
- Conduct quarterly data cleansing campaigns to remove duplicates, correct misclassifications, and update stale records.
- Integrate CMDB governance into existing data governance frameworks to align with enterprise policies and regulatory requirements.
- Require steward approval for bulk update operations exceeding 100 CIs to prevent accidental data corruption.
Module 9: Integration Architecture and API Management
- Design integration patterns (push vs. pull, real-time vs. batch) based on source system capabilities and CMDB update requirements.
- Implement API rate limiting and authentication (OAuth 2.0 or API keys) for external systems accessing CMDB data.
- Develop canonical data models to normalize attributes from disparate sources (e.g., cloud providers, monitoring tools) before ingestion.
- Use message queues (e.g., Kafka, RabbitMQ) to decouple high-volume integrations and prevent CMDB overload during spikes.
- Validate incoming integration data against CMDB schema rules and reject malformed payloads with detailed error logging.
- Monitor integration health and latency daily, with alerts for failures lasting over 15 minutes.
- Version CMDB APIs explicitly and maintain backward compatibility for at least one year during deprecation cycles.
- Document integration SLAs (e.g., data freshness, uptime) and align them with service level agreements for dependent processes.