AI Bill of Materials: Standards and Implementation

FlowRidge

COMPEL Certification Body of Knowledge — Module 3.7: Advanced Governance Architecture Article 12 — Domain 20: AI Supply Chain and Third-Party Governance

The AI-BOM Imperative

The software industry learned a painful lesson from the Log4Shell vulnerability in December 2021. A critical vulnerability in a single open-source library — Apache Log4j — affected hundreds of thousands of applications across virtually every industry. Organizations scrambled to determine whether they were affected, but most could not answer the basic question: “Does our software use Log4j?” They could not answer because they did not have Software Bills of Materials (SBOMs) that documented their software dependencies.

The response was transformative. Executive Order 14028 in the United States mandated SBOMs for software sold to the federal government. The Cybersecurity and Infrastructure Security Agency (CISA) published SBOM guidance. NTIA, then CISA, developed SBOM minimum elements. Industry adoption of SBOM standards — SPDX and CycloneDX — accelerated dramatically.

AI supply chains face an analogous challenge with higher stakes. When a bias is discovered in a foundation model, which organizations are affected? When a training dataset is found to contain copyrighted material, which models trained on it need remediation? When a security vulnerability is found in an AI inference framework, which deployments are exposed? Without AI Bills of Materials, these questions are unanswerable at scale.

An AI Bill of Materials (AI-BOM) extends the SBOM concept to capture the unique components of AI systems: training data, model architecture, evaluation methodology, fine-tuning processes, safety mechanisms, and deployment configurations. It provides the structured, machine-readable documentation needed to manage AI supply chain risk at enterprise scale.

AI-BOM Structure and Components

A comprehensive AI-BOM contains seven component categories that together provide a complete description of an AI system’s composition, provenance, and characteristics.

Component Category 1: System Identity and Metadata

The foundational layer of the AI-BOM identifies the AI system and provides contextual metadata.

System identifier. A unique, persistent identifier for the AI system. This identifier should follow a standardized format (e.g., PURL for software packages) and should be resolvable to the AI-BOM document.

System name and version. The human-readable name and current version of the AI system. Version information is critical because AI systems change frequently, and the AI-BOM must be version-specific.

Provider identity. The organization that provides the AI system, including legal entity name, contact information, and responsible AI program contact.

Intended use. A description of the AI system’s intended use cases, including the target users, target deployment contexts, and intended input/output types.

Out-of-scope use. A description of use cases that the AI system is not designed for, is not appropriate for, or has been found to perform poorly in. This negative scope documentation is as important as the positive scope.

AI-BOM creation date. The date the AI-BOM was created or last updated. AI-BOMs must be dated because they describe a system that changes over time.

AI-BOM format and standard. The standard and version used to create the AI-BOM (e.g., CycloneDX ML-BOM 1.6, SPDX 3.0 AI profile).

Component Category 2: Model Architecture

The model architecture section describes the computational structure of the AI system.

Model type. The general category of model: large language model, image classification model, regression model, recommender system, reinforcement learning agent, multi-modal model, etc.

Architecture description. The specific architecture: transformer (encoder, decoder, encoder-decoder), convolutional neural network, recurrent neural network, gradient boosted trees, ensemble, etc. For transformer models, specify variant (GPT, BERT, T5, etc.) and key architectural parameters (layers, attention heads, hidden dimensions).

Model size. Key size metrics: parameter count, model weight size (in GB), embedding dimensions, vocabulary size, context window length.

Foundation model dependency. If the model is built on or fine-tuned from a foundation model, identify the foundation model (provider, name, version). This dependency creates a supply chain link that the AI-BOM must capture.

Model framework. The software framework used to implement the model: PyTorch, TensorFlow, JAX, ONNX, etc. Include framework version, as framework vulnerabilities affect model security.

Quantization and optimization. If the model has been quantized, pruned, or otherwise optimized for deployment, describe the optimization applied and its impact on model performance.

Component Category 3: Training Data Provenance

The training data section documents the data used to create and refine the model.

Training datasets. For each dataset used in training:

Dataset name and identifier
Dataset provider and source URL
Dataset size (samples, tokens, images, etc.)
Temporal coverage (date range of data collection)
Geographic coverage (regions represented in the data)
Demographic coverage (populations represented in the data)
Known limitations (underrepresented populations, temporal biases, geographic gaps)
License terms and usage restrictions
Data collection methodology
Data quality processes applied (cleaning, filtering, deduplication, harmful content removal)

Fine-tuning datasets. If the model was fine-tuned from a foundation model, document the fine-tuning datasets separately, with the same detail as training datasets.

Reinforcement learning data. If the model uses reinforcement learning from human feedback (RLHF) or similar techniques, document the feedback data: number of raters, rater demographics, rating criteria, inter-rater agreement metrics, and known rater biases.

Synthetic data. If synthetic data was used for training or augmentation, document the synthetic data generation methodology, the model used to generate synthetic data, and the quality validation applied to synthetic data.

Data exclusions. Document any data that was explicitly excluded from training — content types filtered, domains blocked, time periods excluded — and the rationale for exclusion.

Component Category 4: Evaluation and Performance

The evaluation section documents how the model was tested and what performance it achieves.

Benchmark evaluations. For each benchmark used:

Benchmark name and version
Evaluation date
Metrics measured (accuracy, precision, recall, F1, BLEU, ROUGE, perplexity, etc.)
Results achieved
Comparison to relevant baselines

Fairness evaluations. For each fairness evaluation:

Protected characteristics tested
Fairness metrics used (demographic parity, equalized odds, calibration, etc.)
Results by demographic group
Intersectional analysis results (if conducted)
Thresholds applied and whether they were met

Robustness evaluations. For each robustness evaluation:

Adversarial attack types tested (input perturbation, prompt injection, jailbreaking, etc.)
Testing methodology (automated red teaming, human red teaming, formal verification)
Results and identified vulnerabilities
Mitigations applied

Safety evaluations. For each safety evaluation:

Safety domains tested (toxicity, bias, misinformation, dangerous content, etc.)
Testing methodology
Results and identified failure modes
Safety mitigations applied (output filters, content classifiers, refusal mechanisms)

Limitations. A candid description of known limitations: failure modes, accuracy degradation scenarios, bias patterns, hallucination tendencies, and other known weaknesses.

Component Category 5: Software Dependencies

The software dependency section captures the traditional SBOM information for the AI system’s software stack.

Direct dependencies. The software libraries and frameworks directly used by the AI system, with version numbers, license terms, and known vulnerability status.

Transitive dependencies. The dependencies of dependencies, recursively, to provide full dependency tree visibility.

Runtime environment. The operating system, runtime (Python, Node.js, etc.), container image, and infrastructure requirements.

Vulnerability status. Known vulnerabilities (CVEs) in any dependency, with severity scores and remediation status.

Component Category 6: Deployment Configuration

The deployment section documents how the AI system is configured for production use.

Inference parameters. Configuration parameters that affect model behavior: temperature, top-p, max tokens, stop sequences, system prompts, safety settings, content filter settings.

API specification. The API through which the model is accessed: endpoints, authentication, rate limits, input formats, output formats.

Scaling configuration. How the AI system scales: instance types, auto-scaling parameters, geographic distribution, redundancy configuration.

Access controls. Who can access the AI system, through what mechanisms, with what permissions.

Component Category 7: Lifecycle and Maintenance

The lifecycle section documents the operational maintenance of the AI system.

Update cadence. How frequently the model is updated: retraining schedule, fine-tuning schedule, safety update schedule.

Update notification. How customers are notified of updates: advance notification period, notification channels, change documentation provided.

Deprecation policy. How and when the model version will be deprecated: deprecation timeline, migration support, backward compatibility commitments.

Support contacts. Technical support, responsible AI contacts, security incident contacts, and escalation paths.

Alignment with NIST and EU AI Act Requirements

NIST AI RMF Alignment

The NIST AI Risk Management Framework provides the risk management context within which AI-BOMs operate. Key alignment points:

MAP function (MAP 5). MAP 5 addresses third-party AI components. MAP 5.1 calls for identifying third-party datasets, models, and services used in AI systems. MAP 5.2 calls for assessing the risks associated with third-party AI components. The AI-BOM directly supports both subcategories by providing structured documentation of third-party components and their characteristics.

MEASURE function. The AI-BOM’s evaluation section (Component Category 4) directly supports the MEASURE function’s requirement for documented evaluation methodology and results.

MANAGE function. The AI-BOM’s lifecycle section (Component Category 7) supports the MANAGE function’s requirements for ongoing monitoring, incident management, and change management.

GOVERN function. The AI-BOM itself is a governance artifact. Its existence, completeness, and currency are measures of governance maturity. The GOVERN function’s requirements for policies, processes, and accountability are supported by the organizational processes that create, maintain, and use AI-BOMs.

EU AI Act Alignment

The EU AI Act imposes specific documentation requirements on providers and deployers of AI systems. The AI-BOM supports compliance with several key articles:

Article 11: Technical documentation. Providers of high-risk AI systems must draw up technical documentation before the system is placed on the market. The required documentation includes system description, design specifications, development process, testing and validation, and ongoing monitoring. The AI-BOM provides a structured format for much of this required documentation.

Article 13: Transparency and provision of information. High-risk AI systems must be designed to enable users to interpret output and use it appropriately. The AI-BOM’s documentation of intended use, limitations, and evaluation results supports this transparency requirement.

Article 17: Quality management system. Providers must establish a quality management system that includes documentation of techniques, procedures, and systematic actions for design, development, and examination. The AI-BOM creation and maintenance process is a component of this quality management system.

Article 9(4): Supply chain obligations. Providers must exercise due diligence regarding the components they incorporate, including third-party models and data. The AI-BOM’s documentation of foundation model dependencies, training data provenance, and software dependencies directly supports this due diligence requirement.

Annex IV: Technical documentation requirements. Annex IV provides a detailed list of required technical documentation elements. The AI-BOM’s seven component categories map comprehensively to the Annex IV requirements.

ISO/IEC 42001:2023 Alignment

ISO/IEC 42001 (AI Management System) provides the management system standard for AI governance. The AI-BOM supports several key controls:

Control A.6.2.6: Documentation of AI system information. Requires organizations to document information about AI systems throughout their lifecycle. The AI-BOM is the primary artifact for satisfying this control.

Control A.10: Supplier relationships. Requires organizations to establish and maintain policies and procedures for AI products, services, and components obtained from external suppliers. The AI-BOM provides the structured documentation that makes supplier relationship governance operational.

Control A.7.4: Documentation. Requires organizations to create and maintain documentation necessary for the effective planning, operation, and control of AI system lifecycle processes. The AI-BOM contributes to this documentation requirement for any AI system that includes third-party components.

Implementation Methodology

Phase 1: Standard Selection and Customization (Months 1-3)

Select the AI-BOM standard that best fits the organization’s ecosystem and regulatory requirements.

CycloneDX ML-BOM. CycloneDX, originally developed by OWASP for software SBOMs, has extended its specification to include machine learning components. The CycloneDX ML-BOM specification (version 1.6 and later) includes fields for model details, training data, evaluation results, and deployment information. CycloneDX is particularly strong for organizations already using CycloneDX for SBOMs, as the AI-BOM integrates naturally with existing SBOM tooling and workflows.

SPDX 3.0 AI Profile. SPDX, maintained by the Linux Foundation, has added an AI/ML profile in version 3.0. The SPDX AI profile includes fields for model architecture, training data, safety evaluations, and known limitations. SPDX is particularly strong for organizations that need interoperability with open-source compliance tooling.

Custom schema. Some organizations develop custom AI-BOM schemas that combine elements from multiple standards with organization-specific fields. This approach provides maximum flexibility but reduces interoperability. Custom schemas should be used only when standard schemas are demonstrably inadequate.

After selecting the base standard, customize it to include any organization-specific fields required by the enterprise’s AI governance framework, regulatory obligations, or industry-specific requirements.

Phase 2: Pilot Implementation (Months 3-6)

Implement AI-BOM creation for a small number of pilot AI systems — ideally two to three systems spanning different types (one internally built, one procured from a transparent vendor, one procured from a less transparent vendor).

The pilot reveals several practical challenges:

Vendor cooperation. Not all vendors will provide the information needed for a complete AI-BOM. The pilot identifies which information is readily available, which requires negotiation, and which may be unobtainable from certain vendors. This informs the AI-BOM completeness standards — which fields are mandatory, which are conditional, and which are aspirational.

Tooling requirements. The pilot identifies what tooling is needed to create, store, validate, and manage AI-BOMs. Options range from simple document templates (for organizations with few AI systems) to dedicated AI governance platforms (for organizations with many AI systems).

Process integration. The pilot identifies how AI-BOM creation integrates with existing processes — procurement, vendor assessment, change management, and compliance reporting.

Effort estimation. The pilot provides realistic effort estimates for AI-BOM creation, which informs the rollout plan.

Phase 3: Progressive Rollout (Months 6-18)

Roll out AI-BOM requirements progressively, aligned with the vendor tiering model:

Strategic vendors first. Require AI-BOMs from strategic AI vendors (Tier 1). These vendors have the most impact and typically the most resources to provide comprehensive documentation. Negotiate AI-BOM provision into strategic vendor contracts.

Tactical vendors second. Extend AI-BOM requirements to tactical AI vendors (Tier 2). Accept streamlined AI-BOMs that cover the most critical component categories (Model Architecture, Training Data, Evaluation, and Lifecycle).

Commodity vendors last. For commodity AI vendors (Tier 3), accept minimal AI-BOMs or self-declared AI information sheets that cover core risk dimensions without the full depth of a comprehensive AI-BOM.

Internal AI systems. Require AI-BOMs for all internally developed AI systems. Internal AI-BOMs are typically more complete because the organization has full visibility into the development process.

Phase 4: Lifecycle Management (Ongoing)

AI-BOMs are living documents that must be updated as AI systems change.

Version management. When a vendor updates its AI model, the AI-BOM must be updated to reflect the new version. Establish a process for vendors to notify the organization of model updates and provide updated AI-BOM documentation.

Change detection. Implement automated change detection that compares current AI behavior with AI-BOM documentation. If the AI system’s behavior diverges from its documented characteristics, the AI-BOM may be stale, and the vendor may have made undocumented changes.

Periodic validation. Conduct periodic validation of AI-BOM accuracy by independently testing the AI system’s performance, fairness, and safety characteristics against the documented claims.

Archive management. Maintain an archive of historical AI-BOMs to support audit, incident investigation, and regulatory compliance. When an incident occurs, historical AI-BOMs enable tracing the system’s composition at the time of the incident.

Tooling and Automation

AI-BOM Generation Tools

Several categories of tools support AI-BOM creation:

Model documentation tools. Tools that generate model documentation from model artifacts. Hugging Face Model Cards, Google’s Model Card Toolkit, and Microsoft’s Datasheets for Datasets provide structured templates for documenting model and data characteristics. These tools focus on individual AI systems and produce documentation that can be incorporated into AI-BOMs.

SBOM tools extended to AI. Traditional SBOM tools that have been extended to capture AI-specific components. Syft (Anchore), Tern (VMware), and SPDX tools can capture software dependencies; some are being extended to capture ML-specific components like model frameworks, training libraries, and inference runtimes.

AI governance platforms. Dedicated AI governance platforms that include AI-BOM management as a component of broader AI lifecycle governance. These platforms typically provide AI inventory management, risk assessment workflows, monitoring dashboards, and compliance reporting in addition to AI-BOM management.

Custom tooling. Organizations with large AI portfolios often develop custom tooling that integrates AI-BOM generation into their CI/CD pipelines, model registries, and vendor management systems.

Automation Opportunities

Several aspects of AI-BOM management can be automated:

Dependency scanning. Software dependencies can be automatically scanned and documented using existing SBOM tooling. This covers Component Category 5 (Software Dependencies) with minimal manual effort.

Model metadata extraction. For internally developed models, model metadata (architecture, parameters, framework) can be automatically extracted from model registries and training pipelines.

Evaluation result integration. Model evaluation results can be automatically incorporated into AI-BOMs from evaluation platforms and testing frameworks.

Change detection. Automated monitoring can detect changes in AI system behavior that may indicate model updates requiring AI-BOM revision.

Validation. Automated validation can verify that AI-BOMs contain all required fields, that referenced datasets and models exist, and that evaluation results are within expected ranges.

The AI-BOM as Governance Foundation

The AI-BOM is not an end in itself. It is the foundational artifact that enables systematic AI supply chain governance. With comprehensive AI-BOMs:

Procurement decisions are informed by structured, comparable information about AI system composition and characteristics
Risk assessments can be conducted against documented system properties rather than vendor marketing claims
Incident response can trace the impact of a vulnerability or bias through the supply chain, from foundation model to enterprise deployment
Regulatory compliance can be demonstrated through documented evidence of AI system composition, evaluation, and oversight
Continuous monitoring can detect changes by comparing current behavior against documented characteristics
Concentration risk can be identified by analyzing AI-BOM dependency trees across the portfolio

The maturity of an organization’s AI-BOM practice is a leading indicator of its AI supply chain governance maturity. Organizations that maintain comprehensive, current AI-BOMs are better positioned to manage AI supply chain risk, respond to incidents, meet regulatory obligations, and make informed decisions about their AI portfolio.

Previous in the Domain 20 series: Article 11 — AI Supply Chain Governance at Enterprise Scale (Module 3.7) Next in the Domain 20 series: Article 11 — Strategic Third-Party AI Governance for Leaders (Module 4.6)