Building EU AI Act Evidence Portfolios

FlowRidge

This article provides the practitioner-level guidance for building evidence portfolios that are complete, defensible, and maintainable over time.

The Evidence Hierarchy

EU AI Act compliance evidence is organised in a natural hierarchy. Understanding this hierarchy helps practitioners ensure completeness and trace every compliance claim to its supporting evidence.

Level 1: Classification Evidence

The foundation of the portfolio. Classification evidence supports the determination of which risk category each AI system falls into and, therefore, which obligations apply.

Required evidence includes:

AI system inventory with system metadata (name, version, provider, deployer, intended purpose, affected population, deployment geography)
Classification rationale document per system, referencing specific Articles and Annex categories
Article 6(3) exception assessment documentation (where the exception is claimed)
GPAI model identification and systemic risk threshold analysis
Legal opinion on classification (for edge cases)

Quality criteria:

Each classification must reference the specific Article and sub-paragraph relied upon
Assumptions must be explicitly stated (e.g., “We assume the system’s use case does not constitute profiling within the meaning of Article 6(3)”)
The classification must be signed off by a person with appropriate authority
Classifications must be reviewed when the system’s purpose, scope, or context changes

Level 2: Requirements Compliance Evidence

For each applicable requirement, the portfolio must contain evidence demonstrating compliance. This is the bulk of the portfolio for high-risk systems.

Article 9 — Risk Management Evidence:

Risk management plan (scope, methodology, criteria, roles)
Risk register (identified risks, severity, likelihood, risk treatment decisions)
Risk treatment effectiveness testing reports
Residual risk documentation and deployer communication
Post-market monitoring integration documentation
Risk management review records (evidence of continuous iteration)

Article 10 — Data Governance Evidence:

Data governance policy for AI training, validation, and testing data
Data sourcing documentation (provenance, collection methods, consent basis)
Data quality assessment reports (completeness, representativeness, error rates)
Bias detection and mitigation analysis reports
Data gap analysis and remediation documentation
Statistical properties documentation for data sets
Personal data handling documentation (GDPR intersection where applicable)

Article 11 / Annex IV — Technical Documentation:

System description (purpose, developer, version, interfaces)
Development process documentation (design specifications, architecture, algorithms)
Training methodology documentation (objectives, hyperparameters, optimisation)
Validation and testing methodology and results
Performance metrics and their appropriateness justification
Change log documenting modifications through the lifecycle

Article 12 — Record-Keeping Evidence:

Logging system design specification
Sample log outputs demonstrating content requirements
Log retention policy and implementation evidence
Log integrity and tamper-protection mechanisms
Log analysis procedures for post-market monitoring

Article 13 — Transparency Evidence:

Instructions for use document (covering all Article 13(3) elements)
Provider contact information accessible to deployers
System capabilities and limitations description
Known circumstances that may lead to risks
Human oversight guidance for deployers

Article 14 — Human Oversight Evidence:

Human oversight design specification
Override and intervention mechanism testing results
Output interpretation aid documentation
Overseer training materials and completion records
Operational procedures for human oversight

Article 15 — Accuracy, Robustness, and Cybersecurity Evidence:

Accuracy metric declarations with validation methodology
Validation test results against declared accuracy levels
Robustness testing reports (errors, faults, distribution shifts)
Cybersecurity assessment reports (AI-specific and general)
Adversarial testing results (prompt injection, data poisoning, model extraction)
Fail-safe mechanism documentation and testing

Article 17 — Quality Management System Evidence:

QMS policy and procedures
Internal audit schedule and results
Management review records
Corrective and preventive action (CAPA) records
Personnel competence and training records

Level 3: Conformity Assessment Evidence

Evidence produced during the conformity assessment process itself.

For internal conformity assessment (Annex VI):

Internal assessment protocol and checklist
Assessor qualifications and independence declaration
Assessment report covering all applicable requirements
Non-conformity findings, corrective actions, and verification
Final assessment conclusion and recommendation

For notified body assessment (Annex VII):

Notified body application and engagement documentation
QMS audit report from the notified body
Technical documentation assessment report
QMS certificate issued by the notified body
Technical documentation certificate (where applicable)
Corrective action records in response to notified body findings

Level 4: Operational Compliance Evidence

Ongoing evidence produced during the system’s operational life.

Post-market monitoring plan and implementation evidence
Performance monitoring data and trend analysis
Incident logs, investigation reports, and corrective actions
Serious incident reports submitted to authorities (Article 73)
System modification records and impact assessments
Periodic compliance review reports
Updated risk assessments incorporating operational data

Level 5: Registration and Declaration Evidence

Formal regulatory documentation.

EU declaration of conformity (signed, per Annex V format)
EU database registration confirmation (Article 49)
CE marking evidence (photograph or documentation of marking)
Fundamental rights impact assessments (where deployer is a public body, Article 27)

Evidence Documentation Standards

Format and Structure

Evidence documents should follow consistent formatting standards to ensure they are professionally presented, easily navigable, and resistant to challenges about their completeness or authenticity.

Recommended document metadata for every evidence artefact:

Field	Description
Document ID	Unique identifier within the evidence portfolio
Title	Descriptive title of the evidence artefact
AI System Reference	Which AI system(s) this evidence relates to
Requirement Reference	Which Article/Annex requirement this evidence supports
Version	Document version number
Created Date	Date of initial creation
Last Updated	Date of most recent update
Author	Person who created the document
Reviewer	Person who reviewed for accuracy and completeness
Approver	Person with authority to approve the document
Status	Draft, Under Review, Approved, Superseded, Archived
Retention Period	How long the document must be retained
Classification	Confidentiality level

Version Control

All evidence must be version-controlled. The EU AI Act requires documentation to be “kept up to date” (Article 11(1)), and competent authorities may request previous versions to understand how the system has evolved. Version control must:

Preserve all previous versions of evidence documents
Record what changed between versions and why
Identify who made each change
Timestamp all changes accurately
Prevent unauthorised modification (audit trail integrity)

Retention

Article 19(1) requires providers to keep documentation at the disposal of competent authorities for a period of 10 years after the AI system has been placed on the market or put into service, or for the duration of the CE certificate’s validity plus 10 years. Deployers must retain automatically generated logs for a minimum of 6 months (or longer if required by Union or national law).

Plan your evidence repository accordingly. Cloud-based document management systems with long-term archival capabilities, audit trails, and access controls are recommended.

Evidence Collection Automation

Manual evidence collection is labour-intensive, error-prone, and difficult to maintain over time. Practitioners should automate evidence collection wherever possible.

Automated Evidence Sources

From the development pipeline:

Version control system metadata (commits, branches, code reviews)
CI/CD pipeline outputs (build reports, test results, deployment records)
Model training logs (hyperparameters, metrics, resource consumption)
Data pipeline metadata (data lineage, quality metrics, transformation records)

From production systems:

System operation logs (Article 12 compliance)
Performance monitoring dashboards and alerts
Error and exception logs
User feedback and complaint records
Incident detection and response system outputs

From governance processes:

Risk register updates and review records
Governance committee meeting minutes and decision logs
Training completion records from learning management systems
Audit findings and corrective action tracking

Integration Architecture

The evidence portfolio should integrate with existing organisational systems rather than exist as an isolated repository:

Document Management System (SharePoint, Confluence, etc.): Store narrative evidence documents with version control and access management
Source Control (Git): Technical artefacts, model cards, data cards, configuration documentation
Model Registry (MLflow, Weights & Biases, etc.): Model metadata, training parameters, evaluation metrics
Data Catalogue (Amundsen, DataHub, etc.): Data lineage, quality metrics, governance metadata
ITSM/GRC Platform (ServiceNow, Archer, etc.): Risk registers, incident records, audit findings
Learning Management System: Training completion records and competency assessments

Quality Assurance for Evidence

Completeness Checks

Periodically verify that the evidence portfolio is complete by mapping every applicable requirement to its supporting evidence:

For each high-risk system, verify that evidence exists for Articles 9, 10, 11, 12, 13, 14, 15, and 17
For each evidence document, verify that it contains all required information elements
Identify any evidence artefacts that are expired, outdated, or pending update
Flag any requirements that are supported by a single piece of evidence (single point of failure)

Consistency Checks

Verify that evidence is internally consistent:

Do risk management documents align with the risks identified in the technical documentation?
Do accuracy metrics declared in instructions for use match the validation test results?
Do human oversight procedures match the oversight design specification?
Do data governance documents accurately describe the data actually used?

Inconsistencies are a significant risk during regulatory inspection. A competent authority that finds contradictions between documents will question the reliability of the entire portfolio.

Defensibility Checks

Assess whether evidence would withstand regulatory challenge:

Is every compliance claim supported by objective evidence (not just assertions)?
Are testing methodologies appropriate and well-documented?
Are sample sizes and statistical methods defensible?
Are assessor qualifications and independence clearly documented?
Is the chain of custody for evidence artefacts unbroken?

Evidence for GPAI Model Obligations

GPAI model providers have specific evidence requirements under Articles 53-55:

Standard GPAI (Article 53):

Technical documentation per Annex XI (model architecture, training process, evaluation results)
Downstream provider information packages
Copyright compliance policy with opt-out processing audit trail
Published training data summary (per AI Office template)
Published energy consumption data

Systemic Risk GPAI (Article 55, in addition to above):

Standardised model evaluation results
Adversarial testing (red-teaming) programme documentation and results
Systemic risk assessment and mitigation reports
Cybersecurity assessment reports
Incident monitoring system documentation and incident logs
AI Office correspondence and cooperation records

Building the Portfolio: A Step-by-Step Approach

Phase 1: Portfolio Structure (Week 1)

Establish the portfolio structure before collecting evidence:

Create the portfolio repository with a clear folder structure organised by AI system and requirement
Define the document metadata standard and create templates
Establish version control procedures
Assign portfolio maintenance responsibilities
Define the access control policy

Phase 2: Classification Evidence (Weeks 1-2)

Populate the classification layer:

Complete and file the AI system inventory
Document classification rationale for each system
File any legal opinions or specialist assessments
Create a classification summary dashboard for governance committee review

Phase 3: Requirements Evidence — Existing Documentation (Weeks 2-4)

Gather existing documentation that supports compliance:

Collect existing risk management documentation
Gather existing technical documentation, design documents, and architecture diagrams
Compile existing test reports and validation results
Collect existing data governance documentation
Map gathered documents to specific Article requirements and identify gaps

Phase 4: Requirements Evidence — Gap Remediation (Weeks 4-10)

Produce new evidence to fill identified gaps:

Prioritise gaps by regulatory deadline and risk
Create remediation work packages for each gap
Produce documentation to the defined standards and templates
Review and approve new evidence documents
File in the portfolio and update the completeness matrix

Phase 5: Conformity Evidence (Weeks 10-12)

Produce the conformity assessment layer:

Conduct the conformity assessment (internal or notified body)
File assessment reports and findings
Document corrective actions and verification
Prepare and sign the EU declaration of conformity

Phase 6: Operational Evidence (Ongoing)

Activate continuous evidence collection:

Activate post-market monitoring and evidence collection
Establish periodic portfolio review cycle (recommend quarterly)
Implement automated evidence feeds where possible
Monitor for evidence staleness and update accordingly

Common Pitfalls

The “compliance dump” anti-pattern: Throwing every document the organisation has ever produced about an AI system into a folder and calling it an evidence portfolio. Quantity without structure is worse than useless — it wastes regulatory reviewers’ time and signals a lack of systematic governance.

The “snapshot” anti-pattern: Producing evidence once and never updating it. The EU AI Act requires documentation to be kept up to date. A technical documentation package that describes a system as it existed 18 months ago does not demonstrate current compliance.

The “narrative-only” anti-pattern: Producing descriptive documents that say “we do risk management” without providing objective evidence that risk management activities were actually conducted. Every narrative claim must be backed by artefact evidence: test reports, audit logs, meeting minutes, system outputs.

The “development-only” anti-pattern: Producing comprehensive development documentation but no operational evidence. The EU AI Act requires ongoing compliance — post-market monitoring, incident reporting, continuous risk management. The portfolio must be a living collection, not a project deliverable.

Conclusion

An evidence portfolio is not a bureaucratic exercise — it is the tangible proof that your organisation’s AI governance is real, operational, and effective. For COMPEL practitioners, building and maintaining an evidence portfolio is a natural extension of the evidence-based governance approach that the framework teaches.

The practitioner who can build a complete, well-structured, defensible evidence portfolio is providing genuine value to their organisation: reducing regulatory risk, accelerating conformity assessment, and creating the documentary foundation for trustworthy AI operations. This skill — translating governance activities into demonstrable evidence — is one of the most practically valuable competencies a COMPEL practitioner can develop.