Feature Stores and Vector Stores as Governance Artifacts

FlowRidge

Feature Stores vs Vector Stores — As Governance Artefacts

Feature store

Structured features

Tabular, typed, versioned

Point-in-time joins

Prevents training-serving skew

Lineage to source tables

Deterministic computation

Quality SLOs

Freshness, completeness

vs

Vector store

Embeddings + metadata

High-dim semantic space

Similarity retrieval

Approximate nearest neighbour

Embedding model version

Lineage ties to encoder

Content moderation

Source filtering + tagging

Figure 306. Both stores serve AI inference but carry different governance obligations. Treating them identically hides risk — classical features vs semantic embeddings call for different controls.

This article treats both stores as readiness objects. The practitioner learns to score them against the same framework used for source datasets in Article 3, extended to cover the serving-path characteristics unique to each store type.

Why these stores are governance artifacts

A feature store consolidates expensive feature-engineering work so multiple models can share it. A vector store consolidates expensive embedding-generation work so multiple retrieval pipelines can share it. Both are deliberately introduced as reuse layers. That reuse is valuable and dangerous.

The value is obvious — reuse saves compute, reduces duplicated engineering, and makes features or embeddings version-consistent across consumers. The danger is that the store becomes a single point of failure that sits inside multiple production AI systems simultaneously. A schema change in a feature store can break a dozen models. An embedding-model change in a vector store can degrade a dozen retrieval pipelines. A weak access policy on either store can leak sensitive data into unintended AI workloads.

A readiness practitioner therefore treats feature and vector stores with additional rigor. Every consumer of the store is a downstream dependency, and the store’s governance must match the blast radius of its consumer set.

Feature store readiness criteria

A feature store has two serving surfaces: offline (batch retrieval for training) and online (low-latency retrieval for inference). The readiness criteria must cover both.

Schema contract

Every feature is declared with a type, a value range, an owner, a description, a freshness SLA, and a version. The schema contract is the feature-store equivalent of a data contract (Article 3) adapted to feature semantics. The readiness practitioner audits that every feature in use has all six elements populated. Where the store contains features with missing elements, those features are not ready for production use.

Consistency across offline and online

The feature value for a given entity and point in time must be the same whether retrieved from the offline batch surface or the online real-time surface. Offline/online skew is a recurring feature-store failure mode. The readiness practitioner checks that the feature store has a documented consistency guarantee (event-time semantics, point-in-time joins, correct lookback windows) and that consistency tests run regularly against a sample of features.

Freshness SLA

Each feature has a documented latency between the underlying source event and feature availability. An account-balance feature with a 30-second SLA is suitable for inference latency-sensitive uses; the same feature with a 24-hour SLA is not. Use cases that depend on tight-SLA features must reference the SLA explicitly, and the store must monitor the SLA continuously with alerting.

Access policy

Features derived from restricted data carry the restriction. A feature derived from a PII column is itself PII even after transformation unless the transformation is irreversible and documented. Access policy on the store must enforce the restriction across training and inference paths. The readiness practitioner verifies that the policy exists in machine-enforceable form (IAM rules, not just documentation) and that the enforcement has been tested.

Lineage back to source

Every feature must trace to source data through a documented transformation. The lineage link combines with the Article 4 provenance discipline to answer “where did this value come from” for any feature in use. A feature whose lineage is opaque is a latent readiness failure.

Drift monitoring

Features drift. The distribution of an input feature can shift as upstream source systems change, as customer populations evolve, or as the world changes. Each feature in production use should carry a drift monitor with a documented threshold and an escalation path. Article 10 of this credential covers drift in depth; here the practitioner audits that the feature store owns its drift monitoring rather than pushing it to each downstream model.

Deprecation policy

Features are eventually retired. A clear deprecation policy specifies the notice period to consumers, the alternative features (if any), and the rollback path if the deprecation breaks a consumer. Features deprecated without notice are an incident class; the policy prevents it.

[DIAGRAM: HubSpokeDiagram — feature-store-readiness-spokes — central hub labeled “Feature Store” with seven spokes: schema contract, offline/online consistency, freshness SLA, access policy, lineage, drift monitoring, deprecation policy — each spoke annotated with the owner role and the check frequency]

Worked examples — feature stores across stacks

Feature-store implementations span open source, commercial, and in-house patterns. Named public implementations include:

Feast (open source; Linux Foundation AI & Data project). Adopted at Airbnb, Gojek, Robinhood, and others; public engineering blog posts document the integration patterns.¹²
Tecton (commercial). Documented integrations with Snowflake, Databricks, and Spark-based warehouses.
Databricks Feature Store (platform-native). Integrated with Unity Catalog for governance.
SageMaker Feature Store (AWS-native).
In-house stores — Uber’s Michelangelo, Airbnb’s Zipline, and LinkedIn’s Frame are all published as engineering case studies.

The readiness practitioner does not score on vendor choice. The practitioner scores on whether the seven readiness criteria are met by whatever implementation the organization has chosen. A best-in-class commercial store with a weak access policy fails readiness; a homegrown store with strong practices across all seven criteria passes.

Vector store readiness criteria

Vector stores are a newer artifact class and carry criteria specific to retrieval-augmented generation (RAG). The criteria below extend the feature-store criteria with RAG-specific additions.

Chunking strategy

Source documents are split into chunks before embedding. The chunking strategy — size, overlap, boundary rules (paragraph, semantic, fixed-token), and metadata preserved per chunk — shapes retrieval quality and is itself a governed artifact. A change to chunking requires reindexing and is a breaking change for downstream consumers.

Embedding model version

The embedding model used to produce vectors is a governance-critical parameter. A change to the embedding model requires reindexing (the old vectors are incompatible with new queries). The readiness practitioner audits that the embedding-model version is pinned per index, recorded in the vector store’s metadata, and versioned independently from the source documents.

Embedding governance

The broader discipline — encompassing chunking strategy, embedding model version, refresh cadence, and index lifecycle — is embedding governance. It is a new sub-discipline inside data governance. The readiness practitioner confirms that embedding governance is owned (not orphaned between ML and platform teams) and that the ownership includes a response protocol for embedding-model deprecation.

Source-document lineage

Every chunk in a vector store should trace to the source document it was derived from, including the document’s version, the access scope, and the retention policy. A chunk whose source-document access is revoked must be removed from the index. The readiness practitioner audits the lineage and the revocation path.

Refresh cadence

Source documents change. The refresh cadence — how often the index is rebuilt or incrementally updated — must match the use case’s freshness requirement. A customer-facing RAG system answering billing questions needs same-day refresh when billing documents change; a research-literature assistant can tolerate weekly or monthly refresh.

Query-time access policy

Retrieval queries inherit the user’s access scope. A user querying the index should only retrieve chunks from documents they are authorized to see. Query-time access enforcement is the hardest access-policy problem in RAG and the one most commonly broken. The readiness practitioner specifically tests the enforcement: a user without access to a restricted document should not retrieve its chunks even under targeted queries.

Retrieval quality monitoring

Retrieval quality drifts. Source-document changes, embedding-model versions, and query-pattern shifts all affect retrieval quality. The readiness practitioner requires a retrieval-quality monitor (recall-at-k, mean reciprocal rank, or task-specific equivalents on a held-out query set) with an escalation path when quality degrades.

Audit log

Every retrieval is logged with the query, the retrieved chunks, the user context, and the downstream generation. The audit log supports debugging, compliance, and incident investigation.

[DIAGRAM: StageGateFlow — rag-pipeline-readiness-gates — horizontal flow of a retrieval-augmented pipeline: document intake → chunking → embedding → indexing → retrieval → generation → audit — each stage annotated with the readiness gate (lineage registered, chunking strategy versioned, embedding model pinned, access policy enforced, retrieval quality monitored, audit log emitted)]

Vector store implementations

Like feature stores, vector stores span open-source and commercial options. Named implementations include:

Pinecone (commercial managed service).
Weaviate (open source with optional managed service).
Qdrant (open source with optional managed service).
Milvus (open source; LF AI & Data project).
pgvector (PostgreSQL extension).
Chroma (open source, local-first).
Azure AI Search, Amazon OpenSearch, Google Vertex AI Matching Engine (cloud-platform-native).

Each implementation has tradeoffs on latency, scale, metadata filtering, and governance integration. The readiness practitioner scores against the criteria, not the brand. A team using pgvector with tight lineage, access control, and refresh discipline meets readiness; a team using a managed service without documented chunking strategy or embedding-model versioning does not.

The Samsung access-policy failure, reprised

Article 3 introduced the Samsung ChatGPT incident in the context of access-scope enforcement. The incident recurs here because external AI services often maintain their own embedding stores, and pasting proprietary content into them places that content into an embedding index outside the organization’s governance. The readiness practitioner should require that the organization maintain an internal vector-store option for sensitive content, with access policies that prevent routing to external services, and an acceptable-use policy that names the boundary explicitly. Technical enforcement matters more than policy prose.

The hybrid pattern — feature store plus vector store

Many mature AI platforms run a feature store and a vector store side by side, serving different workload classes. A customer-service platform may use features from the feature store to predict call routing and the vector store to retrieve knowledge-base content for the agent assistant. The two stores share governance concepts but serve different consumers.

The readiness practitioner should confirm that the two stores have:

Coordinated access policies. A user or service entitled to feature-store access should not automatically have vector-store access; the access models are distinct and should be authorized independently.
Shared lineage where applicable. Where a vector-store index is derived from features or documents that the feature store also covers, the shared lineage should be captured once and referenced from both.
Independent drift monitoring. Feature drift and retrieval-quality drift are different phenomena and need different monitors; sharing a monitor across the two stores is a governance mistake.
Independent incident response. A feature-store incident and a vector-store incident are typically different in severity and response path; the practitioner records both in the same incident taxonomy but maintains distinct response playbooks.

Cost and capacity planning as a governance input

Feature and vector stores are resource-intensive, and cost pressure can silently degrade governance. An index that is rebuilt less frequently than the contract requires because the reindex budget was cut is a governance failure masquerading as an ops decision. The readiness practitioner audits:

Is the store’s operating cost budgeted and owned?
Are the refresh cadence and the freshness SLA achievable within the budget?
What happens to the governance posture if the budget is cut by 20%? By 50%?

An honest budget-pressure analysis is itself evidence that the store’s governance will survive organizational change.

Governance ownership — preventing orphan risk

Feature and vector stores sit between platform engineering, ML engineering, data engineering, and security. Without explicit ownership, they become orphan systems — everyone uses them, no one owns them. The readiness practitioner insists on named ownership:

Platform team — provisions and operates the physical store.
Data or ML engineering team — owns the schema or index content, the freshness SLA, and the retention policy.
Security team — owns the access policy and query-time enforcement.
Governance lead — owns the contract registration, the drift-monitoring policy, and the deprecation approval path.

The four owners meet regularly (typically monthly) to review incidents, plan changes, and approve contract amendments. A store without this forum is a readiness finding.

Cross-references

COMPEL Governance Professional — Data architecture for enterprise AI (EATE-Level-3/M3.3-Art03-Data-Architecture-for-Enterprise-AI.md) — the expert treatment of enterprise data architecture in which feature and vector stores sit.
COMPEL Core — Technology pillar domains: data and platforms (EATF-Level-1/M1.3-Art06-Technology-Pillar-Domains-Data-and-Platforms.md) — the 20-domain maturity model’s platform domain.
AITM-DR Article 3 (./Article-03-Data-Governance-and-Data-Contracts.md) — the contract discipline extended here to feature and vector stores.
AITM-DR Article 10 (./Article-10-Drift-Monitoring-Incident-Classification-and-Sustainment.md) — drift monitoring referenced in the feature-store and vector-store readiness criteria.

Summary

Feature stores and vector stores are data products, not infrastructure. Each carries readiness criteria that extend the data contract: schema, consistency, freshness SLA, access policy, lineage, drift monitoring, and deprecation policy for feature stores; chunking strategy, embedding model version, source-document lineage, refresh cadence, query-time access enforcement, and retrieval-quality monitoring for vector stores. A readiness practitioner scores against the criteria, not the tool. Orphan ownership is the most common readiness failure mode, addressed by named four-party ownership covering platform, data or ML engineering, security, and governance.

Feast Documentation and engineering blog, Linux Foundation AI & Data Foundation. https://feast.dev/blog/feast-joins-the-lf-ai-data-foundation/ ↩
Airbnb Engineering, Airbnb’s ML Platform, 2021. https://medium.com/airbnb-engineering/airbnbs-ml-platform-8bc51e21ae75 ↩

Why these stores are governance artifacts

Feature store readiness criteria

Schema contract

Consistency across offline and online

Freshness SLA

Access policy

Lineage back to source

Drift monitoring

Deprecation policy

Worked examples — feature stores across stacks

Vector store readiness criteria

Chunking strategy

Embedding model version

Embedding governance

Source-document lineage

Refresh cadence

Query-time access policy

Retrieval quality monitoring

Audit log

Vector store implementations

The Samsung access-policy failure, reprised

The hybrid pattern — feature store plus vector store

Cost and capacity planning as a governance input

Governance ownership — preventing orphan risk

Cross-references

Summary

Footnotes