Skip to content

AI Model Provenance and Supply Chain

**AI · Provenance · Audit trails · Copyright attestation**

Overview

AI Model Provenance and Supply Chain

AI · Provenance · Audit trails · Copyright attestation

The problem

The modern AI pipeline touches dozens of parties and artifacts:

  • Training datasets from multiple sources (Common Crawl, licensed content, user-contributed, synthetic).
  • Base models developed and licensed (Llama, Mistral, GPT, Claude).
  • Fine-tuned variants building on the base.
  • Distilled / quantized / modified derivatives.
  • Inference endpoints serving the models.
  • Applications consuming the inferences.

Claims across this chain are constantly contested:

  • Copyright. “This model was trained on copyrighted content we didn’t license.” Class actions and regulatory investigations are piling up. The model developer’s internal logs aren’t a sufficient answer.
  • Attribution. “This fine-tune is ours, don’t claim it’s yours.” Derivative work disputes.
  • Safety. “This model was fine-tuned for a purpose different from what’s disclosed.” (A model fine-tuned on malicious data, distributed under a benign name.)
  • Licensing. “This model’s license forbids commercial use, but the downstream app is commercial.” The downstream application may not even know where the model came from.
  • Benchmarks / capabilities. “Model X achieves Y on benchmark Z.” Self-reported; hard to independently verify.

Today: internal CSVs, training-run tags, and README files. Not cryptographic. Not verifiable by anyone other than the model’s producer.

Why Quidnug fits

AI artifacts have natural identities and relationships:

  • Datasets are identifiable things.
  • Models are derived from datasets + parent models.
  • Fine-tunes are derived from a specific base model and training data.
  • Inferences come from specific model versions.

This is a directed graph of signed claims. Quidnug’s trust + title + event model fits directly.

ProblemQuidnug primitive
”What’s this model trained on?”Title of model + events linking to dataset quids
”Was this training authorized by data owner?”Signed event by data owner on the model title
”What benchmarks has this model been tested on?”Events from independent benchmarkers
”Who claims this model is safe?”Trust edges from safety attesters
”Is this inference from the claimed model?”Inference output bound to model quid via signature
”Has this model been fine-tuned since release?”Model’s event stream lists all descendants

High-level architecture

┌─────────────────────────────────────────────────┐
│ ai.provenance.models (domain) │
└─────────────────────────────────────────────────┘
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
Dataset quids Base-model quids Fine-tune quids
│ │ │
│ │ │
│ TITLE: │ TITLE: │ TITLE:
│ "is-training- │ "is-derived- │ "is-derived-
│ data-for" │ from-dataset" │ from-model"
│ │ │
▼ ▼ ▼
Event streams: Event streams: Event streams:
- licensed - training-run - fine-tune-start
- access-granted - benchmarks - benchmark
- scraped - safety-review - deployed
- inference-count

Data model

Quids

  • Dataset, each training dataset has a quid. The dataset’s owner (data curator) signs its metadata.
  • Model, each model + version is a quid. Base models, fine-tunes, and quantized versions are distinct quids.
  • Model producer, the organization that trains the model; has its own guardian set for key recovery.
  • Benchmark org, MLCommons, HELM, etc.; publishes signed benchmark results.
  • Safety attester, independent safety auditors.
  • Rights holder, publisher, artist, code author with licensing claims.

Domain

ai.provenance (top)
├── ai.provenance.datasets
├── ai.provenance.models
│ ├── ai.provenance.models.foundation
│ └── ai.provenance.models.fine-tunes
├── ai.provenance.licensing
└── ai.provenance.benchmarks

Dataset title

{
"type": "TITLE",
"assetId": "dataset-common-crawl-2024-cc-main-en",
"domain": "ai.provenance.datasets",
"titleType": "training-dataset",
"owners": [{"ownerId": "common-crawl-foundation", "percentage": 100.0}],
"attributes": {
"datasetHash": "<sha256 of canonicalized dataset>",
"sizeBytes": "3.1T",
"language": "en",
"contentType": "web-text",
"collectionDate": "2024-12",
"license": "CC0",
"licenseURL": "https://commoncrawl.org/terms-of-use/",
"exclusions": ["copyrighted-ebook-shadow-library",
"known-malicious-sites"]
},
"signatures": {"common-crawl-foundation": "<sig>"}
}

Model title

{
"type": "TITLE",
"assetId": "model-acme-foundation-7b-v2",
"domain": "ai.provenance.models.foundation",
"titleType": "ai-model",
"owners": [{"ownerId": "acme-ai", "percentage": 100.0}],
"attributes": {
"modelArchitecture": "decoder-transformer",
"parameters": 7000000000,
"modelHash": "<sha256 of model weights>",
"framework": "PyTorch",
"trainingDataRef": [
"dataset-common-crawl-2024-cc-main-en",
"dataset-acme-proprietary-licensed-books"
],
"license": "Apache-2.0",
"releaseDate": "2026-04-01",
"trainingCompute": "1.2e23 FLOPs"
},
"signatures": {"acme-ai": "<sig>"}
}

Training run events

On the model’s stream:

1. training.started
payload: { trainingDataRefs: [...], config: <hash>,
startedAt: ... }
signer: acme-ai
2. training.completed
payload: { finalLossHash: ..., checkpointsHash: ...,
totalFLOPs: ..., endedAt: ... }
signer: acme-ai
3. safety.evaluated
signer: safety-org-anthropic-evals
payload: { evaluatorOrg: ..., evaluationHash: ...,
redTeamReportHash: ..., overallRating: "acceptable" }
4. benchmark.submitted
signer: mlcommons
payload: { benchmark: "MMLU", score: 0.78, runDate: ... }
5. license.claimed
signer: rights-holder-publisher-X
payload: { claim: "model trained on copyrighted books ...",
counterclaimID: null, evidenceHash: ... }

Derivative model (fine-tune)

{
"type": "TITLE",
"assetId": "model-widgetco-finetune-for-support",
"domain": "ai.provenance.models.fine-tunes",
"owners": [{"ownerId": "widget-corp", "percentage": 100.0}],
"attributes": {
"baseModelRef": "model-acme-foundation-7b-v2",
"fineTuneData": "dataset-widget-support-tickets-private",
"modelHash": "<sha256>",
"license": "inherited + proprietary additions",
"intendedUse": "customer support chatbot"
},
"signatures": {"widget-corp": "<sig>"}
}

On this title’s stream, derivation.authorized events from the base model’s owner:

derivation.authorized
signer: acme-ai (the base model owner)
payload: { derivativeModelID: "model-widgetco-finetune-for-support",
authorizedUses: ["commercial", "non-commercial"],
termsHash: "<sha256 of license>" }

Inference attestation

When a model serves an inference, it can emit a signed inference-ran event:

eventType: "inference.ran"
subjectId: <model quid>
payload: {
inferenceID: "inf-abc-123",
requestHash: "<sha256 of prompt>",
responseHash: "<sha256 of response>",
timestamp: ...,
computeEnv: "acme-gpu-cluster-us-east"
}
signer: model producer

A downstream consumer can verify: “The inference I received was produced by model X, running at time T.” No one can forge an inference claim from a model without that model’s producer’s key.

Consumer trust

A downstream application (e.g., an LLM-powered customer support product):

func (app *App) EvaluateModel(modelID string) ModelAssessment {
title := app.quidnug.GetTitle(modelID)
events := app.quidnug.GetSubjectEvents(modelID, "TITLE")
// Check each attestation's source via relational trust
var safetyOK bool
for _, ev := range events {
if ev.EventType == "safety.evaluated" {
trust, _ := app.quidnug.GetTrust(app.quid,
ev.Payload["evaluatorOrg"].(string),
"ai.provenance.safety", nil)
if trust.TrustLevel >= 0.8 {
safetyOK = true
break
}
}
}
// Check for license claims
hasUnresolvedLicenseClaims := false
for _, ev := range events {
if ev.EventType == "license.claimed" && ev.Payload["counterclaimID"] == nil {
// Unresolved copyright claim, risky
claimantTrust, _ := app.quidnug.GetTrust(app.quid,
ev.Payload["signer"].(string),
"ai.provenance.licensing", nil)
if claimantTrust.TrustLevel >= 0.5 {
hasUnresolvedLicenseClaims = true
}
}
}
return ModelAssessment{
SafetyVerified: safetyOK,
UnresolvedLicenseIssues: hasUnresolvedLicenseClaims,
ReadyForProduction: safetyOK && !hasUnresolvedLicenseClaims,
}
}

Counter-attestations

Disputes happen. A rights holder files a license-claim event. The model producer can file a license.contested:

eventType: "license.contested"
payload: {
contestsClaimID: <earlier event ID>,
evidence: <hash>,
arguments: "Model was trained on publicly available
summaries, not full text. Summaries are
transformative under fair use..."
}

Both claim and contest live in the record. Consumers weigh their trust in both parties. Courts (if it gets there) have a full signed evidence chain.

Key Quidnug features

  • Title-of-title hierarchy, dataset, base model, fine-tune all have titles; event links model them into a DAG.
  • Event streams per artifact, training runs, safety evals, benchmarks, license claims.
  • Domain hierarchy, scope trust by dataset provenance vs. model safety vs. licensing.
  • Relational trust, different consumers trust different safety orgs / benchmarkers.
  • Guardian sets, model producer’s signing keys recoverable (a lab’s HSM loss shouldn’t orphan all their published models).
  • Push gossip, new claims (especially safety and license) propagate immediately.

Value delivered

DimensionBeforeWith Quidnug
Dataset provenanceREADME filesSigned title + hash; verifiable
Model-to-dataset linkageBlog postsSigned derivation relationship
Safety attestationInternal labs / private auditsOn-chain claims from independent attesters
License dispute evidenceEmails, depositionsSigned claim/counterclaim chain
Benchmark result verificationSelf-reportedBenchmark org’s signed event
Fine-tune authorizationContract + trustderivation.authorized event
Inference authenticityRely on endpointSigned inference event
Consumer evaluationVendor’s marketingAlgorithmic: trust × attestations

What’s in this folder

Runnable POC

Full end-to-end demo at examples/ai-model-provenance/:

  • model_provenance.py, pure verifier: producer-trust gate, derivative base-model gate, dataset-license filter, safety strictness, benchmark requirement.
  • model_provenance_test.py, 14 pytest cases.
  • demo.py, eight-step end-to-end flow covering accept foundation, accept derivative, reject prohibited dataset, warn on missing safety.
Terminal window
cd examples/ai-model-provenance
python demo.py

Implementation

Concrete API calls, pseudocode, signing shape.

Implementation: AI Model Provenance

1. Register a dataset

Terminal window
curl -X POST $NODE/api/identities -d '{
"quidId":"common-crawl-foundation",
"name":"Common Crawl Foundation",
"homeDomain":"ai.provenance.datasets",
"creator":"common-crawl-foundation","updateNonce":1
}'
# The dataset itself is a TITLE owned by the curator
curl -X POST $NODE/api/v1/titles -d '{
"assetId":"dataset-common-crawl-2024-cc-main-en",
"domain":"ai.provenance.datasets",
"titleType":"training-dataset",
"owners":[{"ownerId":"common-crawl-foundation","percentage":100.0}],
"attributes":{
"datasetHash":"<sha256>",
"sizeBytes":"3.1T",
"language":"en",
"license":"CC0",
"exclusions":["copyrighted-shadow-libraries"]
},
"signatures":{"common-crawl-foundation":"<sig>"}
}'

2. Register a base model

Terminal window
# Identity for the lab
curl -X POST $NODE/api/identities -d '{
"quidId":"acme-ai-labs",
"name":"Acme AI Labs",
"homeDomain":"ai.provenance.models.foundation",
"creator":"acme-ai-labs","updateNonce":1
}'
# Install a guardian set for the lab (HSM failures happen)
curl -X POST $NODE/api/v2/guardian/set-update -d '{ /* ... */ }'
# The model itself
curl -X POST $NODE/api/v1/titles -d '{
"assetId":"model-acme-foundation-7b-v2",
"domain":"ai.provenance.models.foundation",
"titleType":"ai-model",
"owners":[{"ownerId":"acme-ai-labs","percentage":100.0}],
"attributes":{
"modelArchitecture":"decoder-transformer",
"parameters":7000000000,
"modelHash":"<sha256 of weights>",
"license":"Apache-2.0",
"trainingDataRefs":[
"dataset-common-crawl-2024-cc-main-en",
"dataset-acme-proprietary-licensed-books"
],
"releaseDate":"2026-04-01"
},
"signatures":{"acme-ai-labs":"<sig>"}
}'

3. Training run events

Terminal window
# When training begins
curl -X POST $NODE/api/v1/events -d '{
"subjectId":"model-acme-foundation-7b-v2",
"subjectType":"TITLE",
"eventType":"training.started",
"payload":{
"trainingDataRefs":["dataset-common-crawl-2024-cc-main-en"],
"configHash":"<sha256 of training config>",
"startedAt":1713400000,
"expectedCompletion":1716000000,
"computeEnv":"acme-gpu-cluster-us-east"
},
"creator":"acme-ai-labs","signature":"<sig>"
}'
# When training completes
curl -X POST $NODE/api/v1/events -d '{
"subjectId":"model-acme-foundation-7b-v2",
"subjectType":"TITLE",
"eventType":"training.completed",
"payload":{
"finalLossHash":"<sha256>",
"checkpointsHash":"<sha256>",
"totalFLOPs":"1.2e23",
"endedAt":1716000000
},
"creator":"acme-ai-labs","signature":"<sig>"
}'

4. Safety evaluation

An independent safety org (e.g., anthropic-evals-team) runs tests and publishes:

Terminal window
curl -X POST $NODE/api/v1/events -d '{
"subjectId":"model-acme-foundation-7b-v2",
"subjectType":"TITLE",
"eventType":"safety.evaluated",
"payload":{
"evaluatorOrg":"anthropic-evals-team",
"evaluationHash":"<sha256 of full report>",
"redTeamReportHash":"<sha256>",
"overallRating":"acceptable",
"knownIssues":["occasional-hallucination-on-math-problems"],
"evaluationDate":1716100000
},
"creator":"anthropic-evals-team","signature":"<sig>"
}'

Anthropic Evals publishes their trust from whoever views them as authoritative. Consumers doing their own trust eval weigh Anthropic’s signature by their own trust in Anthropic.

5. Benchmark submissions

MLCommons, HELM, or other benchmark orgs run tests:

Terminal window
curl -X POST $NODE/api/v1/events -d '{
"subjectId":"model-acme-foundation-7b-v2",
"subjectType":"TITLE",
"eventType":"benchmark.submitted",
"payload":{
"benchmark":"MMLU",
"score":0.78,
"benchmarkVersion":"2024.04",
"runDate":1716200000,
"fullResultsHash":"<sha256>"
},
"creator":"mlcommons","signature":"<sig>"
}'

6. Derivative (fine-tune) authorization

Widget Corp fine-tunes Acme’s model:

Terminal window
# First register the fine-tune as a title
curl -X POST $NODE/api/v1/titles -d '{
"assetId":"model-widgetco-finetune-v1",
"domain":"ai.provenance.models.fine-tunes",
"titleType":"ai-model",
"owners":[{"ownerId":"widget-corp","percentage":100.0}],
"attributes":{
"baseModelRef":"model-acme-foundation-7b-v2",
"fineTuneDataRef":"dataset-widget-support-tickets",
"intendedUse":"customer support",
"license":"proprietary",
"modelHash":"<sha256>"
},
"signatures":{"widget-corp":"<sig>"}
}'
# Acme signs an authorization event on the fine-tune title
curl -X POST $NODE/api/v1/events -d '{
"subjectId":"model-widgetco-finetune-v1",
"subjectType":"TITLE",
"eventType":"derivation.authorized",
"payload":{
"baseModelRef":"model-acme-foundation-7b-v2",
"derivativeModelRef":"model-widgetco-finetune-v1",
"authorizedUses":["commercial-internal","non-commercial-research"],
"forbiddenUses":["generative-content-for-resale"],
"licenseTermsHash":"<sha256 of full license terms doc>"
},
"creator":"acme-ai-labs","signature":"<sig>"
}'

Without Acme’s signed authorization, Widget Corp’s fine-tune’s event stream lacks the derivation.authorized event. Downstream consumers relying on that authorization can detect it.

7. Inference attestation

When the production service runs an inference:

type InferenceAttestation struct {
InferenceID string
ModelRef string
RequestHash string
ResponseHash string
Timestamp int64
ComputeEnv string
}
func (s *InferenceServer) AttestInference(req InferenceRequest, resp InferenceResponse) error {
event := map[string]interface{}{
"subjectId": s.modelQuid,
"subjectType": "TITLE",
"eventType": "inference.ran",
"payload": map[string]interface{}{
"inferenceID": req.ID,
"requestHash": sha256sum(req),
"responseHash": sha256sum(resp),
"timestamp": time.Now().Unix(),
"computeEnv": s.computeEnv,
},
"creator": s.operatorQuid,
"signature": s.sign(/* canonical bytes */),
}
return s.submitEvent(event)
}

Inference consumers can later verify: “This response I claim came from model X at time T really did.” Useful for:

  • AI-generated content attribution
  • Regulatory compliance (“which model produced this recommendation?”)
  • Debugging: “Did the right model handle this request?“

8. License claim and contest

A publisher detects content from their books in the model’s outputs:

Terminal window
curl -X POST $NODE/api/v1/events -d '{
"subjectId":"model-acme-foundation-7b-v2",
"subjectType":"TITLE",
"eventType":"license.claimed",
"payload":{
"claimType":"copyright-violation",
"claimantJurisdiction":"US",
"evidenceHash":"<sha256>",
"affectedWorks":["isbn-1234567890","isbn-1234567891"],
"demandedRemedy":"cease + statutory damages"
},
"creator":"publisher-x","signature":"<sig>"
}'

Acme contests:

Terminal window
curl -X POST $NODE/api/v1/events -d '{
"subjectId":"model-acme-foundation-7b-v2",
"subjectType":"TITLE",
"eventType":"license.contested",
"payload":{
"contestsClaimID":"<event ID of claim>",
"argumentsHash":"<sha256 of response brief>",
"evidenceHash":"<sha256 of training-data audit>"
},
"creator":"acme-ai-labs","signature":"<sig>"
}'

9. Consumer-side evaluation

func (c *Consumer) PreflightModel(modelID string) PreflightReport {
events := c.GetEvents(modelID, "TITLE")
report := PreflightReport{ModelID: modelID}
for _, ev := range events {
switch ev.EventType {
case "safety.evaluated":
evaluator := ev.Payload["evaluatorOrg"].(string)
trust := c.GetTrust(c.selfQuid, evaluator, "ai.provenance.safety")
report.SafetyAttestations = append(report.SafetyAttestations,
SafetyRecord{Evaluator: evaluator, Rating: ev.Payload["overallRating"].(string), Trust: trust.TrustLevel})
case "benchmark.submitted":
bench := ev.Payload["benchmark"].(string)
score := ev.Payload["score"].(float64)
signerTrust := c.GetTrust(c.selfQuid, ev.Creator, "ai.provenance.benchmarks")
report.Benchmarks = append(report.Benchmarks,
BenchmarkResult{Benchmark: bench, Score: score, ReporterTrust: signerTrust.TrustLevel})
case "license.claimed":
// Check if contested
contested := c.hasContestEvent(events, ev.ID)
if !contested {
report.OpenLicenseClaims = append(report.OpenLicenseClaims, ev)
}
}
}
return report
}

10. Model key rotation (producer lost HSM)

Acme’s signing HSM fails. Initiate guardian recovery:

Terminal window
curl -X POST $NODE/api/v2/guardian/recovery/init -d '{
"subjectQuid":"acme-ai-labs",
"fromEpoch":0,
"toEpoch":1,
"newPublicKey":"<hex>",
"minNextNonce":1,
"maxAcceptedOldNonce":0,
"anchorNonce":<next>,
"validFrom":<now>,
"guardianSigs":[ /* Acme's CEO, CTO, CISO */ ]
}'

Post-rotation, downstream consumers still verify their historical events (those used the old-epoch key, which is still known in the ledger). New events use the new-epoch key.

11. Testing

func TestModelProvenance_DerivationChainVerification(t *testing.T) {
// Register dataset, base model, fine-tune, auth event
// Verify: consumer traversing from fine-tune can reach
// original dataset + all safety attestations
}
func TestModelProvenance_UnauthorizedFineTuneDetectable(t *testing.T) {
// Fine-tune title created without derivation.authorized event
// Consumer's preflight: flags missing authorization
}
func TestModelProvenance_LicenseClaimContest(t *testing.T) {
// Publisher files claim; Acme contests
// Consumer sees both; can decide
}

Where to go next

Threat model

Adversaries, assumed capabilities, mitigations.

Threat Model: AI Model Provenance

Assets

  1. Provenance integrity, the cryptographic record of dataset, training, and fine-tune relationships.
  2. Safety attestations, signed claims from evaluators.
  3. License-dispute evidence, the full chain of claims and counter-claims.
  4. Model producer reputations, a producer who has shipped many well-attested safe models builds trust.

Attackers

AttackerCapabilityGoal
Rogue model producerTheir own signing keyFalse safety claims, hide IP issues
CompetitorNo access to producer’s keysSmear via false license claims
Fake “evaluator”Spins up a new quid claiming to be a safety orgBogus safety endorsements
Data subjectHas their own content in training dataForce takedown via false claims
End userConsumes inferencesVerify authenticity

Threats and mitigations

T1. Producer falsely claims safety

Attack. Acme self-publishes a safety.evaluated event claiming an independent evaluator endorsed safety, but actually signed it with their own key.

Mitigation.

  • Signer verification. The event is signed by whoever submitted it. If Acme submits, only Acme’s key matches. Consumers looking for “evaluator’s own attestation” check creator on the event, not just the content.
  • Relational trust in evaluator. A consumer’s own trust in the evaluator determines weight. Acme self-vouching counts as… Acme self-vouching, weighted only by consumer’s trust in Acme.

Residual risk. None structural. Consumer needs to understand who signs what.

T2. Fake evaluator quid

Attack. Attacker creates a quid named “Anthropic-Evals- Official” and publishes endorsements of a malicious model.

Mitigation.

  • Trust edges must be issued from trusted parties to the quid, consumers don’t trust a quid just because of its name. They trust it because other entities they trust have declared trust in it.
  • Domain ownership (if configured), the ai.provenance.safety domain’s validators can prevent random quids from claiming to be safety evaluators.

Residual risk. Social engineering (naming tricks) can confuse uninformed consumers. Mitigated by tooling that shows “trust path” prominently in UI.

T3. Competitor smear via false license claim

Attack. Competitor publishes a license.claimed event claiming Acme violated their copyright (fabricated).

Mitigation.

  • Acme can contest with license.contested event.
  • Both claim and contest are visible; consumers weigh both by their trust in each party.
  • Frivolous claims from low-trust entities are de-prioritized.

Residual risk. Reputational. A false claim visible on- chain may chill adoption even if contested. Market dynamics.

T4. Model producer’s key compromise

Attack. Acme’s signing key is stolen. Attacker publishes fake derivation events or fake benchmark results.

Mitigation.

  • Guardian recovery rotates Acme’s key. Post-rotation, attacker’s signatures at old epoch become invalid.
  • Anchor nonces, even with the old key, attacker can’t replay or re-use a signature.
  • Quick invalidation path, immediate epoch freeze via invalidation anchor.

Residual risk. Window between compromise and rotation. During window, attacker can publish events with old-epoch sig. Mitigated by monitoring (event-rate anomalies).

T5. Dataset hash forgery

Attack. Acme registers a dataset title with a fake datasetHash. Claims to have trained on a clean dataset when in fact they trained on something else.

Mitigation.

  • Dataset hash is deterministic. Independent auditors can replicate the hash given the dataset. A fake hash can be caught in audit.
  • Safety evaluator’s report (if thorough) audits the claimed training data. A safety.evaluated event from a high-trust evaluator includes this.

Residual risk. If the dataset is proprietary and no one can independently re-hash, the claim is on Acme’s word alone. Mitigated by audit norms.

T6. Fine-tune without authorization

Attack. Someone fine-tunes Acme’s model without authorization, then registers the fine-tune as their own.

Mitigation.

  • Missing derivation.authorized event is detectable. Consumers’ pre-flight checks flag it.
  • Licensing enforcement is ultimately legal; Quidnug provides the evidence trail.

T7. Inference forgery

Attack. Someone serves inferences from model X and claims they’re from model Y.

Mitigation.

  • Inference events signed by the model producer. Fake inferences would need the producer’s signing key.
  • Response hash binds the inference content to the attestation, altering the response breaks the hash match.

Residual risk. If the producer sincerely offers inferences that then get repackaged by middlemen, middlemen can misattribute. Consumer needs to verify inference event signatures, not trust middlemen.

T8. Privacy: what can be inferred from on-chain data?

Concern. The event stream reveals:

  • Which models exist
  • When they were trained
  • Who evaluated them
  • Which datasets they used
  • Inference counts

Mitigation / reality.

  • Metadata is public by design, that’s the point of provenance.
  • Sensitive training data stays OFF-chain; only hashes are published.
  • For inference privacy, emit batch events (“10,000 inferences this hour”) rather than per-inference if volume is sensitive.

T9. Replay

Attack. Attacker replays old events.

Mitigation. Anchor nonce + dedup. Same as every other use case.

T10. Fork-block abuse

Attack. Consortium fork-block changes provenance rules in a way that hides producer accountability.

Mitigation. 2/3 quorum + notice period. See ../institutional-custody/threat-model.md.

Not defended against

  1. Physical-world copyright. Whether a model’s output “substantially resembles” copyrighted training data is a legal question, not a cryptographic one.

  2. Model weight theft. If attacker steals Acme’s model weights, Quidnug can’t prevent it. But if they try to publish the weights under a new quid, there’s no derivation.authorized from Acme, their claim is trivially traceable.

  3. Fine-grained safety. “Safe for adults but not kids”, that’s attribute-level granularity beyond a single event. Multiple signed evaluations with distinct contexts handle this; protocol supports it, just needs consumer-side aggregation logic.

  4. Regulatory-mandated provenance, if the EU AI Act requires specific fields we don’t yet have, add them. Extensible by design.

References