Different data model. Different buyer.

We’re not trying to be a better Datadog or a faster Vanta. The agent execution graph as a first-class, tamper-evident audit object is not what any of these vendors ship today. Here is exactly where the lines fall.

Auditor question Dev observability
Datadog · Langfuse · LangSmith · Arize
GRC platforms
Vanta · Drata · OneTrust
AI governance
Credo AI · Holistic AI · ValidMind
Runfile
Reconstruct the decision end-to-end ◐ Spans, 14-day ✗ Not modelled ◐ Policy layer Execution graph
Prove the log hasn’t been touched ✗ Mutable ✗ Mutable ✗ Mutable Hash-chained · signed · anchored
Map an event to a specific control ✗ No mapping ◐ Checklist-grade ◐ Policy-grade Event → predicate
Retain for 6 months to 10 years ✗ 14-day default ◐ Document-grade ◐ Document-grade Tiered, S3 Object Lock
Verify offline, by a third party verify.sh + public anchor
Produce a single auditor-facing artefact ✗ Dashboards ◐ Evidence rooms ◐ Policy reports Signed PDF + JSONL + verifier
Data residency for EU / UK / US ◐ Patchy Day one, single-tenant
Bills on what you ship, not on volume ✗ Per-span / per-GB ✓ Seat / framework ✓ Framework Per execution + retention tier
Who buys it Platform Eng CISO · Compliance Head of AI Governance CCO + Internal Audit
§ Datadog · Langfuse · LangSmith · Arize

Built to debug. Not to defend.

Best-in-class for engineering — traces, evals, prompt iteration, latency hunts. The data model is the OpenTelemetry trace: (trace_id, span_id, input, output, latency). Default retention is 14–15 days. The logs are mutable, billed by span volume, and untouched by control mapping. Keep them; Runfile reads OTel and rides alongside.

§ Vanta · Drata · OneTrust

Built for checklists. Not for runs.

Vanta hit $300M ARR in April 2026; Drata is at $100M; OneTrust at half a billion. They are excellent at SOC 2 evidence rooms and the policy/inventory layer. The data model is (control_id, evidence_artifact, owner, status). They do not model the agent run. Runfile feeds them, not the other way around.

§ Credo AI · Holistic AI · ValidMind

Built for policy. Not for proof.

Credo AI’s “Agent Registry,” Holistic’s bias and red-team toolkits, ValidMind’s bank-MRM evidence are real and useful. They generate audit artefacts at the policy layer. The agent execution graph — the actual sequence of prompts, tools, retrievals, refusals and approvals — is not their object. It is ours.

— 01
You’re a consumer app with no regulated workload and no contractual evidence obligations.
Buy LangSmith or Langfuse.
— 02
You need real-time guardrails — block a bad output before it leaves the model.
Buy Lakera or Galileo. Runfile records; it does not gate.
— 03
You need a SOC 2 evidence room and policy library, no agents in scope yet.
Vanta or Drata is the right shape today.
§ Next step

Bring the comparison
to your auditor.

We’ll do a 30-minute walk-through with whoever signs off your model risk programme. If they have an obvious objection, we’d like to hear it.