§ Datadog · Langfuse · LangSmith · Arize
Built to debug. Not to defend.
Best-in-class for engineering — traces, evals, prompt iteration, latency hunts. The data model is the OpenTelemetry trace: (trace_id, span_id, input, output, latency). Default retention is 14–15 days. The logs are mutable, billed by span volume, and untouched by control mapping. Keep them; Runfile reads OTel and rides alongside.
§ Vanta · Drata · OneTrust
Built for checklists. Not for runs.
Vanta hit $300M ARR in April 2026; Drata is at $100M; OneTrust at half a billion. They are excellent at SOC 2 evidence rooms and the policy/inventory layer. The data model is (control_id, evidence_artifact, owner, status). They do not model the agent run. Runfile feeds them, not the other way around.
§ Credo AI · Holistic AI · ValidMind
Built for policy. Not for proof.
Credo AI’s “Agent Registry,” Holistic’s bias and red-team toolkits, ValidMind’s bank-MRM evidence are real and useful. They generate audit artefacts at the policy layer. The agent execution graph — the actual sequence of prompts, tools, retrievals, refusals and approvals — is not their object. It is ours.