XeroML Documentation

XeroML is compliance observability for every AI agent in finance — logged, auditable, regulator-ready.

Get Started with Tracing Live Verification Interactive Demo

What is XeroML?

XeroML gives finance teams the controls to monitor, verify, and govern AI agents end to end. Whether you’re investigating an incident, enforcing policy before execution, preparing audit evidence, or reviewing model behavior, XeroML keeps every action logged, auditable, and regulator-ready.

Live Verification

Intercept agent decisions in real time, verify compliance before execution, and generate audit-ready records automatically.

See live verification architecture →

Observability

Trace every LLM call, tool use, and retrieval step. Understand what’s happening inside your application with structured, searchable logs.

Get started with tracing →

Prompt Management

Version, deploy, and iterate on prompts without code changes. Separate prompt iteration from engineering deployments.

Manage your prompts →

Evaluation

Score outputs with LLM-as-a-judge, human annotation, or custom logic. Build datasets and run experiments systematically.

Set up evaluations →

Compliance Operations

Enforce policy rules on traces, route risky outcomes to reviewers, and maintain tamper-evident audit records.

Explore compliance workflows →

API & Data Platform

Access all your data programmatically. Export traces, query metrics, and integrate with your existing data stack.

Explore the API →

Core Capabilities

Observability

AI applications are non-deterministic — a trace that passed yesterday may fail today. XeroML’s application tracing captures structured logs of every request: the exact prompt sent, the model’s response, token usage, latency, and any tools or retrieval steps in between. This data is captured automatically with minimal performance impact, as all SDK calls are asynchronous.

Key tracing features:

Track all LLM and non-LLM operations (retrieval, embeddings, API calls)
Multi-turn conversation monitoring via Sessions
Visualize agent graphs with nested observations
50+ native integrations: OpenAI, LangChain, LlamaIndex, Vercel AI SDK, and more
OpenTelemetry-based architecture to reduce vendor lock-in

Prompt Management

Rather than hardcoding prompts in your application, XeroML provides a central store where prompts are versioned, labeled, and deployed independently of code. Non-technical teammates can iterate on prompts via the UI while engineers control deployments through labels — no code review required for a text change.

Instant production deployment via labels (no code redeploy)
Client-side SDK caching — as fast as reading from memory
Link prompts to traces to analyze per-version performance
Interactive Playground for testing prompts before deployment

Evaluation

Quality assurance for LLM applications requires more than unit tests. XeroML supports multiple evaluation approaches that can be run in production and offline:

LLM-as-a-Judge — scalable automated evaluation for nuanced qualities
Human Annotation — structured annotation queues for ground truth
Custom Scores — numeric, boolean, or categorical scores via API/SDK
Datasets & Experiments — systematic testing before every deployment

Quickstart

Choose where you want to start:

Trace your first LLM call → Observability: Get Started
Manage prompts outside code → Prompt Management: Get Started
Score your application outputs → Evaluation: Core Concepts
Set up compliance guardrails → Compliance: Get Started

Regulator Readiness

XeroML is built for finance teams that need provable AI governance. Every agent decision can be traced, reviewed, and reported with complete audit context.