Observability
Trace every LLM call, tool use, and retrieval step. Understand what’s happening inside your application with structured, searchable logs.
XeroML gives teams the tools to collaboratively develop and monitor LLM applications. Whether you’re debugging a production issue, iterating on prompts, or running systematic evaluations, XeroML brings together the full lifecycle in one place.
Observability
Trace every LLM call, tool use, and retrieval step. Understand what’s happening inside your application with structured, searchable logs.
Prompt Management
Version, deploy, and iterate on prompts without code changes. Separate prompt iteration from engineering deployments.
Evaluation
Score outputs with LLM-as-a-judge, human annotation, or custom logic. Build datasets and run experiments systematically.
API & Data Platform
Access all your data programmatically. Export traces, query metrics, and integrate with your existing data stack.
AI applications are non-deterministic — a trace that passed yesterday may fail today. XeroML’s application tracing captures structured logs of every request: the exact prompt sent, the model’s response, token usage, latency, and any tools or retrieval steps in between. This data is captured automatically with minimal performance impact, as all SDK calls are asynchronous.
Key tracing features:
Rather than hardcoding prompts in your application, XeroML provides a central store where prompts are versioned, labeled, and deployed independently of code. Non-technical teammates can iterate on prompts via the UI while engineers control deployments through labels — no code review required for a text change.
Quality assurance for LLM applications requires more than unit tests. XeroML supports multiple evaluation approaches that can be run in production and offline:
Choose where you want to start:
XeroML is fully open source. The core platform, SDKs, and integrations are all MIT-licensed. You can self-host XeroML on your own infrastructure or use XeroML Cloud.