Skip to content

Observability & Application Tracing

AI applications are non-deterministic by nature. The same input can produce different outputs depending on the model, the prompt version, retrieved context, or external tool state. This makes traditional debugging approaches — inspecting logs, adding print statements — insufficient for production LLM systems.

Well-implemented observability gives you the tools to understand what’s happening inside your application and why.

What is Application Tracing?

Application tracing captures structured logs of every request that flow through your system. For LLM applications this means recording:

  • The exact prompt sent to the model (including system instructions and context)
  • The model’s response and any tool calls made
  • Token usage and associated cost
  • Latency for every step in the pipeline
  • Retrieval steps, embeddings, and any non-LLM operations

XeroML captures this data automatically during development and production with no manual instrumentation required for supported frameworks. The result is a detailed, searchable log of every request your application handles.

Getting Started

The fastest way to start is to instrument your existing application. XeroML supports 50+ frameworks and libraries natively.

Get Started with Tracing

Once you have traces flowing, explore the core concepts to understand how XeroML structures your data:

Data Model: Traces, Observations, Sessions

What You Can Do with Traces

After instrumentation, traces unlock the following workflows:

Debugging production issues When something goes wrong in production, open the trace for that request. You’ll see the exact prompt, context, and response that caused the problem — not a stack trace pointing at library internals.

Performance analysis Identify slow steps in your pipeline. Token usage summaries and per-step latency let you spot bottlenecks in retrieval, model calls, or post-processing.

Evaluation Traces are the input to XeroML’s evaluation system. Run LLM-as-a-judge evaluators on live traces, create datasets from interesting examples, and track quality over time.

Prompt iteration When prompts are managed in XeroML, every trace links back to the prompt version that generated it. Compare quality across versions directly.

Core Features

FeatureDescription
SessionsGroup multiple traces from a single conversation or workflow
EnvironmentsSeparate production, staging, and development data
TagsCategorize traces for filtering and reporting
UsersTrack per-user token usage, costs, and feedback
MetadataAttach arbitrary key-value data to any trace or observation
Releases & VersioningTag traces with application version for regression tracking

FAQ

What’s the difference between observability and tracing?

Observability is the broader capability — the ability to understand a system’s internal state from its external outputs. It encompasses metrics, logs, and tracing. Application tracing is one specific tool within observability: it records the complete flow of a request, including every operation, its timing, inputs, and outputs.

How does XeroML differ from general APM tools?

General APM tools (Datadog, New Relic, etc.) are designed for traditional request/response services. They don’t natively understand token counts, model parameters, prompt templates, or evaluation scores. XeroML is built specifically for LLM applications and understands the semantics of your AI workloads out of the box.

What’s the performance impact?

XeroML SDKs send tracing data asynchronously in the background. Traces are batched locally before sending, so there is no added latency to your application’s response time. In short-lived processes (serverless functions, scripts), call flush() before your process exits to ensure all data is sent.

Does XeroML support OpenTelemetry?

Yes. XeroML’s SDKs are built on OpenTelemetry. This means you can use any OTEL-compatible instrumentation library alongside XeroML, and you can send traces to multiple destinations simultaneously.