Skip to content

Data Model

XeroML organizes tracing data into three hierarchical concepts: Observations, Traces, and Sessions. Understanding this structure helps you instrument your application correctly and query your data effectively.

Observations

An observation is the fundamental unit of XeroML’s data model. It represents a single step or operation within a trace — an LLM call, a retrieval step, a tool use, an embedding operation, or any custom span you define.

Observations come in three types:

TypeWhen to use
generationAny LLM call. XeroML automatically captures model, prompt, completion, token usage, and cost.
spanA logical grouping of sub-steps, like a RAG pipeline or a multi-step agent loop.
eventA point-in-time occurrence with no duration — logging a decision, a cache hit, or a state change.

Observations can be nested: a span can contain generations, events, and other spans. The nesting reflects the actual call hierarchy of your application.

Traces

A trace is a container for all observations produced by a single request or interaction. When a user sends a message to your chatbot, the entire handling of that message — from retrieval through generation to post-processing — is one trace.

Traces have a flat set of attributes that propagate to all contained observations:

  • name — human-readable identifier
  • userId — the end user making the request
  • sessionId — the session this trace belongs to (optional)
  • environment — production, staging, development, etc.
  • tags — string labels for filtering
  • metadata — arbitrary key-value pairs
  • version — application version or release tag
  • input / output — the trace-level input and final output

In XeroML V4, context attributes set at the trace level automatically propagate down to all observations. This enables faster queries via single-table operations rather than expensive joins.

Sessions

A session is an optional grouping of related traces. Sessions are used for multi-turn conversations where a single logical interaction spans multiple request/response cycles.

For example, a chatbot conversation with 10 back-and-forth exchanges would produce 10 traces, all sharing the same sessionId. XeroML’s Session Replay view stitches these together into a single timeline.

Sessions follow a one-to-many relationship: one session contains many traces. The sessionId is any US-ASCII string under 200 characters — typically a UUID generated at the start of a conversation.

Data Enrichment Attributes

You can attach additional context to any trace or observation:

AttributePurpose
environmentSeparate production, staging, and development data
tagsCategorize by feature, endpoint, or experiment
userIdTrack per-user usage and quality
metadataArbitrary structured data (JSON)
versionApplication version for regression tracking
releaseRelease identifier for deployment correlation

Technical Architecture

OpenTelemetry Foundation

XeroML’s SDKs are built on OpenTelemetry. The mapping is straightforward:

OpenTelemetryXeroML
TraceTrace
SpanObservation (span, generation, or event)
Span attributesObservation attributes

This means any OTEL-compatible instrumentation library works with XeroML, and you can route traces to multiple backends simultaneously.

Background Processing

XeroML SDKs batch trace data locally before sending it asynchronously. This means:

  1. No added latency — your application’s response time is not affected
  2. Flush required in short-lived processes — serverless functions and scripts must call flush() before terminating to prevent data loss
# Always call flush() in scripts or serverless functions
from xeroml import get_client
xeroml = get_client()
# ... your instrumented code ...
xeroml.flush()

Attribute Propagation

When you use propagate_attributes() (Python) or propagateAttributes() (TypeScript), the specified attributes are automatically applied to all observations created within that context — regardless of nesting depth or which library made the call.

from xeroml import observe, propagate_attributes
@observe()
def handle_request(user_id: str, session_id: str):
with propagate_attributes(user_id=user_id, session_id=session_id):
# All nested observations automatically inherit these attributes
result = run_pipeline()
return result