Python SDK

The XeroML Python SDK (xeroml) is the primary way to instrument Python applications. It provides:

Tracing — create traces and observations, propagate context
Prompt management — fetch, compile, and cache prompts
Evaluation — score traces, run dataset experiments
Data access — query traces, metrics, datasets

The SDK is built on OpenTelemetry, so it’s compatible with any OTEL-based instrumentation library.

Installation

pip install xeroml

Quick Setup

Set environment variables:

XEROML_SECRET_KEY="sk-xm-..."
XEROML_PUBLIC_KEY="pk-xm-..."
XEROML_BASE_URL="https://cloud.xeroml.com"

Initialize the client:

from xeroml import get_client

xeroml = get_client()

Instrument your code:

from xeroml import observe

@observe()
def my_function(input: str) -> str:
    return call_llm(input)

Flush before exit (scripts and serverless):
```
xeroml.flush()
```

SDK Packages

Package	Purpose
`xeroml`	Core SDK — tracing, prompt management, evaluation, data access
`xeroml[openai]`	OpenAI integration (included in core)
`xeroml[langchain]`	LangChain CallbackHandler

Instrumentation

Using the @observe Decorator

The simplest way to instrument functions:

from xeroml import observe

@observe()
def generate_response(user_message: str) -> str:
    # Function inputs and outputs are captured automatically
    return call_llm(user_message)

@observe(name="rag-pipeline", type="span", capture_input=False)
def run_rag(query: str) -> str:
    context = retrieve_context(query)
    return generate_with_context(query, context)

Using Context Managers

For explicit control over observation lifecycle:

from xeroml import get_client

xeroml = get_client()

with xeroml.start_as_current_observation(
    name="pipeline",
    type="span",
    input={"query": user_query},
) as obs:
    result = run_pipeline(user_query)
    obs.update(output=result)

Propagating Context

Attach user IDs, session IDs, and other attributes to all observations in a scope:

from xeroml import propagate_attributes

with propagate_attributes(
    user_id="user-123",
    session_id="session-abc",
    tags=["feature:chat"],
    environment="production",
):
    result = handle_request(user_message)

Prompt Management

# Fetch and compile a prompt
prompt = xeroml.get_prompt("my-prompt", label="production")
compiled = prompt.compile(user_name="Alice", topic="finance")

# Use with OpenAI
from xeroml.openai import openai

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": compiled}],
)

Evaluation

# Score a trace
xeroml.score(
    trace_id="trace_abc123",
    name="accuracy",
    value=0.92,
    data_type="NUMERIC",
)

# Run a dataset experiment
dataset = xeroml.get_dataset("my-dataset")
experiment = xeroml.create_experiment(name="v2-test", dataset_name="my-dataset")

for item in dataset.items:
    with experiment.run_item(item) as run:
        output = my_task(item.input)
        run.set_output(output)
        run.score("accuracy", evaluate(output, item.expected_output))