LlamaIndex

XeroML integrates with LlamaIndex via OpenTelemetry. LlamaIndex has built-in OTEL instrumentation support — configure it to export to XeroML’s endpoint.

Installation

pip install xeroml llama-index llama-index-instrumentation-opentelemetry

Setup

import os
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from llama_index.core import Settings
from llama_index.instrumentation.opentelemetry import OpenTelemetryInstrumentor
import base64

# Configure OTLP exporter pointing to XeroML
credentials = base64.b64encode(
    f"{os.environ['XEROML_PUBLIC_KEY']}:{os.environ['XEROML_SECRET_KEY']}".encode()
).decode()

exporter = OTLPSpanExporter(
    endpoint=f"{os.environ['XEROML_BASE_URL']}/api/public/otel/v1/traces",
    headers={"Authorization": f"Basic {credentials}"},
)

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(exporter))

# Attach to LlamaIndex
OpenTelemetryInstrumentor().instrument(tracer_provider=provider)

Usage

After setup, all LlamaIndex operations are automatically traced:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load and index documents
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)

# Query — automatically traced with retrieval and generation steps
query_engine = index.as_query_engine()
response = query_engine.query("What is XeroML?")
print(response)

XeroML captures the full RAG pipeline: query embedding, vector retrieval, context assembly, and LLM generation.

Adding Context

Use propagate_attributes() to add user/session context:

from xeroml import propagate_attributes

with propagate_attributes(user_id="user-123", session_id="session-abc"):
    response = query_engine.query("What is XeroML?")