Custom Scores
Custom scores let you attach evaluation results to traces from any source — your own deterministic checks, user feedback, A/B test outcomes, or external evaluation systems. They’re the most flexible evaluation method in XeroML.
Use Cases
- User feedback — store thumbs up/down or star ratings collected in your UI
- Deterministic checks — regex match, JSON schema validation, exact string comparison
- Business metrics — order completed, support ticket resolved, conversion rate
- External evaluators — scores from third-party safety filters or domain-specific models
- Custom LLM judges — run your own evaluation logic and push results to XeroML
Adding Scores via SDK
Score a specific trace by ID:
from xeroml import get_client
xeroml = get_client()
xeroml.score( trace_id="trace_abc123", name="user-feedback", value=1, # thumbs up comment="User clicked helpful", data_type="NUMERIC",)Score with categorical values:
xeroml.score( trace_id="trace_abc123", name="safety-check", value="pass", data_type="CATEGORICAL",)Score a specific observation (span or generation):
xeroml.score( trace_id="trace_abc123", observation_id="obs_xyz789", name="retrieval-relevance", value=0.92, data_type="NUMERIC",)import { XeroMLClient } from "@xeroml/client";
const xeroml = new XeroMLClient();
await xeroml.score({ traceId: "trace_abc123", name: "user-feedback", value: 1, comment: "User clicked helpful", dataType: "NUMERIC",});curl -X POST https://cloud.xeroml.com/api/public/v2/scores \ -u "pk-xm-...:sk-xm-..." \ -H "Content-Type: application/json" \ -d '{ "traceId": "trace_abc123", "name": "user-feedback", "value": 1, "comment": "User clicked helpful", "dataType": "NUMERIC" }'Passing the Trace ID to Your Frontend
To score traces from user interactions, you need the trace ID in your frontend. Generate it deterministically so you don’t need to pass it through your response:
from xeroml import create_trace_id, propagate_attributes
# Generate a deterministic trace ID from a session + request identifiertrace_id = create_trace_id(seed=f"session-{session_id}-msg-{message_id}")
with propagate_attributes(trace_id=trace_id): response = handle_request(user_message)
# Return trace_id to the frontend alongside the responsereturn {"response": response, "trace_id": trace_id}Then on the frontend, when the user submits feedback, send the trace_id back and call the XeroML API (or your backend which calls XeroML) to store the score.
Score Data Types
| Type | Values | Use for |
|---|---|---|
NUMERIC | Any float | Ratings, confidence scores, latency penalties |
BOOLEAN | true / false | Pass/fail checks, binary feedback |
CATEGORICAL | Any string | Named categories, multi-class labels |
Viewing Custom Scores
Custom scores appear in the trace detail view alongside automated scores. In the Metrics dashboard, you can chart any score name over time, compare across prompt versions, and filter traces by score value.