Observability (OpenLit)
OpenTelemetry-based LLM tracing, metrics, and logs via OpenLit.
CUGA can emit OpenTelemetry traces, metrics, and logs for every LLM call using OpenLit — a drop-in OTel instrumentation for popular LLM SDKs.
Install
OpenLit ships as an optional extra:
pip install "cuga[observability]"
# or with uv:
uv sync --group observabilityConfigure
[observability]
openlit = truePoint OpenLit at your OTLP collector via environment:
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"Common service-identifying env vars (OTEL_SERVICE_NAME, OTEL_RESOURCE_ATTRIBUTES) work as usual — CUGA does not override them.
Local testing stack
The cuga-agent repo ships a docker-compose stack under deployment/docker-compose/openlit/ containing an OTel Collector, Tempo (traces), Prometheus (metrics), and Grafana (UI). Start it, point CUGA at the collector, run a task, and you'll see per-call traces with prompt, model, token counts, latency, and cost.
What gets captured
For each LLM invocation OpenLit records:
- Trace span — start/end, duration, parent-child relationships across planner / shortlister / coder / reflection nodes.
- Attributes — model name, provider, temperature, prompt/response content (off by default — configure per OpenLit's docs), token usage (input/output/total), and any raised exception.
- Metrics — request counts, token counts, and latency histograms exported via OTLP.
Combining with Langfuse
langfuse_tracing = true under [advanced_features] is independent of OpenLit and can be enabled in parallel — useful when you want both an OTel-native pipeline and a Langfuse dashboard.
OpenLit's instrumentation is opt-in per LLM SDK. CUGA enables instrumentation for the providers it ships with (OpenAI, LiteLLM, WatsonX). If you wire in a custom provider, follow OpenLit's instrumentation docs to enable it explicitly.
