Core Concepts

AgentLens organizes observability data into four core types. Together they give you a complete picture of what your agents do and why.

Trace

A Trace is the top-level container for a single agent execution. It groups all the work that happens from the moment your agent starts until it finishes. Every span, decision point, and event belongs to exactly one trace.

Properties

FieldTypeDescription
idstringUnique identifier (UUID v4)
namestringHuman-readable label for the trace
statusenumRUNNING, COMPLETED, or ERROR
tagsstring[]Freeform labels for filtering
sessionIdstring?Groups traces from the same session
startedAtISO datetimeWhen the trace began
endedAtISO datetime?When the trace finished (null if still running)

Span

A Span represents a unit of work within a trace. Spans form a tree: each span can have a parent, creating a hierarchy that shows how work is nested. For example, an AGENT span may contain several LLM_CALL and TOOL_CALL child spans.

Span Types

LLM_CALL

A call to a language model (OpenAI, Anthropic, etc.)

TOOL_CALL

An invocation of an external tool or function

MEMORY_OP

A read or write to a vector store or memory system

CHAIN

A sequential pipeline of operations

AGENT

A top-level agent or sub-agent execution

CUSTOM

Any user-defined operation type

Nesting example

Trace: "research-agent"
  Span: "agent" (AGENT)
    Span: "plan" (LLM_CALL)
    Span: "web-search" (TOOL_CALL)
    Span: "summarize" (LLM_CALL)

Key properties

  • input / output — JSON payloads capturing what went in and came out
  • tokenCount — Total tokens consumed (for LLM_CALL spans)
  • costUsd — Dollar cost of this span
  • durationMs — Wall-clock time in milliseconds
  • parentSpanId — Reference to the parent span (null for root spans)

Decision Point

A Decision Point records where your agent chose between alternatives. This is what separates AgentLens from generic tracing tools: you see the reasoning, what was chosen, and what was considered but rejected.

Decision Point Types

TOOL_SELECTION

Agent chose which tool to call

ROUTING

Agent routed to a specific sub-agent or branch

RETRY

Agent decided to retry a failed operation

ESCALATION

Agent escalated to a human or higher-level agent

MEMORY_RETRIEVAL

Agent chose what context to retrieve

PLANNING

Agent formulated a multi-step plan

Structure

decision_point.json
{
  "type": "TOOL_SELECTION",
  "reasoning": "User asked about weather, need real-time data",
  "chosen": { "tool": "weather_api", "confidence": 0.95 },
  "alternatives": [
    { "tool": "web_search", "confidence": 0.72 },
    { "tool": "knowledge_base", "confidence": 0.31 }
  ],
  "contextSnapshot": { "user_intent": "weather_query" }
}

Event

An Event is a discrete occurrence during a trace that does not represent a unit of work but is worth recording. Events capture errors, retries, fallbacks, and other notable moments.

Event Types

ERROR

An exception or failure occurred

RETRY

An operation was retried

FALLBACK

A fallback path was triggered

CONTEXT_OVERFLOW

Context window limit was exceeded

USER_FEEDBACK

User provided feedback on an output

CUSTOM

Any user-defined event type

Example

event.json
{
  "type": "CONTEXT_OVERFLOW",
  "name": "token-limit-exceeded",
  "metadata": {
    "limit": 128000,
    "actual": 131072,
    "truncated_chars": 4200
  },
  "timestamp": "2026-01-15T10:30:00.000Z"
}

How they fit together

Trace: "customer-support-agent"
  |
  +-- Span: "classify-intent" (LLM_CALL)
  |     Decision: ROUTING -> chose "refund-flow" over "faq-flow"
  |
  +-- Span: "refund-flow" (AGENT)
  |     +-- Span: "lookup-order" (TOOL_CALL)
  |     +-- Span: "process-refund" (TOOL_CALL)
  |           Event: ERROR -> "payment-gateway-timeout"
  |           Event: RETRY -> "retrying with backup gateway"
  |     +-- Span: "process-refund-retry" (TOOL_CALL)
  |
  +-- Span: "compose-response" (LLM_CALL)