A primer on tracing for LLM applications

A primer on tracing for LLM applications

@lotte_verheyden
英语1天前 · 2026年5月15日

AI 功能

319K
42
5
2
110

TL;DR

This guide explains how tracing serves as the foundation for the AI engineering loop, detailing the structure of observations, hierarchical traces, and the difference between traces and sessions.

This is one piece of a series we’re publishing as part of the Langfuse Academy, where we walk through the full AI engineering lifecycle. If you’re new to the series, The AI Engineering Loop is the best place to start.

A short recap of the AI Engineering Loop

The AI Engineering Loop is how teams continuously improve AI systems. It connects what’s happening in production (tracing, monitoring) to structured iteration during development (datasets, experiments, evaluation). Each shipped improvement produces new data, and teams loop through this process continuously.

Lotte - inline image

You can read more on this here.

How tracing fits into the loop

Traditional software is largely deterministic, executions follow a pre-defined format. For LLM applications that's not the case. Agent executions can be messy, we are dealing with emergent behaviour with rich and unexpected inputs and outputs, and execution order. You need something else to follow your agent's behavior: traces.

Tracing is central to the entire improvement loop. Every other step (reviewing, building datasets, running experiments, evaluating) operates on traces.

If you're already familiar with traditional observability concepts, some of what follows may feel repetitive. Feel free to skim or skip ahead.

The anatomy of a trace

A trace can be as complex or as simple as your application requires, but all traces share the same basic structure. It's composed of a set of observations that map out the path your agent took.

An observation is a single step in the process. It has an input, an output, start/end time, and metadata about what happened during that step.

Hierarchy

A trace has a hierarchical tree structure. Nested inside are observations that can contain other observations, forming a parent-child structure that mirrors the actual execution of your AI application.

Lotte - inline image

You can see what happened in what order, and which steps were part of which larger step.

Observation data

Input and output. Every observation can have an input and an output. Most of the time it will have both; in some specific cases it might only have one of the two. It's important for interpretability that you set an input and/or output that makes sense for the type of action happening in that observation.

Observation types. In order to make it easy to differentiate between operations, you'll see different types of observations. Each type of observation is used to capture different kinds of interactions of an agent.

Lotte - inline image

Observation types make it easier to read traces and to filter. In a trace with 20 observations, being able to quickly spot the LLM calls saves time.

Cost, latency, token usage

Beyond input and output, there are a few attributes on observations that are table stakes in any LLM application: cost, latency, and token usage. These are recorded per observation and aggregated at the trace level.

Traces vs sessions

Most of the time you would not see an entire agent's lifecycle execution in one trace. Traces can be grouped into sessions. But where do you draw the line between a trace and a session?

Lotte - inline image

A general rule of thumb is: one trace corresponds to one invocation of your system, typically one API call or one agent execution. A session then groups multiple traces together, for example all the turns in a multi-turn conversation.

Where to start

If you're just getting started, focus on instrumenting one real workflow end to end before trying to cover every possible path.

  1. Set up tracing for one important request path in your application.
  2. Make sure each observation captures useful input, output, and metadata for the step it represents.
  3. Review a handful of real traces manually to confirm that the structure is easy to follow and useful for debugging.

What comes next

Once you see traces, you can move on to the next step: monitoring. Monitoring is what connects traces to the loop of improving and iterating on your agent.

更多可拆解样本

近期爆款文章

探索更多爆款文章

为创作者而生。

从全球 𝕏 爆款文章里发现选题,拆解它为什么能爆,再把可复用的内容结构变成你的下一篇创作灵感。