User-Bot Latency Observer

The UserBotLatencyObserver measures the time between when a user stops speaking and when the bot starts responding, emitting events for custom handling and optional OpenTelemetry tracing integration. It also tracks first-bot-speech latency and provides detailed per-service latency breakdowns when metrics are enabled.

Features

Tracks user speech start/stop timing using VAD frames
Measures bot response latency from the actual moment the user started speaking
Measures first bot speech latency (client connection to first speech)
Provides detailed latency breakdown with per-service TTFB, text aggregation, user turn duration, and function call metrics
Emits on_latency_measured events for custom processing
Emits on_latency_breakdown events with detailed per-service metrics
Emits on_first_bot_speech_latency event for greeting latency measurement
Automatically records latency as OpenTelemetry span attributes when tracing is enabled
Automatically resets between conversation turns

Usage

Basic Latency Monitoring

Add latency monitoring to your pipeline and handle the event:

from pipecat.observers.user_bot_latency_observer import UserBotLatencyObserver

latency_observer = UserBotLatencyObserver()

@latency_observer.event_handler("on_latency_measured")
async def on_latency_measured(observer, latency):
    print(f"User-to-bot latency: {latency:.3f}s")

task = PipelineTask(
    pipeline,
    params=PipelineParams(observers=[latency_observer]),
)

Detailed Latency Breakdown

Enable metrics to collect per-service latency breakdown:

from pipecat.observers.user_bot_latency_observer import UserBotLatencyObserver

latency_observer = UserBotLatencyObserver()

@latency_observer.event_handler("on_latency_breakdown")
async def on_latency_breakdown(observer, breakdown):
    print(f"Latency breakdown ({len(breakdown.chronological_events())} events):")
    for event in breakdown.chronological_events():
        print(f"  {event}")

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        observers=[latency_observer],
        enable_metrics=True,  # Required for breakdown metrics
    ),
)

OpenTelemetry Integration

When tracing is enabled, latency measurements are automatically recorded as turn.user_bot_latency_seconds attributes on OpenTelemetry turn spans. No additional configuration is needed.

How It Works

The observer tracks conversation flow through these key events:

Client connects (ClientConnectedFrame) → Records timestamp for first-bot-speech measurement
User starts speaking (VADUserStartedSpeakingFrame) → Resets latency tracking
User stops speaking (VADUserStoppedSpeakingFrame) → Records timestamp, accounting for VAD stop_secs delay
Bot starts speaking (BotStartedSpeakingFrame) → Calculates latency and emits on_latency_measured and on_latency_breakdown events

When enable_metrics=True in PipelineParams, the observer also collects per-service metrics (TTFB, text aggregation, function call latency) from MetricsFrame instances and includes them in the latency breakdown.

Event Handlers

on_latency_measured

Called each time a user-to-bot latency measurement is captured.

@latency_observer.event_handler("on_latency_measured")
async def on_latency_measured(observer, latency):
    # latency is a float representing seconds
    logger.info(f"Response latency: {latency:.3f}s")

on_latency_breakdown

Called alongside on_latency_measured with detailed per-service metrics collected during the user→bot cycle. The breakdown includes TTFB from each service, text aggregation latency, user turn duration, and function call timings.

@latency_observer.event_handler("on_latency_breakdown")
async def on_latency_breakdown(observer, breakdown):
    # breakdown is a LatencyBreakdown object
    logger.info("Latency breakdown:")
    for event in breakdown.chronological_events():
        logger.info(f"  {event}")

LatencyBreakdown fields:

Field	Type	Description
`ttfb`	`List[TTFBBreakdownMetrics]`	Time-to-first-byte metrics from each service
`text_aggregation`	`Optional[TextAggregationBreakdownMetrics]`	First text aggregation measurement (sentence aggregation latency)
`user_turn_start_time`	`Optional[float]`	Unix timestamp when user turn started (adjusted for VAD stop_secs)
`user_turn_secs`	`Optional[float]`	User turn duration including VAD silence detection, STT finalization, and turn analyzer wait
`function_calls`	`List[FunctionCallMetrics]`	Latency for each function call executed during the cycle

The breakdown.chronological_events() method returns a human-readable list of all metrics sorted by start time, useful for logging and debugging.

on_first_bot_speech_latency

Called once when the bot first speaks after client connection. Measures the time from ClientConnectedFrame to the first BotStartedSpeakingFrame. This is particularly useful for measuring greeting latency.

@latency_observer.event_handler("on_first_bot_speech_latency")
async def on_first_bot_speech_latency(observer, latency):
    logger.info(f"First bot speech latency: {latency:.3f}s")

The on_latency_breakdown event is also emitted for the first bot speech, allowing you to see the detailed breakdown of what contributed to the greeting latency.

Configuration

Constructor Parameters

max_frames

int

default:"100"

Maximum number of frame IDs to keep in history for duplicate detection. Prevents unbounded memory growth in long conversations.

Limitations

Requires proper frame sequencing to work accurately
Per-service metrics are only collected when enable_metrics=True in PipelineParams

Pipecat Server

Pipecat Subagents

Pipecat Flows

Pipecat Cloud

CLI

User-Bot Latency Observer

Features

Usage

Basic Latency Monitoring

Detailed Latency Breakdown

OpenTelemetry Integration

How It Works

Event Handlers

on_latency_measured

on_latency_breakdown

on_first_bot_speech_latency

Configuration

Constructor Parameters

Limitations

Pipecat Server

Pipecat Subagents

Pipecat Flows

Pipecat Cloud

CLI

​Features

​Usage

​Basic Latency Monitoring

​Detailed Latency Breakdown

​OpenTelemetry Integration

​How It Works

​Event Handlers

​on_latency_measured

​on_latency_breakdown

​on_first_bot_speech_latency

​Configuration

​Constructor Parameters

​Limitations

Features

Usage

Basic Latency Monitoring

Detailed Latency Breakdown

OpenTelemetry Integration

How It Works

Event Handlers

on_latency_measured

on_latency_breakdown

on_first_bot_speech_latency

Configuration

Constructor Parameters

Limitations