Per-Record Tracing

Volley can trace individual records as they flow through a pipeline, creating OpenTelemetry spans at each processing stage. This gives you a complete picture of what happened to a record: which operators processed it, how long each step took, what the input and output looked like, and whether any errors occurred.

Quick Start

#![allow(unused)]
fn main() {
use volley_core::prelude::*;
use volley_core::observability::{TracingConfig, SamplingStrategy};

let results = StreamExecutionEnvironment::new()
    .from_source(my_source)
    .with_tracing(TracingConfig {
        sampling: SamplingStrategy::Ratio(0.01),
        ..Default::default()
    })
    .map(|record| {
        // Your transformation
        Ok(record)
    })
    .to_sink(my_sink)
    .execute("traced-pipeline")
    .await?;
}

Open Jaeger or Grafana Tempo to see traces. Each sampled record appears as a trace with child spans for source, operators, and sink.

How It Works

Sampling decision happens at the source. Based on your SamplingStrategy, the runtime decides whether to trace each record.
Root span is created for sampled records (volley.record). The span’s context is attached to the StreamRecord.trace field.
Child spans are created automatically at each pipeline stage (operators, sink). No code changes needed in your operators.
Payload previews (optional) capture a JSON snapshot of the first row at each operator’s input and output, size-capped to max_payload_bytes.
Cleanup happens at pipeline exit. Any root spans for records that were filtered or lost to errors are ended.

Sampling

At production throughput (thousands of records per second), tracing every record would overwhelm your tracing backend. Use Ratio sampling:

Throughput	Suggested ratio	Traces/sec
1K rec/s	0.10 (10%)	~100
10K rec/s	0.01 (1%)	~100
100K rec/s	0.001 (0.1%)	~100

Use Always only during development or debugging.

Window and Aggregate Traces

Window and aggregate operators accumulate input records into new output records. The output gets its own trace, linked back to the input traces via OpenTelemetry span links. This preserves lineage without creating misleading parent-child relationships.

The volley.window.input_trace_count attribute tells you how many sampled records contributed to each aggregation result.

Kafka Trace Propagation

When using Kafka connectors, trace context propagates automatically via W3C traceparent headers:

Kafka source extracts traceparent from message headers
Kafka sink injects traceparent into produced messages

A traced request in an upstream service continues its trace through Volley and into downstream consumers. The runtime respects upstream sampling decisions (parent-based sampling).

Configuration Reference

Field	Type	Default	Description
`sampling`	`SamplingStrategy`	`Never`	Which records to trace
`max_payload_bytes`	`usize`	`1024`	Max JSON preview size per span event
`capture_payload`	`bool`	`true`	Whether to attach input/output previews

Viewing Traces

Volley emits traces via the OTLP gRPC exporter configured in ObservabilityConfig. Point it at your collector:

#![allow(unused)]
fn main() {
.with_observability(ObservabilityConfig::new()
    .with_node_id("node-1")
    .with_otlp_endpoint("http://otel-collector:4317"))
.with_tracing(TracingConfig {
    sampling: SamplingStrategy::Ratio(0.01),
    ..Default::default()
})
}

Compatible backends: Jaeger, Grafana Tempo, Datadog, Honeycomb, or any OTLP-compatible collector.

Error Handling

When a record fails during processing (e.g., a transformation returns an error), the record’s trace span is:

Marked with otel.status_code = ERROR
Annotated with the error message via span.record_error()
Ended at the point of failure

Failed records are visible in your tracing backend by filtering on otel.status_code = ERROR. This makes tracing a useful debugging tool even when records are dropped or filtered.

Performance Impact

Tracing adds minimal overhead when sampling is configured appropriately:

Sampling	Overhead	Notes
`Never`	~0%	No spans created; trace context still propagated for Kafka
`Ratio(0.001)`	< 1%	Recommended for high-throughput production
`Ratio(0.01)`	~1-2%	Good for staging or moderate-throughput production
`Always`	~5-10%	Development and debugging only

The overhead comes from span creation and OTLP export, not from sampling decisions (which are a single random number comparison).

Troubleshooting

No traces appearing in the backend

Verify with_observability() is configured with the correct otlp_endpoint
Check that the OTLP collector is reachable from your application
Confirm sampling is not set to Never (the default)
Check collector logs for rejected spans (often caused by resource limits)

Traces appear but are missing operators

Ensure with_tracing() is called before operators are added to the pipeline
Custom operators using apply_operator() create spans automatically; no manual instrumentation needed

Kafka traces not connecting to upstream services

The upstream producer must inject W3C traceparent headers into Kafka message headers
If the upstream doesn’t send traceparent, Volley creates a new root trace (no parent linkage)

Keyboard shortcuts

Volley Documentation