Writing Connectors
Connectors are how Volley reads from and writes to external systems. Volley provides built-in connectors for Kafka, cloud blob stores, Delta Lake, and Iceberg. You can also write your own.
Source and Sink Traits
#![allow(unused)]
fn main() {
// Source: produces StreamRecords
trait Source: Send {
async fn poll(&mut self) -> Option<StreamRecord>;
fn connector_id(&self) -> &str;
}
// Sink: consumes StreamRecords
trait Sink: Send {
async fn write(&mut self, record: StreamRecord) -> Result<()>;
async fn flush(&mut self) -> Result<()>;
fn on_checkpoint(&mut self, epoch: u64);
fn connector_id(&self) -> &str;
}
}
Sources produce StreamRecord values via poll(). Sinks consume them via write() and flush(). Both support the checkpoint lifecycle via on_checkpoint().
Cloud Blob Store Pattern
All cloud blob store connectors follow a shared architecture:
Cloud Queue/Topic Cloud Storage
(SQS / Azure Queue / (S3 / Azure Blob /
Pub/Sub) GCS)
| |
v v
NotificationSource BlobReader
| |
+--------+ +-----------+
| |
v v
BlobSource (volley-connector-blob-store)
|
v
Decoder (Parquet / JSON / CSV / Avro)
|
v
RecordBatch --> Pipeline
To add a new cloud backend, implement two traits:
NotificationSource— polls a queue for new object notificationsBlobReader— reads object data from cloud storage
The decoder layer (Parquet, JSON, CSV, Avro -> Arrow RecordBatch) is shared across all cloud backends.
Feature Flags
The volley-connectors umbrella crate uses feature flags:
| Feature | What it includes |
|---|---|
kafka | KafkaSource, KafkaSink |
blob-store | BlobSource, decoders |
blob-store-aws | S3Source, SqsNotificationSource |
blob-store-azure | AzureBlobSource, AzureQueueNotificationSource |
blob-store-gcs | GcsSource, PubSubNotificationSource |
file-formats | Parquet, JSON, CSV, Avro decoders |
table-formats | Delta Lake + Iceberg sinks |
Existing Connectors
- Kafka — source and sink with exactly-once semantics
- Delta Lake — sink with epoch-tagged commits
- Iceberg — sink via REST catalog
- AWS S3 — source via SQS notifications
- Blob Store — Azure Blob and GCS sources