S3 Output Node

The S3 Sink node writes workflow output data to AWS S3 via a configured data asset, with optional batching for high-volume streams.

The S3 Output Node stores workflow output data in Amazon S3. It uses a data asset to encapsulate the AWS connection details, making it ideal for data archival, backup, or integration with other AWS services.

Quick Reference

S3 Folder Select an existing S3 folder (asset) or create a new one.

Enable Batching Toggle to enable batching of records before writing to S3. Default: true.

Batch Size Number of records to accumulate before writing. Default: 10,000. Only available when batching is enabled.

Flush Interval (ms) Maximum time in milliseconds to wait before forcing a write. Default: 10,000. Only available when batching is enabled.

Configuration

Field	Description	Required	Placeholder
S3 Folder	Select an existing S3 folder data asset or create a new one. The data asset encapsulates the AWS connection, region, bucket, key path, and file format.	Yes	N/A
Enable Batching	Toggle to enable batching of records before writing.	No	`true`
Batch Size	Number of records to accumulate before writing (only when batching is enabled). Minimum: `1`.	No	`10,000`
Flush Interval (ms)	Maximum time (ms) to wait before forcing a write (only when batching is enabled).	No	`10,000`

Batching

When batching is disabled, each record is written to S3 individually. This is simple but can result in a high number of S3 API calls.

When batching is enabled, records accumulate in memory and are flushed to S3 when either condition is met first:

The number of accumulated records reaches the Batch Size (default: 10,000)
The Flush Interval timer expires (default: 10,000 ms)

This reduces the number of write operations and is recommended for high-volume streams.

File Formats

The output file format is configured on the S3 data asset. Supported formats:

CSV (.csv)
JSON Object (.json) — does not support batching
JSON Array (.json)
JSON Lines (.jsonl)
Parquet (.parquet) — requires batching enabled and an Avro schema configured on the data asset

Output Path Structure

The S3 key path depends on whether batching is enabled.

Batching mode:

{key}/year=YYYY/month=MM/day=DD/{uuid}.{ext}

Files are organized into date-partitioned folders using UTC timestamps. Each batch is written with a unique UUID filename.

Without batching:

{key}/{epochMillis}.{ext}

Each record is written individually with a millisecond-precision timestamp filename.

Usage Tips

Ensure the data asset's AWS credentials have appropriate permissions to write to the target S3 bucket
Enable batching for high-volume streams to reduce the number of S3 write operations
Parquet format requires batching to be enabled and an Avro schema on the data asset
JSON Object format is incompatible with batching

Quick Reference​

Configuration​

Batching​

File Formats​

Output Path Structure​

Usage Tips​