Skip to main content

GCS Source Node

Quick Reference

Use Credentials Select a GCP credential used to access the bucket. Optional — if no credential is selected, the node uses the workflow runtime's Application Default Credentials.

Bucket Name The Google Cloud Storage bucket the node reads from. ex: my-data-lake

Object Prefix Optional prefix used to filter which objects in the bucket are read. Leave blank to read every object. ex: events/2026/

Encoding Type Format the node uses to decode the content of each object. ex: JSON_OBJECT_LINE for newline-delimited JSON files

The gcssource node reads objects from a Google Cloud Storage bucket and emits each object's decoded content as workflow records.

Overview

The GCS Source connector lets you pull data stored in Google Cloud Storage into your workflow. It functions by listing objects in the configured bucket (optionally filtered by a prefix), downloading each object, and decoding its content using the selected encoding type.

This source is designed for batch ingestion. When the workflow runs, the connector authenticates with GCP, iterates through the matching objects, and emits the records contained in each object.

Prerequisites

Before configuring the source, ensure you have:

  • GCP credential with read access to the target bucket. The credential can use a service account JSON key, an OAuth access token, or Application Default Credentials.
  • Object access permission: the credential must have at least the Storage Object Viewer role on the bucket.

Configuration

FieldDescriptionRequiredPlaceholder
Use CredentialsSelect or create a GCP credential. The credential stores either a service account JSON key, an OAuth access token, or Application Default Credentials.Nogcs-credential
Bucket NameName of the Google Cloud Storage bucket the node reads from.Yesmy-data-lake
Object PrefixPrefix used to filter the objects listed in the bucket. Only objects whose name starts with this value are read. Leave blank to read every object.Noevents/2026/
Encoding TypeFormat used to decode each object's content into workflow records.YesJSON_OBJECT_LINE

Use Credentials

Select an existing GCP credential from the dropdown or create a new one. Supported authentication types:

  • Service Account JSON key — paste the full JSON keyfile contents.
  • Access Token — provide a short-lived OAuth ya29.* token.
  • Application Default Credentials — used automatically when no credential is selected and the workflow runtime has ADC available.

The credential needs the Storage Object Viewer role (or equivalent read access) on the target bucket.

Bucket Name

The exact name of the GCS bucket. The bucket must exist before the workflow runs.

Object Prefix

Use the prefix to scope the read to a subset of the bucket — for example, a date-partitioned folder. The connector matches objects whose key starts with the prefix exactly, so include any trailing / if you want to limit the read to a folder.

Encoding Type

Choose the format of the objects you are reading. Common choices:

  • JSON_OBJECT_LINE — newline-delimited JSON, one record per line.
  • JSON_ARRAY — a JSON array where each element becomes a record.
  • JSON_OBJECT — a single JSON object per file.
  • CSV — comma-separated rows; the first row is treated as the header.
  • STRING_LINE / TEXT — plain text, one line per record.
  • XML / PARQUET — also supported.

Examples

Example: Replay Archived Events

Use this node when historical events have been archived to GCS and you want to reprocess them through the workflow.

  • Select the credential that can read the archive bucket.
  • Enter the bucket name where the archive lives.
  • Set Object Prefix to the date range you want to replay (for example events/2026/04/).
  • Choose JSON_OBJECT_LINE if each archived file is a .jsonl export.

Example: Ingest a Single File

If you only want to process one file in the bucket, set Object Prefix to the full object key. The connector will list only that object and read it once.

  • GCS Sink: Write workflow records back to a Google Cloud Storage bucket
  • S3 Sink: Write workflow records to AWS S3
  • Kafka Source: Consume streaming events from Kafka