Databricks Sink Node

Quick Reference

Table (Asset) Select or create a table from Databricks.

Select Warehouse Select the Databricks SQL Warehouse to execute the load command.

Overview

The Databricks Sink Node allows you to ingest processed data directly into Databricks Unity Catalog tables. Unlike direct file writers, this node leverages the Databricks SQL Compute engine to ensure ACID compliance and governance within the Databricks ecosystem.

How It Works

This node operates in a three-step "Stage and Load" process to maximize reliability and throughput:

Stage: Data is converted to Parquet format locally.
Upload: Parquet files are uploaded securely to a Databricks Volume (Unity Catalog managed storage).
Load: A COPY INTO SQL command is executed on your designated SQL Warehouse. This command loads the data from the Volume into your target table transactionally.

Configuration

UI Selection	Description
Select Table (Asset)	Choose a target table from the dropdown (e.g., prod.finance.revenue_reports). You can select an existing table or define a new one directly in the UI. You can also create/import tables in the Data Assets section
Select Warehouse	Select the Databricks SQL Warehouse to execute the load command.

Advanced Settings

These settings control the performance and behavior of the ingestion process but do not affect the destination topology.

Field Name	Description	Default
Batch Size	Number of records to accumulate before triggering a "Stage and Load" operation.	`10,000`
Flush Interval	Maximum time (ms) to wait before forcing a write, ensuring low latency for low-volume streams.	`30,000` (30s)
Cleanup After Copy	Automatically deletes the temporary Parquet files from the Databricks Volume after a successful load.	`True`

When to use the Databricks Sink

Choose Databricks Sink if you are building business-critical tables that analysts query immediately, and you want the safety and ease of Databricks Unity Catalog governance.
Choose Delta Lake Sink if you are building a raw data landing zone, are sensitive to compute costs, or need to write data to storage that isn't strictly coupled to a Databricks workspace.

Quick Reference​

Overview​

How It Works​

Configuration​

Advanced Settings​

When to use the Databricks Sink​