Skip to main content

Fleak SQL Language Reference

Introduction

Fleak SQL is a SQL dialect that follows Postgres SQL standards while providing enhanced functionality for handling nested data structures, JSON operations, and log parsing. This reference guide covers the syntax, data types, operators, and functions available in Fleak SQL.

Basic Syntax

Fleak SQL supports standard SQL query syntax:

SELECT [DISTINCT [ON (expression_list)]] column1 [AS alias1], column2 [AS alias2], ...
FROM source_table [AS alias]
[JOIN another_table [ON condition]]
[WHERE conditions]
[GROUP BY column1, column2, ...]
[HAVING group_conditions]
[ORDER BY column1 [ASC|DESC], column2 [ASC|DESC], ...]
[LIMIT count]
[;]
note

Always use events as the source_table to represent the input data.

Query Components

SELECT Statement

The SELECT statement retrieves data from tables or expressions:

-- Basic selection
SELECT id, timestamp, status FROM events;

-- Select all columns
SELECT * FROM events;

-- Column aliasing
SELECT timestamp AS event_time, message AS event_message FROM events;

-- DISTINCT to remove duplicates
SELECT DISTINCT status FROM events;

-- DISTINCT ON to keep first row per group
SELECT DISTINCT ON (user_id) user_id, timestamp, message
FROM events
ORDER BY user_id, timestamp DESC;

FROM Clause

The FROM clause specifies the data source:

-- Simple table reference
FROM events

-- Table with alias
FROM events AS e

-- Subquery
FROM (SELECT * FROM raw_events WHERE type = 'error') AS filtered_events

WHERE Clause

The WHERE clause filters rows:

-- Basic condition
WHERE status = 'success'

-- Multiple conditions
WHERE status = 'error' AND severity > 3

-- JSON field conditions
WHERE data->'user'->>'role' = 'admin'

-- Pattern matching
WHERE message LIKE '%error%'

-- Range check
WHERE timestamp BETWEEN '2023-01-01' AND '2023-01-31'

-- NULL check
WHERE response_code IS NULL
WHERE response_code IS NOT NULL

JOIN Clause

Joins combine rows from multiple tables:

-- INNER JOIN
FROM table1 JOIN table2 ON table1.id = table2.table1_id

-- LEFT JOIN
FROM table1 LEFT JOIN table2 ON table1.id = table2.table1_id

-- RIGHT JOIN
FROM table1 RIGHT JOIN table2 ON table1.id = table2.table1_id

-- FULL JOIN
FROM table1 FULL JOIN table2 ON table1.id = table2.table1_id

-- CROSS JOIN
FROM table1 CROSS JOIN table2

GROUP BY and HAVING Clauses

GROUP BY groups rows with the same values:

-- Group by one column
GROUP BY status

-- Group by multiple columns
GROUP BY region, status

-- Filter groups with HAVING
GROUP BY status HAVING COUNT(*) > 10

ORDER BY Clause

ORDER BY sorts the result set:

-- Ascending order (default)
ORDER BY timestamp

-- Descending order
ORDER BY timestamp DESC

-- Multiple sort columns
ORDER BY status ASC, timestamp DESC

LIMIT Clause

LIMIT restricts the number of rows returned:

-- Limit to 10 rows
LIMIT 10

Data Types

Fleak SQL supports the following data types:

Type NameDescription
INT, INTEGERInteger value
TEXT, STRING, VARCHARString value
DATEDate value
TIMETime value
TIMESTAMPTimestamp value
JSONJSON object or array
NUMERICDecimal value
DOUBLE PRECISIONDouble precision floating point
BOOLEAN, BOOLBoolean value (true/false)

Type Casting

Values can be cast to different types using the :: operator:

-- Cast to string
SELECT field::text FROM events

-- Cast to integer
SELECT field::int FROM events
SELECT field::integer FROM events
SELECT field::smallint FROM events
SELECT field::bigint FROM events

-- Cast to numeric types
SELECT field::numeric FROM events

-- Cast to boolean
SELECT field::boolean FROM events
SELECT field::bool FROM events

-- Cast to JSON
SELECT field::json FROM events

Operators

Arithmetic Operators

OperatorDescriptionExample
+AdditionSELECT amount + tax FROM events
-SubtractionSELECT price - discount FROM events
*MultiplicationSELECT quantity * price FROM events
/DivisionSELECT total / count FROM events
%ModuloSELECT id % 10 FROM events
^ExponentSELECT 2 ^ 3 FROM events

Comparison Operators

OperatorDescriptionExample
=EqualWHERE status = 'success'
!=, <>Not equalWHERE status != 'error'
<Less thanWHERE count < 10
>Greater thanWHERE response_time > 100
<=Less than or equalWHERE count <= 20
>=Greater than or equalWHERE response_time >= 50

Logical Operators

OperatorDescriptionExample
ANDLogical ANDWHERE status = 'error' AND severity > 3
ORLogical ORWHERE status = 'success' OR status = 'pending'
NOTLogical NOTWHERE NOT status = 'error'

String Operators

OperatorDescriptionExample
``
LIKEPattern matching (case-sensitive)WHERE name LIKE 'A%'
ILIKEPattern matching (case-insensitive)WHERE name ILIKE 'a%'

JSON Operators

OperatorDescriptionExample
->Get JSON object field or array element as JSONdata->'user'->'name'
->>Get JSON object field or array element as textdata->>'email'

Working with JSON Data

Fleak SQL provides powerful operators for handling JSON data structures.

JSON Access Examples

-- Access object field as JSON
SELECT data->'user'->'name' FROM events;

-- Access object field as text
SELECT data->>'email' FROM events;

-- Access array element
SELECT items->0->'price' FROM events;

-- Access array element and convert to text
SELECT items->0->>'name' FROM events;

Special JSON Behavior

Unlike standard Postgres SQL, Fleak SQL provides more flexible type handling with JSON operators:

  • Automatic string-to-JSON casting when possible
  • -> is preferred for chained access to preserve JSON types
  • Use ->> only when you need the final result as a string

For example:

-- This works in Fleak SQL (auto-casting string to JSON)
SELECT modelOutputs->0->'result'->'AAPL' FROM events;

-- Avoid this pattern (may cause errors)
SELECT modelOutputs->>0->'result' FROM events;

Functions

JSON Functions

FunctionDescriptionExample
json_build_objectCreate a JSON objectjson_build_object('name', username, 'age', age)
json_array_lengthGet JSON array lengthjson_array_length(items)
json_array_elementsExpand JSON array into rowsSELECT * FROM events, json_array_elements(items) item
json_eachExpand JSON object into key/value rowsSELECT * FROM events, json_each(data) kv
json_each_textExpand JSON object into key/value rows as textSELECT * FROM events, json_each_text(data) kv
json_object_keysGet JSON object keys as rowsSELECT * FROM events, json_object_keys(data) k
json_strip_nullsRemove object fields with null valuesjson_strip_nulls(data)
json_aggAggregate values as JSON arrayjson_agg(value)

String Functions

FunctionDescriptionExample
concatConcatenate stringsconcat(first_name, ' ', last_name)
concat_wsConcatenate with separatorconcat_ws(', ', first_name, last_name)
upperConvert to uppercaseupper(name)
lowerConvert to lowercaselower(email)
lengthString lengthlength(message)
char_lengthCharacter countchar_length(message)
substringExtract substringsubstring(text, 1, 10)
regexp_replaceReplace using regexregexp_replace(text, 'pattern', 'replacement')
regexp_split_to_arraySplit string to arrayregexp_split_to_array(text, ',')
regexp_split_to_tableSplit string to rowsSELECT * FROM events, regexp_split_to_table(tags, ',')
regexp_matchesFind regex matchesregexp_matches(text, 'pattern')
split_partGet part after splittingsplit_part(text, ',', 2)
trimRemove whitespacetrim(text)
ltrimRemove leading whitespaceltrim(text)
rtrimRemove trailing whitespacertrim(text)
replaceReplace substringreplace(text, 'old', 'new')
leftGet leftmost charactersleft(text, 5)
rightGet rightmost charactersright(text, 5)
positionFind substring positionposition('needle' IN 'haystack')
formatFormat stringformat('Hello, %s!', name)
md5MD5 hashmd5(text)
btrimTrim specified charactersbtrim(text, 'xyz')
reverseReverse stringreverse(text)
to_hexConvert to hexto_hex(255)

Mathematical Functions

FunctionDescriptionExample
absAbsolute valueabs(value)
roundRound numberround(value, 2)
ceil, ceilingRound up to nearest integerceil(value)
floorRound down to nearest integerfloor(value)
randomRandom value between 0 and 1random()
sqrtSquare rootsqrt(16)
powerRaise to powerpower(2, 3)
modModulomod(10, 3)
truncTruncate decimaltrunc(value, 2)
signSign (-1, 0, 1)sign(-15)
gcdGreatest common divisorgcd(12, 8)
lcmLeast common multiplelcm(12, 8)
expExponentialexp(1)
logNatural logarithmlog(value)
lnNatural logarithmln(value)
log10Base-10 logarithmlog10(100)
log2Base-2 logarithmlog2(8)
piPI constantpi()
degreesRadians to degreesdegrees(pi())
radiansDegrees to radiansradians(180)
sinSinesin(radians(30))
cosCosinecos(radians(60))
tanTangenttan(radians(45))

Date and Time Functions

FunctionDescriptionExample
nowCurrent timestampnow()
date_truncTruncate timestampdate_trunc('hour', timestamp)
to_timestampConvert to timestampto_timestamp(epoch_value)
to_dateConvert to dateto_date(string, format)

Aggregate Functions

FunctionDescriptionExample
countCount rowsSELECT count(*) FROM events
sumSum valuesSELECT sum(amount) FROM events
maxMaximum valueSELECT max(timestamp) FROM events
minMinimum valueSELECT min(timestamp) FROM events
avgAverage valueSELECT avg(score) FROM events
string_aggConcatenate stringsSELECT string_agg(tag, ',') FROM events, json_array_elements(tags) tag

Important Note: Aggregation functions in Fleak SQL only work with the current input event list. For HTTP requests with multiple events, aggregation works across all events in the request. For Kafka sources, each input event list typically contains only one message.

Set-returning Functions

These functions return multiple rows from a single input:

FunctionDescriptionExample
json_array_elementsExpand JSON array into rowsSELECT * FROM events, json_array_elements(items) item
regexp_split_to_tableSplit string to rowsSELECT * FROM events, regexp_split_to_table(tags, ',')
json_eachExpand JSON object into key/value rowsSELECT * FROM events, json_each(data) kv

Other Functions

FunctionDescriptionExample
greatestLargest value from listgreatest(a, b, c)
leastSmallest value from listleast(a, b, c)

Log Parsing with grok

The grok function provides powerful log parsing capabilities:

SELECT grok('%{IPV4:client_ip} - %{WORD:method} %{URIPATHPARAM:request}', log_entry) AS extracted_fields
FROM events

This extracts structured data from unstructured log text:

"127.0.0.1 - GET /index.html" → { "client_ip": "127.0.0.1", "method": "GET", "request": "/index.html" }

CASE Expressions

CASE expressions provide conditional logic:

-- Simple CASE
SELECT
CASE status
WHEN 'success' THEN 'Completed'
WHEN 'error' THEN 'Failed'
ELSE 'Unknown'
END AS status_text
FROM events;

-- Searched CASE
SELECT
CASE
WHEN response_time < 100 THEN 'Fast'
WHEN response_time < 300 THEN 'Normal'
ELSE 'Slow'
END AS performance
FROM events;

Subqueries and Derived Tables

You can use subqueries to create derived tables:

-- Subquery in SELECT
SELECT id, (SELECT count(*) FROM json_array_elements(items)) AS item_count
FROM events;

-- Subquery in FROM
SELECT u.name, e.count
FROM users u
JOIN (SELECT user_id, count(*) AS count FROM events GROUP BY user_id) e
ON u.id = e.user_id;

-- Subquery in WHERE
SELECT * FROM events
WHERE status IN (SELECT status FROM event_categories WHERE severity > 3);

Best Practices

  1. Use type casting to ensure proper data type handling:

    SELECT timestamp::timestamp, amount::numeric FROM events
  2. Prefer -> for JSON access chains, using ->> only for the final extraction:

    -- Good
    SELECT data->'user'->'address'->>'city' FROM events

    -- Avoid (may cause errors in parsing nested structures)
    SELECT data->>'user'->>'address'->>'city' FROM events
  3. Use set-returning functions for processing arrays and structured data:

    SELECT item.value->>'id' AS item_id
    FROM events,
    json_array_elements(items) item
  4. Remember aggregation scope - aggregations work on the current input event list only

Limitations and Special Behaviors

  1. Case sensitivity: Table names and columns are case-insensitive.

  2. Aggregation scope: Aggregation functions only work within the current input event list.

  • For HTTP requests with multiple events, aggregation works across all events in the request
  • For Kafka sources, each input event list typically contains only one message, limiting aggregation across messages
  1. Implicit grouping: Simple queries with aggregate functions like SELECT count(*) FROM events imply an implicit grouping, equivalent to SELECT count(*) FROM events GROUP BY 1.

  2. JSON handling: The type system is more flexible than standard Postgres SQL and tries to cast types to container elements when using JSON operators.

  3. NULL handling: Operations on null values generally propagate null.