Skip to main content
Skip to main content
Edit this page

BigQuery SQL translation reference

This document details the similarities and differences in SQL syntax between BigQuery and ClickHouse.

Data types

ClickHouse offers more granular precision than BigQuery for numerics. Where BigQuery has INT64, NUMERIC, BIGNUMERIC, and FLOAT64, ClickHouse provides multiple integer, decimal, and float widths so storage and memory can be tuned to the actual range of the data.

Tip

When several ClickHouse types map to a single BigQuery type, pick the smallest that fits and consider appropriate codecs for further compression.

BigQueryClickHouseNotes
ARRAYArray(t)
BIGNUMERICDecimal256(S)
BOOLBool
BYTESString  or  FixedString(N)
DATEDate  or  Date32Date (2-byte, 1970-2149) for typical analytical data; Date32 (4-byte, 1900-2299) to match BigQuery's full range
DATETIMEDateTime  or  DateTime64(p)Use DateTime64(p) for sub-second precision
FLOAT64Float64
GEOGRAPHYGeo data types
INT64UInt8UInt256 / Int8Int256Pick the smallest signed or unsigned variant that fits the value range
INTERVALNo equivalentUse the INTERVAL expression or date/time arithmetic functions
JSONJSON  or  StringJSON is preferred; String with JSONExtract* accessors works as a fallback
NUMERICDecimal(P, S)Sized variants Decimal32(S) / Decimal64(S) / Decimal128(S) are also available
RANGENo equivalentStore (start, end) columns or a Tuple(start, end)
STRINGStringOptionally wrap in LowCardinality(String) for columns with few distinct values (enums, status codes, country codes). String functions are byte-based; the String family has UTF8 variants where relevant
STRUCTTuple  or  NestedFlattened sibling columns are often more performant in ClickHouse; use Tuple / Nested for named fields when the nested shape is load-bearing
TIMENo equivalentCarry a DateTime64 and extract via formatDateTime
TIMESTAMPDateTime64(6, 'UTC')Use for microsecond-precision UTC parity

DDL statements

Schemas and databases

BigQueryClickHouse
CREATE SCHEMA mydataset
CREATE DATABASE mydb

BigQuery datasets map to ClickHouse databases.

CREATE SCHEMA IF NOT EXISTS mydataset
OPTIONS (location = 'US')
CREATE DATABASE IF NOT EXISTS mydb

Location is a service-level decision in ClickHouse Cloud.

ALTER SCHEMA mydataset SET OPTIONS (description = 'sales')
ALTER DATABASE mydb MODIFY COMMENT 'sales'
DROP SCHEMA mydataset
DROP DATABASE mydb

Add IF EXISTS for an idempotent drop in either engine.

DROP SCHEMA mydataset CASCADE
DROP DATABASE mydb

ClickHouse drops tables in the database unconditionally; there is no CASCADE keyword.

Tables

BigQueryClickHouse
CREATE TABLE mydataset.t (
  id   INT64,
  name STRING
)
CREATE TABLE mydb.t (
  id   Int64,
  name String
)
ENGINE = MergeTree
ORDER BY id

An engine and an ORDER BY are required for MergeTree-family tables; pick the column(s) that match the query access pattern. See Sparse primary indexes.

CREATE TABLE IF NOT EXISTS mydataset.t (id INT64)
CREATE TABLE IF NOT EXISTS mydb.t (id Int64)
ENGINE = MergeTree
ORDER BY id
CREATE OR REPLACE TABLE mydataset.t (id INT64)
CREATE OR REPLACE TABLE mydb.t (id Int64)
ENGINE = MergeTree
ORDER BY id
CREATE TABLE mydataset.t (id INT64)
PARTITION BY DATE(created_at)
CREATE TABLE mydb.t (
  id         Int64,
  created_at DateTime
)
ENGINE = MergeTree
PARTITION BY toDate(created_at)
ORDER BY id

BigQuery partitioning expects a single date/timestamp column; ClickHouse accepts an arbitrary expression. ClickHouse partitions are a storage-organisation feature — not a substitute for ORDER BY.

CREATE TABLE mydataset.t (
  id   INT64,
  name STRING
)
CLUSTER BY id, name
CREATE TABLE mydb.t (
  id   Int64,
  name String
)
ENGINE = MergeTree
ORDER BY (id, name)

BigQuery clustering colocates data; in ClickHouse the equivalent is the table's ORDER BY key, which controls on-disk ordering and the primary index.

CREATE TABLE mydataset.t LIKE mydataset.source
CREATE TABLE mydb.t AS mydb.source

Both copy the schema only; ClickHouse also copies engine and ORDER BY.

CREATE TABLE mydataset.t AS
SELECT * FROM mydataset.source
CREATE TABLE mydb.t
ENGINE = MergeTree
ORDER BY id
AS SELECT * FROM mydb.source

Engine and ORDER BY are required for the new table.

CREATE TEMP TABLE t AS SELECT 1 AS x
CREATE TEMPORARY TABLE t (x Int64) ENGINE = Memory;
INSERT INTO t VALUES (1);

ClickHouse temporary tables are session-scoped and require an explicit engine (commonly Memory).

CREATE EXTERNAL TABLE mydataset.t
WITH CONNECTION ...
OPTIONS (
  format = 'PARQUET',
  uris   = ['gs://...']
)
CREATE TABLE mydb.t (...)
ENGINE = S3('https://.../*.parquet', 'Parquet')

ClickHouse exposes external storage through table engines such as S3, URL, and HDFS.

ALTER TABLE mydataset.t ADD COLUMN tag STRING
ALTER TABLE mydb.t ADD COLUMN tag String
ALTER TABLE mydataset.t DROP COLUMN tag
ALTER TABLE mydb.t DROP COLUMN tag
ALTER TABLE mydataset.t RENAME COLUMN old TO new
ALTER TABLE mydb.t RENAME COLUMN old TO new
ALTER TABLE mydataset.t
ALTER COLUMN amount SET DATA TYPE FLOAT64
ALTER TABLE mydb.t MODIFY COLUMN amount Float64
ALTER TABLE mydataset.t SET OPTIONS (description = '...')
ALTER TABLE mydb.t MODIFY COMMENT '...'
ALTER TABLE mydataset.t RENAME TO t2
RENAME TABLE mydb.t TO mydb.t2
DROP TABLE mydataset.t
DROP TABLE mydb.t
TRUNCATE TABLE mydataset.t
TRUNCATE TABLE mydb.t

Views and materialized views

BigQueryClickHouse
CREATE VIEW mydataset.v AS
SELECT id, name FROM mydataset.t
CREATE VIEW mydb.v AS
SELECT id, name FROM mydb.t

Standard logical views; no storage in either engine.

CREATE OR REPLACE VIEW mydataset.v AS SELECT ...
CREATE OR REPLACE VIEW mydb.v AS SELECT ...
DROP VIEW mydataset.v
DROP VIEW mydb.v
CREATE MATERIALIZED VIEW mydataset.mv AS
SELECT status, count(*) AS c
FROM mydataset.t
GROUP BY status
CREATE MATERIALIZED VIEW mydb.mv
ENGINE = AggregatingMergeTree
ORDER BY status
AS SELECT status, countState() AS c
FROM mydb.t
GROUP BY status

ClickHouse materialized views are incremental insert-time triggers that write into a target table. For an aggregate MV, use AggregatingMergeTree and the -State aggregate-function combinator. Query the MV with countMerge(c) to finalise.

CREATE MATERIALIZED VIEW mydataset.mv
OPTIONS (refresh_interval_minutes = 60)
CREATE MATERIALIZED VIEW mydb.mv
REFRESH EVERY 1 HOUR
ENGINE = MergeTree
ORDER BY id
AS SELECT ...

ClickHouse also supports refreshable materialized views for full-refresh semantics.

Indexes, functions, and procedures

BigQueryClickHouse
CREATE SEARCH INDEX idx ON mydataset.t (name)
ALTER TABLE mydb.t
ADD INDEX idx name
  TYPE text(tokenizer = 'default')
  GRANULARITY 1

ClickHouse full-text indexes accelerate LIKE / token-search predicates.

CREATE VECTOR INDEX idx ON mydataset.t (embedding)
OPTIONS (distance_type = 'COSINE')
ALTER TABLE mydb.t
ADD INDEX idx embedding
  TYPE vector_similarity('hnsw', 'cosineDistance')
  GRANULARITY 1

See Approximate-nearest-neighbour indexes.

CREATE FUNCTION mydataset.add_one(x INT64) AS (x + 1)
CREATE FUNCTION add_one AS (x) -> x + 1

ClickHouse UDFs are expression-only. For complex logic, use the SQL-defined-function or executable-UDF patterns.

CREATE FUNCTION mydataset.f(x INT64)
RETURNS INT64
LANGUAGE js AS 'return x+1;'

No equivalent

ClickHouse has no JavaScript or Python UDFs in standard SQL; use executable UDFs at the server level.

DROP FUNCTION mydataset.add_one
DROP FUNCTION add_one
CREATE PROCEDURE mydataset.p(x INT64) BEGIN ... END

No equivalent

ClickHouse has no stored procedures; orchestrate from a client (Python, Go, etc.) or compose into a SQL function.

DML

Insert operations

BigQueryClickHouse
INSERT INTO mydataset.t VALUES (1, 'a'), (2, 'b')
INSERT INTO mydb.t VALUES (1, 'a'), (2, 'b')
INSERT INTO mydataset.t (id, name) VALUES (1, 'a')
INSERT INTO mydb.t (id, name) VALUES (1, 'a')
INSERT INTO mydataset.t
SELECT id, name FROM mydataset.source
INSERT INTO mydb.t
SELECT id, name FROM mydb.source

Update operations

BigQueryClickHouse
UPDATE mydataset.t SET name = 'x' WHERE id = 1
ALTER TABLE mydb.t UPDATE name = 'x' WHERE id = 1

ClickHouse mutates asynchronously by default. For synchronous, in-place updates use lightweight updates: UPDATE mydb.t SET name = 'x' WHERE id = 1.

Delete operations

BigQueryClickHouse
DELETE FROM mydataset.t WHERE id = 1
DELETE FROM mydb.t WHERE id = 1

ClickHouse's lightweight delete marks rows for removal; physical removal happens at the next merge. For bulk historical cleanup prefer use of TTL.

TRUNCATE TABLE mydataset.t
TRUNCATE TABLE mydb.t

Merge operations

BigQueryClickHouse
MERGE INTO mydataset.t USING mydataset.staging s
  ON t.id = s.id
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT ...

No equivalent

ClickHouse has no equivalent MERGE statement. The idiomatic pattern is a ReplacingMergeTree keyed on the natural key plus a version column; an INSERT of newer rows "wins" at merge time. Use FINAL for read-time deduplication.

DCL

Grants

BigQueryClickHouse
GRANT SELECT ON TABLE mydataset.t
TO 'user:alice@example.com'
GRANT SELECT ON mydb.t TO alice

BigQuery binds to IAM principals (users, groups, service accounts). ClickHouse binds to SQL-level users and roles created with CREATE USER / CREATE ROLE.

GRANT ROLE foo ON TABLE mydataset.t
TO 'user:alice@example.com'
GRANT foo TO alice
REVOKE SELECT ON TABLE mydataset.t
FROM 'user:alice@example.com'
REVOKE SELECT ON mydb.t FROM alice

Roles

BigQueryClickHouse

IAM-managed via console / gcloud

CREATE USER alice
IDENTIFIED WITH plaintext_password BY 'pw'

See Access control and roles.

IAM-managed via console / gcloud

CREATE ROLE analyst

IAM-managed via console / gcloud

DROP USER alice

Syntax

Query syntax

BigQueryClickHouse
SELECT id, name FROM mydataset.t
SELECT id, name FROM mydb.t
SELECT * FROM mydataset.t
SELECT * FROM mydb.t
SELECT * EXCEPT (password) FROM mydataset.t
SELECT * EXCEPT password FROM mydb.t

ClickHouse * EXCEPT accepts a bare list or parenthesised list.

SELECT * REPLACE (lower(name) AS name) FROM mydataset.t
SELECT * REPLACE (lower(name) AS name) FROM mydb.t
SELECT id FROM mydataset.t
WHERE created_at > '2024-01-01'
SELECT id FROM mydb.t
WHERE created_at > '2024-01-01'
SELECT status, count(*) FROM mydataset.t
GROUP BY status
SELECT status, count() FROM mydb.t
GROUP BY status
SELECT status, count(*) FROM mydataset.t
GROUP BY ROLLUP (status, country)
SELECT status, country, count() FROM mydb.t
GROUP BY ROLLUP (status, country)

ClickHouse also supports GROUP BY CUBE and GROUP BY GROUPING SETS.

SELECT status, count(*) AS c FROM mydataset.t
GROUP BY status
HAVING c > 10
SELECT status, count() AS c FROM mydb.t
GROUP BY status
HAVING c > 10
SELECT id FROM mydataset.t ORDER BY id
SELECT id FROM mydb.t ORDER BY id
SELECT id FROM mydataset.t
ORDER BY id DESC NULLS LAST
LIMIT 100
SELECT id FROM mydb.t
ORDER BY id DESC NULLS LAST
LIMIT 100
SELECT id FROM mydataset.t LIMIT 100 OFFSET 50
SELECT id FROM mydb.t LIMIT 100 OFFSET 50
SELECT DISTINCT status FROM mydataset.t
SELECT DISTINCT status FROM mydb.t
WITH recent AS (
  SELECT * FROM mydataset.t
  WHERE created_at > '2024-01-01'
)
SELECT id FROM recent
WITH recent AS (
  SELECT * FROM mydb.t
  WHERE created_at > '2024-01-01'
)
SELECT id FROM recent
SELECT a.id
FROM mydataset.t a
INNER JOIN mydataset.u b ON a.id = b.id
SELECT a.id
FROM mydb.t AS a
INNER JOIN mydb.u AS b ON a.id = b.id

ClickHouse generally requires AS for table aliases.

SELECT a.id
FROM mydataset.t a
LEFT JOIN mydataset.u b ON a.id = b.id
SELECT a.id
FROM mydb.t AS a
LEFT JOIN mydb.u AS b ON a.id = b.id

FULL, RIGHT, LEFT ANY, LEFT SEMI, LEFT ANTI all match between engines.

SELECT a.id
FROM mydataset.t a
CROSS JOIN mydataset.u b
SELECT a.id
FROM mydb.t AS a
CROSS JOIN mydb.u AS b
SELECT a.id
FROM mydataset.t a
JOIN mydataset.u b USING (id)
SELECT a.id
FROM mydb.t AS a
JOIN mydb.u AS b USING (id)
SELECT a, b FROM mydataset.t
UNION ALL
SELECT a, b FROM mydataset.u
SELECT a, b FROM mydb.t
UNION ALL
SELECT a, b FROM mydb.u

ClickHouse rejects bare UNION; use UNION ALL or UNION DISTINCT.

SELECT a FROM mydataset.t
INTERSECT DISTINCT
SELECT a FROM mydataset.u
SELECT a FROM mydb.t
INTERSECT
SELECT a FROM mydb.u
SELECT a FROM mydataset.t
EXCEPT DISTINCT
SELECT a FROM mydataset.u
SELECT a FROM mydb.t
EXCEPT
SELECT a FROM mydb.u
SELECT element
FROM UNNEST(['a','b','c']) AS element
SELECT arrayJoin(['a','b','c']) AS element

Use ARRAY JOIN or arrayJoin.

SELECT id FROM mydataset.t
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY user_id
  ORDER BY created_at DESC
) = 1
SELECT id FROM (
  SELECT *,
    row_number() OVER (
      PARTITION BY user_id
      ORDER BY created_at DESC
    ) AS rn
  FROM mydb.t
)
WHERE rn = 1

ClickHouse has no QUALIFY; wrap the windowed query in a subquery.

SELECT * FROM mydataset.t TABLESAMPLE SYSTEM (10 PERCENT)
SELECT * FROM mydb.t SAMPLE 0.1

ClickHouse SAMPLE requires the table to declare a SAMPLE BY clause.

SELECT * FROM mydataset.t
FOR SYSTEM_TIME AS OF TIMESTAMP '2024-03-15 00:00:00+00:00'

No equivalent

ClickHouse does not provide row-level time travel. Patterns include ReplacingMergeTree with a version column or per-day backup tables.

SELECT id, sum(amount) OVER w
FROM mydataset.t
WINDOW w AS (PARTITION BY user_id ORDER BY created_at)
SELECT id, sum(amount) OVER w
FROM mydb.t
WINDOW w AS (PARTITION BY user_id ORDER BY created_at)
SELECT * FROM mydataset.t
PIVOT (count(*) FOR status IN ('open' AS open, 'closed' AS closed))
SELECT
  countIf(status = 'open')   AS open,
  countIf(status = 'closed') AS closed
FROM mydb.t

ClickHouse has no PIVOT; use countIf / sumIf per output column.

SELECT col, v
FROM (SELECT 1 AS a, 2 AS b)
UNPIVOT (v FOR col IN (a, b))
SELECT col, v
FROM (
  SELECT [('a', a), ('b', b)] AS pairs FROM mydb.t
)
ARRAY JOIN pairs.1 AS col, pairs.2 AS v

ClickHouse has no UNPIVOT; emit (name, value) tuples and ARRAY JOIN.

Pipe syntax

BigQuery's pipe syntax chains transformations with the |> operator. ClickHouse has no equivalent; each pipe operator desugars to a clause in standard SQL. Subqueries or CTEs are the readable way to chain stages.

BigQueryClickHouse
FROM mydataset.t
|> SELECT id, name
SELECT id, name FROM mydb.t
FROM mydataset.t
|> WHERE amount > 100
SELECT * FROM mydb.t WHERE amount > 100
FROM mydataset.t
|> EXTEND amount * 0.1 AS tax
SELECT *, amount * 0.1 AS tax FROM mydb.t
FROM mydataset.t
|> SET amount = amount * 2
SELECT * REPLACE (amount * 2 AS amount) FROM mydb.t
FROM mydataset.t
|> DROP password
SELECT * EXCEPT password FROM mydb.t
FROM mydataset.t
|> RENAME amount AS price
SELECT * REPLACE (amount AS price) FROM mydb.t

ClickHouse * REPLACE substitutes the column expression; rename via an outer SELECT if you also need the column name to change.

FROM mydataset.t
|> AGGREGATE sum(amount) AS total
   GROUP BY status
SELECT status, sum(amount) AS total
FROM mydb.t
GROUP BY status
FROM mydataset.t
|> ORDER BY created_at DESC
SELECT * FROM mydb.t ORDER BY created_at DESC
FROM mydataset.t
|> LIMIT 100
SELECT * FROM mydb.t LIMIT 100
FROM mydataset.t
|> JOIN mydataset.u USING (id)
SELECT * FROM mydb.t JOIN mydb.u USING (id)
FROM mydataset.t
|> WHERE amount > 100
|> AGGREGATE count(*) GROUP BY status
SELECT status, count()
FROM mydb.t
WHERE amount > 100
GROUP BY status

Pipes compose left-to-right; in ClickHouse, layer clauses or use a CTE for readability.

Procedural language

ClickHouse SQL is not a procedural language. Variables, loops, statement-level IF / CASE, and stored procedures have no first-class equivalents; orchestrate multi-step logic from a client library (Python, Go, JavaScript, etc.) or use parameterized views for templated queries.

BigQueryClickHouse
DECLARE x INT64 DEFAULT 0

client-side variable, or SET param_x = 0 for query parameters

SET x = 5

client-side; SET param_x = 5

BEGIN ... END

multi-statement scripts are run by the client, not the server

IF cond THEN ... ELSE ... END IF

expression form if(cond, a, b); for statement-level branching, branch in the client

CASE WHEN cond THEN ... ELSE ... END CASE

expression form CASE WHEN cond THEN a ELSE b END; no statement form

WHILE cond DO ... END WHILE

no equivalent — use a client-driven loop

LOOP ... END LOOP

no equivalent

FOR row IN (SELECT ...) DO ... END FOR

iterate over query results client-side

BREAK, CONTINUE, LEAVE, ITERATE

no equivalent

CALL mydataset.p(1)

no stored procedures

EXECUTE IMMEDIATE 'SELECT 1'

client-side prepared statements

RAISE USING MESSAGE = 'bad'
SELECT throwIf(1, 'bad')
BEGIN TRANSACTION; ...; COMMIT TRANSACTION

multi-statement transactions are experimental — see transactions roadmap

Operators

BigQueryClickHouseNotes

a + b, a - b, a * b, a / b

a + b, a - b, a * b, a / b

/ is real division in both engines.

DIV(a, b)
intDiv(a, b)

Integer division.

MOD(a, b)
a % b -- or modulo(a, b)

a = b, a != b, a <> b

a = b, a != b, a <> b

a < b, a <= b, a > b, a >= b

same

a AND b, a OR b, NOT a

same

a IN (1, 2, 3)
a IN (1, 2, 3)
a NOT IN (1, 2, 3)
a NOT IN (1, 2, 3)
a BETWEEN x AND y
a BETWEEN x AND y

Inclusive of both bounds.

a IS NULL -- or a IS NOT NULL

same

a LIKE 'pre%'
a LIKE 'pre%'

ClickHouse also offers ILIKE for case-insensitive matching.

CONCAT(a, b) -- or a || b
concat(a, b) -- or a || b

Conditional expressions

BigQueryClickHouse
CASE WHEN c THEN a ELSE b END
CASE WHEN c THEN a ELSE b END
CASE x WHEN 1 THEN 'a' ELSE 'b' END
CASE x WHEN 1 THEN 'a' ELSE 'b' END
IF(cond, a, b)
if(cond, a, b)
IFNULL(a, b)
ifNull(a, b) -- or coalesce(a, b)
NULLIF(a, b)
nullIf(a, b)
COALESCE(a, b, c)
coalesce(a, b, c)

Conversion

BigQueryClickHouse
CAST(x AS STRING)
CAST(x AS String) -- or toString(x)
CAST(x AS INT64)
CAST(x AS Int64) -- or toInt64(x)
CAST(x AS FLOAT64)
CAST(x AS Float64) -- or toFloat64(x)
CAST(x AS DATE)
CAST(x AS Date) -- or toDate(x)
CAST(x AS TIMESTAMP)
CAST(x AS DateTime64(6, 'UTC'))
-- or parseDateTime64BestEffort(x, 6, 'UTC')

BigQuery TIMESTAMP is microsecond UTC.

SAFE_CAST(x AS INT64)
toInt64OrNull(x)

Each numeric type has toTypeOrNull / toTypeOrZero / toTypeOrDefault variants.

PARSE_NUMERIC('3.14')
toDecimal64('3.14', 2)

Specify scale explicitly in ClickHouse.

PARSE_BIGNUMERIC('3.14')
toDecimal256('3.14', 38)

Functions

Array functions

Compared to BigQuery's roughly eight array functions, ClickHouse has more than 80 built-in array functions. The idiomatic pattern is to aggregate row values into an array with groupArray, transform with higher-order lambda functions (arrayMap, arrayFilter, arrayZip), and optionally expand back to rows with arrayJoin. Because of this, many transformations BigQuery expresses by round-tripping through UNNEST collapse to a single function call in ClickHouse.

BigQueryClickHouse
ARRAY_CONCAT(a, b)
arrayConcat(a, b)
ARRAY_LENGTH(tags)
length(tags)
ARRAY_REVERSE(tags)
arrayReverse(tags)
ARRAY_TO_STRING(tags, ',')
arrayStringConcat(tags, ',')
GENERATE_ARRAY(1, 5)
range(1, 6)

ClickHouse range excludes the upper bound.

Aggregate functions

BigQuery exposes roughly 18 aggregate functions plus a handful of approximate aggregates. ClickHouse ships more than 150 aggregate functions and adds combinators — suffixes such as -If, -Array, -Map, -ForEach, -Merge, and -State — that compose with any aggregate to extend its behaviour across data shapes or to use it inside materialized views.