Query Google Firestore with SQL: Introducing the FireDuckExt DuckDB Extension

Google Firestore is excellent for real-time applications — flexible schema, scalable, built for high-frequency reads and writes. But its query API has well-known limitations: no joins, limited aggregations, no window functions, and no easy way to run ad-hoc analytics across multiple collections. The usual workaround is exporting data to BigQuery or a data warehouse — adding latency, cost, and complexity. FireDuckExt takes a different approach: bring SQL to Firestore directly, by querying it from DuckDB.

What Is DuckDB?

DuckDB is an in-process analytical database — think SQLite, but optimized for analytical workloads rather than transactional ones. It runs inside your process, requires no separate server, and can query CSV files, Parquet files, Arrow data, and now — via FireDuckExt — Google Firestore collections, all with standard SQL.

Quick Start

sql

-- Load the extension
LOAD fire_duck_ext;

-- Configure credentials
CREATE SECRET my_firestore (
    TYPE firestore,
    PROJECT_ID 'my-gcp-project',
    SERVICE_ACCOUNT_JSON '/path/to/credentials.json'
);

-- Query a Firestore collection with full SQL
SELECT status, COUNT(*) AS total, AVG(order_value) AS avg_value
FROM firestore_scan('orders')
WHERE created_at > '2026-01-01'
GROUP BY status
ORDER BY total DESC;

Key Features

Full SQL: joins, aggregations, window functions, CTEs — anything DuckDB supports works on Firestore data
Filter pushdown: supported WHERE clauses (equality, range, IN, IS NOT NULL) are pushed to Firestore to reduce data transfer
ORDER BY / LIMIT pushdown: sorted and top-N queries fetch only what's needed from Firestore
CRUD operations: firestore_insert(), firestore_update(), firestore_delete() for write access
Array transforms: firestore_array_union(), firestore_array_remove() for Firestore array field semantics
Collection group queries: prefix with ~ to query across all subcollections with the same name
Vector embedding support: Firestore vector fields map to DuckDB ARRAY(DOUBLE, N) for distance computations
Multiple databases: target different Firestore databases per secret or per call

Filter and Sort Pushdown

Firestore's native query API supports equality, range, and IN filters — but only on a single field at a time without a composite index. FireDuckExt pushes supported WHERE clauses down to Firestore automatically, reducing documents fetched. DuckDB re-applies all filters after the scan for correctness, so unsupported filters (LIKE, IS NULL, OR, complex expressions) still work — they just scan all documents first. Use EXPLAIN to see exactly what gets pushed:

sql

EXPLAIN
SELECT * FROM firestore_scan('users')
WHERE status = 'active' AND age > 25
ORDER BY created_at DESC
LIMIT 50;

-- Output shows:
-- Firestore Pushed Filters: status EQUAL 'active', age GREATER_THAN 25
-- Firestore Pushed Order: created_at DESC
-- Firestore Pushed Limit: 50

Authentication Options

The extension supports service account JSON (file path or inline JSON text), API keys for development, and the GOOGLE_APPLICATION_CREDENTIALS environment variable for zero-config setup. Named parameters on each function call override the secret for per-call credential control — useful when querying across multiple GCP projects.

Installation

sql

-- Install via DuckDB Community Extensions
INSTALL fire_duck_ext FROM community;
LOAD fire_duck_ext;

FireDuckExt is open source (MIT license) and available on the DuckDB Community Extensions registry. Source at github.com/BorisBesky/fire_duck_ext.