Google Firestore is excellent for real-time applications — flexible schema, scalable, built for high-frequency reads and writes. But its query API has well-known limitations: no joins, limited aggregations, no window functions, and no easy way to run ad-hoc analytics across multiple collections. The usual workaround is exporting data to BigQuery or a data warehouse — adding latency, cost, and complexity. FireDuckExt takes a different approach: bring SQL to Firestore directly, by querying it from DuckDB.
What Is DuckDB?
DuckDB is an in-process analytical database — think SQLite, but optimized for analytical workloads rather than transactional ones. It runs inside your process, requires no separate server, and can query CSV files, Parquet files, Arrow data, and now — via FireDuckExt — Google Firestore collections, all with standard SQL.
Quick Start
-- Load the extension
LOAD fire_duck_ext;
-- Configure credentials
CREATE SECRET my_firestore (
TYPE firestore,
PROJECT_ID 'my-gcp-project',
SERVICE_ACCOUNT_JSON '/path/to/credentials.json'
);
-- Query a Firestore collection with full SQL
SELECT status, COUNT(*) AS total, AVG(order_value) AS avg_value
FROM firestore_scan('orders')
WHERE created_at > '2026-01-01'
GROUP BY status
ORDER BY total DESC;Key Features
- Full SQL: joins, aggregations, window functions, CTEs — anything DuckDB supports works on Firestore data
- Filter pushdown: supported WHERE clauses (equality, range, IN, IS NOT NULL) are pushed to Firestore to reduce data transfer
- ORDER BY / LIMIT pushdown: sorted and top-N queries fetch only what's needed from Firestore
- CRUD operations: firestore_insert(), firestore_update(), firestore_delete() for write access
- Array transforms: firestore_array_union(), firestore_array_remove() for Firestore array field semantics
- Collection group queries: prefix with ~ to query across all subcollections with the same name
- Vector embedding support: Firestore vector fields map to DuckDB ARRAY(DOUBLE, N) for distance computations
- Multiple databases: target different Firestore databases per secret or per call
Filter and Sort Pushdown
Firestore's native query API supports equality, range, and IN filters — but only on a single field at a time without a composite index. FireDuckExt pushes supported WHERE clauses down to Firestore automatically, reducing documents fetched. DuckDB re-applies all filters after the scan for correctness, so unsupported filters (LIKE, IS NULL, OR, complex expressions) still work — they just scan all documents first. Use EXPLAIN to see exactly what gets pushed:
EXPLAIN
SELECT * FROM firestore_scan('users')
WHERE status = 'active' AND age > 25
ORDER BY created_at DESC
LIMIT 50;
-- Output shows:
-- Firestore Pushed Filters: status EQUAL 'active', age GREATER_THAN 25
-- Firestore Pushed Order: created_at DESC
-- Firestore Pushed Limit: 50Authentication Options
The extension supports service account JSON (file path or inline JSON text), API keys for development, and the GOOGLE_APPLICATION_CREDENTIALS environment variable for zero-config setup. Named parameters on each function call override the secret for per-call credential control — useful when querying across multiple GCP projects.
Installation
-- Install via DuckDB Community Extensions
INSTALL fire_duck_ext FROM community;
LOAD fire_duck_ext;