
2025/06/06 - Simon Späti
DuckDB Ecosystem: June 2025
DuckDB news: DuckLake combines catalog and table format with ACID metadata in SQL. Radio extension adds WebSocket and Redis Pub/Sub. Top CSV benchmark results.
- 8 min read
BYData pipelines today feel like an underground fight: you build them fast, but the real battle starts when you try to serve the results. Welcome to Flight Club.
The first rule of Flight Club? You do not talk to REST.
The second rule? You definitely do not talk to REST.
The third rule? If your pipeline goes limp, chokes on JSON, or taps out on throughput, the session is over.
DuckDB changed how we do local analytics — the lovechild of SQLite and a supercomputer, delivering screaming-fast OLAP without the servers, clusters, or life-ruining setup scripts.
But modern data teams don't just analyze. They integrate, connect, and serve. From BI dashboards to ML pipelines to that one stakeholder who still loves their pivot tables, the need to expose DuckDB cleanly over a network keeps surfacing.
Picture this: Your team has built a lightning-fast DuckDB analytics pipeline that crunches billions of records in seconds. But when it's time to serve those insights to your dashboards or ML models? You're forced to squeeze that beautiful columnar data through the rusty pipes of REST or JDBC. It's like putting a Ferrari engine in a horse-drawn carriage.
The problem? REST is duct tape. JDBC is legacy glue. Both are leaky, brittle, and built for another era.
That's where Apache Arrow Flight SQL comes in.
Not another framework to learn. Not a platform to buy into. A protocol — lean, typed, binary-native. Fire SQL queries and stream columnar data with zero-copy swagger.
It doesn't just work. It flies.
No more encoding rows into JSON just to decode them faster than you can say "technical debt." No more pretending analytics engines are web servers. Flight SQL treats data like it's 2025: fast, typed, and unapologetically direct.
Two open-source servers — Hatch and GizmoSQL — are already strapping rockets to DuckDB with Arrow Flight SQL. Different vibes, same mission: Give DuckDB wings. Let it serve, stream, and scale like the compute beast it is.
In this post, we'll break it down: Why Arrow + Flight SQL is stupidly fast (we're talking 20+ Gb/s per core), how Flight SQL powers real-time pipelines without breaking a sweat, what Hatch and GizmoSQL bring to the DuckDB party, and how local-first analytics just became a distributed superpower.
No REST. No bloat. Just protocol-native performance. Welcome to Flight Club.
Apache Arrow is the Usain Bolt of data formats—columnar, in-memory, and built for speed. It's designed to shuttle structured data across tools and languages without breaking a sweat.
Arrow isn't just a format. It's a shared memory model that says, "Why copy data when you can just point at it?"
Arrow Flight is the network protocol that makes Arrow feel like it's teleporting. Forget JSON blobs or binary spaghetti—Flight streams Arrow batches over gRPC like a data wizard slinging spells.
It's gRPC for tables, with:
Here's a real-world example:
Copy code
# Traditional REST/JDBC way:
# 1. Query database (1-2s)
# 2. Serialize to JSON/rows (0.5-1s)
# 3. Transfer over network (0.2-0.5s)
# 4. Deserialize back to usable format (0.5-1s)
# Total: 2.2-4.5s
# Flight SQL way:
# 1. Query database (1-2s)
# 2. Stream Arrow batches directly (0.1-0.2s)
# Total: 1.1-2.2s

No ORMs, JDBC or REST nonsense. Just fast, typed, structured streams that respect your time.
Flight SQL takes Arrow Flight and slaps SQL semantics on it. Send a query, get an Arrow table back. No middleman, no drama.
This isn't your grandma's database driver. It's SQL for pipelines, built for machines, not GUIs.
| Protocol | Median Round Trip | Payload Format | Peak Throughput |
|---|---|---|---|
| REST | 75 ms | JSON (yawn) | 1-2 Gb/s |
| JDBC | 52 ms | Binary (meh) | 5-10 Gb/s |
| Flight SQL | 18 ms | Arrow IPC (wow) | 20+ Gb/s |
Flight SQL doesn't just win; it laps the competition while sipping coffee.
Two open-source projects are bringing Flight SQL to DuckDB, and they're as different as a duck and a goose. Both get the job done.
Hatch is Go-based, Arrow-native, and built for people who think "composable" is a personality trait. It's experimentable, open to the wild, and always looking for new recruits.
Run it locally, at the edge, or sneak it into a bigger system.

GizmoSQL is a full Arrow Flight SQL server with support for both DuckDB and SQLite as pluggable backends. Built in C++ and extended from Voltron Data's sqlflite, it's been battle-tested, hardened, and upgraded for real-world flexibility.
Whether you want to mount a local DB, run interactive pipelines, or integrate cleanly with BI tools, GizmoSQL is a solid, well-documented launchpad.
DuckDB deserves a clean, stable interface to the world.
Ready to lift off? Here's how to get started with GizmoSQL:
Copy code
docker run -d \ --name gizmosql \ -p 31337:31337 \ -e GIZMOSQL_USERNAME=gizmosql_username \ -e GIZMOSQL_PASSWORD=gizmosql_password \ gizmodata/gizmosql:latest
Give the server a few seconds to start.
Here's how you talk to it:
Copy code
import os
from adbc_driver_flightsql import dbapi as gizmosql, DatabaseOptions
with gizmosql.connect(
uri="grpc+tls://localhost:31337",
db_kwargs={
"username": os.getenv("GIZMOSQL_USERNAME", "gizmosql_username"),
"password": os.getenv("GIZMOSQL_PASSWORD", "gizmosql_password"),
DatabaseOptions.TLS_SKIP_VERIFY.value: "true",
},
) as conn:
with conn.cursor() as cur:
cur.execute(
"SELECT n_nationkey, n_name FROM nation WHERE n_nationkey = ?",
parameters=[24],
)
x = cur.fetch_arrow_table()
print(x)
That's it. No REST endpoints to design. No JDBC drivers to wrestle. Just SQL in, Arrow out, running at memory speed.
Want to serve this to a dashboard? Point Superset or Metabase at your GizmoSQL server. Need real-time ML features? Stream them through Flight SQL. The protocol handles the heavy lifting while you focus on the analytics.
Remember: This is your data. And it's ending one transformation at a time.
Once you unshackle DuckDB with Flight SQL, the possibilities explode like a data piñata:
Flight SQL makes these real, not just PowerPoint dreams. Here's what it means in practice:
Flight SQL is the start, not the finish line. It's the foundation for wilder ideas:
This isn't a platform pitch. It's a protocol revolution. Each innovation builds on Flight's core promise: moving data at the speed of memory, not the speed of serialization.
Flight SQL isn't here to replace everything. It's just the fastest, cleanest, most developer-friendly way to serve columnar data over the wire in 2025. If your team is evaluating architectures for low-latency analytics, removing the network bottleneck with Arrow Flight is half the battle.
DuckDB changed how we crunch data locally. Flight SQL lets it spread its wings and scale horizontally—not just in size, but in impact. It's about unlocking the full potential of your analytics:
No more REST duct tape. No more JDBC relics. Let's build data services that treat DuckDB like the rockstar it is.
Give DuckDB wings. Let it soar.
The last rule of Flight Club? Build fast. Serve smart. Never serialize again.

2025/06/06 - Simon Späti
DuckDB news: DuckLake combines catalog and table format with ACID metadata in SQL. Radio extension adds WebSocket and Redis Pub/Sub. Top CSV benchmark results.

2025/06/09 - Mehdi Ouazza
Learn how DuckLake simplifies metadata and brings fast, database-like features to your data lakehouse — with a hands-on example using DuckDB and PostgreSQL