Simon Späti

Simon Späti's photo

Technical Author & Data Engineer

Simon is a Data Engineer and Technical Author with 20+ years of experience in the data field. He's the author of the Data Engineering Blog (ssp.sh), curator of the Data Engineering Vault (vault.ssp.sh), and currently writes a book about Data Engineering Design Patterns (dedp.online). Simon maintains an awareness of open-source data engineering technologies and enjoys sharing his knowledge with the community.

29 POSTS

This Month in the DuckDB Ecosystem: January 2026

2026/01/17 - Simon Späti

This Month in the DuckDB Ecosystem: January 2026

DuckDB news: Iceberg extension adds full DML (INSERT/UPDATE/DELETE). Process 1TB in 30 seconds. Query data via AI agents with MCP server. TypeScript macros for APIs.

This Month in the DuckDB Ecosystem: December 2025

2025/12/10 - Simon Späti

This Month in the DuckDB Ecosystem: December 2025

DuckDB news: v1.4 adds AES-256 encryption. DuckLake brings ACID-compliant lakehouse with time-travel queries. Gaggle extension queries Kaggle datasets directly via SQL.

Simplicity of a Database, but the Speed of a Cache: OLAP Caches for DuckDB

2025/12/03 - Simon Späti

Simplicity of a Database, but the Speed of a Cache: OLAP Caches for DuckDB

Speed up slow dashboards without adding new infrastructure. Learn how DuckDB's caching extensions can drop query times from minutes to seconds.

Branch, Test, Deploy: A Git-Inspired Approach for Data

2025/11/24 - Simon Späti

Branch, Test, Deploy: A Git-Inspired Approach for Data

This article explores how to bring Git style workflows like branching, testing, and deploying to your data stack. Learn how concepts like zero copy cloning and metadata pointers can finally give you isolated test environments.

DuckDB Ecosystem: November 2025

2025/11/12 - Simon Späti

DuckDB Ecosystem: November 2025

DuckDB news: QuackStore caching cuts query time from 49s to 3s. Infera runs ONNX ML models in SQL. 127 community extensions analyzed. DuckLake architecture explained.

4 Senior Data Engineers Answer 10 Top Reddit Questions

2025/10/30 - Simon Späti

4 Senior Data Engineers Answer 10 Top Reddit Questions

A great panel answering the most voted/commented data questions on Reddit

DuckDB Ecosystem: October 2025

2025/10/07 - Simon Späti

DuckDB Ecosystem: October 2025

DuckDB news: v1.4.0 LTS brings AES-256 encryption, MERGE statements, and Iceberg writes. 100x faster than Spark on local Parquet. Official Docker images released.

DuckDB Ecosystem: September 2025

2025/09/09 - Simon Späti

DuckDB Ecosystem: September 2025

DuckDB news: Spatial joins 58x faster via R-tree indexing. pg_duckdb 1.0 adds OLAP analytics to PostgreSQL. One team cut Snowflake costs 79% using DuckDB caching.

Why Semantic Layers Matter — and How to Build One with DuckDB

2025/08/19 - Simon Späti

Why Semantic Layers Matter — and How to Build One with DuckDB

Learn what a semantic layer is, why it matters, and how to build a simple one with DuckDB and Ibis using just YAML and Python

DuckDB Ecosystem: August 2025

2025/08/07 - Simon Späti

DuckDB Ecosystem: August 2025

DuckDB news: 50.7% YoY developer growth. DuckLake v0.2 adds credential secrets. BigQuery extension hits 21.7k weekly downloads. Vector search enables RAG applications.

Summer Data Engineering Roadmap

2025/07/21 - Simon Späti

Summer Data Engineering Roadmap

A comprehensive 3-week structured roadmap for learning data engineering fundamentals, from SQL and Git basics to advanced topics like streaming, data quality, and DevOps.RetryClaude can make mistakes. Please double-check responses.

This Month in the DuckDB Ecosystem: July 2025

2025/07/08 - Simon Späti

This Month in the DuckDB Ecosystem: July 2025

DuckDB news: Tributary streams Kafka data to SQL queries. SQLRooms enables browser analytics via DuckDB-WASM. pg_duckdb benchmarks show 1,500x TPC-DS speedup.

SUBSCRIBE

Subscribe to MotherDuck Blog

Subscription Blog Lottie