Database School

EXPLORE

Society & Culture

© 2024 PodJoint

https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/50/bd/e0/50bde0e8-72e9-7912-4247-c667cbd5303f/mza_7763692627677187539.jpg/600x600bb.jpg

Database School

Try Hard Studios

29 episodes

2 weeks ago

Join database educator Aaron Francis as he gets schooled by database professionals.

Show more...

All content for Database School is the property of Try Hard Studios and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Join database educator Aaron Francis as he gets schooled by database professionals.

Show more...

Episodes (20/29)

Database School

Building search for AI systems with Chroma CTO Hammad Bashir

Hammad Bashir, CTO of Chroma, joins the show to break down how modern vector search systems are actually built from local, embedded databases to massively distributed, object-storage-backed architectures. We dig into Chroma’s shared local-to-cloud API, log-structured storage on object stores, hybrid search, and why retrieval-augmented generation (RAG) isn’t going anywhere.

Follow Hammad:
Twitter/X: https://twitter.com/HammadTime
LinkedIn: https://www.linkedin.com/in/hbashir
Chroma: https://trychroma.com

Follow Aaron:
Twitter/X: https://twitter.com/aarondfrancis
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:
00:00 – Introduction From high-school ASICs to CTO of Chroma
01:04 – Hammad’s background and why vector search stuck
03:01 – Why Chroma has one API for local and distributed systems
05:37 – Local experimentation vs production AI workflows
08:03 – What “unprincipled data” means in machine learning
10:31 – From computer vision to retrieval for LLMs
13:00 – Exploratory data analysis and why looking at data still matters
16:38 – Promoting data from local to Chroma Cloud
19:26 – Why Chroma is built on object storage
20:27 – Write-ahead logs, batching, and durability
26:56 – Compaction, inverted indexes, and storage layout
29:26 – Strong consistency and reading from the log
34:12 – How queries are routed and executed
37:00 – Hybrid search: vectors, full-text, and metadata
41:03 – Chunking, embeddings, and retrieval boundaries
43:22 – Agentic search and letting models drive retrieval
45:01 – Is RAG dead? A grounded explanation
48:24 – Why context windows don’t replace search
56:20 – Context rot and why retrieval reduces confusion
01:00:19 – Faster models and the future of search stacks
01:02:25 – Who Chroma is for and when it’s a great fit
01:04:25 – Hiring, team culture, and where to follow Chroma

2 weeks ago

1 hour 6 minutes

Database School

Scaling DuckDB in the cloud with MotherDuck CEO Jordan Tigani

In this episode of Database School, Aaron Francis sits down with Jordan Tigani, co-founder and CEO of MotherDuck, to break down what DuckDB is, how MotherDuck hosts it in the cloud, and why analytics workloads are shifting toward embedded databases. They dig into Duck Lake, pricing models, scaling strategies, and what it really takes to build a modern cloud data warehouse.

Follow Jordan:
Twitter/X: https://twitter.com/jrdntgn
LinkedIn: https://www.linkedin.com/in/jordantigani
MotherDuck: https://motherduck.com

Follow Aaron:
Twitter/X: https://twitter.com/aarondfrancis
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:
00:00 - Introduction
01:44 - What DuckDB is and why embedded analytics matter
04:03 - How MotherDuck hosts DuckDB in the cloud
05:18 - Is MotherDuck like the “Turso for DuckDB”?
07:38 - Isolated analytics per user and scaling to zero
08:51 - The academic origins of DuckDB
10:00 - From SingleStore to founding MotherDuck
12:28 - Getting fired… and funded 12 days later
16:39 - Jordan’s background: Kernel dev, BigQuery, and Product
18:36 - Partnering with DuckDB Labs and avoiding a fork
20:52 - Why MotherDuck targets startups and the long tail
24:22 - Pricing lessons: why $25 was too cheap
28:11 - Ducklings, instance sizing, and compute scaling
34:16 - How MotherDuck separates compute and storage
37:09 - Inside the AWS architecture and differential storage
43:12 - Hybrid execution: joining local and cloud data
45:14 - Analytics vs warehouses vs operational databases
47:41 - Data lakes, Iceberg, and what Duck Lake actually is
53:22 - When Duck Lake makes more sense than DuckDB alone
56:09 - Who switches to MotherDuck and why
58:02 - PG DuckDB and offloading analytics from Postgres
1:00:49 - Who should use MotherDuck and why
1:03:39 - Hiring plans and where to follow Jordan
1:05:01 - Wrap-up

3 weeks ago

1 hour 5 minutes

Database School

Just use Postgres with Denis Magda

In this episode, Aaron talks with Dennis Magda, author of Just Use Postgres!, about the wide world of modern Postgres, from JSON and full-text search to generative AI, time-series storage, and even message queues. They explore when Postgres should be your go-to tool, when it shouldn’t, and why understanding its breadth helps developers build better systems.

Use the code DBSmagda to get 45% off Denis' new book Just Use Postgres!
Order Just Use Postgres!

Follow Denis:
Twitter/X: https://twitter.com/denismagda
LinkedIn: https://www.linkedin.com/in/dmagda

Follow Aaron:
Twitter/X: https://twitter.com/aarondfrancis
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:
00:00 – Welcome
01:28 – Dennis’ Background: Java, JVM, and Databases
03:20 – Bridging Application Development & Databases
04:05 – Moving Down the Stack: How Dennis Entered Databases
07:28 – Apache Ignite, Distributed Systems & the Path to Postgres
08:02 – Writing Just Use Postgres!: The Origin Story
10:26 – Why a Modern Postgres Book Was Needed
11:01 – The Spark That Led to the Book Proposal
13:06 – Developers Still Don’t Know What Postgres Can Do
15:40 – Connecting With Manning & Refining the Book Vision
16:38 – What Just Use Postgres! Covers
17:40 – The Book’s Core Thesis: The Breadth of Postgres
19:50 – Favorite Use Cases & Learning While Writing
20:30 – When to Use Postgres for Non-Relational Workloads
23:08 – Full Text Search in Postgres Explained
29:31 – When Not to Use Postgres (Pragmatism Over Fanaticism)
34:01 – Using Postgres as a Message Queue
42:09 – When Message Queues Outgrow Postgres
48:10 – Postgres for Generative AI (PGVector)
55:34 – Dennis’ 14-Month Writing Process
01:00:50 – Who the Book Is For
01:04:10 – Where to Follow Dennis & Closing Thoughts

4 weeks ago

1 hour 7 minutes

Database School

Strictly typed SQL with Contra CTO, Gajus Kuizinas

In this episode, Gajus Kuizinas, co-founder and CTO of Contra, joins Aaron to talk about building the engineering world you want to live in, from strict runtime-validated SQL with Slonik to creating high-ownership engineering cultures. They dive into developer experience, runtime assertions, SafeQL, and even “Loom-driven development,” a powerful review process that lets teams move fast without breaking things.

Follow Gajus:
Twitter/X: https://twitter.com/kuizinas
Slonk: https://github.com/gajus/slonik
Scaling article: https://gajus.medium.com/lessons-learned-scaling-postgresql-database-to-1-2bn-records-month-edc5449b3067

Follow Aaron:
Twitter/X: https://twitter.com/aarondfrancis
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:
00:00 – Introduction
01:03 – Meet Gajus and Contra
01:48 – What Contra does and how it’s different
05:34 – Why Slonik exists & early career origins
07:47 – The early Node.js era and frustrations with ORMs
09:50 – SQL vs abstractions and the case for raw SQL
10:35 – Template tags and the breakthrough idea
12:03 – Strictness, catching errors early & data shape guarantees
13:37 – Runtime type checking, Zod, and performance debates
16:02 – SafeQL and real-time schema linting
17:01 – Synthesizing Slonik’s philosophy
21:29 – Handling drift, static types vs reality
22:52 – Defining schemas per-query & why it matters
27:59 – Integrating runtime types with large test suites
31:00 – Scaling the team and performance tradeoffs
33:41 – Runtime validation cost vs developer productivity
35:21 – Real drift examples from payments & external APIs
38:21 – User roles, data shape differences & edge cases
39:51 – Integration test safety & catching issues pre-deploy
40:52 – Contra’s engineering culture
41:47 – Why traditional PR reviews don’t scale
43:22 – Introducing Loom-Driven Development
45:12 – How looms transformed the review process
52:38 – Using GetDX to measure engineering friction
53:07 – How the team uses AI (Claude, etc.)
56:26 – Closing thoughts on DX and engineering philosophy
58:05 – Contra needs Postgres experts
59:00 – Where to find Gajus

1 month ago

59 minutes

Database School

Building serverless vector search with Turbopuffer CEO, Simon Eskildsen

In this episode, Aaron Francis talks with Simon Eskildsen, co-founder and CEO of TurboPuffer, about building a high-performance search engine and database that runs entirely on object storage. They dive deep on Simon's time as an engineer at Shopify, database design trade-offs, and how TurboPuffer powers modern AI workloads like Cursor and Notion.

Follow Simon:
Twitter: https://twitter.com/Sirupsen
LinkedIn: https://ca.linkedin.com/in/sirupsen
Turbopuffer: https://turbopuffer.com

Follow Aaron:
Twitter/X: https://twitter.com/aarondfrancis
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters
00:00 - Introduction
01:11 - Simon’s background and time at Shopify
03:01 - The Rails glory days and early developer experiences
04:55 - From PHP to Rails and joining Shopify
06:14 - The viral blog post that led to Shopify
09:03 - Discovering engineering talent through GitHub
10:06 - Scaling Shopify’s infrastructure to millions of requests per second
12:47 - Lessons from hypergrowth and burnout
14:46 - Life after Shopify and “angel engineering”
16:31 - The Readwise problem and discovering vector embeddings
18:22 - The high cost of vector databases and napkin math
19:14 - Building TurboPuffer on object storage
21:20 - Landing Cursor as the first big customer
23:00 - What TurboPuffer actually is
25:26 - Why object storage now works for databases
28:37 - How TurboPuffer stores and retrieves data
31:06 - What’s inside those S3 files
33:02 - Explaining vectors and embeddings
35:55 - How TurboPuffer v1 handled search
38:00 - Transitioning from search engine to database
44:09 - How Turbopuffer v2 and v3 improved performance
47:00 - Smart caching and architecture optimizations
49:04 - Trade-offs: high write latency and cold queries
51:03 - Cache warming and primitives
52:25 - Comparing object storage providers (AWS, GCP, Azure)
55:02 - Building a multi-cloud S3-compatible client
57:11 - Who TurboPuffer serves and the scale it runs at
59:31 - Connecting data to AI and the global vision
1:00:15 - Company size, scale, and hiring
1:01:36 - Roadmap and what’s next for TurboPuffer
1:03:10 - Why you should (or shouldn’t) use TurboPuffer
1:05:15 - Closing thoughts and where to find Simon

1 month ago

1 hour 6 minutes

Database School

Building an S3 Competitor with Tigris CEO Ovais Tariq

Aaron talks with Ovais Tariq, co-founder and CEO of Tigris Data and former Uber engineer who helped scale one of the world’s largest distributed systems. They discuss Uber’s hyperscale infrastructure, what it takes to build an S3-compatible object store from scratch, and how distributed storage is evolving for the AI era.

Follow Ovais:
Twitter: https://twitter.com/ovaistariq
LinkedIn: https://www.linkedin.com/in/ovaistariq
Tigris: https://www.tigrisdata.com

Follow Aaron:
Twitter/X: https://twitter.com/aarondfrancis
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:
00:00 - Introduction and overview of the episode
01:35 - Ovais’s background and introduction to Tigris
03:00 - Building distributed databases and infrastructure at Uber
06:00 - Uber’s in-house philosophy and massive data scale
09:00 - Hardware, power density, and talking to chip manufacturers
12:00 - Learning curve of scaling hardware and data centers
14:00 - The Halloween outage and lessons from Cassandra
16:00 - Building data centers across the world for Uber
17:00 - Founding Tigris and the vision for global storage
18:45 - How Tigris differs from AWS S3
20:00 - The architecture of Tigris: caching, metadata, and replication
32:00 - Why Tigris uses FoundationDB and its reliability
36:00 - Managing global and regional metadata
38:00 - How Tigris dynamically moves and caches data
41:30 - Building their own data centers and backbone
43:45 - Specialized storage for AI workloads
46:00 - Small file optimization and real-world use cases
49:00 - Snapshots, forking, and agentic AI workflows
51:00 - How AI transformed Tigris’s customer base
54:00 - Partnership with Fly.io and the distributed cloud ecosystem
57:00 - Growth, customers, and focus on media and AI companies
59:00 - What’s next for Tigris: distributed file system plans
1:01:00 - Technical challenges and building trust in durability
1:03:00 - Call to action: try Tigris and upcoming snapshot feature
1:05:00 - Advice for engineers leaving big companies to start something new
1:06:30 - Where to find Ovais online and closing remarks

1 month ago

1 hour 7 minutes

Database School

Rewriting SQLite from prison with Preston Thorpe

In this episode of Database School, Aaron talks with Preston Thorpe, a senior engineer at Turso who is currently incarcerated, about his incredible journey from prison to rewriting SQLite in Rust. They dive deep into concurrent writes, MVCC, and the challenges of building a new database from scratch while discussing redemption, resilience, and raw technical brilliance.

Follow Preston and Turso:
LinkedIn: https://www.linkedin.com/in/PThorpe92
Preston's Blog: https://pthorpe92.dev
GitHub: https://github.com/PThorpe92
Turso: https://turso.tech

Follow Aaron:
Twitter/X: https://twitter.com/aarondfrancis
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:
00:00 - Intro and Preston’s story
02:13 - How Preston learned programming in prison
06:06 - Making his parents proud and turning life around
09:01 - Getting his first job at Unlock Labs
10:47 - Discovering Turso and contributing to open source
12:53 - From contributor to senior engineer at Turso
22:27 - What Preston works on inside Turso
24:00 - Challenges of rewriting SQLite in Rust
26:00 - Why concurrent writes matter
27:57 - How Turso implements concurrent writes
35:02 - Maintaining SQLite compatibility
37:03 - MVCC explained simply
43:40 - How Turso handles MVCC and logging
46:03 - Open source contributions and performance work
46:23 - Implementing live materialized views
50:55 - The DBSP paper and incremental computation
52:55 - Sync and offline capabilities in Turso
56:45 - Change data capture and future possibilities
1:02:01 - Implementing foreign keys and fuzz testing
1:06:02 - Rebuilding SQLite’s virtual machine
1:08:10 - The quirks of SQLite’s codebase
1:10:47 - Preston’s upcoming release and what’s next
1:14:02 - Gratitude, reflection, and closing thoughts

2 months ago

1 hour 18 minutes

Database School

A million transactions per second: building TigerBeetle with Joran Greef

In this episode, Aaron talks with Joran Greef, CEO and creator of TigerBeetle, the world’s first financial transactions database. Joran takes us on a deep dive of on how TigerBeetle brings double-entry accounting principles directly into the database layer to achieve extreme correctness, performance, and fault tolerance at scale.

Follow Joran and TigerBeetle:
Twitter/X: https://twitter.com/jorandirkgreef
Website: https://tigerbeetle.com
GitHub: https://github.com/tigerbeetle/tigerbeetle
Tiger Style: https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TIGER_STYLE.md
YouTube: https://www.youtube.com/@UC3TlyQ3h6lC_jSWust2leGg

Follow Aaron:
Twitter/X: https://twitter.com/aarondfrancis
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:
00:00 - Introduction and crossover between accounting and databases
01:50 - Meet Joran Greef and the origins of TigerBeetle
02:55 - What makes TigerBeetle different from general purpose databases
04:38 - The founding story and the 5,000-year history of transactions
07:42 - How modern commerce became highly transactional
10:06 - Recognizing the limits of general purpose databases
13:18 - From distributed systems to payment infrastructure
17:01 - Discovering bottlenecks in traditional database performance
19:58 - Why traditional databases can’t scale for microtransactions
23:05 - Introducing double-entry accounting concepts
25:20 - How double-entry accounting mirrors database design
31:35 - Modeling ledgers and event sourcing in Tiger Beetle
35:02 - Why TigerBeetle outperforms Postgres and MySQL
40:05 - Batching transactions for massive throughput
47:09 - Client-side batching and zero-copy efficiency
50:04 - Handling contention and concurrency internally
56:03 - Ensuring correctness and atomicity in transactions
57:17 - Designing for mission-critical systems and reliability
1:00:50 - Building safety through deterministic simulation testing
1:04:55 - Detecting and recovering from storage faults
1:10:00 - How TigerBeetle prevents data corruption
1:17:01 - Distributed replication and self-healing data
1:20:08 - Who’s using TigerBeetle and how it’s structured as a company
1:24:01 - How to learn more and get involved with TigerBeetle
1:26:15 - Closing thoughts and where to find Joran online

2 months ago

1 hour 28 minutes

Database School

PlanetScale Postgres with CEO Sam Lambert

Sam Lambert, my former boss at PlanetScale, talks to me about PlanetScale moving from a MySQL company to now also having a Postgres offering. Sam shares why PlanetScale decided to move to Postgres, how MySQL and Postgres are different at a technical level, and how the change has impacted the company culture. Stay to the end for a special surprise!

PlanetScale Metal Episode: https://youtu.be/3r9PsVwGkg4
Join the waitlist to be notified of the MySQL for Developers release on Database School:
https://databaseschool.com/mysql

Follow Sam:
PlanetScale: https://planetscale.com
Twitter: https://twitter.com/isamlambert

Follow Aaron:
Twitter: https://twitter.com/aarondfrancis
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.
Database School: https://databaseschool.com
Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)

Chapters:
00:00 - Inaugural episode on this channel
01:46 - Introducing Sam Lambert and his background
03:04 - How PlanetScale built on MySQL and Vitess
06:10 - Explaining the layers of PlanetScale’s architecture
09:57 - Node lifecycles, failover, and operational discipline
12:02 - How Vitess makes sharding work
14:21 - PlanetScale’s edge network and resharding
19:02 - Why downtime is unacceptable at scale
20:04 - From Metal to Postgres: the decision process
23:06 - Why Postgres vibes matter for startups
27:04 - How PlanetScale adapted its stack for Postgres
34:38 - Entering the Postgres ecosystem and extensions
41:02 - Permissions, security, and reliability trade-offs
45:04 - Building Ni: a Vitess-style system for Postgres
53:33 - Why PlanetScale insists on control for reliability
1:02:05 - Competing in the broader Postgres landscape
1:08:33 - Why PlanetScale stays “just a database”
1:12:33 - What GA means for Postgres at PlanetScale
1:17:43 - Call to action for new Postgres users
1:18:49 - Surprise!
1:22:21 - Wrap-up and where to find Sam

3 months ago

1 hour 6 minutes

Database School

The database for all your AI needs

Marcel Kornacker, the creator of Apache Impala and co-creator of Apache Parquet, joins me to talk about his latest project: Pixeltable, a multimodal AI database that combines structured and unstructured data with rich, Python-native workflows.

From ingestion to vector search, transcription to snapshots, Pixeltable eliminates painful data plumbing for modern AI teams.

Follow Marcel

Pixeltable: https://pixeltable.com
Pixeltable GitHub: https://github.com/pixeltable/pixeltable
LinkedIn: https://www.linkedin.com/in/marcelkornacker

Follow Aaron

Twitter: https://twitter.com/aarondfrancis
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com – find articles, podcasts, courses, and more
Database School: https://databaseschool.com

Chapters

0:00 – Introduction
0:20 – Meet Marcel Kornacker
1:19 – Early career and grad school in databases
2:12 – Joining Google and building F1
3:42 – How F1 used Spanner at Google
4:01 – Starting Apache Impala at Cloudera
6:02 – Why SQL still matters
7:29 – What keeps Marcel fascinated with databases
9:37 – The “SQL is dead” waves and shift to AI
10:21 – Observing pain points in computer vision pipelines
13:02 – Multimodal data challenges and the idea for Pixeltable
16:10 – How Pixeltable handles transformations with computed columns
26:29 – Example: processing video, audio, and transcripts in Pixeltable
33:12 – DAG execution and parallelism explained
37:00 – Transactional guarantees in Pixeltable
39:00 – Iterators and chunking data for search
42:26 – Using embeddings and semantic search
47:05 – Updating data and incremental recomputation
50:06 – Thoughts on RAG and hybrid search
53:14 – Real-world use cases and dataset curation
57:00 – Example: labeling food waste on cruise ships
1:02:00 – Labeling workflows and syncing annotations
1:02:41 – Pixeltable’s roadmap and cloud vision
1:07:10 – How to get involved with Pixeltable
1:09:03 – Closing and where to find Marcel

3 months ago

1 hour

Database School

Sharding Postgres without extensions with PgDog founder, Lev Kokotov

I chat with Lev Kokotov to talk about building PgDog, an open-source sharding solution for Postgres that sits outside the database. Lev shares the journey from creating PgCat to launching PgDog through YC, the technical challenges of sharding, and why he believes scaling Postgres shouldn’t require extensions or rewrites.

Follow Lev:

Twitter: https://twitter.com/levpgdog
PgDog: https://pgdog.dev

Follow Aaron:

Twitter: https://twitter.com/aarondfrancis
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com — find articles, podcasts, courses, and more.
Database School: https://databaseschool.com

Chapters

00:00 - Intro
01:27 - Lev’s self-taught to computer science degree journey
04:50 - Transition to Postgres discussion
05:24 - History of PgCat
07:06 - What PG Cat does and key features
08:59 - Why Lev built PgCat instead of extending PG Bouncer
10:06 - PG Cat’s current status and usage
12:20 - Moving from PgCat to PgDog
13:09 - Applying to YC as a solo founder
16:24 - YC pitch: the market gap for Postgres sharding
18:52 - High-level overview of PgDog
23:32 - Why PgDog is not an extension
25:57 - When to build Postgres extensions vs standalone tools
27:49 - PgDog architecture and query parsing
30:39 - Handling cross-shard queries and current capabilities
33:47 - How PgDog shards an existing large Postgres database
36:37 - Parallel replication streams for faster sharding
39:07 - Alternate resharding approaches
42:52 - Where PgDog draws the orchestration line
44:00 - Vision for PgDog Cloud vs bring-your-own-database
46:47 - Company status: first hire, design partners, and production use
50:45 - How deploys work for customers
52:20 - Importance of building closely with design partners
54:05 - Paid design partnerships and initial deployments
56:23 - Benefit of sitting outside Postgres for compatibility
58:32 - Near-term roadmap and long-term vision
1:01:03 - Where to find Lev online

4 months ago

48 minutes

Database School

Rewriting SQLite from scratch (yes, really)

Want to learn more about SQLite?
Check out my course on SQLite: https://highperformancesqlite.com/?ref=yt

In this episode of Database School, I chat with Glauber Costa, CEO of Turso, about their audacious decision to rewrite SQLite from the ground up.

We cover the technical motivations, open contribution philosophy, and how deterministic simulation testing is unlocking new levels of reliability.

Get your free SQLite reference guide: https://highperformancesqlite.com/products/sqlite-reference-guide.

Follow Glauber:
Twitter: https://twitter.com/glcst
Turso: https://tur.so/af

Follow Aaron:
Twitter: https://twitter.com/aarondfrancis
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Database school: https://databaseschool.com

Chapters:
00:00 - Intro to guest Glauber Costa
00:58 - Glauber's background and path to databases
02:23 - Moving to Texas and life changes
05:32 - The origin story of Turso
07:55 - Why fork SQLite in the first place?
10:28 - SQLite’s closed contribution model
12:00 - Launching libSQL as an open contribution fork
13:43 - Building Turso Cloud for serverless SQLite
14:57 - Limitations of forking SQLite
17:00 - Deciding to rewrite SQLite from scratch
19:08 - Branding mistakes and naming decisions
22:29 - Differentiating Turso (the database) from Turso Cloud
24:00 - Technical barriers that led to the rewrite
28:00 - Why libSQL plateaued for deeper improvements
30:14 - Big business partner request leads to deeper rethink
31:23 - The rewrite begins
33:36 - Early community traction and GitHub stars
35:00 - Hiring contributors from the community
36:58 - Reigniting the original vision
39:40 - Turso’s core business thesis
42:00 - Fully pivoting the company around the rewrite
45:16 - How GitHub contributors signal business alignment
47:10 - SQLite’s rock-solid rep and test suite challenges
49:00 - The magic of deterministic simulation testing
53:00 - How the simulator injects and replays IO failures
56:00 - The role of property-based testing
58:54 - Offering cash for bugs that break data integrity
1:01:05 - Deterministic testing vs traditional testing
1:03:44 - What it took to release Turso Alpha
1:05:50 - Encouraging contributors with real incentives
1:07:50 - How to get involved and contribute
1:20:00 - Upcoming roadmap: indexes, CDC, schema changes
1:23:40 - Final thoughts and where to find Turso

4 months ago

1 hour 17 minutes

Database School

Vitess for Postgres, with the co-founder of PlanetScale

Sugu Sougoumarane, co-creator of Vitess and co-founder of PlanetScale, joins me to talk about his time scaling YouTube’s database infrastructure, building Vitess, and his latest project bringing sharding to Postgres with Multigres.

This was a fun conversation with technical deep-dives, lessons from building distributed systems, and why he’s joining Supabase to tackle this next big challenge.

Sugu’s Vitess videos:

https://www.youtube.com/watch?v=6yOjF7qhmyY&list=PLA9CMdLbfL5zHg3oapO0HvtPfVx6_iJy6

The big announcement:

https://supabase.com/blog/multigres-vitess-for-postgres

Database School:

https://databaseschool.com

Follow Sugu:

Twitter: https://twitter.com/ssougou

LinkedIn: https://www.linkedin.com/in/sougou

Follow Aaron:

Twitter: https://twitter.com/aarondfrancis

LinkedIn: https://www.linkedin.com/in/aarondfrancis

Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:

00:00 - Intro

1:38 - The birth of Vitess at YouTube

3:19 - The spreadsheet that started it all

6:17 - Intelligent query parsing and connection pooling

9:46 - Preventing outages with query limits

13:42 - Growing Vitess beyond a connection pooler

16:01 - Choosing Go for Vitess

20:00 - The life of a query in Vitess

23:12 - How sharding worked at YouTube

26:03 - Hiding the keyspace ID from applications

33:02 - How Vitess evolved to hide complexity

36:05 - Founding PlanetScale & maintaining Vitess solo

39:22 - Sabbatical, rediscovering empathy, and volunteering

42:08 - The itch to bring Vitess to Postgres

44:50 - Why Multigres focuses on compatibility and usability

49:00 - The Postgres codebase vs. MySQL codebase

52:06 - Joining Supabase & building the Multigres team

54:20 - Starting Multigres from scratch with lessons from Vitess

57:02 - MVP goals for Multigres

1:01:02 - Integration with Supabase & database branching

1:05:21 - Sugu’s dream for Multigres

1:09:05 - Small teams, hiring, and open positions

1:11:07 - Community response to Multigres announcement

1:12:31 - Where to find Sugu

6 months ago

1 hour 7 minutes

Database School

PlanetScale Metal

In this episode, I chat with Richard Crowley from PlanetScale about their new offering: PlanetScale Metal.

We dive deep into the performance and reliability trade-offs of EBS vs. locally attached NVMe storage,

and how Metal delivers game-changing speed for MySQL workloads.

Links:

Database School: https://databaseschool.com
PlanetScale: https://planetscale.com
PlanetScale Metal: https://planetscale.com/blog/announcing-metal

Follow Richard:

Twitter: https://twitter.com/rcrowley
Website: https://rcrowley.org

Follow Aaron:

Twitter: https://twitter.com/aarondfrancis
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com — find articles, podcasts, courses, and more.

Chapters:

00:00 - Intro: What is PlanetScale Metal?

00:39 - Meet Richard Crowley

01:33 - What is Vitess and how does it work?

03:00 - Where PlanetScale fits into the picture

09:03 - Why EBS is the default and its trade-offs

13:03 - How PlanetScale handles durability without EBS

16:03 - The engineering work behind PlanetScale Metal

22:00 - Deep dive into backups, restores, and availability math

25:03 - How PlanetScale replaces instances safely

27:11 - Performance gains with Metal: Latency and IOPS explained

32:03 - Database workloads that truly benefit from Metal

39:10 - The myth of the infinite cloud

41:08 - How PlanetScale plans for capacity

43:02 - Multi-tenant vs. PlanetScale Managed

44:02 - Who should use Metal and when?

46:05 - Pricing trade-offs and when Metal becomes cheaper

48:27 - Scaling vertically vs. sharding

49:57 - What’s next for PlanetScale Metal?

53:32 - Where to learn more

6 months ago

50 minutes

Database School

From Prisma Founder to LiveStore: Building local-first apps with Johannes Schickling

Johannes Schickling, original founder of Prisma, joins me to talk about LiveStore, his ambitious local-first data layer designed to rethink how we build apps from the data layer up.

We dive deep into event sourcing, syncing with SQLite, and why this approach might power the next generation of reactive apps.

🔗 Links Mentioned

Want to learn more about SQLite?

Check out my SQLite course:

https://highperformancesqlite.com/?ref=yt

LiveStore

Website: https://livestore.dev

Repo: https://github.com/livestorejs

Discord: https://discord.gg/RbMcjUAPd7

Follow Johannes

Twitter: https://www.x.com/schickling

LinkedIn: https://www.linkedin.com/in/schickling

Website: https://www.schickling.dev

Podcast: https://www.localfirst.fm

Follow Aaron

Twitter: https://twitter.com/aarondfrancis

LinkedIn: https://www.linkedin.com/in/aarondfrancis

Website: https://aaronfrancis.com — find articles, podcasts, courses, and more

Database School

YouTube: https://www.youtube.com/playlist?list=PLI72dgeNJtzqElnNB6sQoAn2R-F3Vqm15

Audio only: https://databaseschool.transistor.fm

🕒 Chapters

00:00 - Intro to Johannes

01:00 - From Prisma to LiveStore

03:00 - Discovering local-first through Riffle

05:00 - What is local-first and who is it for?

07:00 - Why local-first is gaining popularity

10:00 - The inspiration from apps like Linear

13:00 - Gaps in local-first tooling in 2020

16:00 - Social apps vs. user-centric apps

18:00 - Distributed systems and why they’re hard

21:00 - The value of embracing local-first

24:00 - What LiveStore is and what it’s not

26:00 - Event sourcing as the core of LiveStore

30:00 - Benefits of event sourcing for apps

33:00 - Schema changes and time travel via events

37:00 - Materializers and how they work

43:00 - Syncing data across clients and devices

48:00 - Sync servers and cross-tab communication

54:00 - Architecture choices and dev tooling

59:00 - State of the project and future vision

1:06:00 - How to get involved

7 months ago

1 hour 31 minutes

Database School

How Durable Objects and D1 Work: A Deep Dive with Cloudflare’s Josh Howard

Josh Howard, Senior Engineering Manager at Cloudflare, joins me to explain how Durable Objects and D1 work under the hood—and why Cloudflare’s approach to stateful serverless infrastructure is so unique. We get into V8 isolates, replication models, routing strategies, and even upcoming support for containers. Want to learn more about SQLite? Check out my SQLite course: https://highperformancesqlite.com/?ref=yt Follow Josh: Twitter: https://twitter.com/ajoshhoward LinkedIn: https://www.linkedin.com/in/joshthoward Follow Aaron: Twitter: https://twitter.com/aarondfrancis LinkedIn: https://www.linkedin.com/in/aarondfrancis Website: https://aaronfrancis.com - find articles, podcasts, courses, and more. Chapters 00:00 - Intro 00:37 - What is a Durable Object? 01:43 - Cloudflare’s serverless model and V8 isolates 03:58 - Why stateful serverless matters 05:14 - Durable Objects vs Workers 06:22 - How routing to Durable Objects works 08:01 - What makes them "durable"? 08:51 - Tradeoffs of colocating compute and state 10:58 - Stateless Durable Objects 12:49 - Waking up from sleep and restoring state 16:15 - Durable Object storage: KV and SQLite APIs 18:49 - Relationship between D1, Workers KV, and DOs 20:34 - Performance of local storage writes 21:50 - Storage replication and output gating 24:15 - Lifecycle of a request through a Durable Object 26:46 - Replication strategy and long-term durability 31:25 - Placement logic and sharding strategy 36:35 - Use cases: agents, multiplayer games, chat apps 40:33 - Scaling Durable Objects 41:14 - Globally unique ID generation 43:22 - Named Durable Objects and coordination 46:07 - D1 vs Workers KV vs Durable Objects 47:50 - Outerbase acquisition and DX improvements 49:49 - Querying durable object storage 51:20 - Developer Week highlights and new features 52:44 - Read replicas and sticky sessions 53:49 - Containers and the future of routing 56:47 - Deployment regions and infrastructure expansion 57:43 - Hiring and how to connect with Josh

7 months ago

1 hour 14 minutes

Database School

20 years of hacking Postgres with Heikki Linnakangas (cofounder of Neon)

In this episode of Database School, I talk with Heikki Linnakangas, co-founder of Neon and longtime PostgreSQL hacker, to talk about 20+ years in the Postgres community, the architecture behind Neon, and the future of multi-threaded Postgres. From paternity leave patches to branching production databases, we cover a lot of ground in this deep-dive conversation. Links: Let's make postgres multi-threaded: https://www.postgresql.org/message-id/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi Hacker News discussion: https://news.ycombinator.com/item?id=36284487 Follow Heikki: LinkedIn: https://www.linkedin.com/in/heikki-linnakangas-6b58bb203/ Website: https://neon.tech Follow Aaron: Twitter: https://twitter.com/aarondfrancis LinkedIn: https://www.linkedin.com/in/aarondfrancis Website: https://aaronfrancis.com - find articles, podcasts, courses, and more. 00:00 - Introduction and Heikki's background 01:19 - How Heikki got into Postgres 03:17 - First major patch: two-phase commit 04:00 - Governance and decision-making in Postgres 07:00 - Committer consensus and decentralization 09:25 - Attracting new contributors 11:25 - Founding Neon with Nikita Shamgunov 13:01 - Why separation of compute and storage matters 15:00 - Write-ahead log and architectural insights 17:03 - Early days of building Neon 20:00 - Building the control plane and user-facing systems 21:28 - What "serverless Postgres" really means 23:39 - Reducing cold start time from 5s to 700ms 25:05 - Storage architecture and page servers 27:31 - Who uses sleepable databases 28:44 - Multi-tenancy and schema management 31:01 - Role in low-code/AI app generation 33:04 - Branching, time travel, and read replicas 36:56 - Real-time point-in-time query recovery 38:47 - Large customers and scaling in Neon 41:04 - Heikki’s favorite Neon feature: time travel 41:49 - Making Postgres multi-threaded 45:29 - Why it matters for connection scaling 50:50 - The next five years for Postgres and Neon 52:57 - Final thoughts and where to find Heikki

8 months ago

2 hours

Database School

Building a serverless database replica with Carl Sverre

Want to learn more SQLite? Check out my SQLite course: https://highperformancesqlite.com In this episode, Carl Sverre and I discuss why syncing everything is a bad idea and how his new project, Graft, makes edge-native, partially replicated databases possible. We dig into SQLite, object storage, transactional guarantees, and why Graft might be the foundation for serverless database replicas. SQLSync: https://sqlsync.dev Stop syncing everything blog post: https://sqlsync.dev/posts/stop-syncing-everything Graft: https://github.com/orbitinghail/graft Follow Carl: Twitter: https://twitter.com/carlsverre LinkedIn: https://www.linkedin.com/in/carlsverre Website: https://carlsverre.com/ Follow Aaron: Twitter: https://twitter.com/aarondfrancis LinkedIn: https://www.linkedin.com/in/aarondfrancis Website: https://aaronfrancis.com - find articles, podcasts, courses, and more. Chapters: 00:00 - Intro and Carl’s controversial blog title 01:00 - Why “stop syncing everything” doesn't mean stop syncing 02:30 - The problem with full database syncs 03:20 - Quick recap of SQL Sync and multiplayer SQLite 04:45 - How SQL Sync works using physical replication 06:00 - The limitations that led to building Graft 09:00 - What is Graft? A high-level overview 16:30 - Syncing architecture: how Graft scales 18:00 - Graft's stateless design and Fly.io integration 20:00 - S3 compatibility and using Tigris as backend 22:00 - Latency tuning and express zone support 24:00 - Can Graft run locally or with Minio? 27:00 - Page store vs meta store in Graft 36:00 - Index-aware prefetching in SQLite 38:00 - Prefetching intelligence: Graft vs driver 40:00 - The benefits of Graft's architectural simplicity 48:00 - Three use cases: apps, web apps, and replicas 50:00 - Sync timing and perceived latency 59:00 - Replaying transactions vs logical conflict resolution 1:03:00 - What’s next for Graft and how to get involved 1:05:00 - Hacker News reception and blog post feedback 1:06:30 - Closing thoughts and where to find Carl

8 months ago

1 hour 28 minutes

Database School

Postgres on bare metal with the CEO of Prisma

Prisma started as a GraphQL backend and pivoted into one of the most widely used ORMs in the world. Now, they’ve launched Prisma Postgres, and CEO Søren Bramer Schmidt is here to break down the journey, the challenges, and the massive technical innovations behind it—including bare-metal servers, Firecracker microVMs, and unikernels. If you care about databases, performance, or scaling, this one’s for you.

Want to learn more Postgres? Check out my Postgres course: https://masteringpostgres.com.

Follow Søren:
Twitter: https://twitter.com/sorenbs
GitHub: https://github.com/prisma/prisma
Prisma Postgres: https://www.prisma.io/postgres

Follow Aaron:
Twitter: https://twitter.com/aarondfrancis
LinkedIn: https://www.linkedin.com/in/aarondfrancis
Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

Chapters:
00:00 - Introduction
01:15 - The Origins of Prisma: From GraphQL to ORM
02:55 - Why Firebase & Parse Inspired Prisma
04:04 - The Pivot: From GraphQL to Prisma ORM
06:00 - Why They Abandoned Backend-as-a-Service
08:07 - The Open Source Business Model Debate
10:15 - The Challenges of Monetizing an ORM
12:42 - Building Prisma Accelerate & Pulse
14:55 - How Prisma Accelerate Optimizes Database Access
17:00 - Real-Time Database Updates with Prisma Pulse
20:03 - How Prisma Pulse Handles Change Data Capture (CDC)
23:15 - Users Wanted a Hosted Database (Even When Prisma Didn’t)
25:40 - Why Prisma Finally Launched Prisma Postgres
27:32 - Unikernels, Firecracker MicroVMs & Running Millions of Databases
31:10 - Bare Metal Servers vs. AWS: The Controversial Choice
34:40 - How Prisma Routes Queries for Low Latency
38:02 - Scaling, Cost Efficiency & Performance Benefits
42:10 - The Prisma Postgres Roadmap & Future Features
45:30 - Why Prisma is Competing with AWS & The Big Cloud Players
48:05 - Final Thoughts & Where to Learn More

10 months ago

1 hour 24 minutes

Database School

Moving from Redis to SQLite with Mike Buckbee

Want to learn more SQLite? Check out my SQLite course: https://highperformancesqlite.com In this episode, I sit down with Mike Buckbee to dive into the nitty-gritty of web application firewalls and his journey from using Redis to SQLite in Wafris. We talk about database architecture, operational challenges, and the fascinating ways SQLite improves performance and usability in cybersecurity tools. Get production ready SQLite with Turso: https://tur.so/af. Follow Mike: Twitter: https://twitter.com/mbuckbee LinkedIn: https://www.linkedin.com/in/michaelbuckbee Wafris website: https://wafris.org Rearchitecting Redis to SQLite article: https://wafris.org/blog/rearchitecting-for-sqlite Follow Aaron: Twitter: https://twitter.com/aarondfrancis LinkedIn: https://www.linkedin.com/in/aarondfrancis Website: https://aaronfrancis.com - find articles, podcasts, courses, and more. Chapters: 00:00 - Introduction and Guest Overview 01:06 - What is Wafris? 02:43 - Naming and Origins of Wafris 04:00 - Mike's Cybersecurity Background 07:17 - Challenges with Web Application Firewalls 10:01 - Wafris Architecture Overview 16:15 - Why Switch to SQLite? 18:01 - Handling IP Address Ranges 24:00 - Wild Redis Data Structures Explained 28:51 - Transitioning to SQLite 32:02 - Operational Advantages of SQLite 37:04 - How Wafris Leverages Threat Lists 40:13 - Performance Gains with SQLite 46:51 - Splitting Reads and Writes in the New Architecture 52:29 - Closing Thoughts and Where to Learn More

1 year ago

1 hour 9 minutes