An 11TB MySQL Log Warehouse, Before Columnar Warehouses Existed
A 90-day rolling HTTP log warehouse built on MySQL 5 with MERGE-table partitioning, MEMORY-engine ingestion staging, and IOPS-aware hardware design — solving the data-warehouse problem years before the tools that now define it.
Today you would point Snowflake at the problem and be done. In 2005, warehousing 11TB of HTTP logs meant designing around MySQL’s MERGE tables as a partitioning workaround, staging ingestion through in-memory tables, and speccing hardware around IOPS budgets calculated by hand.
The full write-up will cover:
- Partitioning before native partitioning: MERGE-table mechanics and their failure modes
- Ingestion staging with the MEMORY engine and durable storage on MyISAM
- IOPS-aware hardware design when storage was the binding constraint
- What this era teaches about the current generation of cloud warehouses — the problems are the same; the tools absorbed them
- Bonus thread: migrating millions of live webmail users from mbox to Maildir at the same employer
Full case study coming soon.