The micro data footprint
The personal data problem was never about storage or compute. It was always about organisation and access.
All meaningful textual and structured data a person generates in a day fits in roughly 50–200KB compressed. Messages, emails, notes, biometrics, transactions, calendar entries, meeting transcripts, decisions, location. Everything. 100MB per year is the upper threshold. A decade fits in under a gigabyte. A lifetime in a few gigabytes.
At this scale, just capture everything. Don’t filter upfront. The storage cost is zero. Query what you need later.
This means:
- A single SQLite file on a laptop can hold an entire life’s meaningful data.
- No cloud infrastructure needed for capacity. Only for sync and backup.
- No distributed systems, no scaling concerns. Ever.
- Local-first is not just viable — it’s the right architecture. Anything more is over-engineered.
When people hear “unify all your personal data,” they think big data, infrastructure, cost. The reality is a few hundred KB a day. The hard part was never storing it. The hard part was getting it in and making it queryable.
Tools built for scale — cloud databases, distributed systems, enterprise knowledge management — are massively over-engineered for personal data. A Markdown file and SQLite is the right level of technology. That’s a feature, not a limitation.
One caveat: this excludes video, photos, and audio. These are large, but they’re media, not structured data. Their metadata — timestamps, tags, descriptions — is tiny and belongs in the graph. The files themselves live on disk or in cloud storage.