the boring 80%: a tactical playbook for engineering foundations

most engineering 'foundation' advice is fluff. here's the tactical playbook i use to build production ai systems without over-engineering: what to instrument first, the contracts that prevent 90% of bugs, the 30-minute rule, and how to know when foundations are a trap.

April 7, 2026/7 min read

last week i shipped a feature in three hours that would have taken three weeks at my last job. not because i'm a 10x engineer. because the foundation was already there.

i've been thinking about why some codebases let you move fast and others punish every line of new code. there's a popular framing that you spend most of your time on the foundation before you can do the interesting work. that framing is mostly right but it's missing a tactical playbook. so here's what i actually do.

what "foundation" means in software, not in metaphor

when people talk about engineering foundations they usually mean vague things: "good architecture", "clean code", "proper testing". that's marketing copy. let me be specific.

the foundation is the answer to seven questions:

  1. observability: when something breaks at 2am, can i find the broken thing in under 90 seconds?
  2. data contracts: if i change a field, will i find every consumer of that field in under 5 minutes?
  3. error budgets: do i know which errors are acceptable and which mean i wake up?
  4. idempotency: can the same request happen twice without breaking the world?
  5. migrations: can i change the schema without taking the system down?
  6. secrets: can i rotate a credential without a deploy?
  7. feedback loops: how long from "i typed code" to "i know if it works"?

if you can answer all seven with concrete tools and runbooks, you have a foundation. if even one is hand-wavy, you don't.

the order matters

not all seven cost the same and not all seven matter equally on day one. here's the order i actually build them in for a new system.

first: feedback loops. before anything else, i want a hot-reload dev server, a unit test that runs in under a second, and a one-command local reset. everything else is downstream of how fast i can iterate. if every change takes 30 seconds to validate, i will ship slow code forever.

second: observability. structured logs with request ids, a single dashboard that shows error rate and latency per endpoint, and traces for any call that crosses a service boundary. i want to know what's happening before i write the second feature, not after the first incident.

third: data contracts. i write the schema as code (sqlc, prisma, sqlx) so queries are typed against the database. every field that crosses a boundary becomes a typed struct, not a map[string]any. this kills 90% of "the api returned something weird" bugs before they ship.

fourth: idempotency. every write has a request id. every job has a deduplication key. retries are safe by default. i never want to be afraid of clicking the button twice.

fifth: migrations. i never write a destructive migration. add column, backfill, then drop in a separate deploy a week later. this turns "schema change" from a scary event into a routine.

sixth: secrets. vault or aws secrets manager with rotation. no env vars with prod credentials in plain text. this one is boring until the day it isn't.

seventh: error budgets. a list, written down, of which errors page me and which ones go to a queue i look at on monday. without this, every bug feels like a fire and i burn out in six months.

notice what's not on the list: code style, file organization, "clean code" patterns. those matter for craft. different problem.

the metric for "enough"

the question nobody answers when they talk about foundations: how do you know when it's done? when do you stop building scaffolding and start shipping features?

i use a test i stole from a senior engineer years ago: the 30-minute rule. if a new engineer (or me, two months from now) cannot do a meaningful change to the system within 30 minutes of cloning the repo, the foundation is not done. that 30 minutes includes:

  • clone the repo, install dependencies, run locally
  • find the file that handles the thing they need to change
  • make the change
  • see it work in the dev environment
  • write a test that proves it

if any step takes more than a few minutes, fix that step before adding more features. if the whole sequence takes under 5 minutes, you can ship.

this is way more concrete than "we have good architecture". it's a stopwatch you can run on any codebase, including your own.

when foundations are a trap

here's the part nobody writes about. foundations can absolutely be a trap.

i've seen teams spend six months building an "event-driven microservice architecture" before they had ten users. i've seen teams write a custom orm because the existing ones "didn't fit our model". i've seen teams build observability dashboards for systems that didn't exist yet. i've been each of those teams at different points.

the tell is when foundation work isn't connected to a feature you're about to ship. if you're building observability because you keep getting paged for stuff you can't debug, build it. if you're building observability because it feels like the responsible thing to do, you're procrastinating with extra steps.

the rule: build the foundation in response to pain, not in anticipation of it.

a related rule: ship the ugly version first. write the function inline. hard-code the value. skip the abstraction. then when you need to use it twice, extract it. when you need to test it, isolate it. let the foundation grow from real demand, not from a mental model of what good engineering looks like.

this sounds contradictory to the seven-item list. it isn't. the seven items are non-negotiable because every system has them whether you choose them or not. you'll have logging either way. the question is whether it's structured. you'll have a schema either way. the question is whether changes are safe. these aren't optional features. they're physics.

the compounding asymmetry

here's the actual reason foundations matter, stripped of metaphor: they have a fixed cost and an unbounded payoff.

i can spin up a new ai feature in regent in a day not because i'm fast. it's because:

  • the database has per-user rls so i don't have to think about access control
  • every llm call goes through one router that handles retries, fallbacks, and circuit breaking
  • there's a typed event log so i can replay exactly what the user did when something breaks
  • the cron framework handles scheduling, so adding a new periodic job is one function

each of those took a week to build the first time. now they pay back every feature, forever. you spend a fixed cost once to remove a per-feature tax for the rest of the project's life. that's the asymmetry.

the boring foundation work is what makes the interesting work possible. not the other way around.

what i actually do on day one

if i'm starting something fresh tomorrow, here's the first day:

  1. clone a template that has logging, hot reload, and a one-command dev setup already wired in
  2. set up the database with one migration and one typed query
  3. write one endpoint, one handler, one test
  4. deploy it to a real environment (not just localhost)
  5. add structured logging and one dashboard
  6. now start building the actual thing

that whole sequence takes 4 to 6 hours. it's almost the only "foundation" i do up front. the rest grows in response to features.

the part i'm still updating

i used to think foundations meant "getting the architecture right". after building regent, mailpilot, and a few other systems in production, i think architecture matters way less than the seven items above. you can have a "wrong" architecture and still ship if you can debug it. you can have a "right" architecture and still die if you can't tell why a request is slow.

ship debuggable systems. ship them with safe writes. ship them with one fast feedback loop. everything else is downstream.

the boring 80% isn't a phase you finish once. it's a discipline you maintain forever. and the reward isn't a standing ovation. the reward is that next week, when you have an idea, you can ship it on a tuesday afternoon instead of starting a six-week project.

that's the real magic. not a flash of brilliance. the quiet fact that you built a system where brilliance is cheap.

Comments

Sign in to leave a comment

No comments yet