PoC is a framework of perverse incentives
Was it ever possible to build a Proof of Concept that didn’t end up being rushed into production as a nest of bugs, instability, and technical debt, regardless of any advance warning from engineering that it will be “just a PoC”? Can we be at ease, now that anyone with a laptop has a PoC machine gun?
Every PoC starts with good intentions. But the road to hell is paved with good intentions. They are a bit like that movie “Gremlins” where a kid sells a cute pet with three warnings (“no sunlight”, “no contact with water”, “no food after midnight”). The buyer ignores the warnings and at the first oversight, the pet nibbles on chicken wings leftovers and transforms into a horde of reptilian psychopaths. The kid’s grandfather, a grumpy elder, knew that usage warnings are useless and didn’t want to sell the creature in the first place. But nobody wanted to listen. The kid was too eager to sell. The man was too eager to buy.
The term PoC functions as a trigger-word, offering a special type of deliverable that twists what are fair, rational incentives from every stakeholder into a perverse framework that works against them.
For Engineers, it carries the implicit license to fasten headphones and indulge in coding without the boring guardrails of the daily job, the jiras, the tests, the documentation, the standard tools and frameworks, the attention to quality standards. It’s “just a PoC” so it’ll be fun and it’ll be fine, we will only need a bit of extra time to polish, to productise. They become eager sellers of a dangerous product.
For anyone on the buy side, the PoC is equally attractive. It finds eager sponsors among anyone with influence on priorities. They see work that requires surprisingly small budgets, that motivates engineers, that perhaps carries fuzzy warnings about it being “just a PoC”, but how risky can this be? It’s “just a PoC”.
When the demo arrives PMs are excited that such a valuable feature got done faster and cheaper than usual. Sales is positive that it can really help close $customer’s contract renewal. Suddenly, the entire organization is pushing to roll it out by yesterday. Nobody wants to productise. They want to profit. Bring value to customers. How can that be wrong? Warnings are forgotten. Any delay, incomprehensible. Let’s be agile and iterate later.
Sometimes engineers manage to push back and rewrite, the delay frustrates PMs, Sales, customers alike (“What is it with you engineers and technical debt, why did you introduce it in the first place?”). Other times someone pulls $customer’s monthly bill and strong-arms the release of a duct taped, half-baked feature, engineers brace for the ensuing maintenance tire fire. Every stakeholder has fair and rational incentives, but the aggregate is a mess.
The whole point of PoCs was to confirm the viability of an idea. What’s viable about all this?
You can’t lose if you don’t play
The best way to avoid falling in the PoC trap is not doing them in the first place. Which doesn’t mean to stop doing small investments to validate ideas. What we want to avoid is a particular mindset, a way of approaching and executing this type of projects. The antiquarian in Gremlins wasn’t against all Christmas presents either, just the kind that will find a way to eat dinner leftovers and stab you.
Not doing PoCs is about treating those seemingly special projects as any other. It means resisting the temptation to compromise on the standards and basic principles that apply to ordinary work. Those short-cuts are a naive optimization that yes, accelerates the demo. But does so at the expense of creating a flawed deliverable.
For the sake of speed it seems fine to have tests fail, hack around the code base breaking logical boundaries that are there for good reasons, to leave corner cases unimplemented behind TODOs. If quality controls make those short-cuts impossible on the main line, then let’s find a toxic workaround like developing on feature branches or the git-flow nonsense.
By tolerating short-cuts, engineers trick themselves with a shiny looking demo that conceals a cluster of delay-fuse bombs of technical debt and late integration issues.
The same standards and principles applied to production are the necessary guardrails to keep emotional impulses and perverse incentives in check. Work and demo from trunk, respect the same quality checks, control exposure to users with feature flags or similar mechanisms, etc. Even if the demo takes a bit longer, the risk that over-excited PMs push for a premature release becomes smaller and easier to manage.
The argument that respecting production-grade delivery and operational standards slows down experimentation has some merit. But it raises the question: if those standards matter, why is it acceptable to open back-doors under the PoC banner? Instead, reasonable course of action would be to adapt delivery processes and operational standards to enable the rate of speed and experimentation that the business requires.
Scaling without over-engineering
Another objection to treating a PoC like real work is that production standards require taking scalability, maintainability and the whole lot into account. Won’t that turn a quick experiment into an expensive and over-engineered mess?
The answer is that we should not confound considering those factors with actually designing and implementing for the maximum. In a well-known talk about scalability, Jeff Dean recommends to “design for ~10x growth, but plan to rewrite before ~100x [because] the right design at x may be very wrong at 10x or 100x.
There is an implicit point here: scalability is about ranges, more than fixed points. What matters is awareness of where the initial implementation stands, and a clear, credible story for how it will scale if, and when needed.
The PoC mentality adds pressure to design for very concrete points at the lower end of the range (“the demo will be for just one user”, “it’s just a couple of engineers working on this system”, “it’s just a throwaway”). Which leaves little runway before the system saturates and forces a rewrite.
The way to avoid that problem is not jumping straight from “throwaway demo” to “hyperscale”. It is to apply the same principles one would use in normal work, but adjusting the scale. A design for a single-digit % of the full production load with enough headroom to support a 10x growth is enough to keep the over-engineering risk under control.
When it comes to maintainability, it is naive to anticipate years of future maintenance by a hypothetical team of dozens of engineers, and bloat the experiment with every architectural paradigm on the books. Most often, it’s about the exact opposite. To avoid marrying to anything that embeds long-term assumptions (architectural patterns, frameworks, etc.). Deliver a basic, monolithic design with reasonably clear boundaries to allow for a clean replacement of functional units, or slicing them out to larger subsystems, if and when required.
Call this type of project whatever you want. What matters is refusing to delude ourselves into a mindset that treats short-cuts as free, warnings as formalities, and “we’ll productize later” as a plan. A reasonable combination of discipline and pragmatism allows you to be much more ambitious. To build experiments that can satisfy the short-term urges of PMs or sales, and are able to grow fast into real products without a rewrite, a fire drill, or an apology to the customer who trusted the demo.
Any thoughts? send me an email!
To get notified on new posts, follow me on Bluesky / Twitter, or subscribe via RSS feed or email:
Archive
- PoC is a framework of perverse incentives
- AI-generated code will choke delivery pipelines
- Why aren't we all serverless yet?
- Identifiers are better off without meaning
- Alert on symptoms, not causes
- How about we forget the concept of test types?
- How organisations cripple engineering teams with good intentions
- Migrating an Eureka-based microservice fleet to Kubernetes
- ⭐ Talk write-up: "How to build a PaaS for 1500 engineers"
- ⭐ Kubernetes made my latency 10x higher
- Sizing Kubernetes pods for JVM apps without fearing the OOM Killer
- GC forensics by example: multi-second pauses and allocation pressure
- ⭐ How does the default hashCode() work?
- Frugal memory management on the JVM (Meetup)
- DirectBuffer creation / disposal has hidden contention on sun.misc.Cleaner