Platform Engineering: Bridging the Dev/Ops Gap

Felipe Hlibco

July 4, 2023

“You build it, you run it.” That was the DevOps promise. Engineers would own the full lifecycle of their services — from writing code to operating it in production. No more throwing things over the wall to an operations team. No more “works on my machine.” Full ownership, full accountability.

In theory, this was liberating. In practice? It buried engineers under a mountain of operational complexity they never asked for and weren’t trained to handle.

The Cognitive Overload Problem #

I’ve managed engineering teams at companies ranging from three-person startups to Google. And at every scale above about ten engineers, I’ve watched the same thing happen: developers spend an increasing percentage of their time fighting infrastructure instead of building product.

Writing Kubernetes manifests. Debugging Terraform state drift. Configuring CI/CD pipelines. Managing secrets rotation. Setting up monitoring dashboards. Handling certificate renewals. These tasks are necessary — but they’re not why anyone became a software engineer.

The DevOps movement correctly identified that separating “people who write code” from “people who run code” creates organizational dysfunction. The handoff between dev and ops was a bottleneck; eliminating it improved velocity. But the solution — making every developer responsible for operations — created a new problem. Engineers now need deep expertise in Kubernetes, Terraform, Helm, cloud networking, observability tools, and half a dozen other infrastructure technologies on top of actually knowing how to write software.

At Google, this was managed through massive internal tooling investment. The infrastructure was abstracted to a level where most engineers never touched raw cloud primitives. You deployed your service through an internal platform, and the platform handled the operational complexity. It worked because Google had thousands of infrastructure engineers building and maintaining those abstractions.

Most companies aren’t Google. They adopted the “you build it, you run it” philosophy without the platform investment that makes it sustainable. The result? Engineers drowning in YAML.

Enter Platform Engineering #

Platform engineering is, at its core, an acknowledgment that DevOps overcorrected. Yes, developers should own their services. No, they shouldn’t need to be infrastructure experts to deploy them.

The idea is straightforward: a dedicated platform team builds an Internal Developer Platform (IDP) that provides curated, opinionated paths for common workflows. Need to deploy a new service? The platform gives you a template with sensible defaults for scaling, monitoring, and security. Need a database? The platform provisions it with backup policies, access controls, and connection pooling already configured.

These “golden paths” aren’t constraints in the traditional sense. Developers can go off-path when they need to. But the default experience — the thing that happens when you don’t actively choose otherwise — is well-architected, secure, and operationally sound.

Gartner named platform engineering a top strategic technology trend for 2023, predicting that 80% of large software engineering organizations will have established platform teams by 2026. That prediction feels about right to me. The organizations I’ve talked to in the past year almost universally describe some version of “our developers are spending too much time on infrastructure” as a top concern.

What Good Looks Like #

I’ve seen platform engineering done well and done badly. The difference usually comes down to one thing: whether the platform team treats developers as customers or as users to be controlled.

Good platform teams build products. They interview their developers, understand their workflows, measure adoption, and iterate based on feedback. They provide self-service capabilities with guard rails rather than request-based workflows with gatekeepers. They build documentation that developers actually read — because it’s specific to their platform, not generic Kubernetes docs.

Spotify’s Backstage is the most visible example of this approach. It’s an open-source developer portal that provides a unified interface for managing services, infrastructure, documentation, and CI/CD. Backstage doesn’t replace your infrastructure tools; it wraps them in a developer experience layer that makes the complexity navigable.

Other companies — Humanitec, Kratix, Port — are building commercial IDP products with similar philosophies. The tooling ecosystem is maturing quickly, which is a good sign. When there’s a market for developer platforms, it means enough organizations have the problem to sustain multiple solutions.

Bad platform teams build internal bureaucracies disguised as platforms. They create approval workflows where there should be self-service. They build rigid templates that don’t accommodate legitimate edge cases. They prioritize standardization over developer productivity — which usually means the platform is optimized for the platform team’s convenience, not the developer’s.

The difference isn’t subtle. At one company I consulted for, deploying a new service required filling out a 40-field Jira ticket and waiting three days for the platform team to provision resources. At another, it was a single CLI command that completed in under two minutes. Both called themselves “platform engineering teams.” Only one of them actually was.

The Organizational Design Question #

Platform engineering isn’t just a technical practice; it’s an organizational design decision. Where does the platform team sit? Who do they report to? How is their success measured?

The most effective model I’ve seen puts the platform team in engineering (not IT operations) with a product management function. The platform team has a product manager who treats the IDP as an internal product — with roadmaps, sprint planning, and user research, all focused on developer experience.

This matters because the failure mode for platform teams is almost always the same: they become a bottleneck. If the platform team is the only group that can modify platform capabilities, every developer request becomes a ticket in the platform team’s backlog. Velocity drops. Frustration rises. Developers start building workarounds, and suddenly you have a shadow infrastructure that nobody maintains.

The antidote is treating the platform as a product with an API-first design. Other teams should be able to extend the platform, contribute modules, and customize their golden paths within defined boundaries. The platform team sets the standards and builds the core capabilities; other teams build on top.

DevOps Is Not Dead #

I want to be clear about something: platform engineering doesn’t replace DevOps. It evolves it.

The DevOps principles — automation, continuous delivery, shared ownership, feedback loops — are still correct. Platform engineering is a specific implementation pattern that makes those principles practical at scale. Without DevOps culture, a platform team becomes just another ops team with a fancier name.

The cultural shift that DevOps introduced (developers caring about operations, operations caring about development velocity) is the foundation that platform engineering builds on. You can’t skip straight to platform engineering without the cultural prerequisites. I’ve watched organizations try; they build a platform that developers ignore because the developers never internalized the “own your service” mentality. The platform becomes shelfware.

The progression makes sense when you think about it as maturation. Phase one: developers and operations are separate teams with a handoff process (traditional IT). Phase two: developers take ownership of operations, supported by tooling and culture change (DevOps). Phase three: a dedicated team abstracts operational complexity into a self-service platform, letting developers focus on product work while maintaining ownership (platform engineering).

Getting Started #

If you’re thinking about building a platform team, a few practical recommendations based on what I’ve seen work.

Start with the highest-friction workflows. Survey your developers: what takes the longest? What do you dread? Where do you waste the most time? The answers cluster predictably around service provisioning, environment setup, CI/CD configuration, and observability. Pick the most painful one and build a golden path for it.

Staff the team with senior engineers who’ve experienced the pain firsthand. Platform engineering requires deep empathy for developer workflows. Engineers who’ve never been on-call at 3 AM debugging a Kubernetes networking issue can’t build good abstractions over that experience.

Measure adoption, not compliance. If your platform is genuinely useful, developers will choose it voluntarily. If you have to mandate its use, you’ve built the wrong thing. Track self-service adoption rates, time-to-deploy, and developer satisfaction surveys. These are your product metrics.

And resist the urge to platform everything on day one. Build a narrow, excellent experience for one workflow before expanding. A platform that does one thing really well earns trust; a platform that does twenty things poorly earns resentment.

The dev/ops gap isn’t going away. But the answer isn’t asking every developer to become an infrastructure expert. It’s building abstractions that respect their time while maintaining the ownership culture that DevOps got right. Platform engineering is that answer — or at least the best version of it we have so far.