Scaling AppSec at Netflix with Cosmos Microservices

Felipe Hlibco

May 27, 2022

Netflix launches new microservices daily. Not weekly. Daily.

When your architecture is thousands of services deep and growing at that pace, you can’t secure it with manual reviews and quarterly audits. You just can’t. The only viable strategy is automation plus queryable data, and Netflix’s engineering team has written openly about how they’ve built exactly that.

Their Scaling AppSec blog post lays out an approach that I think any engineering leader running a microservices architecture should study — not to copy Netflix’s specific tooling, but to understand the organizational model behind it.

Cosmos: The Platform That Makes It Possible #

Netflix’s Cosmos platform has been in production since 2019, with roughly 40 services by 2021. It combines microservices, asynchronous workflows, and serverless functions with built-in observability.

The important part for security? Cosmos gives the AppSec team a consistent surface to work with.

When every service is built on the same platform, security controls can be baked into the platform itself rather than bolted onto each service individually. Secure defaults, standard authentication patterns, consistent logging — these become platform features rather than per-team responsibilities.

This is the “Secure by Default” pillar in Netflix’s framework, and it’s the one I find most transferable to other organizations. If your platform handles auth, secret management, and input validation correctly out of the box, individual teams don’t need to reinvent those wheels (or get them wrong).

Organization as Architecture #

Netflix structures their AppSec function into two tracks: Partnerships and Engineering.

The Partnerships team provides strategic security guidance to product teams; they’re embedded advisors, not gatekeepers. The Engineering team builds the automation and tooling that scales security beyond what human reviewers can cover.

This split reflects Netflix’s broader “Context not Control” culture. They don’t have a security team that approves or blocks deployments. They have a security team that builds guardrails and provides context so that product teams can make good decisions autonomously.

The distinction matters: gatekeeping creates bottlenecks; context creates informed decision-makers.

Three Pillars Worth Stealing #

The Netflix AppSec strategy rests on three investment areas:

Secure by Default means the platform handles the common security patterns so teams don’t have to. Authentication, authorization, encryption in transit — if your platform provides these correctly, you’ve eliminated entire categories of vulnerabilities before any application code is written.

Security Self-Service gives product teams the tools to answer their own security questions. Vulnerability dashboards, dependency scanners they can run on demand, security linting in CI. The goal is reducing the AppSec team’s role from “answerer of questions” to “builder of tools that answer questions.”

Vulnerability Scanning at Scale covers the software supply chain. Every dependency, every container image, every infrastructure component — scanned continuously, with results mapped to service ownership so the right team gets the alert.

At Netflix’s scale, this requires mapping relationships between infrastructure components to identify cascading risk from a single vulnerability. One vulnerable library can touch hundreds of services. You need to know which ones.

The Takeaway for Non-Netflix Teams #

You don’t need Netflix’s scale to apply Netflix’s thinking.

The core insight is simple: security that depends on human reviewers catching things doesn’t scale. Security that’s automated, queryable, and built into the platform does.

Start with your deployment pipeline. What security checks run automatically? What requires a human to remember to do it?

Every manual step is a step that will eventually be skipped. Make the secure path the default path, and most of your problems solve themselves.