Designing for Portability with Cloud-Native Abstractions
I’ve had the “should we go multi-cloud” conversation at every company I’ve worked at. The answer is always complicated. Pure multi-cloud is expensive and operationally brutal. Full single-cloud commitment is efficient right up until pricing changes, a region goes down, or an acquisition brings a different cloud into the picture.
The pragmatic middle ground — and the one I keep coming back to — is designing for portability without necessarily deploying to multiple clouds.
There’s a difference. Portability means your application can move. Multi-cloud means it does move. The first is an architectural decision; the second is an operational one. You want the first even if you never need the second.
Where Lock-In Actually Bites #
Not all cloud dependencies are equal. Compute portability is largely solved. If your application runs in containers, it runs on any cloud that supports Kubernetes. That’s all of them.
The hard part is stateful services. Databases, object storage, message queues, identity providers — these are where cloud-specific APIs creep in and lock-in accumulates.
Consider a Node.js application that uses DynamoDB for state, SQS for job queues, S3 for file storage, and Cognito for auth. Each of those is an AWS-specific API. Moving to GCP means rewriting the data layer for Firestore or Bigtable, swapping SQS for Pub/Sub, replacing S3 calls with Cloud Storage, and migrating auth to Firebase or Identity Platform.
That’s not a weekend project. That’s a quarter of engineering work, minimum.
The Adapter Pattern for Cloud Services #
The solution isn’t to avoid cloud services — that defeats the purpose of being on a cloud. The solution is to wrap them.
// Define your interface
interface ObjectStore {
put(key: string, data: Buffer, metadata?: Record<string, string>): Promise<void>;
get(key: string): Promise<{ data: Buffer; metadata: Record<string, string> }>;
delete(key: string): Promise<void>;
list(prefix: string): AsyncIterable<string>;
}
// Implement per provider
class S3ObjectStore implements ObjectStore { /* ... */ }
class GCSObjectStore implements ObjectStore { /* ... */ }
class LocalFSObjectStore implements ObjectStore { /* ... for dev/test */ }This isn’t a new pattern. It’s the dependency inversion principle applied to infrastructure. But I’m surprised how often teams skip it — especially early-stage teams that argue they’ll “deal with it later.”
Later never comes. Or rather, it comes when you’re under pressure to migrate and don’t have time to refactor the entire data access layer.
The overhead of writing an adapter is small. An ObjectStore interface with four methods takes an afternoon. The S3 implementation is thin; most of the logic is in the aws-sdk calls. A GCS implementation, if you ever need it, follows the same structure. And the local filesystem implementation gives you fast integration tests without cloud credentials.
Infrastructure as Code: The Unsung Portability Tool #
Terraform has become the default IaC tool for a reason. It provides a consistent provisioning language across AWS, Azure, and GCP. Your infrastructure definitions are declarative, version-controlled, and (to a degree) portable.
Pulumi takes this further by letting you define infrastructure in TypeScript, Python, or Go — real programming languages with loops, conditionals, and abstractions. For teams already deep in TypeScript, Pulumi’s model feels more natural than HCL.
The portability benefit of IaC isn’t that you can lift-and-shift your Terraform configs between clouds. You can’t, really; the resource types are cloud-specific. The benefit is that your provisioning process is codified and repeatable. When you do need to stand up equivalent infrastructure on a different provider, you have a complete specification of what needs to exist rather than a collection of manual console configurations.
Kubernetes as the Portability Layer #
Kubernetes has graduated within CNCF, and its value proposition for portability is straightforward: if your workloads run on Kubernetes, they run on EKS, GKE, or AKS with minimal changes. The control plane differences are mostly operational (networking, load balancing, storage classes), not application-level.
Helm charts provide a packaging layer that further abstracts provider differences. A Helm chart that deploys your application to GKE should work on EKS with a different values.yaml. Should. In practice, you’ll encounter differences in ingress controllers, storage provisioners, and monitoring integrations that require per-cloud overrides. But the core application deployment stays consistent.
The more interesting CNCF project for portability right now is Dapr, which hit 1.0 in February. Dapr provides building blocks — state management, pub/sub, service invocation, secrets — as sidecar APIs. Your application talks to Dapr over HTTP or gRPC; Dapr talks to the underlying infrastructure (Redis, Kafka, DynamoDB, whatever).
# Dapr component: swap implementations without code changes
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: statestore
spec:
type: state.redis # Change to state.dynamodb, state.cosmosdb, etc.
metadata:
- name: redisHost
value: "redis:6379"It’s the adapter pattern at the infrastructure level. Your code calls dapr.saveState() and doesn’t know or care whether the backend is Redis, DynamoDB, or Cosmos DB.
The Honest Tradeoffs #
I don’t want to oversell this. Every abstraction layer has costs:
Performance overhead. Going through Dapr’s sidecar adds a network hop. Wrapping DynamoDB in a generic ObjectStore means you can’t use DynamoDB-specific features like conditional writes or query expressions (or you add them to your interface, which erodes portability).
Lowest common denominator. A truly portable interface only supports features that exist across all providers. DynamoDB’s single-digit millisecond latency at scale, Spanner’s global consistency, Cosmos DB’s multi-model support — these are provider-specific strengths that a generic interface flattens away.
Maintenance cost. Every adapter needs to be tested, updated when provider SDKs change, and monitored for behavioral differences. Two implementations is manageable; five is a burden.
The question I ask is: what’s the probability that we need to move in the next three years, and what’s the cost of moving without these abstractions? If you’re a startup on AWS with no acquisition prospects, the calculus is different from an enterprise running across three clouds with regulatory requirements for geographic diversity.
My Current Approach #
At TaskRabbit, we’re primarily on AWS. We don’t abstract everything — our DynamoDB usage takes advantage of provider-specific features where it matters. But for services where portability is plausible (storage, messaging, secrets management), we maintain clean interfaces.
The rule I apply: if the cloud service is a commodity (object storage, queues, key-value state), wrap it. If it’s differentiated (a specific database engine, a specialized ML service), use it directly and accept the coupling.
Portability isn’t about being cloud-agnostic. It’s about making deliberate choices about where you’re willing to be locked in, and keeping the rest flexible.