Infrastructure

10 Mar 2026

Netdata: Next-Gen Server Monitoring for Agent Sprawl

If you run AI agents in production, your monitoring situation probably sucks. I don’t say that to be dramatic. Traditional APM tools grew up watching request-response web services; predictable traffic, well-defined endpoints, resources that scale the way textbooks say they should. Agent workloads break every one of those assumptions.

Agents spin up, do something weird for three seconds, and vanish. They consume GPU, CPU, and memory in bursts that look like anomalies to anything trained on web server baselines. They chain API calls in patterns that shift based on input. And when something breaks? Good luck tracing the failure through four nested agent invocations and a vector database query that returned garbage.

8 Oct 2025

Right-Sizing AI for the Edge: Power and Security Focus

There’s a default assumption in the AI industry that bigger wins. More parameters, larger context windows, heavier compute. For many tasks, that holds. Complex reasoning, multi-step planning, fine-grained code generation: those benefit from frontier-scale models.

But a huge chunk of real-world inference doesn’t need any of that.

Classifying a support ticket? Detecting anomalous sensor readings? Running intent recognition on a phone? Shipping 405 billion parameters to answer “is this a cat?” is not engineering. That’s waste.

8 May 2025

Node.js 24: The Krypton LTS Cycle Begins

Node.js 24 landed on May 6th. Two days in, I’ve already migrated one personal project and started testing at work. The release is dense — V8 13.6, npm 11, Undici 7, a simplified permission model — but the part that interests me most isn’t any single feature. It’s the pattern these features reveal.

Node is maturing. Not in the “it’s boring now” sense. In the “it takes infrastructure seriously” sense. And that’s exactly what it needs to do to stay relevant.

28 Sep 2021

The Shape of the Edge in Modern Data Centers

There’s a Gartner prediction floating around that 75% of enterprise data will be created and processed outside traditional data centers by 2025. Up from roughly 10% a few years ago.

I don’t know if 75% is exactly right. Gartner predictions have a habit of being directionally correct but numerically ambitious — like that friend who’s always “about to start a startup.” What I do believe is that the center of gravity for compute is shifting, and the architecture of that shift matters more than whatever marketing label gets slapped on it.

6 Apr 2021

Node.js 16: Timers Promises and Apple Silicon support

Node.js 16 drops April 20th. I’ve been poking around the dev branch and the pre-release notes, and there’s enough here to warrant a proper look. Not a paradigm shift — more like a collection of practical fixes for real friction points.

Here’s what stands out.

Timers Promises Goes Stable #

This is the one I’m most excited about. The Timers Promises API has been experimental since Node.js 15, and it graduates to stable in v16.

16 Mar 2021

US Carrier Pivot: The End of CCMI and Google's RCS

Back in October 2019, AT&T, Sprint, T-Mobile, and Verizon made a big joint announcement. They were forming the Cross Carrier Messaging Initiative — CCMI — to bring RCS messaging to every Android user in the US. A unified front. The carriers would handle it together.

It sounded great on paper. Four carriers, one initiative, interoperable RCS for everyone. The pitch was straightforward: replace SMS with something modern, and do it as a consortium so no single carrier has to go it alone.

31 Jul 2020

Moving Infrastructure Inference to Hardware Accelerators

Last quarter we moved a couple of our ML inference workloads off general-purpose CPUs and onto NVIDIA T4 GPUs. The performance gains were immediate and dramatic. The operational complexity that came with them was… also immediate.

At TaskRabbit, we use ML models for ranking and recommendation—matching Taskers to jobs, surfacing relevant categories, scoring urgency. These aren’t massive models by research standards, but they run on every request. Latency matters. Cost matters. And for a while, our CPU-based inference was both too slow and too expensive.

9 Jun 2020

Interoperability Milestones in the RCS Ecosystem

RCS has a credibility problem. For years, the pitch was “SMS but better” — rich media, read receipts, typing indicators, group chat. The features were real, but deployment was a mess. Carriers built RCS in silos, which meant your “upgraded” messages only worked if both people happened to be on the same carrier.

That’s finally changing. The first half of 2020 has produced genuine interoperability milestones, and honestly? They’re worth paying attention to.