Systems | Development | Analytics | API | Testing

Trace without traces

A customer emailed on a Tuesday: checkout hung for ten seconds. I opened our tracing tool, punched in the time window, and got nothing. The trace was sampled out. We keep 1% of traces, like most shops with real traffic do. The one request that actually mattered was in the 99% we threw away. I spent twenty minutes admiring our observability stack before admitting it couldn’t answer a first-grader’s question: what happened to this person? Here’s what I know now.

AI Agents Write Broken Code 49% of the Time #speedscale #AI #Coding #Tech #DevOps

AI agents write broken code nearly 50% of the time. By adding a traffic-based deterministic evaluation, Speedscale boosted unsupervised bug-fixing quality from 51% to 77% in just 5 minutes. This helped slash token costs and eliminate rework without human intervention. Learn more: speedscale.com.

The Three Pillars Were Built for Humans

It was 2am and I was paying for the privilege. Something was on fire in production, and I’d done the modern thing: I pointed an AI agent at it. It ingested the dashboards. It read the logs. It walked the traces. Then it handed me back a beautifully formatted paragraph that said, in effect, “latency is elevated on the checkout path.” I knew that. The page told me that.

Which Bugs AI Agents Fix Better With Traffic

In the first experiment, I wanted a baseline: if an AI coding agent gets the same production signal a human would get, can it fix bugs in a codebase it has never seen? Yes, but only when I gave it better context. With only an alert, the agent passed 51% of the runtime tests. When I added captured traffic, the actual request and response for the failing call, it climbed to 77%. This post is the second pass.
Sponsored Post

The Kubeshark Workflow That Doesn't Stop at the Dashboard

The Observability Gap shows up the moment you try to reproduce a production bug locally. Your traces tell you a request was slow. Your logs tell you which line printed. Neither tells you what was actually on the wire: the headers, the JSON body, the surprise field your client started sending last Tuesday. Until now, closing that gap meant SSHing to a node, attaching a debugger, or shipping a sidecar through change review.