Systems | Development | Analytics | API | Testing

Sponsored Post

The Kubeshark Workflow That Doesn't Stop at the Dashboard

The Observability Gap shows up the moment you try to reproduce a production bug locally. Your traces tell you a request was slow. Your logs tell you which line printed. Neither tells you what was actually on the wire: the headers, the JSON body, the surprise field your client started sending last Tuesday. Until now, closing that gap meant SSHing to a node, attaching a debugger, or shipping a sidecar through change review.

Beware of PII in Testing Data: The Security Iceberg and Where PII Actually Hides

If you run a platform tools or security team, you have likely heard this request from developers: “I just need a copy of the production database for staging so I can run realistic load and integration tests.” It is a completely reasonable request. Production traffic and data contain the actual request shapes, real-world value distributions, long-tail anomalies, and timing patterns that make tests useful.

Build WireMock mappings fast from real traffic

I’m a big fan of service mocking. I’ve been working in and around software for about 25 years, and one thing never changes: when you sit down to work on your code, you almost never have everything available. The database, the third-party API, the message queue, the service two teams over. Something’s missing. So you’ve got to stub it out or mock it out and keep moving.

Five things your logs will never tell you

A customer escalation hit my queue when I was on the customer smoke jumpers team at an observability vendor. My team was the group that parachutes into Fortune 500 accounts one bad week from churning and usually after a big customer outage. The customer had filed a billing dispute three weeks earlier and their on-call engineers were stuck. They had our full stack: logs, metrics, traces, end-to-end instrumentation, every product we sold and some we didn’t. They could see the request came in.

Capture once, test forever

We’ve gotten used to understanding our applications through signals, summaries, and traces. Tiny little bits of information about how the app really works. Not because that’s the best way to do it, but because it’s been too hard to get the real thing. The real information exists. It’s on the network. How people called your app and what your code did. What other systems it called, the database queries it made, and the result sets that came back.

Fixing 403 auth errors when you replay traffic

Trigger warning: this one is about Java, authentication, and Docker Compose files. If that is not your thing, I am sorry, but they are part of life and they are honestly not that hard to work with. Everything here is open source on our GitHub repo, so you can follow along. Recording an authenticated Java flow, replaying it, hitting the dreaded 403, and fixing it with a proxymock recommendation.

We won't train on your data is not a security architecture

Every enterprise contract I’ve signed in the last two years has the same clause. “Vendor will not use Customer Data to train machine learning models.” Sometimes it’s a paragraph. Sometimes it’s a whole section. The language varies but the intent is identical: don’t feed our production data into your AI. I get it. I sign the same clause as a vendor. But here’s what’s been bothering me: that clause is a promise, not an architecture.

Validate Spring Boot Upgrades with Traffic Replay

Spring Boot version upgrades—whether moving from 2.x to 3.x, 3.x to 4.x, or even minor bumps like 3.2.5 to 3.3.1—regularly introduce subtle, breaking changes that unit and integration tests miss. JSON serialization shifts, autoconfiguration reordering, and transitive dependency conflicts can silently alter your API contract.

The AI Code Explosion: Why Your Mocking Strategy is Breaking Down

The rise of AI-assisted coding has transformed how software is built. With tools generating entire features in seconds, the bottleneck is no longer writing code—it’s verifying it. Because AI can generate boilerplate and handle API integrations instantly, more service changes are being pushed into authentication logic, API calls, and configurations. Teams desperately need a way to verify these changes before merging, especially when the code touches external dependencies.

Logs told me something broke. Traffic showed me what.

Here’s a problem I run into constantly: something breaks in production, I can see the 500 errors in my logs, but I can’t reproduce it locally. The trace shows me the dependency graph but not the actual request that failed. This is especially painful in microservices. I was looking at a CNCF example the other day (a simple demo app, like 4 pods) and it already had so many cross-service dependencies that understanding what broke required looking at the whole system at once.