Systems | Development | Analytics | API | Testing

Conversation tree branching in @ably/ai-transport

Picture a developer pair-programming with an AI assistant. The model returns a function that almost works. The developer asks it to try again. The second attempt is worse. They want the first one back. In a linear chat, that history is gone, or it's a third bubble in the thread that pollutes context for every future turn.

The model is fine. The session is broken.

Take any AI agent demo from the last six months. It works. Now ship it to real users on real networks, real devices, real attention spans. A meaningful share of those users will never finish their first conversation cleanly. Not because the model gave a bad answer. Because the connection dropped, the tab refreshed, the phone took over from the laptop, or the spinner kept spinning forever.

Why we built a dedicated SDK for realtime AI streaming

If you've built a conversational AI feature, you know the pattern. Client sends a message, backend calls a model, response streams back over HTTP. SSE mostly, or WebSockets if you need bidirectional. For a single user on a single device, it works well. The trouble is the best AI products right now have moved well past that.

Why production AI needs a session layer, not just a stream

I spoke at AI Engineer Europe last week, and came away with a clearer picture of where the industry actually is right now. My talk was about why AI user experience breaks at the transport layer. But the bigger takeaway wasn't from my own session. It was from watching what the rest of the room was building, and what problems they were running into.

The Durable Sessions stack is forming

By Matt O'Riordan, CEO and Co-Founder Across AI infrastructure right now, one word is doing a lot of work: durable. It is attached to execution. To agents. To workflows. To sessions. To streams. To transports. To memory. Every few weeks, another product ships with "durable" in the name. This is not branding noise. The underlying observation is the same in every case. AI systems are long-lived. They can fail at any layer. They need infrastructure that assumes failure rather than hopes against it.

Ably Python SDK v3: realtime for Python, built for AI

Python dominates AI development. It's where teams build their agents, orchestration layers, and the backend systems that turn LLM calls into products people actually use. Over the past year, those systems have matured rapidly. What used to live in notebooks and prototypes is now running in production, serving real users with real expectations around reliability and performance. That maturity brings infrastructure requirements. Tokens need to stream in order.

Multi-device AI session continuity: how cross-device conversation sync works

You start a research task on your laptop, the network drops during a meeting, and when you open your phone to continue, the conversation is gone – you re-prompt, get partial duplicate results, and lose 30 minutes of work. The delivery layer dropped it. That's one of the most consistent problems teams hit when building AI applications. It's particularly acute in customer support, where a session belongs to the conversation - not to any single device, connection, or participant.

Does your AI stack need a session layer? A maturity framework for teams building AI agents

Most teams building AI agents start with HTTP streaming. It's the right starting point. Every major agent framework defaults to it, it gets tokens on screen fast, and for a single-user prompt-response interaction it works well. The question is when it stops being enough - and how to recognise that before it turns into user experience problems, engineering waste, and technical debt that constrains what your product can do.