WebSocket reconnection in AI agents: transport recovery vs. session recovery
Your AI agent is mid-task, waiting on the result of a search tool call it made 30 seconds ago. The user is watching a spinner. Then a network blip drops the connection. The application reconnects in under a second, fast enough that most monitoring wouldn't flag it. But the tool call result that came back during the gap is gone, and so are the 200 tokens the agent generated before the silence began. The reconnect succeeded - but the session didn't.