Systems | Development | Analytics | API | Testing

How to Build a Multi-LLM AI Agent with Kong AI Gateway and LangGraph

In the last two parts of this series, we discussed How to Strengthen a ReAct AI Agent with Kong AI Gateway and How to Build a Single-LLM AI Agent with Kong AI Gateway and LangGraph. In this third and final part, we're going to evolve the AI Agent with multiple LLMs and Semantic Routing policies across them. In this blog post, we'll also explore new capabilities introduced in Kong AI Gateway 3.11 that support other GenAI infrastructures.

What is an AI Gateway?

Ever wondered what an AI Gateway is? Think of it as an airport for your AI traffic! We break down how an AI Gateway can: Act as a central access point for different AI models. Provide security for your LLM prompts. Route traffic to the best model for the job. Save on AI costs with features like response caching. Learn the basics of this essential tool that helps manage AI and LLM costs, security, and efficiency.

Kong AI Gateway: Prompt Compression

High token consumption from long prompts can degrade model performance and lead to expensive, inefficient LLM operations. This video demonstrates how to solve that problem using Kong's AI Gateway. AI Prompt Compressor Plugin: See how this plugin intelligently compresses incoming prompts before they hit the model. It summarizes context, removes redundant information, and trims excess tokens—all while preserving the original meaning.This could lead to significant cost savings and improved performance.

Custom API Logic with Server-Side Scripting

Server-side scripting allows developers to create APIs that respond dynamically to user input, security needs, and business logic. Unlike static APIs, server-side scripts interact with databases and external systems to deliver personalized, secure, and efficient responses. Key highlights: Enhanced API Security: Scripts run on servers, reducing risks like code injection and securing sensitive data. Dynamic Customization: Adjust responses based on user roles, input, or workflows. Improved Efficiency.

Quality Assurance Testing: Everything You Need To Know

In a technologically dominated world, producing top-notch software isn’t merely a competitive edge – it’s a must. Whether creating a mobile application, a website, or a large-scale enterprise system, consumers anticipate that your software be secure, responsive, and flawless. That’s where Quality Assurance (QA) Testing comes in. It doesn’t just make your software work – it work well, do what users need it to do, and meet world-class standards of excellence.

Traffic Replay: Production Without Production Risk

The software and product life cycle is fraught with pitfalls and tradeoffs. While testing applications under production-like load is critical to ensuring the reliability, performance, and security of your data storage and software services, you need to do this testing without actually affecting the production data and systems. In essence, you have to pull off the impossible – be as close to production as you can without actually being production.

How to Filter Events in REST APIs

Filtering events in REST APIs lets you request only the data you need, improving efficiency, reducing server load, and speeding up responses. The process involves using query parameters and operators to define conditions for retrieving specific records, like filtering by date, category, or status. Here's the core idea: Query Parameters: Add key-value pairs to the URL (e.g., ?date=2022-03-01) to filter events by specific fields.

Stop guessing! Speedscale's Notebook finds anything in your traffic.

Debugging complex microservices just got an upgrade. This video demonstrates Speedscale's innovative Notebook capability, allowing you to perform advanced substring searches and filter production traffic based on deeply nested JSON fields within request and response bodies. Unlike traditional observability tools that only record telemetry, Speedscale's always-on recorder captures full traffic payloads, empowering you to precisely pinpoint issues, identify specific user calls, or validate API versions. Streamline your troubleshooting, enhance your testing, and gain unprecedented visibility into your production environment.