Enterprise AI Infrastructure Security Series - 6) Application Gateway

ClearML

Apr 2, 2026

In this video, we pivot from securing your development environment to protecting your production model serving with ClearML's AI Application Gateway. We walk through how to establish a secure front door for your models, manage access with token-based authentication, and enforce governance with stable routes and RBAC to secure your deployed API endpoints.

What we cover:

The AI Application Gateway — ClearML's secure front door for production services, handling routing, SSL, and authentication.
Token-based access — generating, managing, and expiring tokens, and why you should never create a "forever" token to limit security exposure.
Static routes and RBAC — administrator-defined, persistent endpoints that decouple external URLs from specific model instances and control who can access each endpoint by group membership.
The five layers of gateway functionality: routing, SSL, authentication, RBAC, and full observability.
Why the gateway eliminates the need for manual networking, load balancers, SSL certificates, and writing Kubernetes YAML for authentication.
Ephemeral Routes vs. Static Routes — choosing the right route type for dev/testing versus production serving.
Load Balancing with session affinity (sticky sessions) for consistent, performant serving at scale, particularly for LLM inference.
Security Flow: the differences between securing endpoints for internal ClearML users versus external partners, and the critical need for token hygiene.
Demo — creating an authenticated Internal Static route, deploying a model using vLLM, and configuring token-based access.
Observability via the Model Endpoints dashboard — seeing real-time request counts, uptime, and which tokens are hitting your service.
How Identity, RBAC, Vaults, Service Accounts, and Compute Governance (the first five parts of this series) converge on the final production endpoint.
What's next: a sneak peek into audit trails and monitoring (Video 7).

Previous videos in this series:

Part 1 — Introduction to the Six Layers of Enterprise Security: https://youtu.be/RXz8FuzzwI4
Part 2 — Identity Provider Setup, Group Sync & Access Rules: https://youtu.be/wnVbOxWbzWM
Part 3 — Configuration Governance with Administrator Vaults: https://youtu.be/vse_015TaWM
Part 4 — Service Accounts & Automation Security: https://youtu.be/aPyVLSOp_4I
Part 5 — Compute Governance Layer — resource pools, resource profiles, and resource policies: https://youtu.be/Wa0ULIyychs

This is Part 6 of our series on enterprise AI infrastructure security. Everything we've covered so far protects your development environment; now we focus on production. Whether you're an IT director managing a public API attack surface, a platform engineer designing deployment workflows, or a team lead responsible for securing consumer-facing model access — this walkthrough covers the practical, hands-on configuration from start to finish.

🔗Links & Resources
ClearML Docs — AI Application Gateway & Static Routes: https://clear.ml/docs/latest/docs/webapp/settings/webapp_settings_app_gw/#static-routes
ClearML Docs — Access Token Management: https://clear.ml/docs/latest/docs/webapp/settings/webapp_settings_app_gw/#access-tokens