Enterprise AI Infrastructure Security Series - 6) Application Gateway
In this video, we pivot from securing your development environment to protecting your production model serving with ClearML's AI Application Gateway. We walk through how to establish a secure front door for your models, manage access with token-based authentication, and enforce governance with stable routes and RBAC to secure your deployed API endpoints.
What we cover:
- The AI Application Gateway — ClearML's secure front door for production services, handling routing, SSL, and authentication.
- Token-based access — generating, managing, and expiring tokens, and why you should never create a "forever" token to limit security exposure.
- Static routes and RBAC — administrator-defined, persistent endpoints that decouple external URLs from specific model instances and control who can access each endpoint by group membership.
- The five layers of gateway functionality: routing, SSL, authentication, RBAC, and full observability.
- Why the gateway eliminates the need for manual networking, load balancers, SSL certificates, and writing Kubernetes YAML for authentication.
- Ephemeral Routes vs. Static Routes — choosing the right route type for dev/testing versus production serving.
- Load Balancing with session affinity (sticky sessions) for consistent, performant serving at scale, particularly for LLM inference.
- Security Flow: the differences between securing endpoints for internal ClearML users versus external partners, and the critical need for token hygiene.
- Demo — creating an authenticated Internal Static route, deploying a model using vLLM, and configuring token-based access.
- Observability via the Model Endpoints dashboard — seeing real-time request counts, uptime, and which tokens are hitting your service.
- How Identity, RBAC, Vaults, Service Accounts, and Compute Governance (the first five parts of this series) converge on the final production endpoint.
- What's next: a sneak peek into audit trails and monitoring (Video 7).
Previous videos in this series:
- Part 1 — Introduction to the Six Layers of Enterprise Security: https://youtu.be/RXz8FuzzwI4
- Part 2 — Identity Provider Setup, Group Sync & Access Rules: https://youtu.be/wnVbOxWbzWM
- Part 3 — Configuration Governance with Administrator Vaults: https://youtu.be/vse_015TaWM
- Part 4 — Service Accounts & Automation Security: https://youtu.be/aPyVLSOp_4I
- Part 5 — Compute Governance Layer — resource pools, resource profiles, and resource policies: https://youtu.be/Wa0ULIyychs
This is Part 6 of our series on enterprise AI infrastructure security. Everything we've covered so far protects your development environment; now we focus on production. Whether you're an IT director managing a public API attack surface, a platform engineer designing deployment workflows, or a team lead responsible for securing consumer-facing model access — this walkthrough covers the practical, hands-on configuration from start to finish.
🔗Links & Resources
ClearML Docs — AI Application Gateway & Static Routes: https://clear.ml/docs/latest/docs/webapp/settings/webapp_settings_app_gw/#static-routes
ClearML Docs — Access Token Management: https://clear.ml/docs/latest/docs/webapp/settings/webapp_settings_app_gw/#access-tokens