Systems | Development | Analytics | API | Testing

Technology

Testing generative AI systems and red teaming: An introductory guide

The topic of testing AI and ensuring its responsibility, safety, and security has never been more urgent. Controversy and incidents of AI misuse have increased 26-fold since 2021, highlighting growing concerns. As users quickly find out, AI tools are not infallible; they can make mistakes, display overconfidence, and lack critical questioning. The reality of the market is that AI is prone to error. This is exactly why testing AI is crucial. But how do we test AI?

Introduction to Gemini in BigQuery

Data practitioners spend much of their time on complex, fragmented and sometimes, repetitive tasks. This limits their ability to focus on strategic insights and maximize the value of their data. Gemini in BigQuery shifts this paradigm by providing AI capabilities that help streamline your workflows across the entire data lifecycle.

Streamline Your AI Integration: A Deep Dive into Kong AI Gateway

Join us to learn about the AI Gateway concept and explore the rapidly evolving landscape of large language models (LLMs) in modern applications. With the surge of AI providers and the lack of standardization, organizations face significant challenges in adopting and managing AI services effectively. Kong's AI Gateway, built on the proven Kong Gateway platform, addresses these challenges head-on, empowering developers and organizations to harness the power of AI quickly and securely.

Snowflake Launches the World's Best Practical Text-Embedding Model for Retrieval Use Cases

Today Snowflake is launching and open-sourcing with an Apache 2.0 license the Snowflake Arctic embed family of models. Based on the Massive Text Embedding Benchmark (MTEB) Retrieval Leaderboard, the largest Arctic embed model with only 334 million parameters is the only one to surpass average retrieval performance of 55.9, a feat only less practical to deploy models with over 1 billion parameters are able to achieve.

LLM Metrics: Key Metrics Explained

Organizations that monitor their LLMs will benefit from higher performing models at higher efficiency, while meeting ethical considerations like ensuring privacy and eliminating bias and toxicity. In this blog post, we bring the top LLM metrics we recommend measuring and when to use each one. In the end, we explain how to implement these metrics in your ML and gen AI pipelines.

Unleashing the Power of Digital Assurance in the Age of AI and Gen AI: Charting the Way Forward

In the pursuit of excellence, Quality Assurance (QA) has embarked on a profound journey of automation. Beginning with manual testing as its foundation, QA has progressed steadily through functional automation and smart automation, culminating in its embrace of Intelligent automation and Codeless automation. This evolution mirrors our transition from traditional waterfall models to agile methodologies.

Why RAG Has a Place in Your LLMOps

With the explosion of generative AI tools available for providing information, making recommendations, or creating images, LLMs have captured the public imagination. Although we cannot expect an LLM to have all the information we want, or sometimes even include inaccurate information, consumer enthusiasm for using generative AI tools continues to build.