Systems | Development | Analytics | API | Testing

Machine Learning

Benchmarking llama.cpp on Arm Neoverse-based AWS Graviton instances with ClearML

By Erez Schnaider, Technical Product Marketing Manager, ClearML In a previous blog post, we demonstrated how easy it is to leverage Arm Neoverse-based Graviton instances on AWS to run training workloads. In this post, we’ll explore how ClearML simplifies the management and deployment of LLM inference using llama.cpp on Arm-based instances and helps deliver up to 4x performance compared to x86 alternatives on AWS. (Want to run llama.cpp directly?

6 Best Practices for Implementing Generative AI

Generative AI has rapidly transformed industries by enabling advanced automation, personalized experiences and groundbreaking innovations. However, implementing these powerful tools requires a production-first approach. This will maximize business value while mitigating risks. This guide outlines six best practices to ensure your generative AI initiatives are effective: valuable, scalable, compliant and future-proof.

From Machine Learning to AI: Simplifying the Path to Enterprise Intelligence

For years, Cloudera’s platform has helped the world’s most innovative organizations turn data into action. As the AI landscape evolves from experiments into strategic, enterprise-wide initiatives, it’s clear that our naming should reflect that shift. That’s why we’re moving from Cloudera Machine Learning to Cloudera AI.

Revolutionizing Enterprise AI: ClearML and AMD Collaborate to Drive Innovation at Scale

In a significant stride toward transforming AI infrastructure, ClearML has recently announced a collaboration with AMD. By integrating with AMD’s powerful hardware and open-source ROCm software with ClearML’s silicon-agnostic, end-to-end platform, we’re empowering IT teams and AI builders to innovate with ease across diverse infrastructures and integrate GPUs from multiple vendors.

2025 Gen AI Predictions: What Lies Ahead?

In 2024, organizations realized the revolutionizing business potential of gen AI. They accelerated their gen AI operationalization processes: explored new use cases to implement, researched LLMs and AI pipelines and contemplated underlying ethical issues. And with the seeds of the AI revolution now planted, the market is maturing accordingly.

Deploying Gen AI in Production with NVIDIA NIM and MLRun

In this demo, we showcase how to leverage MLRun, Iguazio, and NVIDIA NIMs to deploy and monitor a generative AI model at scale, focusing on reducing risks and ensuring seamless performance. Using NVIDIA's NIMs, the demo demonstrates advanced methods in model monitoring, logging, and continuous fine-tuning.

Choosing the Right-Sized LLM for Quality and Flexibility: Optimizing Your AI Toolkit

LLMs are the foundation of gen AI applications. To effectively operationalize and de-risk LLMs and ensure they bring business value, organizations need to consider not just the model itself, but the supporting infrastructure, including GPUs and operational frameworks. By optimizing them to your use case, you can ensure you are using an LLM that is the right fit to your needs.

Introducing Accelerator for Machine Learning (ML) Projects: Summarization with Gemini from Vertex AI

We’re thrilled to announce the release of a new Cloudera Accelerator for Machine Learning (ML) Projects (AMP): “Summarization with Gemini from Vertex AI”. An AMP is a pre-built, high-quality minimal viable product (MVP) for Artificial Intelligence (AI) use cases that can be deployed in a single-click from Cloudera AI (CAI). AMPs are all about helping you quickly build performant AI applications. More on AMPs can be found here.

AI Agents Are All You Need

Sorry for the click-bait title, but everyone is talking about AI Agents, and for a good reason. With the proliferation of LLMs, everyone – from software engineers using LLMs as a coding copilot to people using AI to plan vacations – is looking for new ways to use the technology that isn’t just answering questions or searching knowledgebases.

Resource Allocation Policy Management - A Practical Overview

As organizations evolve – onboarding new team members, expanding use cases, and broadening the scope of model development, their compute infrastructure grows increasingly complex. What often begins as a single cloud account using available credits can quickly expand into a hybrid mix of on-prem and cloud resources that come with different associated costs and are tailored to diverse workloads.