Analytics

Why Short-Lived Connections Are Killing Your Performance! | Kafka Developer Mistakes

Constantly starting and stopping Apache Kafka producers and consumers? That’s a recipe for high resource usage and inefficiency. Short-lived connections are heavy on resources, and can slow down your whole cluster. Keep them running to boost performance, cut latency, and get the most out of your Kafka setup.

Cloudera announces 'Interoperability Ecosystem' with founding members AWS and Snowflake

Today enterprises can leverage the combination of Cloudera and Snowflake—two best-of-breed tools for ingestion, processing and consumption of data—for a single source of truth across all data, analytics, and AI workloads. But now AWS customers will gain more flexibility, data utility, and complexity, supporting the modern data architecture.

Predictions 2025: Strategies to Realize the Promise of AI

Snowflake leaders offer insight on AI, open source and cybersecurity development — and the fundamental leadership skills required — in the years ahead. As we come to the end of a calendar year, it’s natural to contemplate what the new year will hold for us. It’s an understatement to say that the future is very hard to predict, but it’s possible to both prepare for the likeliest outcomes and stay ready to adapt to the unexpected.

Fueling the Future of GenAI with NiFi: Cloudera DataFlow 2.9 Delivers Enhanced Efficiency and Adaptability

For more than a decade, Cloudera has been an ardent supporter and committee member of Apache NiFi, long recognizing its power and versatility for data ingestion, transformation, and delivery. Our customers rely on NiFi as well as the associated sub-projects (Apache MiNiFi and Registry) to connect to structured, unstructured, and multi-modal data from a variety of data sources – from edge devices to SaaS tools to server logs and change data capture streams.

Cloudera AI Inference Service Enables Easy Integration and Deployment of GenAI Into Your Production Environments

Welcome to the first installment of a series of posts discussing the recently announced Cloudera AI Inference service. Today, Artificial Intelligence (AI) and Machine Learning (ML) are more crucial than ever for organizations to turn data into a competitive advantage. To unlock the full potential of AI, however, businesses need to deploy models and AI applications at scale, in real-time, and with low latency and high throughput. This is where the Cloudera AI Inference service comes in.

EP 2: Beyond Just Data in Today's Market

Airports are an interconnected system where one unforeseen event can tip the scale into chaos. For a smaller airport in Canada, data has grown to be its North Star in an industry full of surprises. But in order for data to bring true value to operations–and ultimately customer experience–those data insights must be grounded in trust. Ryan Garnett, Senior Manager Business Solutions of Halifax International Airport Authority, joins The AI Forecast to share how the airport revamped its approach to data, creating a predictions engine that drives operational efficiency and improved customer experience.

Secure Data Sharing and Interoperability Powered by Iceberg REST Catalog

Many enterprises have heterogeneous data platforms and technology stacks across different business units or data domains. For decades, they have been struggling with scale, speed, and correctness required to derive timely, meaningful, and actionable insights from vast and diverse big data environments. Despite various architectural patterns and paradigms, they still end up with perpetual “data puddles” and silos in many non-interoperable data formats.

Why Relying on Default Settings Can Cost You! | Kafka Developer Mistakes

Default settings in Apache Kafka work when you’re getting started, but aren't suited for production. Sticking with defaults, like a seven-day retention policy, or a replication factor of one, can cause storage issues, or data loss in case of failure. Learn why optimizing retention periods, replication factors, and partitions, is crucial for better Kafka performance and reliability.

Luggage lost in a world of streaming data

Democratizing and sharing data inside and outside your organization, as a real-time data stream, has never been more in demand. Treating data as-a-product and adopting Data Mesh practices is leading the way. Here, we explain the concept through a real-life example of an airline building applications that process data across different domains.