Systems | Development | Analytics | API | Testing

Latest Posts

Serverless NiFi Flows with DataFlow Functions: The Next Step in the DataFlow Service Evolution

Cloudera DataFlow for the Public Cloud (CDF-PC) is a cloud-native service for Apache NiFi within the Cloudera Data Platform (CDP). CDF-PC enables organizations to take control of their data flows and eliminate ingestion silos by allowing developers to connect to any data source anywhere with any structure, process it, and deliver to any destination using a low-code authoring experience.

Announcing GA of DataFlow Functions

Today, we’re excited to announce that DataFlow Functions (DFF), a feature within Cloudera DataFlow for the Public Cloud, is now generally available for AWS, Microsoft Azure, and Google Cloud Platform. DFF provides an efficient, cost optimized, scalable way to run NiFi flows in a completely serverless fashion. This is the first complete no-code, no-ops development experience for functions, allowing users to save time and resources.

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Data teams have the impossible task of delivering everything (data and workloads) everywhere (on premise and in all clouds) all at once (with little to no latency). They are being bombarded with literature about seemingly independent new trends like data mesh and data fabric while dealing with the reality of having to work with hybrid architectures. Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem.

Improve Underwriting Using Data and Analytics

Insurance carriers are always looking to improve operational efficiency. We’ve previously highlighted opportunities to improve digital claims processing with data and AI. In this post, I’ll explore opportunities to enhance risk assessment and underwriting, especially in personal lines and small and medium-sized enterprises.

SCIM (System for Cross-domain Identity Management)

The identity team at Cloudera has been working to add the System for Cross-domain Identity Management (SCIM) support to Cloudera Data Platform (CDP) and we’re happy to announce the general availability of SCIM on Azure Active Directory! In Part One we discussed: CDP SCIM Support for Active Directory, which discusses the core elements of CDP’s SCIM support for Azure AD.

A Flexible and Efficient Storage System for Diverse Workloads

Apache Ozone is a distributed, scalable, and high-performance object store, available with Cloudera Data Platform (CDP), that can scale to billions of objects of varying sizes. It was designed as a native object store to provide extreme scale, performance, and reliability to handle multiple analytics workloads using either S3 API or the traditional Hadoop API.

Demystifying Modern Data Platforms

July brings summer vacations, holiday gatherings, and for the first time in two years, the return of the Massachusetts Institute of Technology (MIT) Chief Data Officer symposium as an in-person event. The gathering in 2022 marked the sixteenth year for top data and analytics professionals to come to the MIT campus to explore current and future trends. A key area of focus for the symposium this year was the design and deployment of modern data platforms.

Chose Both: Data Fabric and Data Lakehouse

A key part of business is the drive for continual improvement, to always do better. “Better” can mean different things to different organizations. It could be about offering better products, better services, or the same product or service for a better price or any number of things. Fundamentally, to be “better” requires ongoing analysis of the current state and comparison to the previous or next one. It sounds straightforward: you just need data and the means to analyze it.