Systems | Development | Analytics | API | Testing

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Co-author: Mike Godwin, Head of Marketing, Rill Data Cloudera has partnered with Rill Data, an expert in metrics at any scale, as Cloudera’s preferred ISV partner to provide technical expertise and support services for Apache Druid customers. We want Cloudera customers that rely on Apache Druid to know that their clusters are secure and supported by the Cloudera partner ecosystem.

Does Financial Crime Increase During a Recession?

The dynamic and interconnected world of global ecommerce, crypto currencies, and alternative payments places increased pressure on anti-financial crime measures to keep pace and transform alongside these initiatives. Consumers worldwide are projected to use mobile devices to make more than 30.7 billion ecommerce transactions by 2026, a five-fold increase over the 6.1 billion predicted for 2022.

Fraud Detection With Cloudera Stream Processing Part 2: Real-Time Streaming Analytics

In part 1 of this blog we discussed how Cloudera DataFlow for the Public Cloud (CDF-PC), the universal data distribution service powered by Apache NiFi, can make it easy to acquire data from wherever it originates and move it efficiently to make it available to other applications in a streaming fashion.

Differences between the C++ and Java MiNiFi agents

In this video we will go through all the differences between the C++ and Java MiNiFi agents. The video shows the differences observed on the Edge Flow Manager UI ranging from different information to the presence of buttons and dropdown elements determined by the agent type. Differences in feature set and functionality are also highlighted. The two implementations also have different footprints (memory and CPU) as well as a different set of available components. This video will help you determine the MiNiFi agent that best suits your use case.

The Chief Data Officer | Digital Transformation

Today, data isn't a cost center. It's a business driver. And Chief Data Officers are responsible for using data to create real results and transform their business. Meet Ray, the CDO at a high tech global electronics manufacturer. Ray relies on the Cloudera Data Platform to bring multiple data sources together, Ray's company can connect supply chain, go-to-market and product research data in one place, while lowering the cost on their network.

Why Replicating HBase Data Using Replication Manager is the Best Choice

In this article we discuss the various methods to replicate HBase data and explore why Replication Manager is the best choice for the job with the help of a use case. Cloudera Replication Manager is a key Cloudera Data Platform (CDP) service, designed to copy and migrate data between environments and infrastructures across hybrid clouds.

Beyond Data Fabrics: Cloudera Modern Data Architectures

As Cloudera CMO David Moxey outlined in his blog, we live in a hybrid data world. Data is growing and continues to accelerate its growth. It is changing in makeup and appearing in ever more places. Driving insight and value from it all is as much of an opportunity as it is a challenge. As a result, it’s getting ​​progressively more complex for businesses to access, use, and create value from it.

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

We are excited to announce the general availability of Apache Iceberg in Cloudera Data Platform (CDP). Iceberg is a 100% open table format, developed through the Apache Software Foundation, and helps users avoid vendor lock-in. Today’s general availability announcement covers Iceberg running within key data services in the Cloudera Data Platform (CDP)—including Cloudera Data Warehousing (CDW), Cloudera Data Engineering (CDE), and Cloudera Machine Learning (CML).

Technology Spotlight: Apache Iceberg

At Cloudera, we are committed to staying true to our open source roots and working well within the communities is critical to that. Since 2021, we have supported the growing Iceberg community with hundreds of contributions across Impala, Hive, Spark and Iceberg. We look forward to continuing the momentum as companies embrace the open lakehouse. General release now available in the Cloudera Data Platform.

Fraud Detection with Cloudera Stream Processing Part 1

In a previous blog of this series, Turning Streams Into Data Products, we talked about the increased need for reducing the latency between data generation/ingestion and producing analytical results and insights from this data. We discussed how Cloudera Stream Processing (CSP) with Apache Kafka and Apache Flink could be used to process this data in real time and at scale. In this blog we will show a real example of how that is done, looking at how we can use CSP to perform real-time fraud detection.