Monthly Archive

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Jun 30, 2022 By Bill Zhang In Cloudera

We are excited to announce the general availability of Apache Iceberg in Cloudera Data Platform (CDP). Iceberg is a 100% open table format, developed through the Apache Software Foundation, and helps users avoid vendor lock-in. Today’s general availability announcement covers Iceberg running within key data services in the Cloudera Data Platform (CDP)—including Cloudera Data Warehousing (CDW), Cloudera Data Engineering (CDE), and Cloudera Machine Learning (CML).

Read Post

Cloudera

Read more about Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Technology Spotlight: Apache Iceberg

Jun 30, 2022 By Cloudera In Cloudera

At Cloudera, we are committed to staying true to our open source roots and working well within the communities is critical to that. Since 2021, we have supported the growing Iceberg community with hundreds of contributions across Impala, Hive, Spark and Iceberg. We look forward to continuing the momentum as companies embrace the open lakehouse. General release now available in the Cloudera Data Platform.

View Video

Cloudera

Analytics
BI

Read more about Technology Spotlight: Apache Iceberg

Fraud Detection with Cloudera Stream Processing Part 1

Jun 28, 2022 By André Araújo In Cloudera

In a previous blog of this series, Turning Streams Into Data Products, we talked about the increased need for reducing the latency between data generation/ingestion and producing analytical results and insights from this data. We discussed how Cloudera Stream Processing (CSP) with Apache Kafka and Apache Flink could be used to process this data in real time and at scale. In this blog we will show a real example of how that is done, looking at how we can use CSP to perform real-time fraud detection.

Read Post

Cloudera

Read more about Fraud Detection with Cloudera Stream Processing Part 1

Introduction to Cloudera Edge Flow Manager

Jun 28, 2022 By Cloudera In Cloudera

This video is a 101 introduction about Edge Flow Manager (EFM), the Cloudera Edge Management (CEM) solution for managing and monitoring Apache MiNiFi agents at scale. The video goes through all the different views of the user interface to demonstrate and explain all of the features for designing flows, publishing flows to the agents, execute remote commands, monitoring the agents, etc.

View Video

Cloudera

Analytics
BI

Read more about Introduction to Cloudera Edge Flow Manager

Making the World a Better Place with Data

Jun 23, 2022 By Carolyn Duby In Cloudera

Much of the hype around big data and analytics focuses on business value and bottom-line impacts. Those are enormously important in the private and public sectors alike. But for government agencies, there is a greater mission: improving people’s lives. Data makes the most ambitious and even idealistic goals—like making the world a better place—possible.

Read Post

Cloudera

Read more about Making the World a Better Place with Data

Are You Ready for Cloud Regulations?

Jun 22, 2022 By Monique Hesseling In Cloudera

Across the globe, cloud concentration risk is coming under greater scrutiny. The UK HM Treasury department recently issued a policy paper “Critical Third Parties to the Finance Sector.” The paper is a proposal to enable oversight of third parties providing critical services to the UK financial system.

Read Post

Cloudera

Read more about Are You Ready for Cloud Regulations?

Build Hybrid Data Pipelines and Enable Universal Connectivity With CDF-PC Inbound Connections

Jun 17, 2022 By Michael Kohs In Cloudera

In the second blog of the Universal Data Distribution blog series, we explored how Cloudera DataFlow for the Public Cloud (CDF-PC) can help you implement use cases like data lakehouse and data warehouse ingest, cybersecurity, and log optimization, as well as IoT and streaming data collection. A key requirement for these use cases is the ability to not only actively pull data from source systems but to receive data that is being pushed from various sources to the central distribution service.

Read Post

Cloudera

Read more about Build Hybrid Data Pipelines and Enable Universal Connectivity With CDF-PC Inbound Connections

The Future of the Data Lakehouse - Open

Jun 17, 2022 By Ram Venkatesh In Cloudera

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. In recent years, the term “data lakehouse” was coined to describe this architectural pattern of tabular analytics over data in the data lake.

Read Post

Cloudera

Read more about The Future of the Data Lakehouse - Open

Turning Streams Into Data Products

Jun 16, 2022 By George Vetticaden In Cloudera

Every large enterprise organization is attempting to accelerate their digital transformation strategies to engage with their customers in a more personalized, relevant, and dynamic way. The ability to perform analytics on data as it is created and collected (a.k.a. real-time data streams) and generate immediate insights for faster decision making provides a competitive edge for organizations.

Read Post

Cloudera

Read more about Turning Streams Into Data Products

Cloudera Recognized as 2022 Gartner Peer Insights

Jun 13, 2022 By Paul Codding In Cloudera

We are excited to announce that Cloudera is named as a 2022 Gartner Peer Insights Customers’ Choice for Cloud Database Management Systems (DBMS). Peer Insights is a user review site, the technology professional’s “go-to” destination for information on customer experience. Gartner Peer Insights collects anonymous customer reviews on select product categories. To date, Gartner has collected over 450,000 reviews for 18,000 products in over 425 categories.

Read Post

Cloudera

Read more about Cloudera Recognized as 2022 Gartner Peer Insights

Cloudera's Applied ML Prototype Catalog Continues to Grow

Jun 10, 2022 By Jacob Bengtson In Cloudera

Here at Cloudera, we’re committed to helping make the lives of data practitioners as painless as possible. For data scientists, we continue to provide new Applied Machine Learning Prototypes (AMPs), which are open source and available on GitHub. These pre-built reference examples are complete end-to-end data science projects. In Cloudera Machine Learning (CML), you can deploy them with the single click of a button, bringing data scientists that much closer to providing value.

Read Post

Cloudera

Read more about Cloudera's Applied ML Prototype Catalog Continues to Grow

Hello, Spark! An intro to Apache Spark using PySpark in the Cloud

Jun 10, 2022 By Cloudera In Cloudera

If you’re new to the world of large-scale data analytics, this session is for you! We'll cover the basics of what problems Apache Spark can solve, why and when to use Spark, and how Spark enables efficient use of time and computing hardware. We’ll also demonstrate how easy it is to run a PySpark job in the public cloud using the Data Science Workbench and Cloudera Data Engineering Products.

View Video

Cloudera

Read more about Hello, Spark! An intro to Apache Spark using PySpark in the Cloud

Streaming Edge Data Collection and Global Data Distribution

Jun 9, 2022 By George Vetticaden In Cloudera

In the first blog of the Universal Data Distribution blog series, we discussed the emerging need within enterprise organizations to take control of their data flows. From origin through all points of consumption both on-prem and in the cloud, all data flows need to be controlled in a simple, secure, universal, scalable, and cost-effective way.

Read Post

Cloudera

Read more about Streaming Edge Data Collection and Global Data Distribution

Data & The Culture Transformation

Jun 8, 2022 By Cloudera In Cloudera

TechCrunch and Cloudera invite you to a conversation about the data transformation underway that is changing how information is used and the very nature of business. The emerging data ecosystem will allow enterprises to work collaboratively with customers, partners and even competitors around the world to integrate disparate data sources for a more complete picture of their business’ present and future.

View Video

Cloudera

Analytics
BI

Read more about Data & The Culture Transformation

The Future Is Hybrid Data, Embrace It

Jun 7, 2022 By David Moxey In Cloudera

We live in a hybrid data world. In the past decade, the amount of structured data created, captured, copied, and consumed globally has grown from less than 1 ZB in 2011 to nearly 14 ZB in 2020. Impressive, but dwarfed by the amount of unstructured data, cloud data, and machine data – another 50 ZB.

Read Post

Cloudera

Read more about The Future Is Hybrid Data, Embrace It

The Power of Exploratory Data Analysis and Visualization for ML

Jun 3, 2022 By Peter Ableda In Cloudera

Data scientists and machine learning engineers in enterprise organizations need to fully understand their data in order to properly analyze it, build models, and power machine learning use cases across their business. Due to the lack of tooling specifically designed for data discovery, exploration, and preliminary analysis, this presents a significant challenge for these teams.

Read Post

Cloudera

Read more about The Power of Exploratory Data Analysis and Visualization for ML

Moving Enterprise Data From Anywhere to Any System Made Easy

Jun 2, 2022 By George Vetticaden In Cloudera

Since 2015, the Cloudera DataFlow team has been helping the largest enterprise organizations in the world adopt Apache NiFi as their enterprise standard data movement tool. Over the last few years, we have had a front-row seat in our customers’ hybrid cloud journey as they expand their data estate across the edge, on-premise, and multiple cloud providers.

Read Post

Cloudera

Read more about Moving Enterprise Data From Anywhere to Any System Made Easy

Technical Demo - Universal Data Distribution With Cloudera DataFlow for Public Cloud

Jun 2, 2022 By Cloudera In Cloudera

Hands-on demo for Cloudera Data Platform’s Universal Data Distribution (UDD) Service using CDF for the public cloud. This demo shows how to build ingest pipelines that move data from anywhere in the business to any other system, software, or workflow. In this particular demo we will show how the UDD service enables automation of ingest and data delivery across multiple public cloud providers into other analytic systems.

View Video

Cloudera

Read more about Technical Demo - Universal Data Distribution With Cloudera DataFlow for Public Cloud

Systems | Development | Analytics | API | Testing

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Technology Spotlight: Apache Iceberg

Fraud Detection with Cloudera Stream Processing Part 1

Introduction to Cloudera Edge Flow Manager

Making the World a Better Place with Data

Are You Ready for Cloud Regulations?

Build Hybrid Data Pipelines and Enable Universal Connectivity With CDF-PC Inbound Connections

The Future of the Data Lakehouse - Open

Turning Streams Into Data Products

Cloudera Recognized as 2022 Gartner Peer Insights

Cloudera's Applied ML Prototype Catalog Continues to Grow

Hello, Spark! An intro to Apache Spark using PySpark in the Cloud

Streaming Edge Data Collection and Global Data Distribution

Data & The Culture Transformation

The Future Is Hybrid Data, Embrace It

The Power of Exploratory Data Analysis and Visualization for ML

Moving Enterprise Data From Anywhere to Any System Made Easy

Technical Demo - Universal Data Distribution With Cloudera DataFlow for Public Cloud

Monthly Archive

Follow Us