Palo Alto, CA, USA
Jun 30, 2022   |  By Bill Zhang
We are excited to announce the general availability of Apache Iceberg in Cloudera Data Platform (CDP). Iceberg is a 100% open table format, developed through the Apache Software Foundation, and helps users avoid vendor lock-in. Today’s general availability announcement covers Iceberg running within key data services in the Cloudera Data Platform (CDP)—including Cloudera Data Warehousing (CDW), Cloudera Data Engineering (CDE), and Cloudera Machine Learning (CML).
Jun 28, 2022   |  By André Araújo
In a previous blog of this series, Turning Streams Into Data Products, we talked about the increased need for reducing the latency between data generation/ingestion and producing analytical results and insights from this data. We discussed how Cloudera Stream Processing (CSP) with Apache Kafka and Apache Flink could be used to process this data in real time and at scale. In this blog we will show a real example of how that is done, looking at how we can use CSP to perform real-time fraud detection.
Jun 23, 2022   |  By Carolyn Duby
Much of the hype around big data and analytics focuses on business value and bottom-line impacts. Those are enormously important in the private and public sectors alike. But for government agencies, there is a greater mission: improving people’s lives. Data makes the most ambitious and even idealistic goals—like making the world a better place—possible.
Jun 22, 2022   |  By Monique Hesseling
Across the globe, cloud concentration risk is coming under greater scrutiny. The UK HM Treasury department recently issued a policy paper “Critical Third Parties to the Finance Sector.” The paper is a proposal to enable oversight of third parties providing critical services to the UK financial system.
Jun 17, 2022   |  By Michael Kohs
In the second blog of the Universal Data Distribution blog series, we explored how Cloudera DataFlow for the Public Cloud (CDF-PC) can help you implement use cases like data lakehouse and data warehouse ingest, cybersecurity, and log optimization, as well as IoT and streaming data collection. A key requirement for these use cases is the ability to not only actively pull data from source systems but to receive data that is being pushed from various sources to the central distribution service.
Jun 17, 2022   |  By Ram Venkatesh
Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. In recent years, the term “data lakehouse” was coined to describe this architectural pattern of tabular analytics over data in the data lake.
Jun 16, 2022   |  By George Vetticaden
Every large enterprise organization is attempting to accelerate their digital transformation strategies to engage with their customers in a more personalized, relevant, and dynamic way. The ability to perform analytics on data as it is created and collected (a.k.a. real-time data streams) and generate immediate insights for faster decision making provides a competitive edge for organizations.
Jun 13, 2022   |  By Paul Codding
We are excited to announce that Cloudera is named as a 2022 Gartner Peer Insights Customers’ Choice for Cloud Database Management Systems (DBMS). Peer Insights is a user review site, the technology professional’s “go-to” destination for information on customer experience. Gartner Peer Insights collects anonymous customer reviews on select product categories. To date, Gartner has collected over 450,000 reviews for 18,000 products in over 425 categories.
Jun 10, 2022   |  By Jacob Bengtson
Here at Cloudera, we’re committed to helping make the lives of data practitioners as painless as possible. For data scientists, we continue to provide new Applied Machine Learning Prototypes (AMPs), which are open source and available on GitHub. These pre-built reference examples are complete end-to-end data science projects. In Cloudera Machine Learning (CML), you can deploy them with the single click of a button, bringing data scientists that much closer to providing value.
Jun 9, 2022   |  By George Vetticaden
In the first blog of the Universal Data Distribution blog series, we discussed the emerging need within enterprise organizations to take control of their data flows. From origin through all points of consumption both on-prem and in the cloud, all data flows need to be controlled in a simple, secure, universal, scalable, and cost-effective way.
Jun 30, 2022   |  By Cloudera
At Cloudera, we are committed to staying true to our open source roots and working well within the communities is critical to that. Since 2021, we have supported the growing Iceberg community with hundreds of contributions across Impala, Hive, Spark and Iceberg. We look forward to continuing the momentum as companies embrace the open lakehouse. General release now available in the Cloudera Data Platform.
Jun 28, 2022   |  By Cloudera
This video is a 101 introduction about Edge Flow Manager (EFM), the Cloudera Edge Management (CEM) solution for managing and monitoring Apache MiNiFi agents at scale. The video goes through all the different views of the user interface to demonstrate and explain all of the features for designing flows, publishing flows to the agents, execute remote commands, monitoring the agents, etc.
Jun 10, 2022   |  By Cloudera
If you’re new to the world of large-scale data analytics, this session is for you! We'll cover the basics of what problems Apache Spark can solve, why and when to use Spark, and how Spark enables efficient use of time and computing hardware. We’ll also demonstrate how easy it is to run a PySpark job in the public cloud using the Data Science Workbench and Cloudera Data Engineering Products.
Jun 8, 2022   |  By Cloudera
TechCrunch and Cloudera invite you to a conversation about the data transformation underway that is changing how information is used and the very nature of business. The emerging data ecosystem will allow enterprises to work collaboratively with customers, partners and even competitors around the world to integrate disparate data sources for a more complete picture of their business’ present and future.
Jun 2, 2022   |  By Cloudera
Hands-on demo for Cloudera Data Platform’s Universal Data Distribution (UDD) Service using CDF for the public cloud. This demo shows how to build ingest pipelines that move data from anywhere in the business to any other system, software, or workflow. In this particular demo we will show how the UDD service enables automation of ingest and data delivery across multiple public cloud providers into other analytic systems.
May 26, 2022   |  By Cloudera
IT leaders need tactics to implement hybrid data strategies in their organizations. Chief Product Officer, Sudhir Menon, discusses the key to overcoming hybrid cloud migration roadblocks.
May 24, 2022   |  By Cloudera
Cloudera’s Long Term Support (LTS) model is here, and in this video, Chief Product Officer Sudhir Menon gives us the run down. Cloudera LTS gives organizations confidence in moving to the next stable version of the Cloudera Data Platform, while maintaining security and performance. With the new LTS release model, Cloudera is committed to supporting our customers large and small—both on-prem and in the cloud—in the most secure, performant, and risk free way possible.
May 18, 2022   |  By Cloudera
Organizations with data strategies in play for more than a year found that those strategies were more successful and more effective. Cindy Maike, VP of Industry Solutions and Value Management explains how kickstarting your data strategy now will help your business meet unforeseen challenges to the industry.
Apr 15, 2022   |  By Cloudera
Like many financial institutions, Commerzbank was challenged with staying flexible to meet customer needs, while also meeting regulatory compliance. In this Movers & Makers, Justyna Lebedyk, Product Owner in Big Data for Commerzbank, talks about how their digital transformation with the hybrid cloud and Cloudera allowed them to overcome this challenge.
Mar 24, 2022   |  By Cloudera
Iceberg is a high-performance table format intended for large-scale analytics that ensures easy accessibility of data stored in multiple file formats common in the Hadoop ecosystem for different use cases common in the lakehouse architecture. During this meetup, we’ll assume you’ve never heard of Apache Iceberg and explain the basics: what problems the Apache Iceberg project is addressing, how iceberg works, what features iceberg tables offer and how you can put Iceberg to use in your own data projects that utilize Hive, Spark, or Impala.
Jun 28, 2018   |  By Cloudera
Enterprises require fast, cost-efficient solutions to the familiar challenges of engaging customers, reducing risk, and improving operational excellence to stay competitive. The cloud is playing a key role in accelerating time to benefit from new insights. Managed cloud services that automate provisioning, operation, and patching will be critical for enterprises to leverage the full promise of the cloud when it comes to time to value and agility.
Jun 26, 2018   |  By Cloudera
The adoption of cloud computing in the financial services sector has grown substantially in the past three years on a global basis. Diversification of risk is always a key concern for financial institutions and the seeming safety of having a single cloud provider is not being properly measured from a systemic risk and operational risk perspective.
Jun 12, 2018   |  By Cloudera
This white paper provides a reference architecture for running Enterprise Data Hub on Oracle Cloud Infrastructure. Topics include installation automation, automated configuration and tuning, and best practices for deployment and topology to support security and high availability.
May 17, 2018   |  By Cloudera
A cloud-based analytics platform needs to be easy, unified, and enterprise-grade to meet the demands of your business. This white paper covers how Cloudera's machine learning and analytics platform complements popular cloud services like Amazon Web Services (AWS) and Microsoft Azure, and enables customers to organize, process, analyze, and store data at large scale...anywhere.
May 15, 2018   |  By Cloudera
The Modern Platform for Machine Learning and Analytics Optimized for Cloud.
Mar 25, 2018   |  By Cloudera
In the wake of the global financial crisis, the world has become much more interconnected and immensely more complex. As a result, you can no longer simply look at the past as an indicator of future trends. The financial services industry needs real-time insights into numerous interacting variables to make informed decisions.

Cloudera delivers the modern platform for machine learning and analytics optimized for the cloud. Imagine having access to all your data in one platform. The opportunities are endless. We enable you to transform vast amounts of complex data into clear and actionable insights to enhance your business and exceed your expectations.

The right products for the job:

  • Enterprise Data Hub: Operate with confidence—thanks to comprehensive security and governance—while at the same time enabling unrivaled self-service performance at extreme scale. All in an enterprise-grade solution that lets you run anywhere, on-premises or in hybrid- and multi-cloud environments.
  • Data Science Workbench: Accelerate machine learning from research to production with the secure, self-service enterprise data science platform built for the enterprise.
  • Data Warehouse: A modern data warehouse that delivers an enterprise-grade, hybrid cloud solution designed for self-service analytics.
  • Data Science & Engineering: Cloudera Data Science provides better access to Apache Hadoop data with familiar and performant tools that address all aspects of modern predictive analytics.
  • Altus Cloud: The industry’s first machine learning and analytics cloud platform built with a shared data experience.

The world’s leading organizations choose Cloudera to grow their businesses, improve lives, and advance human achievement.