Systems | Development | Analytics | API | Testing

February 2021


Sample applications for Cloudera Operational Database

Cloudera Operational Database is an operational database-as-a-service that brings ease of use and flexibility to Apache HBase. Cloudera Operational Database enables developers to quickly build future-proof applications that are architected to handle data evolution. In the previous blog posts, we looked at application development concepts and how Cloudera Operational Database (COD) interacts with other CDP services.


Change The Way You Do ML With Applied ML Prototypes

Today’s enterprise data science teams have one of the most challenging, yet most important roles to play in your business’s ML strategy. In our current landscape, businesses that have adopted a successful ML strategy are outperforming their competitors by over 9%. The implications of ML on the future of business are clear. However, only 4% of enterprise executives today report seeing success from their ML investment.


Five Trends for the Financial Services Industry to Track in 2021

With a new year ahead, it’s time for financial services to pause, take stock of the “new normal,” and plan a path forward. COVID-19 forced nearly every industry to adapt to a new reality, and the financial services industry was no exception. Consumer habits shifted drastically. Suddenly, many people started working from home. Employee and customer needs changed. Adaptability was a necessity.


Building loyalty with data and analytics

In 1969, my aunt graduated from university and joined IBM, the dominant player in the nascent tech industry at the time. She remained at “Big Blue” where she met and married my uncle, and rose up through the management ranks, until their joint semi-retirement exactly 30 years later. She recently told me, “the only way you could get fired in those days was to murder someone, embezzle or steal”.


The Multifaceted Value Proposition of the Cloudera Data Platform

The Cloudera Data Platform (CDP) represents a paradigm shift in modern data architecture by addressing all existing and future analytical needs. It builds on a foundation of technologies from CDH (Cloudera Data Hub) and HDP (Hortonworks Data Platform) technologies and delivers a holistic, integrated data platform from Edge to AI helping clients to accelerate complex data pipelines and democratize data assets.


Express Cloudera POV on 2021 data trends in insurance

Almost a year into the pandemic, the accelerated digital transformation has begun to feel less abrupt and more sustained. 2021 looks likely to be defined by a new phase: Thriving on digital transformation, rather than just surviving through it. We’ve written about the changes forced on the traditionally risk-averse insurance industry by COVID-19.


Cloudera DataFlow's key milestones and wins in 2020

Needless to say, 2020 was an unforgettable year in a lot of ways and we were all happy to say goodbye to it. The pandemic has ushered in new ways of how we conduct businesses, remote work cultures, telehealth, grocery/food deliveries, etc. While certain industries were hard-hit by this change, most of the businesses were able to adapt, pivot, and take on this adversity in their stride.


Using other CDP services with Cloudera Operational Database

In the previous blog post, we looked at some of the application development concepts for the Cloudera Operational Database (COD). In this blog post, we’ll see how you can use other CDP services with COD. COD is an operational database-as-a-service that brings ease of use and flexibility to Apache HBase. Cloudera Operational Database enables developers to quickly build future-proof applications that are architected to handle data evolution.


Fine-Grained Authorization with Apache Kudu and Apache Ranger

When Kudu was first introduced as a part of CDH in 2017, it didn’t support any kind of authorization so only air-gapped and non-secure use cases were satisfied. Coarse-grained authorization was added along with authentication in CDH 5.11 (Kudu 1.3.0) which made it possible to restrict access only to Apache Impala where Apache Sentry policies could be applied, enabling a lot more use cases.

Data Enrichment Using Cloudera Data Engineering

In this video, we'll walk through an example on how you can use Cloudera Data Engineering to pull in multiple datasets from a Hive data warehouse and go through the process of enriching the data through the use of Apache Spark. We'll then run this Spark job from within Cloudera Data Engineering so that we can follow the progress and see details about the job's execution.

Cloudera Operational Database application development concepts

Cloudera Operational Database is now available in three different form-factors in Cloudera Data Platform (CDP). If you are new to Cloudera Operational Database, see this blog post. And, check out the documentation here. In this blog post, we’ll look at both Apache HBase and Apache Phoenix concepts relevant to developing applications for Cloudera Operational Database.


A Cost-Effective Data Warehouse Solution in CDP Public Cloud - Part1

Today’s customers have a growing need for a faster end to end data ingestion to meet the expected speed of insights and overall business demand. This ‘need for speed’ drives a rethink on building a more modern data warehouse solution, one that balances speed with platform cost management, performance, and reliability.


Using COD and CML to build applications that predict stock data

No, not really. You probably won’t be rich unless you work really hard… As nice as it would be, you can’t really predict a stock price based on ML solely, but now I have your attention! Continuing from my previous blog post about how awesome and easy it is to develop web-based applications backed by Cloudera Operational Database (COD), I started a small project to integrate COD with another CDP cloud experience, Cloudera Machine Learning (CML).


Data - the Octane Accelerating Intelligent Connected Vehicles

The digital revolution is making a deep impact on the automotive industry, offering practically unlimited possibilities for more efficient, convenient, and safe driving and travel experiences in connected vehicles. This revolution is just beginning to accelerate – in fact, according to a recent Applied Market Research study, the global connected car market was valued at $63.03 billion in 2019, and is projected to reach $225.16 billion by 2027, registering a CAGR of 17.1% from 2020 to 2027.


Cloudera wins Risk Markets Technology Award for Data Management Product of the year

Financial services institutions need the ability to analyze and act on massive volumes of data from diverse sources in order to monitor, model, and manage risk across the enterprise. They need a comprehensive data and analytics platform to model risk exposures on-demand. Cloudera is that platform. I am pleased to announce that Cloudera was just named the Risk Data Repository and Data Management Product of the Year in the Risk Markets Technology Awards 2021.


Data, The Unsung Hero of the Covid-19 Solution

COVID-19 vaccines from various manufacturers are being approved by more countries, but that doesn’t mean that they will be available at your local pharmacy or mass vaccination centers anytime soon. Creating, scaling-up and manufacturing the vaccine is just the first step, now the world needs to coordinate an incredible and complex supply chain system to deliver more vaccines to more places than ever before.

CDP Public Cloud: SSH Key Deployment

This video covers how to deploy SSH keys in CDP Public Cloud. It touches on how to generate a new SSH key pair and steps through the process of deploying it for a workload user through the Cloudera Management Console Web UI, as well as using the CDP command-line tool. It discusses the security implications of using the Cloudbreak user for login on data hub hosts, and explains why workload user credentials should be used instead in most cases. It also demonstrates using the deployed SSH keys for login to data hub hosts.

How to configure clients to connect to Apache Kafka Clusters securely - Part 4: TLS Client Authentication

In the previous posts in this series, we have discussed Kerberos, LDAP and PAM authentication for Kafka. In this post we will look into how to configure a Kafka cluster and client to use a TLS client authentication. The examples shown here will highlight the authentication-related properties in bold font to differentiate them from other required security properties, as in the example below. TLS is assumed to be enabled for the Apache Kafka cluster, as it should be for every secure cluster.