Latest Posts

Machine learning in production: Human error is inevitable, here's how to prepare.

Apr 20, 2020 By Ade Adewunmi In Cloudera

You did it. You have machine learning capabilities up and running in your organization. Success! What started as a few nascent experiments (and maybe a few failures) are now carefully constructed models racing along in full production—with the ability to scale into the hundreds or thousands of productional models in sight. Assembling your expert team of data scientists and custodians seems like a distant memory. Now you’re looking ahead to the future—growth, innovation, revenue!

Read Post

Cloudera

Read more about Machine learning in production: Human error is inevitable, here's how to prepare.

An Architecture for Secure COVID-19 Contact Tracing

Apr 17, 2020 By Tristan Stevens In Cloudera

This post describes an architecture, and associated controls for privacy, to build a data platform for a nationwide proactive contact tracing solution.

Read Post

Cloudera

Read more about An Architecture for Secure COVID-19 Contact Tracing

Operational Database Management

Apr 16, 2020 By Gokul Kamaraj In Cloudera

This blog post is part of a series on Cloudera’s Operational Database (OpDB) in CDP. Each post goes into more details about new features and capabilities. Start from the beginning of the series with, Operational Database in CDP. This blog post gives you an overview of the OpDB management tools and features in the Cloudera Data Platform. The tools discussed in this article will help you understand the various options available to manage the operations of your OpDB cluster.

Read Post

Cloudera

Read more about Operational Database Management

Challenges of running a big data distro in the cloud

Apr 16, 2020 By Sushant Rao In Cloudera

There are many reasons to run a big data distribution, such as Cloudera Data Hub (CDH) and Hortonworks Data Platform (HDP), in the cloud with Infrastructure-as-a-Service (IaaS). The main reason is agility. When the business needs to onboard a new use case, a data admin can bring on additional virtual infrastructure to their clusters in the cloud in minutes or hours. With an on-prem cluster, it may take weeks or months to add the infrastructure capacity for the new use cases.

Read Post

Cloudera

Read more about Challenges of running a big data distro in the cloud

Evolving Insurance with Data and Analytics

Apr 15, 2020 By Sandra Horn In Cloudera

Insurance companies around the world are striving ahead with innovative offerings that are fundamentally changing the insurance landscape. Insurance companies are creating personalized offerings and products that are tailored to the specific needs of their customers. For example, they are implementing usage-based insurance (UBI) based on driving habits, miles driven and driving history and discounts on health insurance based on health trackers, etc.).

Read Post

Cloudera

Read more about Evolving Insurance with Data and Analytics

The U.S. Census Enters the Digital Age with Cloudera

Apr 14, 2020 By Shaun Bierweiler In Cloudera

2020 brings a new decade, and for the U.S Census Bureau, a new challenge. As the federal government’s—and the nation’s—leading provider of demographic and economic data, its largest initiative is the U.S. Census, which is conducted every 10 years and counts every resident in the United States. For the first time in U.S history, the census will be conducted primarily online instead of by mail.

Read Post

Cloudera

Read more about The U.S. Census Enters the Digital Age with Cloudera

Supercharge ML models with Distributed Xgboost on CML

Apr 10, 2020 By Harshal Patil In Cloudera

Since childhood, we’ve been taught about the power of coalitions: working together to achieve a shared objective. In nature, we see this repeated frequently – swarms of bees, ant colonies, prides of lions – well, you get the idea. It is no different when it comes to Machine Learning models. Research and practical experience show that groups or ensembles of models do much better than a singular, silver bullet model. Intuitively, this makes sense.

Read Post

Cloudera

Read more about Supercharge ML models with Distributed Xgboost on CML

Benchmarking NiFi Performance and Scalability

Apr 9, 2020 By Mark Payne In Cloudera

Ever wonder how fast Apache NiFi is? Ever wonder how well NiFi scales? When a customer is looking to use NiFi in a production environment, these are usually among the first questions asked. They want to know how much hardware they will need, and whether or not NiFi can accommodate their data rates. This isn’t surprising. Today’s world consists of ever-increasing data volumes. Users need tools that make it easy to handle these data rates.

Read Post

Cloudera

Read more about Benchmarking NiFi Performance and Scalability

Operational Database Administration

Apr 9, 2020 By Gokul Kamaraj In Cloudera

This blog post is part of a series on Cloudera’s Operational Database (OpDB) in CDP. Each post goes into more details about new features and capabilities. Start from the beginning of the series with, Operational Database in CDP. This blog post gives you an overview of the operational database (OpDB) administration tools and features in the Cloudera Data Platform.

Read Post

Cloudera

Read more about Operational Database Administration

Why the need for event-driven analysis?

Apr 7, 2020 By Laura Chu In Cloudera

Data saturation is everywhere. We want to collect more data because we want better information from them. However, the rapid rise in our ability to collect data hasn’t been matched by our ability to get meaningful insights from the data.

Read Post

Cloudera

Read more about Why the need for event-driven analysis?

Systems | Development | Analytics | API | Testing

Machine learning in production: Human error is inevitable, here's how to prepare.

An Architecture for Secure COVID-19 Contact Tracing

Operational Database Management

Challenges of running a big data distro in the cloud

Evolving Insurance with Data and Analytics

The U.S. Census Enters the Digital Age with Cloudera

Supercharge ML models with Distributed Xgboost on CML

Benchmarking NiFi Performance and Scalability

Operational Database Administration

Why the need for event-driven analysis?

Monthly Archive

Follow Us