Systems | Development | Analytics | API | Testing

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 1: The Set-Up & Basics

Introduction Python is used extensively among Data Engineers and Data Scientists to solve all sorts of problems from ETL/ELT pipelines to building machine learning models. Apache HBase is an effective data storage system for many workflows but accessing this data specifically through Python can be a struggle. For data professionals that want to make use of data stored in HBase the recent upstream project “hbase-connectors” can be used with PySpark for basic operations.

Maximizing Supply Chain Agility through the "Last Mile" Commitment

In my last two blogs (Get to Know Your Retail Customer: Accelerating Customer Insight and Relevance, and Improving your Customer-Centric Merchandising with Location-based in-Store Merchandising) we looked at the benefits to retail in building personalized interactions by accessing both structured and unstructured data from website clicks, email and SMS opens, in-store point sale systems and past purchased behaviors.

An A-Z Data Adventure on Cloudera's Data Platform

In this blog we will take you through a persona-based data adventure, with short demos attached, to show you the A-Z data worker workflow expedited and made easier through self-service, seamless integration, and cloud-native technologies. You will learn all the parts of Cloudera’s Data Platform that together will accelerate your everyday Data Worker tasks.

How ASEAN Retailers Can Become insight driven with a Hybrid Cloud data strategy

There has been an e-commerce explosion this year as consumers seek safety and convenience from the comfort of their own homes using digital tools to purchase everything from durable hard goods to fashion accessories to daily living consumables like food perishables, cleaning products and even school supplies.

Enabling The Full ML Lifecycle For Scaling AI Use Cases

When it comes to machine learning (ML) in the enterprise, there are many misconceptions about what it actually takes to effectively employ machine learning models and scale AI use cases. When many businesses start their journey into ML and AI, it’s common to place a lot of energy and focus on the coding and data science algorithms themselves.

The role of data in COVID-19 vaccination record keeping

The role of data in COVID-19 vaccination record keeping Now that the Pfizer vaccine has been approved by the FDA for use in the US, and the Moderna vaccine likely isn’t far behind, we are now on the verge of being able to emerge from the social distancing world that began earlier in 2020. Recent news has talked about distributing a vaccination record card to everyone who gets a COVID-19 vaccine.

Cloudera Replication Plugin enables x-platform replication for Apache HBase

The Cloudera Data Platform (CDP) is the latest Big Data offering from Cloudera. It includes Apache HBase and Phoenix as part of the platform. These two components are provided in 3 form-factors: Cloudera’s Apache HBase customers typically run mission-critical applications that cannot afford any downtime. They need a way to migrate to a new deployment either without a production outage or, at a minimum, a tiny outage.

Bringing transaction support to Cloudera Operational Database

We’re excited to share that after adding ANSI SQL, secondary indices, star schema, and view capabilities to Cloudera’s Operational Database, we will be introducing distributed transaction support in the coming months. The ACID model of database design is one of the most important concepts in databases. ACID stands for atomicity, consistency, isolation, and durability. For a very long time, strict adherence to these four properties was required for a commercially successful database.