Systems | Development | Analytics | API | Testing

March 2020


When adopting machine learning, people are as important as technology

A secret to adopting machine learning that has nothing to do with the actual technology. Machine learning has the potential to transform your business. To automate processes, uncover new insights, make your products and services better, and customers happier. Integrating the capability into your organization requires operational transformation and lots (and lots) of experimentation. But, you know this already.


Operational Database in CDP

Cloudera’s operational database (OpDB) in CDP delivers a real-time, always available, scalable OpDB that serves traditional structured data alongside new unstructured data within a unified Operational and Warehousing platform. Cloudera delivers an operational database that serves traditional structured data alongside new unstructured data within a unified open-source platform.


How to deploy ML models to production

Currently, many enterprises, including many Cloudera customers, are experimenting with machine learning (ML) and creating models to tackle a wide range of challenges. While today, many models are used for dashboards and internal BI purposes, a small and rapidly growing group of enterprise leaders have begun to realize the potential of ML for business automation, optimization and product innovation.


Cloudera Data Platform (CDP) now available on Microsoft Azure Marketplace providing unified billing for joint customers

Cloudera Data Platform (CDP) is now available on Microsoft Azure Marketplace – so joint customers can easily deploy the world’s first enterprise data cloud on Microsoft Azure.


Benchmarking Time Series workloads on Apache Kudu using TSBS

Time Series as Fast Analytics on Fast Data Since the open-source introduction of Apache Kudu in 2015, it has billed itself as storage for fast analytics on fast data. This general mission encompasses many different workloads, but one of the fastest-growing use cases is that of time-series analytics. Time series has several key requirements: At first glance, it sounds like these requirements would demand a special-purpose database system built specifically for time series.


Beyond Connectivity - Top 5 Ways Data and Analytics Drive Transformation in Telecom

The telecommunications industry is in the midst of a fundamental reinvention and transformation. Faced with a range of emerging pressures – including consolidation, a changing competitive landscape, and commoditization of traditional services – communication service providers (CSPs) are seeking new revenue streams and novel business approaches.


Distributed model training using Dask and Scikit-learn

The theoretical bases for Machine Learning have existed for decades yet it wasn’t until the early 2000’s that the last AI winter came to an end. Since then, interest in and use of machine learning has exploded and its development has been largely democratized. Perhaps not so coincidentally, the same period saw the rise of Big Data, carrying with it increased distributed data storage and distributed computing capabilities made popular by the Hadoop ecosystem.


The Real Role of Robotics in Retail

Automation and robotics in retail is rapidly changing the retail landscape – so much so that there are clearly winners and losers. I’m not talking about the war between brick and mortar stores and digital marketplaces, but rather I’m talking about the retail digital revolution where the winners are delivering greater than 4.5% comparable store/ channel sales growth compared to their brothers that have not embraced automation and robotics.

From 0 to Query with Cloudera Data Warehouse in CDP

In this video I'll show you how to get started with Cloudera Data Warehouse in CDP public cloud. I'll walk you through activating an environment for use with the Data Warehouse experience, creating a Virtual Warehouse, and then loading in some data. After loading data in, I'll show you how to connect your Virtual Warehouse to Tableau.

Maximizing performance of Apache Kudu block cache with Intel Optane DCPMM

Intel Optane DC persistent memory (Optane DCPMM) has higher bandwidth and lower latency than SSD and HDD storage drives. These characteristics of Optane DCPMM provide a significant performance boost to big data storage platforms that can utilize it for caching. One of such platforms is Apache Kudu that can utilize DCPMM for its internal block cache.


The Retail Renaissance - How data and analytics are reshaping retail

The retail landscape is in the midst of a dramatic, data-driven renaissance. New tools help to build new connections — between consumers and retailers, and across supply chains. Data analytics and machine learning further these connections to better understand and predict customer behavior and improve demand forecasting. In this emerging era of smart retail, organizations have access to a range of powerful new capabilities and tools.


5 Steps to Making Better Business Decisions with Machine Learning

Most of the day to day work for knowledge workers is spent helping the business make better decisions, like choosing whether it’s worth expending the effort (or actual money) to achieve the desired business goal. The example I often use when talking about ML is churn prediction (and I’m starting to think I’m overusing it now). It costs money to retain a customer who is thinking of moving, but this is less than the cost of getting new customers.