Systems | Development | Analytics | API | Testing

January 2020

Cloudera Data Warehouse - What You Should Know

Cloudera Data Warehouse is just one of the many experiences you can use on the Cloudera Data Platform (CDP). Cloudera Data warehouse packages up the projects you may already know and use such as Impala and Hive into a service. This Service runs on Kubernetes which gives it the ability to pause, resume, scale up, or down quickly and automatically.

How Cloudera Enables R Users to Optimize Their Data Science and Machine Learning Workflows

This week, R users from around the world convene in San Francisco for rstudio::conf 2020. With a packed agenda of new package announcements and case studies highlighting successful applications of R across different industries, it’s evident that R and the ecosystem of tools around it make up a vital part of the data science and machine learning landscape.

Deep Learning for Anomaly Detection

We are excited to release Deep Learning for Anomaly Detection, the latest applied machine learning research report from Cloudera Fast Forward Labs. Anomalies, often referred to as outliers, are data points or patterns in data that do not conform to a notion of normal behavior. Anomaly detection, then, is the task of finding those patterns in data that do not adhere to expected norms. The capability to recognize or detect anomalous behavior can provide highly useful insights across industries.

Understanding Healthcare's New Industry Imperative: Data Chain of Custody

One of the first recorded medical devices was the stethoscope in 1816. Fast forward more than a century to 2019, where the world witnessed the creation of an award-winning multi-sensor, implantable cardiac device able to predict potential heart failure weeks in advance. The data and analytics streamed and analyzed from new connected devices are transforming healthcare as we know it. However, a real challenge in this environment is the sheer volume and scope of data that must be managed and protected.

Insurance in 2020 & Beyond - Learning from the past decade to plan for the next

Like many other people, I used time over the recent holidays to clean out and organize my digital files. In that process, I finally trashed the speaking notes for a panel I participated in at SMA’s (Strategy Meets Action) first summit in 2012 when I worked at a large global insurer. During that session, a gentleman in the audience asked me what I thought about “big data” and its implications for Insurance.

Real-time log aggregation with Flink Part 1

Many of us have experienced the feeling of hopelessly digging through log files on multiple servers to fix a critical production issue. We can probably all agree that this is far from ideal. Locating and searching log files is even more challenging when dealing with real-time processing applications where the debugging process itself can be extremely time-sensitive.

These Two Trends Will Put an End to Business as Usual in 2020

Where did the last decade go? Seems like it was just 2010 and I was writing about the future of business in 2020, well it is nowhere! I’ve spent much of my career in finance/accounting and management consulting and the last decade+ helping companies link their business and technology strategies with a focus on data and analytics. Where will we head in 2020 and this next decade?

Announcing support for Apache Flink with the GA of Cloudera Streaming Analytics

We cannot hold our excitement anymore! For the last few months, our Data-in-Motion engineering teams have been working hard to deliver a compelling and critical part of our Cloudera DataFlow (CDF) story. To enhance our Stream Processing and Analytics narrative within the overall Data-in-Motion platform, we give you support for Apache Flink with the general availability of Cloudera Streaming Analytics (CSA).

Updated Cloudera Manager Tour

Cloudera Manager's look has been updated with the arrival of the Cloudera Data Platform. Although CDP is largely configured and controlled through the Control Plane, there are still some options available to you in Cloudera Manager when working with an Environment or in a Data Hub cluster. This quick tour of the different views and menus will hopefully help you align yourself to the new layout.

Placing the Emphasis on Data in the Federal Data Strategy

In mid-June of 2019, the White House Office of Management and Budget (OMB) released the Draft 2019-2020 Federal Data Strategy Action Plan. The plan outlines a series of steps and principles targeting effective governance, responsibilities and best practices for federal agencies’ use of citizen data. When put into place, these action items will allow government agencies to maximize data, improve security and better serve constituents.

How Scania is Driving Logistical Efficiency and Sustainability with Big Data

Organizations in the transportation and manufacturing industries are applying Industrial IoT concepts and technology to transform product development, supply chains, and manufacturing operations. Scania is driving logistical efficiency and sustainability with big data. Scania is a world-leading provider of transport solutions and is leading the shift towards sustainable transport systems. In 2018 it delivered 88,000 trucks, 8,500 buses as well as 12,800 industrial and marine engines to customers.

Introducing Apache Spark on Docker on top of Apache YARN with CDP DataCenter release

Bringing your own libraries to run a Spark job on a shared YARN cluster can be a huge pain. In the past, you had to install the dependencies independently on each host or use different Python package management softwares. Nowadays Docker provides a much simpler way of packaging and managing dependencies so users can easily share a cluster without running into each other, or waiting for central IT to install packages on every node.

Three Trends in Cloud Computing to Expect in 2020

A new year is upon us and that means it’s time to look ahead to what’s coming next. In cloud computing, organizations are going to be making adjustments in 2020 – to accommodate overstrained budgets, new regulations, and shifting technologies. It will be a year of identifying what’s not working and moving toward the right solutions. Let’s take a look at three trends that will impact cloud computing across all industries in the coming year.