Systems | Development | Analytics | API | Testing

Bridging the Gap Between Technology and Business | Part 1 | Snowflake Inc.

In this episode, Florian Douetteau, CEO of Dataiku, answers the question "What is Deep Learning?", explains how his company evolved from machine learning to deep learning methodology, & provides examples of how Dataiku creates uses it to create predictive models. Rise of the Data Cloud is brought to you by Snowflake.

How-to: Index Data from S3 via NiFi Using CDP Data Hubs

Data Discovery and Exploration (DDE) was recently released in tech preview in Cloudera Data Platform in public cloud. In this blog we will go through the process of indexing data from S3 into Solr in DDE with the help of NiFi in Data Flow. The scenario is the same as it was in the previous blog but the ingest pipeline differs. Spark as the ingest pipeline tool for Search (i.e.

COVID-19, the Data Deluge and Optimizing Splunk for Time and Cost

The new normal has changed the way we work and the way we conduct business. More and more employees are working from home, customers are shopping online, and everyone’s phone is still attached to their ears. Bottom line: everything we’re doing in business and in our personal lives is leaving a digital trail. In fact, now devices are getting in the game and creating more data than people, 277 times more, according to Cisco.

What Grocers and CPG Companies Need to Know About Post-Pandemic Shopping

The COVID-19 pandemic has changed nearly everything. It’s affected nearly all Americans, and as such, it’s impacted every organization they interact with, both B2C and B2B. One industry that has had its operations turned upside down is the grocery industry. Grocery stores and their consumer packaged goods (CPG) suppliers and partners had to improvise and adapt nearly overnight to accommodate the changing demands of shoppers.

Validating Jet Engine Predictive Models Using Cloudera Machine Learning

In this video, we’ll go over how to use Cloudera Machine Learning (CML) to validate a complex predictive model. Using a publicly available NASA dataset that simulates how jet engines degrade over time, we’ll use machine learning concepts in a cloud environment to go from simulation data to a cost benefit analysis in just a few steps. We’ll also show how we can run experiments to track specific metrics from many different scenarios that our predictive model could possibly be implemented in.

Redivis makes research data accessible, experiences collaborative with BigQuery

Understanding the data we collect is essential—it allows us to identify trends and uncover answers about our world. However, stories in our data frequently go untold. Large datasets are hard to share between research communities due to their size, security restraints, and complexity. Even if these datasets are accessible to users, the tools needed to query them often require deep technical knowledge.

Smile with new user-friendly SQL capabilities in BigQuery

October happens to be the month to celebrate World Smile Day when Harvey Ball, the inventor of the smiley face declared this day as such to give people a reason to smile. This month, BigQuery users have a lot of new reasons to smile about with the release of new user-friendly SQL capabilities now generally available.

Using Cloudera Machine Learning to Build a Predictive Maintenance Model for Jet Engines

Running a large commercial airline requires the complex management of critical components, including fuel futures contracts, aircraft maintenance and customer expectations. Airlines, in just the U.S. alone, average about 45,000 daily flights and transporting over 10 million passengers a year (source: FAA). Airlines typically operate on very thin margins, and any schedule delay immediately angers or frustrates customers.

Apache Spark on Kubernetes: How Apache YuniKorn (Incubating) helps

Apache Spark unifies batch processing, real-time processing, stream analytics, machine learning, and interactive query in one-platform. While Apache Spark provides a lot of capabilities to support diversified use cases, it comes with additional complexity and high maintenance costs for cluster administrators. Let’s look at some of the high-level requirements for the underlying resource orchestrator to empower Spark as a one-platform.