Systems | Development | Analytics | API | Testing

How do I move data from MySQL to BigQuery?

In a market where streaming analytics is growing in popularity, it’s critical to optimize data processing so you can reduce costs and ensure data quality and integrity. One approach is to focus on working only with data that has changed instead of all available data. This is where change data capture (CDC) comes in handy. CDC is a technique that enables this optimized approach.

Supercharge ML models with Distributed Xgboost on CML

Since childhood, we’ve been taught about the power of coalitions: working together to achieve a shared objective. In nature, we see this repeated frequently – swarms of bees, ant colonies, prides of lions – well, you get the idea. It is no different when it comes to Machine Learning models. Research and practical experience show that groups or ensembles of models do much better than a singular, silver bullet model. Intuitively, this makes sense.

Operational Database Administration

This blog post is part of a series on Cloudera’s Operational Database (OpDB) in CDP. Each post goes into more details about new features and capabilities. Start from the beginning of the series with, Operational Database in CDP. This blog post gives you an overview of the operational database (OpDB) administration tools and features in the Cloudera Data Platform.

Benchmarking NiFi Performance and Scalability

Ever wonder how fast Apache NiFi is? Ever wonder how well NiFi scales? When a customer is looking to use NiFi in a production environment, these are usually among the first questions asked. They want to know how much hardware they will need, and whether or not NiFi can accommodate their data rates. This isn’t surprising. Today’s world consists of ever-increasing data volumes. Users need tools that make it easy to handle these data rates.

Yellowfin 9.1 Release

With Yellowfin 9, we introduced to the world an incredibly flexible, action-based dashboard builder and progressive data storytelling capabilities that advance the capability of the dashboard experience. We’ve received great feedback since then and this month, the newly-released 9.1 further enhances the user experience of analysts, developers, and business users in Yellowfin’s action-based dashboards, data storytelling, and data discovery products.

Enabling Olympic-level performance and productivity for Delta Lake on Databricks

Recently, Databricks introduced Delta Lake, a new analytics platform that combines the best elements of data lakes and data warehouses in a paradigm it calls a “lakehouse.” Delta Lake expands the breadth and depth of use cases that Databricks customers can enjoy. Databricks provides a unified analytics platform that provides robust support for use cases ranging from simple reporting to complex data warehousing to real-time machine learning.

A New Yellowfin 9.1 Release

With Yellowfin 9, we introduced to the world an incredibly flexible, action-based dashboard builder and progressive data storytelling capabilities that advance the capability of the dashboard experience. We’ve received great feedback since then and this month, the newly-released 9.1 further enhances the user experience of analysts, developers, and business users in Yellowfin’s action-based dashboards, data storytelling, and data discovery products.

Supermarkets Optimizing Supply Chains with Unravel DataOps

Retailers are using big data to report on consumer demand, inventory availability, and supply chain performance in real time. Big data provides a convenient, easy way for retail organizations to quickly ingest petabytes of data and apply machine learning techniques for efficiently moving consumer goods. A top supermarket retailer has recently used Unravel to monitor its vast trove of customer data to stock the right product for the right customer, at the right time.