Pentaho 9.0 Teaser: Multcluster Enhancements

Many organizations want to run any workload from any location without the burden of rearchitecting or refactoring applications. Often, they’ll want to leverage their existing on-premise Hadoop investments and provide a seamless experience to data consumers when they migrate to the cloud to take advantage of the usability, scalability and elasticity of cloud-native solutions. Watch this video to learn more about the Pentaho’s 9.0 multicluster enhancements.

One billion files in Ozone

Apache Hadoop Ozone is a distributed key-value store that can manage both small and large files alike. Ozone was designed to address the scale limitations of HDFS with respect to small files. HDFS is designed to store large files and the recommended number of files on HDFS is 300 million for a Namenode, and doesn’t scale well beyond this limit.

Operational Database Availability

This blog post is part of a series on Cloudera’s Operational Database (OpDB) in CDP. Each post goes into more details about new features and capabilities. Start from the beginning of the series with, Operational Database in CDP. This blog post gives you an overview of the high availability configuration capabilities of Cloudera’s OpDB. Cloudera’s Operational Database (OpDB) is a cluster-based software, which comes configured for High Availability (HA) out of the box.

Augment EMR Workloads with CDP

The first thing that comes to mind when talking about synergy is how 2+2=5. Being the writer that he is, Mark Twain described it a lot more eloquently as “the bonus that is achieved when things work together harmoniously”. There is a multitude of product and business examples to illustrate the point and I particularly like how car manufacturers can bring together relatively small engines to do big things.

Create custom functionalities in Keboola's Developer Portal

Every time you write another piece of code that picks up data from an FTP server, a small piece of you dies. As a developer in the data space, you know what we’re talking about. 80% of your time can be taken by building and improving the environment and tools, maintenance tasks, and pieces of functionality. That's simply too much time dedicated away from tackling more important issues.