Systems | Development | Analytics | API | Testing

One billion files in Ozone

Apache Hadoop Ozone is a distributed key-value store that can manage both small and large files alike. Ozone was designed to address the scale limitations of HDFS with respect to small files. HDFS is designed to store large files and the recommended number of files on HDFS is 300 million for a Namenode, and doesn’t scale well beyond this limit.

Operational Database Availability

This blog post is part of a series on Cloudera’s Operational Database (OpDB) in CDP. Each post goes into more details about new features and capabilities. Start from the beginning of the series with, Operational Database in CDP. This blog post gives you an overview of the high availability configuration capabilities of Cloudera’s OpDB. Cloudera’s Operational Database (OpDB) is a cluster-based software, which comes configured for High Availability (HA) out of the box.

Augment EMR Workloads with CDP

The first thing that comes to mind when talking about synergy is how 2+2=5. Being the writer that he is, Mark Twain described it a lot more eloquently as “the bonus that is achieved when things work together harmoniously”. There is a multitude of product and business examples to illustrate the point and I particularly like how car manufacturers can bring together relatively small engines to do big things.

Create custom functionalities in Keboola's Developer Portal

Every time you write another piece of code that picks up data from an FTP server, a small piece of you dies. As a developer in the data space, you know what we’re talking about. 80% of your time can be taken by building and improving the environment and tools, maintenance tasks, and pieces of functionality. That's simply too much time dedicated away from tackling more important issues.

Decision Making in Uncertain Times

Leaders know that making good, fast decisions is challenging under the best of circumstances. But, the trickiest decisions are those we call “big bets” – unfamiliar and high-stakes decisions. When you have a crisis of uncertainty, such as the COVID-19 pandemic, which arrived at overwhelming speed and enormous scale, organizations face a potentially paralyzing volume of these big-bet decisions.