Systems | Development | Analytics | API | Testing



Once Upon a Time in the Land of Data

I recently had the privilege of attending the CDAO event in Boston hosted by Corinium. Tracks represented financial services, insurance, retail and consumer packaged goods, and healthcare. Overall, it struck me that while data science is not new, most firms are still defining the mission of the data office and data officer. It’s clear firms seek to leverage data and embrace its potential insights, but most are forging ahead in largely uncharted territory.


Ozone Write Pipeline V2 with Ratis Streaming

Cloudera has been working on Apache Ozone, an open-source project to develop a highly scalable, highly available, strongly consistent distributed object store. Ozone is able to scale to billions of objects and hundreds petabytes of data. It enables cloud-native applications to store and process mass amounts of data in a hybrid multi-cloud environment and on premises.

Modern Data Architectures | Data Mesh, Data Fabric, & Data Lakehouse

For years, companies have viewed data the wrong way. They see it as the byproduct of a business interaction and this data often ends up collecting dust in centralized silos governed by data teams who lack the expertize to understand its true value. Cloudera is ushering in a new era of data architecture by allowing experts to organize and manage their own data at the source. Data mesh brings all your domains together so each team can benefit from each other’s data.

When Private Cloud is the Right Fit for Public Sector Missions

It’s no secret that IT modernization is a top priority for the US federal government. A quick trip in the congressional time machine to revisit 2017’s Modernizing Government Technology Act surfaces some of the most salient points regarding agencies’ challenges: In the private sector, excluding highly regulated industries like financial services, the migration to the public cloud was the answer to most IT modernization woes, especially those around data, analytics, and storage.


Protect Your Assets and Your Reputation in the Cloud

A recent headline in Wired magazine read “Uber Hack’s Devastation Is Just Starting to Reveal Itself.” There is no corporation that wants that headline and the reputational damage and financial loss it may cause. In the case of Uber it was a relatively simple attack using an approach called Multi Factor Authentication (MFA) fatigue. This is when an attacker takes advantage of authentication systems that require account owners to approve a log in.


Using Apache Solr REST API in CDP Public Cloud

The Apache Solr cluster is available in CDP Public Cloud, using the “Data exploration and analytics” data hub template. In this article we will investigate how to connect to the Solr REST API running in the Public Cloud, and highlight the performance impact of session cookie configurations when Apache Knox Gateway is used to proxy the traffic to Solr servers. Information in this blog post can be useful for engineers developing Apache Solr client applications.

Future of Data Meetup: Enrich Your Data Inline with Apache NiFi

In this meetup, we’ll look at the different options for enriching your data using Apache NiFi. When and why would we prefer using NiFi for enrichment over a potentially more holistic solution, like Flink or Spark? What are the limitations? And how can we get the best of both worlds, performing data enrichment with NiFi when it makes sense and using our CEP engine when that makes the most sense? Join John Kuchmek and Mark Payne to find out!