Systems | Development | Analytics | API | Testing

Generating and Viewing Lineage through Apache Ozone

As businesses look to scale-out storage, they need a storage layer that is performant, reliable and scalable. With Apache Ozone on the Cloudera Data Platform (CDP), they can implement a scale-out model and build out their next generation storage architecture without sacrificing security, governance and lineage. CDP integrates its existing Shared Data Experience (SDX) with Ozone for an easy transition, so you can begin utilizing object storage on-prem.

S&P Global Provides Instant Access to Curated Data

Snowflake connected with David Coluccio from S&P Global Market Intelligence at the Snowflake Data Cloud Tour to hear how the company is using the Snowflake Data Cloud to curate massive amounts of data and provide seamless access for its clients. S&P Global’s foundation is rooted in providing essential insights to make more-informed decisions.

New Snowflake Features Released in June and July 2021

Building on the announcements made at this year’s Summit, Snowflake has released a number of new enhancements, especially in the areas of data programmability, global governance, and data sharing. Read on to learn more. For additional details and to see some of these new capabilities in action, be sure to check out the on-demand sessions from Summit.

The Foundations of a Modern Data-Driven Organisation: Gaining a Clear View of the Customer

Today’s organizations face rising customer expectations in a fragmented marketplace amidst stiff competition. This landscape is one that presents opportunities for a modern data-driven organization to thrive. At the nucleus of such an organization is the practice of accelerating time to insights, using data to make better business decisions at all levels and roles.

Spark Troubleshooting, Part 1 - Ten Challenges

“The most difficult thing is finding out why your job is failing, which parameters to change. Most of the time, it’s OOM errors…” Jagat Singh, Quora Spark has become one of the most important tools for processing data – especially non-relational data – and deriving value from it. And Spark serves as a platform for the creation and delivery of analytics, AI, and machine learning applications, among others.

Choosing Your Upgrade or Migration Path to Cloudera Data Platform

In our previous blog, we talked about the four paths to Cloudera Data Platform. If you haven’t read that yet, we invite you to take a moment and run through the scenarios in that blog. The four strategies will be relevant throughout the rest of this discussion. Today, we’ll discuss an example of how you might make this decision for a cluster using a “round of elimination” process based on our decision workflow.

Four Frameworks for Optimizing Cloud Strategy and Deployment

“40% of all enterprise workloads will be deployed in CIPS [cloud infrastructure and platform services] by 2023, up from only 20% in 2020.”.As the cloud permeates every aspect of business, decision-makers must make critical choices regarding infrastructure at every turn. Their answers will ultimately determine if every part of an organization is empowered to move forward in a cohesive way to reach business outcomes.

Strategies for optimizing your BigQuery queries.

Did you know that optimizing SQL queries can be cost efficient? In this episode of BigQuery Spotlight, we speak to some strategies for optimizing your BigQuery queries. We’ll walk through what happens behind the scenes for more complex queries, and show you specific tactics you can use to optimize your SQL. Watch to learn some great techniques on how to make your queries more performant!