Systems | Development | Analytics | API | Testing

Operational Database Scalability

Cloudera’s Operational Database provides unparalleled scale and flexibility for applications, enabling enterprises to bring together and process data of all types and from more sources, while providing developers with the flexibility they need. In this blog, we’ll look into capabilities that make Operational Database the right choice for hyperscale.

Minimizing Cloud Concentration Risk for Financial Services Institutions, Regulators and Cloud Service Providers

Since the financial crisis of 2008, regulators have been consistently working to identify emerging risks that can potentially result in financial stability events. The growth in cloud adoption across the Financial Services Industry (FSI) and the associated increase in reliance on third-party infrastructure providers has gained the attention of regulators at global, regional, and national levels.

Connected Manufacturing Insights from the Edge with Cloudera DataFlow

Connected Manufacturing’s Pivot to an Enterprise Data Solution Connected Manufacturing is at a turning point and it is catalyzed by a real, measurable change and shift in data types – real-time and time-series data is growing 50% faster than latent or static data forms and streaming analytics projected to grow at a 28% CAGR, leaving legacy data platforms that specialize in static historical data solutions, functioning on-prem or in discrete clouds, inadequate in addressing today’s rea

Building an effective data approach in a hybrid cloud world

“In today’s world of disruption and transformation, there are a few key things that all organizations are trying to figure out: how to remain relevant to their customer base, how to deal with the pressure of disruption in their industry and, undoubtedly, how to look to technology to help deliver a better service.” Paul Mackay Today we are sitting down with Marc Beierschoder, Analytics & Cognitive Offering Lead at Deloitte Germany and Paul Mackay, the EMEA Cloud Lead at Cloudera to dis

CDP Private Cloud ends the battle between agility & control in the data center

As a BI Analyst, have you ever encountered a dashboard that wouldn’t refresh because other teams were using it? As a data scientist, have you ever had to wait 6 months before you could access the latest version of Spark? As an application architect, have you ever been asked to wait 12 weeks before you could get hardware to onboard a new application?

Apache Hadoop YARN in CDP Data Center 7.1: What's new and how to upgrade

This blogpost will cover how customers can migrate clusters and workloads to the new Cloudera Data Platform – Data Center 7.1 (CDP DC 7.1 onwards) plus highlights of this new release. CDP DC 7.1 is the on-premises version of Cloudera Data Platform.

Overview of the Operational Database performance in CDP

This article gives you an overview of Cloudera’s Operational Database (OpDB) performance optimization techniques. Cloudera’s Operational Database can support high-speed transactions of up to 185K/second per table and a high of 440K/second per table. On average, the recorded transaction speed is about 100K-300K/second per node. This article provides you an overview of how you can optimize your OpDB deployment in either Cloudera Data Platform (CDP) Public Cloud or Data Center.

Eliminate the pitfalls on your path to public cloud

As organizations look to get smarter and more agile in how they gain value and insight from their data, they are now able to take advantage of a fundamental shift in architecture. In the last decade, as an industry, we have gone from monolithic machines with direct-attached storage to VMs to cloud. The main attraction of cloud is due to its separation of compute and storage – a major architectural shift in the infrastructure layer that changes the way data can be stored and processed.

How to run queries periodically in Apache Hive

In the lifecycle of a data warehouse in production, there are a variety of tasks that need to be executed on a recurring basis. To name a few concrete examples, scheduled tasks can be related to data ingestion (inserting data from a stream into a transactional table every 10 minutes), query performance (refreshing a materialized view used for BI reporting every hour), or warehouse maintenance (executing replication from one cluster to another on a daily basis).