October 2022

Protect Your Assets and Your Reputation in the Cloud

Oct 28, 2022 By Brian Lachance In Cloudera

A recent headline in Wired magazine read “Uber Hack’s Devastation Is Just Starting to Reveal Itself.” There is no corporation that wants that headline and the reputational damage and financial loss it may cause. In the case of Uber it was a relatively simple attack using an approach called Multi Factor Authentication (MFA) fatigue. This is when an attacker takes advantage of authentication systems that require account owners to approve a log in.

Read Post

Cloudera

Read more about Protect Your Assets and Your Reputation in the Cloud

Using Apache Solr REST API in CDP Public Cloud

Oct 27, 2022 By Máté Szalay-Bekő In Cloudera

The Apache Solr cluster is available in CDP Public Cloud, using the “Data exploration and analytics” data hub template. In this article we will investigate how to connect to the Solr REST API running in the Public Cloud, and highlight the performance impact of session cookie configurations when Apache Knox Gateway is used to proxy the traffic to Solr servers. Information in this blog post can be useful for engineers developing Apache Solr client applications.

Read Post

Cloudera

Read more about Using Apache Solr REST API in CDP Public Cloud

Future of Data Meetup: Enrich Your Data Inline with Apache NiFi

Oct 27, 2022 By Cloudera In Cloudera

In this meetup, we’ll look at the different options for enriching your data using Apache NiFi. When and why would we prefer using NiFi for enrichment over a potentially more holistic solution, like Flink or Spark? What are the limitations? And how can we get the best of both worlds, performing data enrichment with NiFi when it makes sense and using our CEP engine when that makes the most sense? Join John Kuchmek and Mark Payne to find out!

View Video

Cloudera

Analytics
BI

Read more about Future of Data Meetup: Enrich Your Data Inline with Apache NiFi

Technology Spotlight: Applied ML Prototypes

Oct 27, 2022 By Cloudera In Cloudera

Paul Codding introduces Cloudera's Applied ML Prototypes to accelerate machine learning applications in business.

View Video

Cloudera

Analytics
BI

Read more about Technology Spotlight: Applied ML Prototypes

Accelerating Projects in Machine Learning with Applied ML Prototypes

Oct 26, 2022 By Paul Codding In Cloudera

It’s no secret that advancements like AI and machine learning (ML) can have a major impact on business operations. In Cloudera’s recent report Limitless: The Positive Power of AI, we found that 87% of business decision makers are achieving success through existing ML programs. Among the top benefits of ML, 59% of decision makers cite time savings, 54% cite cost savings, and 42% believe ML enables employees to focus on innovation as opposed to manual tasks.

Read Post

Cloudera

Read more about Accelerating Projects in Machine Learning with Applied ML Prototypes

10 Keys to a Secure Cloud Data Lakehouse

Oct 25, 2022 By Brian Lachance In Cloudera

Enabling data and analytics in the cloud allows you to have infinite scale and unlimited possibilities to gain faster insights and make better decisions with data. The data lakehouse is gaining in popularity because it enables a single platform for all your enterprise data with the flexibility to run any analytic and machine learning (ML) use case. Cloud data lakehouses provide significant scaling, agility, and cost advantages compared to cloud data lakes and cloud data warehouses.

Read Post

Cloudera

Read more about 10 Keys to a Secure Cloud Data Lakehouse

Reskilling Against the Risk of Automation

Oct 24, 2022 By Abhas Ricky In Cloudera

Demand for both entry-level and highly skilled tech talent is at an all-time high, and companies across industries and geographies are struggling to find qualified employees. And, with 1.1 billion jobs liable to be radically transformed by technology in the next decade, a “reskilling revolution” is reaching a critical mass.

Read Post

Cloudera

Read more about Reskilling Against the Risk of Automation

dbt on Cloudera Data Platform

Oct 21, 2022 By Cloudera In Cloudera

In this demo, we have shown how an analyst who knows only SQL can work independently to create sophisticated data transformation pipelines without the need for any engineering. Our CDP deployment simplifies all aspects of the software development lifecycle of dbt models.

View Video

Cloudera

Analytics
BI

Read more about dbt on Cloudera Data Platform

Cybersecurity: A Big Data Problem

Oct 20, 2022 By Rob Carey In Cloudera

Information technology has been at the heart of governments around the world, enabling them to deliver vital citizen services, such as healthcare, transportation, employment, and national security. All of these functions rest on technology and share a valuable commodity: data. Data is produced and consumed in ever-increasing amounts and therefore must be protected. After all, we believe everything that we see on our computer screens to be true, don’t we?

Read Post

Cloudera

Read more about Cybersecurity: A Big Data Problem

Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

Oct 19, 2022 By Anthony Behan In Cloudera

The telecommunications industry continues to develop hybrid data architectures to support data workload virtualization and cloud migration. However, while the promise of the cloud remains essential—not just for data workloads but also for network virtualisation and B2B offerings—the sheer volume and scale of data in the industry require careful management of the “journey to the cloud.”

Read Post

Cloudera

Read more about Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

Using Kafka Connect Securely in the Cloudera Data Platform

Oct 19, 2022 By Laszlo Hunyady In Cloudera

In this post I will demonstrate how Kafka Connect is integrated in the Cloudera Data Platform (CDP), allowing users to manage and monitor their connectors in Streams Messaging Manager while also touching on security features such as role-based access control and sensitive information handling. If you are a developer moving data in or out of Kafka, an administrator, or a security expert this post is for you. But before I introduce the nitty-gritty first let’s start with the basics.

Read Post

Cloudera

Read more about Using Kafka Connect Securely in the Cloudera Data Platform

Cloudera Uses CDP to Reduce IT Cloud Spend by $12 Million

Oct 18, 2022 By Dániel Omaisz-Takács In Cloudera

Like all of our customers, Cloudera depends on the Cloudera Data Platform (CDP) to manage our day-to-day analytics and operational insights. Many aspects of our business live within this modern data architecture, providing all Clouderans the ability to ask, and answer, important questions for the business. Clouderans continuously push for improvements in the system, with the goal of driving up confidence in the data.

Read Post

Cloudera

Read more about Cloudera Uses CDP to Reduce IT Cloud Spend by $12 Million

Universal Data Distribution with Cloudera DataFlow for the Public Cloud

Oct 13, 2022 By Cloudera In Cloudera

The speed at which you move data throughout your organization can be your next competitive advantage. Cloudera DataFlow greatly simplifies your data flow infrastructure facilitating complex data collection and movement through a unified process that seamlessly transfers data throughout your organization. Even as you scale. With Cloudera DataFlow for Public Cloud you can collect and move any data (structured, unstructured, and semi-structured) from any source to any destination with any frequency (real-time streaming, batch, and micro-batch).

View Video

Cloudera

Read more about Universal Data Distribution with Cloudera DataFlow for the Public Cloud

AI at Scale isn't Magic, it's Data - Hybrid Data

Oct 11, 2022 By David Moxey In Cloudera

A recent VentureBeat article , “4 AI trends: It’s all about scale in 2022 (so far),” highlighted the importance of scalability. I recommend you read the entire piece, but to me the key takeaway – AI at scale isn’t magic, it’s data – is reminiscent of the 1992 presidential election, when political consultant James Carville succinctly summarized the key to winning – “it’s the economy”.

Read Post

Cloudera

Read more about AI at Scale isn't Magic, it's Data - Hybrid Data

What's new in CDP Private Cloud Base 7.1.8

Oct 11, 2022 By Cloudera In Cloudera

CDP Private Cloud Base 7.1.8 is here! This marks the next wave of Cloudera innovation on-premises for CDP. In this live stream, we’ll go through what’s in our latest release and highlight some of the exciting new features we’ve made available.

View Video

Cloudera

Analytics
BI

Read more about What's new in CDP Private Cloud Base 7.1.8

Cloudera's Open Data Lakehouse Supercharged with dbt Core(tm)

Oct 7, 2022 By Raghotham Murthy In Cloudera

dbt allows data teams to produce trusted data sets for reporting, ML modeling, and operational workflows using SQL, with a simple workflow that follows software engineering best practices like modularity, portability, and continuous integration/continuous development (CI/CD).

Read Post

Cloudera

Read more about Cloudera's Open Data Lakehouse Supercharged with dbt Core(tm)

Does Cost Reduction Play a Role in Digital Transformation?

Oct 6, 2022 By Joe Rodriguez In Cloudera

Digital transformation. Everyone has their own ideas about what digital transformation means, so I decided to look up a few definitions.

Read Post

Cloudera

Read more about Does Cost Reduction Play a Role in Digital Transformation?

Scaling Kafka Brokers in Cloudera Data Hub

Oct 4, 2022 By Tamas Barnabas Egyed In Cloudera

This blog post will provide guidance to administrators currently using or interested in using Kafka nodes to maintain cluster changes as they scale up or down to balance performance and cloud costs in production deployments. Kafka brokers contained within host groups enable the administrators to more easily add and remove nodes. This creates flexibility to handle real-time data feed volumes as they fluctuate.

Read Post

Cloudera

Read more about Scaling Kafka Brokers in Cloudera Data Hub

How to Distribute Machine Learning Workloads with Dask

Oct 3, 2022 By Jacob Bengtson In Cloudera

Tell us if this sounds familiar. You’ve found an awesome data set that you think will allow you to train a machine learning (ML) model that will accomplish the project goals; the only problem is the data is too big to fit in the compute environment that you’re using. In the day and age of “big data,” most might think this issue is trivial, but like anything in the world of data science things are hardly ever as straightforward as they seem.

Read Post

Cloudera

Read more about How to Distribute Machine Learning Workloads with Dask

Systems | Development | Analytics | API | Testing

October 2022

Protect Your Assets and Your Reputation in the Cloud

Using Apache Solr REST API in CDP Public Cloud

Future of Data Meetup: Enrich Your Data Inline with Apache NiFi

Technology Spotlight: Applied ML Prototypes

Accelerating Projects in Machine Learning with Applied ML Prototypes

10 Keys to a Secure Cloud Data Lakehouse

Reskilling Against the Risk of Automation

dbt on Cloudera Data Platform

Cybersecurity: A Big Data Problem

Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

Using Kafka Connect Securely in the Cloudera Data Platform

Cloudera Uses CDP to Reduce IT Cloud Spend by $12 Million

Universal Data Distribution with Cloudera DataFlow for the Public Cloud

AI at Scale isn't Magic, it's Data - Hybrid Data

What's new in CDP Private Cloud Base 7.1.8

Cloudera's Open Data Lakehouse Supercharged with dbt Core(tm)

Does Cost Reduction Play a Role in Digital Transformation?

Scaling Kafka Brokers in Cloudera Data Hub

How to Distribute Machine Learning Workloads with Dask

Monthly Archive

Follow Us