Systems | Development | Analytics | API | Testing

September 2021

Migrate to CDP Private Cloud Base - A Step by Step Guide

Our recent blog discussed the four paths to get from legacy platforms to CDP Private Cloud Base. In this blog and accompanying video, we will deep dive into the mechanics of running an in-place upgrade from CDH5 or CDH6 to CDP Private Cloud Base. The overall upgrade follows a seven-step process illustrated below. In the video below we walk through a complete end to end upgrade of CDH to CDP Private Cloud Base.

Qlik Sense Enterprise SaaS - Impact Analysis

Brief overview and demo of the Impact Analysis capability available in Qlik Sense. Do you understand the origins and journey of your data sets? Do you trust where the data comes from? Are your responsible for migrations, understanding business changes and handling regulatory compliance? If so watch this video to learn more about Qlik’s latest addition to Qlik Sense Enterprise SaaS - Impact Analysis.

Future of Data Meetup: CDP on Azure - Industrial Strength Data Engineering

Data Engineering is undergoing a huge evolution requiring faster and more reliable data pipelines. Apache Spark and Python are core foundational components of this new architecture enabling data engineers to quickly develop these pipelines. They also introduce challenges when moving to production. Come join us as we: Ask questions and learn. We will also have a raffle of Cloudera swag.

Serving the Public Through Data

Digital transformation has been talked about for many years, but the pandemic has accelerated the digital transformation journeys for many enterprises. Forced to adapt to changes in the business landscape and customer behavior, businesses have adopted more digital tools and technologies to drive innovation and increase resilience.

Closing the Gap Between the Digital Haves and Have-Nots

The digital race is on. To pull ahead of the pack, a company needs to know what to do with its data. Without a data-driven strategy, you’re bound to lose ground to competitors who apply their data to operational improvements, product development, go-to-market strategies, and the customer experience. It isn’t enough to collect, interpret, and act on the data. You have to do it fast.

Qlik App Automation - Brief Overview and Demo

Qlik Application Automation lets you visually assemble flows that work with market leading SaaS applications to invoke downstream processes that react to changes in your business. Consequently you spend less time programming back-office workflows and more time driving insights. Qlik Application Automation is part of our Active Intelligence vision which delivers in-the-moment awareness about every aspect of your business and helps you drive immediate actions.

Group vs Fine-Grained Access Control in Cloudera Data Platform Public Cloud

Cloudera Data platform (CDP) provides a Shared Data Experience (SDX) for centralized data access control and audit in the Enterprise Data Cloud. The Ranger Authorization Service (RAZ) is a new service added to help provide fine-grained access control (FGAC) for cloud storage. We covered the value this new capability provides in a previous blog.

React and Respond in the Business Moment With Qlik Application Automation

Unless you’ve hidden under a rock for the past decade, you can’t have failed to notice that data in today’s enterprise is very much alive. It’s always moving, constantly changing, and we’re continually using it to create new business value. However, while data fluidity and visibility have blossomed, the opportunity to use that data to drive business actions seems to have withered in comparison.

Top 7 Talend Alternatives and Competitors

On the surface, Talend seems like the ultimate data integration platform. It's open-source, maintains multi-cloud integration, supports data governance frameworks like GDPR and CCPA, and handles both ETL and ELT, providing you with more flexibility for data management. Dig a little deeper, though, and you'll notice this platform has an outdated user interface and limited capabilities, and you'll probably need to upgrade to its enterprise version to execute data integration.

Build your data analytics skills with the latest no cost BigQuery trainings

BigQuery is a fully-managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and intelligent caching for business intelligence. To help you make the most of BigQuery, we’re offering the following no cost, on-demand training opportunities.

Telecom Network Analytics: Transformation, Innovation, Automation

One of the most substantial big data workloads over the past fifteen years has been in the domain of telecom network analytics. Where does it stand today? What are its current challenges and opportunities? In a sense, there have been three phases of network analytics: the first was an appliance based monitoring phase; the second was an open-source expansion phase; and the third – that we are in right now – is a hybrid-data-cloud and governance phase. Let’s examine how we got here.

Terabytes of Data but Still No Good Insights?

In our modern digital society, data is abundant, and storage is affordable. Businesses, governments and even individuals can (and do) collect every transaction, click, swipe, location, message and attribute in their datasets. With just a few clicks on my smart device, I can review data on every place I’ve been, how much I spent, every step I took, what the weather was like and who I was with. Businesses collect the same abundance of data.

Speed Up Your Data Flow for Business Results

A slow car has never won a Formula One race. The Olympics doesn’t reward slow times in swimming, track or any other clock-timed sport. Likewise, slow data speeds don’t win over customers or colleagues in the real-time business world. Microsoft’s own research once reported that a person visiting a website on a connected device is likely to wait no more than 10 seconds to see it before moving to a competitor’s site.

"So, How Do We Make This Work?" - Tracking Employee COVID Vaccination and Testing in As Little As 15 Minutes

With COVID-19’s ever-changing conditions – growing infection rates, shifting and new vaccine mandates, variant outbreaks and office closures and re-openings – HR has stepped up and taken on a significant role in helping organizations navigate every employee’s personal and work life needs. COVID-19 accelerated the evolution already underway in HR, with HR growing beyond being a policy and procedure hub into a strategic business partner.

How Klearnow went from sleepless nights to a booming data business with ThoughtSpot

Sometimes I walk through the grocery store and marvel at the way customers float through the aisles, blissfully unaware of the logistical nightmare it probably took to stock the shelves. They have no idea how many people, systems, and modes of transportation it takes to make everything magically appear on their grocery shelves. But I do. As the Senior Director of Software Engineering at KlearNow, I spend my days preserving the bliss of those grocery shoppers.

How to Make Your Data Ethical with Jack Berkowitz at ADP | Rise Of The Data Cloud

Ever wonder how companies like ADP handle all of the data that they are responsible for? In this episode of Rise Of The Data Cloud, Jack Berkowitz, Chief Data Officer at ADP, talks about the importance of keeping your product simple, data sharing, applying ethics to algorithms, and much more.

What's New in CDP Public Cloud? Hive and Impala Get a Facelift

Join us LIVE to discuss what’s new in CDP Public Cloud! Don’t miss the live Q&A as we learn about the new capabilities in Cloudera Data Warehouse. See how the Impala and Hive engines get a facelift. Also watch a demo of how you can run advanced analytics at scale using few easy steps

ThoughtSpot SpotApp for Snowflake Performance and Consumption Analytics

This SpotApp will have you up and running in minutes with search and AI-driven analytics from ThoughtSpot around your Snowflake Data Cloud performance and consumption. The SpotApp enables financial controllers to drill into credit consumption trends to proactively manage their cloud spend. And IT Ops teams will be able to dive into granular details about query performance to ensure that their data clouds are running at full speed.

What's Next in the Data Cloud with Benoit Dageville & Christian Kleinerman | Snowflake Summit 2021

From the start, Snowflake co-founders envisioned a new and unique way for companies across industries and around the globe to collaborate on data and analytics. Join Benoit Dageville, co-founder and President of Product, and Christian Kleinerman, Snowflake’s SVP of Product, as they share how the Data Cloud vision has become a reality and they unveil the latest Snowflake innovations in five key areas: connected industries, global governance, platform optimization, data programmability, and applications powered by Snowflake. You’ll see new capabilities in action and hear directly from customers and partners about what these new advancements mean for their businesses.

ThoughtSpot, ServiceNow, and Snowflake for IT Workload Management

As the developer of the leading data cloud, Snowflake generates a wealth of IT Service Management data with ServiceNow. But uncovering actionable, granular insights has been a challenge. Now, ThoughtSpot and Snowflake are empowering IT executives to answer all their questions about support ticket backlog and effort with a single pane of interactive insights in ThoughtSpot, powered by Snowflake.

ThoughtSpot, ServiceNow, and Snowflake for Business Application Management

As the developer of the leading data cloud, Snowflake relies on a number of business applications. But creating a holistic view of these applications has been a challenge, as the data is sourced from a variety of systems. By combining application data from multiple sources in the Snowflake data cloud, ThoughtSpot and Snowflake are empowering internal organizations to answer all their questions about enterprise application quality with a single pane of interactive insights in ThoughtSpot, powered by Snowflake.

ThoughtSpot, ServiceNow, and Snowflake for Operational Metrics

ThoughtSpot for ServiceNow at Snowflake - As the developer of the leading data cloud, Snowflake generates a wealth of operational helpdesk data with ServiceNow. ThoughtSpot and Snowflake are enabling helpdesk and operations executives to answer all their questions about operational metrics with a single pane of interactive insights in ThoughtSpot, powered by Snowflake.

Qlik Sense Insight Advisor Improvements

We cover some of the latest improvements in the Qlik Sense Insight Advisor. Insight Advisor is your intelligent assistant in Qlik Sense, providing AI-generated charts that are delivered in multiple forms using a variety of user experiences – these include field selection, keyword search and insight advisor chat. It auto-generates context-aware analyses learned from your data and search criteria and supports natural language interaction – it can also deliver more advanced analytics for users to explore.

How to Power Rapid Transformation in Financial Services with Snowflake | Snowflake Summit 2021

There has never been a greater need to rapidly transform and innovate to stay ahead of the competition. In this session, join Snowflake customers Western Union, Goldman Sachs, and FINOS as well as partner Deloitte to learn about how the Data Cloud is powering financial services firms.

Customer segmentation with Cosmo, Chief Destiny Officer

Do you ever feel like connecting with the right customer audience is just a matter of luck? We’ve met a CDO who leaves audience targeting up to chance. Cosmo, CDO is not a Chief Data Officer — he’s a Chief Destiny Officer. While we focus on data here at Talend, we’re trying to understand the 36% of business executives who say they don’t base the majority of their decisions on data.

Supercharge your Airflow Pipelines with the Cloudera Provider Package

Many customers looking at modernizing their pipeline orchestration have turned to Apache Airflow, a flexible and scalable workflow manager for data engineers. With 100s of open source operators, Airflow makes it easy to deploy pipelines in the cloud and interact with a multitude of services on premise, in the cloud, and across cloud providers for a true hybrid architecture.

In the Quest for Success, Never Stop Being Curious With Data

My whole life I’ve been curious. You have to be, to become an entrepreneur. I’m curious about trends, about looking at data and finding patterns, which might show you where the next opportunity lies. And, as I discussed with Joe DosSantos in the latest episode of Data Brilliant, I’m a big believer in experimentation and learning by putting the data and analysis into practice.

Introducing Qlik Cloud Government - Analytics for U.S. Federal Sector

Qlik would like to announce our SaaS solution for the U.S. Federal and Public Sector with Qlik Cloud Government. A new platform entirely designed specifically to meet the varied needs of our customers including the U.S. Public sector, offering a modern analytics platform built for speed, security, and scale.

Snowflake's Data Cloud for Advertising in a Cookieless World | Snowflake Summit 2021

Effective marketing and advertising is essential to driving growth, but the landscape is rapidly changing, with escalating regulatory requirements and the deprecation of third-party cookies. To succeed, businesses need to develop new, secure methods for accessing and sharing audience and engagement data. In this session with Disney, NBCUniversal (NBCU), Capgemini, and Snowflake, learn how the unique and innovative capabilities of the Data Cloud are enabling seamless data sharing without data copies or movement. Specifically, learn.

Troubleshooting Databricks

The popularity of Databricks is rocketing skyward, and it is now the leading multi-cloud platform for Spark and analytics workloads, offering fully managed Spark clusters in the cloud. Databricks is fast and organizations generally refactor their applications when moving them to Databricks. The result is strong performance. However, as usage of Databricks grows, so does the importance of reliability for Databricks jobs - especially big data jobs such as Spark workloads. But information you need for troubleshooting is scattered across multiple, voluminous log files.

SQL Server SSRS, SSIS packages with Google Cloud BigQuery

After migrating a Data Warehouse to Google Cloud BigQuery, ETL and Business Intelligence developers are often tasked with upgrading and enhancing data pipelines, reports and dashboards. Data teams who are familiar with SQL Server Integration Services (SSIS) and SQL Server Reporting Services (SSRS) are able to continue to use these tools with BigQuery, allowing them to modernize ETL pipelines and BI platforms after an initial data migration is complete.

Living on the Edge: How to Accelerate Your Business with Real-time Analytics

Leveraging the Internet of Things (IoT) allows you to improve processes and take your business in new directions. But it requires you to live on the edge. That’s where you find the ability to empower IoT devices to respond to events in real time by capturing and analyzing the relevant data.

How To Get True ROI From Your Account-Based Marketing (ABM)

Account-based marketing, or ABM, is more often used as targeted demand generation—not one-to-one marketing. In a 2020 study of more than 300 organizations worldwide, Forrester found that “a significant number of respondents claimed they were using an ABM approach but weren’t doing what we would consider the basics of ABM, such as working with sales.”1 ABM isn’t just about assigning one siloed team the responsibility of targeting and revealing high-potential prospects.

Operating Apache Kafka with Cruise Control

There are two big gaps in the Apache Kafka project when we think of operating a cluster. The first is monitoring the cluster efficiently and the second is managing failures and changes in the cluster. There are no solutions for these inside the Kafka project but there are many good 3rd party tools for both problems. Cruise Control is one of the earliest open source tools to provide a solution for the failure management problem but lately for the monitoring problem as well.

Streaming Analytics with SQL Stream Builder

SQL Stream Builder, part of Cloudera Streaming Analytics, allows developers, analysts, and data scientists to write streaming applications using industry-standard SQL. It provides an interactive experience, so the development process is quick, easy, and productive while taking advantage of Apache Flink’s streaming power. It provides an advanced materialized view engine to interface with applications, tooling, and services via REST API.

Enabling Multi-User Fine-Grained Access Control for Cloud Storage in CDP

Shared Data Experience (SDX) on Cloudera Data Platform (CDP) enables centralized data access control and audit for workloads in the Enterprise Data Cloud. The public cloud (CDP-PC) editions default to using cloud storage (S3 for AWS, ADLS-gen2 for Azure). This introduces new challenges around managing data access across teams and individual users. To solve these challenges for S3 and ADLS-gen2, Cloudera has introduced a new service — the Ranger Authorization Service (RAZ).

Cloudera and NVIDIA Help IRS Fight Fraud, Safeguard Taxpayers

Across the federal government, agencies are struggling to identify, organize, analyze, and act on troves of data. It’s a problem that leaders are working actively to tackle, but they’re in a race against immeasurable volumes of data that is continuously being generated in perpetuity in stores known and unknown. At the Internal Revenue Service, decades’ worth of data exceeds even the most cutting-edge processing capabilities.

Spark Troubleshooting Solutions - DataOps, Spark UI or logs, Platform or APM Tools

Spark is known for being extremely difficult to debug. But this is not all Spark’s fault. Problems in running a Spark job can be the result of problems with the infrastructure Spark is running on, inappropriate configuration of Spark, Spark issues, the currently running Spark job, other Spark jobs running at the same time – or interactions among these layers.

Ad agencies choose BigQuery to drive campaign performance

Advertising agencies are faced with the challenge of providing the precision data that marketers require to make better decisions at a time when customers’ digital footprints are rapidly changing. They need to transform customer information and real-time data into actionable insights to inform clients what to execute to ensure the highest campaign performance.

Data And The Music Industry | Rise Of The Data Cloud

Ever wondered how is data changing the music industry? In this episode, Moin Haque, SVP of Architecture and Engineering, and Vlad Barkov, VP of Data Architecture & Engineering at Warner Music Group, discuss the transformation of the music industry during the pandemic, choosing the right business partners, making data independent, and much more.

Optimizing your BigQuery incremental data ingestion pipelines

When you build a data warehouse, the important question is how to ingest data from the source system to the data warehouse. If the table is small you can fully reload a table on a regular basis, however, if the table is large a common technique is to perform incremental table updates. This post demonstrates how you can enhance incremental pipeline performance when you ingest data into BigQuery.

Supporting Transformation with an Integrated Data Platform. Three Common Questions Answered.

In recent years there has been increased interest in how to safely and efficiently extend enterprise data platforms and workloads into the cloud. CDOs are under increasing pressure to reduce costs by moving data and workloads to the cloud, similar to what has happened with business applications during the last decade. Our upcoming webinar is centered on how an integrated data platform supports the data strategy and goals of becoming a data-driven company.

Early-stage growth: Why shifting the founder mindset is critical to acquiring your first 10 customers

Growth. It’s the mountain every startup founder must learn to climb in order to run a successful business. And as with any great mountain, the journey to the top never feels more daunting than at the base. How your startup earns its first 10 customers will set the tone for the rest of the trek and determine how fast your team reaches the summit — if at all.

The role of a CDO with Cosmo, Chief Destiny Officer

Have you ever wished you had a crystal ball? We tracked down a CDO who actually uses one. See, Cosmo, CDO is not a Chief Data Officer — he’s a Chief Destiny Officer. We’re all about data at Talend, but sometimes it’s good to see things from another perspective. We sat down with Cosmo to ask him about his job, his background, and his methods.

Spectacular growth: Beaumotica accelerates expansion with data-driven insights from Talend

Beaumotica combines smart lighting, design, and top brands to create the perfect mood and atmosphere for any room. And with help from Talend, the company can now combine data, analytics, and automation to optimize business decisions and accelerate growth. Last year alone the company tripled its business and expanded into new territories across Europe. Based in The Netherlands, Beaumotica has been growing steadily since 2007.

With Stitch, Simba is losing no sleep over aggressive growth plans

“If we didn’t have Stitch, we would have to recruit and hire data engineers, buy space for hundreds of millions of rows that we’re sinking into the database, and on and on. For us, Stitch is essential.” –Tomasz Eitner, BI and Data Analyst, Simba Sleep Simba Sleep has always been a data-driven company. Before the firm was even formally launched, the founders purchased research profiles from more than 10 million sleepers—including 180 million body profile data points.

Our reflections on the 2021 Gartner Magic Quadrant for Data Integration Tools

“The data integration tool market is seeing renewed momentum, driven by requirements for hybrid and multi-cloud data integration, augmented data management, and data fabric designs.” This is what Gartner assesses in its latest Magic Quadrant for Data Integration Tools* report. And that assessment makes perfect sense. Data is the lifeblood of an organization.

Optimizing Cloudera Data Engineering Autoscaling Performance

The shift to cloud has been accelerating, and with it, a push to modernize data pipelines that fuel key applications. That is why cloud native solutions which take advantage of the capabilities such as disaggregated storage & compute, elasticity, and containerization are more paramount than ever. At Cloudera, we introduced Cloudera Data Engineering (CDE) as part of our Enterprise Data Cloud product — Cloudera Data Platform (CDP) — to meet these challenges.

Migrating Data Pipelines from Enterprise Schedulers to Airflow

At Airflow Summit 2021, Unravel’s co-founder and CTO, Shivnath Babu and Hari Nyer, Senior Software Engineer, delivered a talk titled Lessons Learned while Migrating Data Pipelines from Enterprise Schedulers to Airflow. This story, along with the slides and videos included in it, comes from the presentation.

How to load Salesforce data into BigQuery using a code-free approach powered by Cloud Data Fusion

Organizations are increasingly investing in modern cloud warehouses and data lake solutions to augment analytics environments and improve business decisions. The business value of such repositories increases as customer relationship data is loaded and additional insights are generated.

BigQuery Admin reference guide: Recap

Over the past few weeks, we have been publishing videos and blogs that walk through the fundamentals of architecting and administering your BigQuery data warehouse. Throughout this series, we have focused on teaching foundational concepts and applying best practices observed directly from customers. Below, you can find links to each week’s content: Query Processing : Ever wonder what happens when you click “run” on a new BigQuery query?

Dimagi implements Passerelle Data Rocket to accelerate state and local COVID-19 response

Frontline healthcare providers don’t always have access to the latest and greatest technology. But when they are trying to fight a global pandemic with pen-and-paper tracking systems, something has to change. Dimagi is a tech company on a mission: to deliver scalable digital solutions for organizations to amplify their frontline impact.