Systems | Development | Analytics | API | Testing

September 2021

How Xplenty Helps Employers Keep Track of Time

Up to 20% of all employees in the United States are regularly late for work, costing organizations like yours billions of dollars a year. The solution? Timesheets, which monitor employee hours, tardiness, performance, and absenteeism. These documents also collect valuable data for payroll, accounting, billing, and project management. So they're critical for human resource teams everywhere. But there's a problem or two.

Building Machine Learning Pipelines with Real-Time Feature Engineering

Real-time feature engineering is valuable for a variety of use cases, from service personalization to trade optimization to operational efficiency. It can also be helpful for risk mitigation through fraud prediction, by enabling data scientists and ML engineers to harness real-time data, perform complex calculations in real time and make fast decisions based on fresh data, for example to predict credit card fraud before it occurs.

Migrate to CDP Private Cloud Base - A Step by Step Guide

Our recent blog discussed the four paths to get from legacy platforms to CDP Private Cloud Base. In this blog and accompanying video, we will deep dive into the mechanics of running an in-place upgrade from CDH5 or CDH6 to CDP Private Cloud Base. The overall upgrade follows a seven-step process illustrated below. In the video below we walk through a complete end to end upgrade of CDH to CDP Private Cloud Base.

Qlik Sense Enterprise SaaS - Impact Analysis

Brief overview and demo of the Impact Analysis capability available in Qlik Sense. Do you understand the origins and journey of your data sets? Do you trust where the data comes from? Are your responsible for migrations, understanding business changes and handling regulatory compliance? If so watch this video to learn more about Qlik’s latest addition to Qlik Sense Enterprise SaaS - Impact Analysis.

How to Turn on Change Data Capture (CDC)

2.5 quintillion bytes of data are produced every day, and those numbers are continually increasing. With such astronomical volumes of data, businesses have to understand and interpret data faster than ever before. However, data transfers must occur for businesses with millions of data entry points to properly store and interpret their data.

7 Tips to Improve ETL Performance

Consider for a moment, if you will, plastic patio furniture. Plastic Fantastic is a global manufacturer with several factories, warehouses, and plenty of stores. One can only imagine the sheer amount of data resulting from sales, production, suppliers, and finances. Everything that happens, from purchase and onward, to these chairs, tables, and cupboards in all corners of the world is measured.

Serving the Public Through Data

Digital transformation has been talked about for many years, but the pandemic has accelerated the digital transformation journeys for many enterprises. Forced to adapt to changes in the business landscape and customer behavior, businesses have adopted more digital tools and technologies to drive innovation and increase resilience.

Closing the Gap Between the Digital Haves and Have-Nots

The digital race is on. To pull ahead of the pack, a company needs to know what to do with its data. Without a data-driven strategy, you’re bound to lose ground to competitors who apply their data to operational improvements, product development, go-to-market strategies, and the customer experience. It isn’t enough to collect, interpret, and act on the data. You have to do it fast.

Future of Data Meetup: CDP on Azure - Industrial Strength Data Engineering

Data Engineering is undergoing a huge evolution requiring faster and more reliable data pipelines. Apache Spark and Python are core foundational components of this new architecture enabling data engineers to quickly develop these pipelines. They also introduce challenges when moving to production. Come join us as we: Ask questions and learn. We will also have a raffle of Cloudera swag.

How to get started with ThoughtSpot for Sales

Who are my top sellers? What kinds of deals have the highest close rate? How have our sales opportunities changed over time? As a sales leader, these are just some of the questions you ask yourself every day to keep your team on track. And the answers are in your data. Every form fill, cold call, and MQL is another data point you can use to assess the health of your sales organization.

Why You Need a Feature Store

Feature stores have arrived in 2021 as an essential piece of technology for operationalizing AI. Despite the enthusiasm for feature stores in high-tech companies, they are still absent from most legacy ML platforms and can be relatively unknown in many enterprise companies. We discussed how feature stores are critical to the data-first approach of next-gen ML platforms in our previous blog, but they are important enough to get their own treatment in a full article.

Customer Data Platform (CDP) vs. Reverse ETL

Reverse ETL and customer data platforms (CDPs) are two big data trends that have been receiving a great deal of attention. While both CDPs and reverse ETL can help you make smarter data-driven decisions, there are also several crucial points of distinction. In this article, we’ll answer the question: what’s the difference between reverse ETL and a customer data platform?

Understanding Microsoft ETL with Azure Data Factory

Migrating analytics workloads to the public cloud has been one of the most significant big data trends in recent years—and it shows no sign of slowing down any time soon. According to a study by IT research company Forrester: Within three years, however, Forrester predicts that the fates will have reversed: Of course, before data can be processed in the public cloud, it has to get there in the first place via data migration.

Group vs Fine-Grained Access Control in Cloudera Data Platform Public Cloud

Cloudera Data platform (CDP) provides a Shared Data Experience (SDX) for centralized data access control and audit in the Enterprise Data Cloud. The Ranger Authorization Service (RAZ) is a new service added to help provide fine-grained access control (FGAC) for cloud storage. We covered the value this new capability provides in a previous blog.

React and Respond in the Business Moment With Qlik Application Automation

Unless you’ve hidden under a rock for the past decade, you can’t have failed to notice that data in today’s enterprise is very much alive. It’s always moving, constantly changing, and we’re continually using it to create new business value. However, while data fluidity and visibility have blossomed, the opportunity to use that data to drive business actions seems to have withered in comparison.

Qlik App Automation - Brief Overview and Demo

Qlik Application Automation lets you visually assemble flows that work with market leading SaaS applications to invoke downstream processes that react to changes in your business. Consequently you spend less time programming back-office workflows and more time driving insights. Qlik Application Automation is part of our Active Intelligence vision which delivers in-the-moment awareness about every aspect of your business and helps you drive immediate actions.

Avoid another analyst fire drill with the modern data and analytics stack

In a recent webinar by TDWI, 45% of analysts reported that “every day seems to be a different fire drill.” No surprise to anyone in the industry. As much as analysts need to be focused on more strategic tasks, their skills are frequently deployed to answer basic questions. Greater self-service capabilities for end-users would no doubt alleviate these fire drills, but this is not yet a reality for the majority of companies.

Top 7 Talend Alternatives and Competitors

On the surface, Talend seems like the ultimate data integration platform. It's open-source, maintains multi-cloud integration, supports data governance frameworks like GDPR and CCPA, and handles both ETL and ELT, providing you with more flexibility for data management. Dig a little deeper, though, and you'll notice this platform has an outdated user interface and limited capabilities, and you'll probably need to upgrade to its enterprise version to execute data integration.

Converged Analytics In Financial Services

In financial services, data has always been viewed as a strategic asset. To manage this data, organizations have invested heavily over several years and across a number of technology generations in the underlying data infrastructure. This approach has left a large data technology legacy along with silos of data linked to specific infrastructure and applications.

Build your data analytics skills with the latest no cost BigQuery trainings

BigQuery is a fully-managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and intelligent caching for business intelligence. To help you make the most of BigQuery, we’re offering the following no cost, on-demand training opportunities.

Apache Hive vs. Apache HBase

Apache Hive and Apache HBase are incredible tools for Big Data. While there is some overlap in their functions, both Apache Hive and Apache HBase have unique qualities that make them better suited to specific tasks. Some key differences include: Ultimately, comparing Apache Hive to Apache HBase is like comparing apples to oranges or Google to Facebook. While the two entities are similar, they don't provide users with the same functionality.

Telecom Network Analytics: Transformation, Innovation, Automation

One of the most substantial big data workloads over the past fifteen years has been in the domain of telecom network analytics. Where does it stand today? What are its current challenges and opportunities? In a sense, there have been three phases of network analytics: the first was an appliance based monitoring phase; the second was an open-source expansion phase; and the third – that we are in right now – is a hybrid-data-cloud and governance phase. Let’s examine how we got here.

Terabytes of Data but Still No Good Insights?

In our modern digital society, data is abundant, and storage is affordable. Businesses, governments and even individuals can (and do) collect every transaction, click, swipe, location, message and attribute in their datasets. With just a few clicks on my smart device, I can review data on every place I’ve been, how much I spent, every step I took, what the weather was like and who I was with. Businesses collect the same abundance of data.

Martin Gardner - A Pattern for Salesforce Data Migration

A Pattern for Salesforce Data Migration A talk by Martin Gardner Solution Principal, Slalom Consulting Migrating data into and out of Salesforce orgs can be very difficult. In this talk I present a pattern for successfully planning and executing a data migration between two Salesforce orgs while avoiding some of the pitfalls and gotchas that catch out the unwary. I will discuss the four key features of a data migration project and how to apply these concepts to a migration between two Salesforce orgs.

How Klearnow went from sleepless nights to a booming data business with ThoughtSpot

Sometimes I walk through the grocery store and marvel at the way customers float through the aisles, blissfully unaware of the logistical nightmare it probably took to stock the shelves. They have no idea how many people, systems, and modes of transportation it takes to make everything magically appear on their grocery shelves. But I do. As the Senior Director of Software Engineering at KlearNow, I spend my days preserving the bliss of those grocery shoppers.

How to Easily Apply Analytics to Product Development Management

The development of a digital product has been redefined to involve only 4 phases, as TCGen and Product Plan propose: However, having an easier-to-follow process is not the only improvement that you can implement: cost and time efficiency can be taken a huge step further when you incorporate analytics insights. So, with this infographic, we propose some tools that can help you analyze data sets to enrich the phases of each development process.

What Are the Limitations of Dashboards?

For modern businesses faced with increasing volumes and complexity of data, it’s no longer efficient or feasible to rely on analyzing data in BI dashboards. Traditional dashboards are great at providing business leaders with insights into what’s happened in the past, but what if they need actionable information in real time? What if they want to use their data to estimate what may happen in the future? Companies are taking notice.

Unlocking Data Literacy Part 1: How to Set Up a Data Analytics Practice That Works for Your People

Does your team know how to use data analytics to their advantage? For the vast majority of companies, the answer may be “no.” According to Accenture, only 21% of people are confident in their data literacy skills, and just 32% of companies have realized tangible, measurable value from their data. While the definition of data literacy varies depending on who you ask, at its core, the term means equipping anyone in your organization to know how to use data in a business context.

Everything You Need to Know About API Integration

APIs are powerful for ETL (extract, transform, load) and data integration workflows. API integrations make it possible for the seamless exchange of information between websites, databases, and applications. The Xplenty API allows you and your enterprise to monitor Xplenty clusters and jobs. Through the Xplenty data processing package and Xplenty web application, you can call the Xplenty API to.

Speed Up Your Data Flow for Business Results

A slow car has never won a Formula One race. The Olympics doesn’t reward slow times in swimming, track or any other clock-timed sport. Likewise, slow data speeds don’t win over customers or colleagues in the real-time business world. Microsoft’s own research once reported that a person visiting a website on a connected device is likely to wait no more than 10 seconds to see it before moving to a competitor’s site.

Cloud Costs Through the Roof? Don't Worry - We've Got You Covered

Does it feel like your business is paying too much for cloud services? You are not alone. Cloud costs are expected to increase at a compound annual growth rate of 10.5% to 13.1% through 2025, according to the International Data Corporation (IDC). While getting a handle on those cloud costs may be tricky, you don’t have to worry — we’ve got you covered.

"So, How Do We Make This Work?" - Tracking Employee COVID Vaccination and Testing in As Little As 15 Minutes

With COVID-19’s ever-changing conditions – growing infection rates, shifting and new vaccine mandates, variant outbreaks and office closures and re-openings – HR has stepped up and taken on a significant role in helping organizations navigate every employee’s personal and work life needs. COVID-19 accelerated the evolution already underway in HR, with HR growing beyond being a policy and procedure hub into a strategic business partner.

Relational vs non-relational database: Which one should you use?

Ever since E. F. Codd introduced the first relational model for storing data at IBM in 1970, the industry has picked up the database technology and used it for its competitive advantage. The relational database management system - or RDBMS - was the default technology for storing and accessing data for a long time. It supported transactional data storage, the building of data products, and was the go-to model for data that was used in data-driven decisioning.

How to Make Your Data Ethical with Jack Berkowitz at ADP | Rise Of The Data Cloud

Ever wonder how companies like ADP handle all of the data that they are responsible for? In this episode of Rise Of The Data Cloud, Jack Berkowitz, Chief Data Officer at ADP, talks about the importance of keeping your product simple, data sharing, applying ethics to algorithms, and much more.

What's New in CDP Public Cloud? Hive and Impala Get a Facelift

Join us LIVE to discuss what’s new in CDP Public Cloud! Don’t miss the live Q&A as we learn about the new capabilities in Cloudera Data Warehouse. See how the Impala and Hive engines get a facelift. Also watch a demo of how you can run advanced analytics at scale using few easy steps

ThoughtSpot SpotApp for Snowflake Performance and Consumption Analytics

This SpotApp will have you up and running in minutes with search and AI-driven analytics from ThoughtSpot around your Snowflake Data Cloud performance and consumption. The SpotApp enables financial controllers to drill into credit consumption trends to proactively manage their cloud spend. And IT Ops teams will be able to dive into granular details about query performance to ensure that their data clouds are running at full speed.

How to get started with ThoughtSpot for ServiceNow Analytics

Since the start of the pandemic, business demands on your IT team have skyrocketed. You need granular, actionable insights to keep up with the speed and volume of digital transformation projects and IT incidents occurring across your organization. Canned reports from SaaS-based systems like ServiceNow aren’t fundamentally built for analytics.

State of the Reverse ETL

Data warehouses fixed one aspect of the data silo problem but introduced another. They function as a large, single source of truth for your organization, but getting insights from this data in a typical Extract, Transform, Load (ETL) data pipeline requires the use of Business Intelligence (BI) and analytics platforms. By the time your data team creates these reports and sends them to other business units, it’s too late for daily decision-making.

What's Next in the Data Cloud with Benoit Dageville & Christian Kleinerman | Snowflake Summit 2021

From the start, Snowflake co-founders envisioned a new and unique way for companies across industries and around the globe to collaborate on data and analytics. Join Benoit Dageville, co-founder and President of Product, and Christian Kleinerman, Snowflake’s SVP of Product, as they share how the Data Cloud vision has become a reality and they unveil the latest Snowflake innovations in five key areas: connected industries, global governance, platform optimization, data programmability, and applications powered by Snowflake. You’ll see new capabilities in action and hear directly from customers and partners about what these new advancements mean for their businesses.

ThoughtSpot, ServiceNow, and Snowflake for IT Workload Management

As the developer of the leading data cloud, Snowflake generates a wealth of IT Service Management data with ServiceNow. But uncovering actionable, granular insights has been a challenge. Now, ThoughtSpot and Snowflake are empowering IT executives to answer all their questions about support ticket backlog and effort with a single pane of interactive insights in ThoughtSpot, powered by Snowflake.

ThoughtSpot, ServiceNow, and Snowflake for Business Application Management

As the developer of the leading data cloud, Snowflake relies on a number of business applications. But creating a holistic view of these applications has been a challenge, as the data is sourced from a variety of systems. By combining application data from multiple sources in the Snowflake data cloud, ThoughtSpot and Snowflake are empowering internal organizations to answer all their questions about enterprise application quality with a single pane of interactive insights in ThoughtSpot, powered by Snowflake.

ThoughtSpot, ServiceNow, and Snowflake for Operational Metrics

ThoughtSpot for ServiceNow at Snowflake - As the developer of the leading data cloud, Snowflake generates a wealth of operational helpdesk data with ServiceNow. ThoughtSpot and Snowflake are enabling helpdesk and operations executives to answer all their questions about operational metrics with a single pane of interactive insights in ThoughtSpot, powered by Snowflake.

What is a Data Source?

“Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway.” That’s according to the author and consultant Geoffrey Moore. It’s an unsettling thought given that data and analytics are shifting from a secondary activity to a core business function. So what can companies do to gather and harness all of this information? The below guide discusses data sources and their role in informing decision-making.

Customer segmentation with Cosmo, Chief Destiny Officer

Do you ever feel like connecting with the right customer audience is just a matter of luck? We’ve met a CDO who leaves audience targeting up to chance. Cosmo, CDO is not a Chief Data Officer — he’s a Chief Destiny Officer. While we focus on data here at Talend, we’re trying to understand the 36% of business executives who say they don’t base the majority of their decisions on data.

Supercharge your Airflow Pipelines with the Cloudera Provider Package

Many customers looking at modernizing their pipeline orchestration have turned to Apache Airflow, a flexible and scalable workflow manager for data engineers. With 100s of open source operators, Airflow makes it easy to deploy pipelines in the cloud and interact with a multitude of services on premise, in the cloud, and across cloud providers for a true hybrid architecture.

Modernizing Your Cloud Platform for IT Agility and Efficiency

Businesses are increasingly embracing a cloud-first approach to increase market responsiveness and flexibility. The cloud-first approach refers to a cloud-like experience consisting of on-demand metered consumption of IT infrastructure, whether on the public cloud or inside private data centers. The rapidly evolving consensus among the tech leaders and vendors has led to an emergence of hybrid IT.

Qlik Sense Insight Advisor Improvements

We cover some of the latest improvements in the Qlik Sense Insight Advisor. Insight Advisor is your intelligent assistant in Qlik Sense, providing AI-generated charts that are delivered in multiple forms using a variety of user experiences – these include field selection, keyword search and insight advisor chat. It auto-generates context-aware analyses learned from your data and search criteria and supports natural language interaction – it can also deliver more advanced analytics for users to explore.

How to Power Rapid Transformation in Financial Services with Snowflake | Snowflake Summit 2021

There has never been a greater need to rapidly transform and innovate to stay ahead of the competition. In this session, join Snowflake customers Western Union, Goldman Sachs, and FINOS as well as partner Deloitte to learn about how the Data Cloud is powering financial services firms.

Pushing Data to Hubspot from Your Warehouse

While traditional ETL (Extract, Transform, Load) collects data within a centralized data warehouse, reverse ETL flips the target and destination of the standard ETL process. This allows information to be pushed out of data warehouses and into powerful third-party operational systems that can provide better analytics and reporting services. With the Xplenty platform and its ETL tools, all information is sent between warehouses and third-party operational systems in an efficient and secure manner.

How to Implement Change Data Capture (CDC)

If you're looking for a better way to organize your data and ensure it stays up-to-date, you need to start utilizing CDC processes today. Change data capture uses various techniques to detect changes made in source tables and databases in real-time. Read on to learn more about change data capture and how it can be implemented to better serve your business.

Apache Kafka Deployments and Systems Reliability - Part 1

There are many ways that Apache Kafka has been deployed in the field. In our Kafka Summit 2021 presentation, we took a brief overview of many different configurations that have been observed to date. In this blog series, we will discuss each of these deployments and the deployment choices made along with how they impact reliability.

ThoughtSpot SpotApp for Outreach

This SpotApp template will have you up and running in minutes with search and AI-driven analytics from ThoughtSpot for Outreach, the leading sales engagement platform. Outreach is a critical piece of sales infrastructure for teams that need to close deals faster - particularly in the remote world. But its canned reports aggregate most data by sequence or team. With the ThoughtSpot SpotApp for Outreach, you can uncover the actionable, granular insights sales leaders need to improve team performance, and individual contributors need to benchmark their performance against peers to increase the number of meetings they book.

SpotApp for Google Analytics and Google Ads

This SpotApp template will have you up and running in minutes with search and AI-driven analytics from ThoughtSpot for Google Analytics and Google Ads, the leading web analytics and advertising tools. Google Analytics and Google Ads are ubiquitous in digital marketing. But their canned reports and tool-specific terminology make it difficult for non-experts to uncover the insights they need to track every marketing dollar with precision and personalize customer experiences at scale. The SpotApp for Google Analytics and Google Ads brings data from both of these tools into the Snowflake Data Cloud or other popular cloud data warehouses, so marketers can leverage the power and ease of search and AI to build remarkable marketing content and programs through deeper insights.

Implementing Automation and an MLOps Framework for Enterprise-scale ML

With the explosion of the machine learning tooling space, the barrier to entry has never been lower for companies looking to invest in AI initiatives. But enterprise AI in production is still immature. How are companies getting to production and scaling up with machine learning in 2021? Implementing data science at scale used to be an endeavor reserved for the tech giants with their armies of developers and deep pockets.

ETL Pipeline vs. Data Pipeline: What's the Difference?

ETL Pipeline and Data Pipeline are two concepts growing increasingly important as businesses keep adding applications to their tech stacks. More and more data is moving between systems, and this is where Data and ETL Pipelines play a crucial role. Take a comment on social media, for example. It might be picked up by your tool for social listening and registered in a sentiment analysis app.

Pushing Data to Zendesk From Your Warehouse

Data warehouses have made it easier for customers to store massive amounts of data. That information is useless, however, unless it is actionable. Making data actionable means employees can use the data for decision-making and improving the user experience. In this article, we’ll discuss reverse ETL, pushing data to Zendesk from your warehouse, and the value of this process.

5 Enterprise Resource Planning (ERP) Trends that Accelerate Digital Transformation

As businesses advance their digital transformation efforts, enterprise resource planning (ERP) systems are evolving — arguably as significantly as the shift from materials resource planning (MRP) to ERP. Just as businesses reimagined operations by leveraging advances in hardware and software in the 1980s, they’re now turning to next-generation ERP.

In the Quest for Success, Never Stop Being Curious With Data

My whole life I’ve been curious. You have to be, to become an entrepreneur. I’m curious about trends, about looking at data and finding patterns, which might show you where the next opportunity lies. And, as I discussed with Joe DosSantos in the latest episode of Data Brilliant, I’m a big believer in experimentation and learning by putting the data and analysis into practice.

Which Stitch Alternative Should You Choose? Top 7 Stitch Alternatives

Stitch is a popular cloud-based Extract, Load, Transform (ELT) tool. Stitch seamlessly moves data between databases, warehouses, data lakes, SaaS services, and other applications with no code required whatsoever, making it a valuable weapon for data integration. However, the platform has limited data transformation capabilities and, away from its free tier, charges users for the amount of data they use per month, which often works out to be more expensive than other pricing models in the ETL/ELT space.

Consumption-based Pricing: Ensuring Every Customer's Value and Success

Consumption-based, aka usage-based, pricing is hardly new. Anyone with an electricity, gas, or water bill knows that the amount you pay each month varies depending on your usage. More recently, disruptive companies have pushed other industries (transportation, hospitality, communications, and insurance) to transform by providing usage-based products and services via software applications. As consumers, we see this all around us, when we hail an Uber or choose a short-term rental on AirBnB.

Introducing Qlik Cloud Government - Analytics for U.S. Federal Sector

Qlik would like to announce our SaaS solution for the U.S. Federal and Public Sector with Qlik Cloud Government. A new platform entirely designed specifically to meet the varied needs of our customers including the U.S. Public sector, offering a modern analytics platform built for speed, security, and scale.

Snowflake's Data Cloud for Advertising in a Cookieless World | Snowflake Summit 2021

Effective marketing and advertising is essential to driving growth, but the landscape is rapidly changing, with escalating regulatory requirements and the deprecation of third-party cookies. To succeed, businesses need to develop new, secure methods for accessing and sharing audience and engagement data. In this session with Disney, NBCUniversal (NBCU), Capgemini, and Snowflake, learn how the unique and innovative capabilities of the Data Cloud are enabling seamless data sharing without data copies or movement. Specifically, learn.

Troubleshooting Databricks

The popularity of Databricks is rocketing skyward, and it is now the leading multi-cloud platform for Spark and analytics workloads, offering fully managed Spark clusters in the cloud. Databricks is fast and organizations generally refactor their applications when moving them to Databricks. The result is strong performance. However, as usage of Databricks grows, so does the importance of reliability for Databricks jobs - especially big data jobs such as Spark workloads. But information you need for troubleshooting is scattered across multiple, voluminous log files.

SQL Server SSRS, SSIS packages with Google Cloud BigQuery

After migrating a Data Warehouse to Google Cloud BigQuery, ETL and Business Intelligence developers are often tasked with upgrading and enhancing data pipelines, reports and dashboards. Data teams who are familiar with SQL Server Integration Services (SSIS) and SQL Server Reporting Services (SSRS) are able to continue to use these tools with BigQuery, allowing them to modernize ETL pipelines and BI platforms after an initial data migration is complete.

Payment gateway analytics for payment service providers

Payment gateway analytics tracks the payment processing journey and related event data across all payment gateways. When used efficiently, payment gateway analytics can benefit businesses by providing insights into their revenues, payment trends, and customer behavior. Payment gateway analytics provides much needed visibility into the payments environment to enable the fast detection of transaction performance issues, anomalies or trends.

Which Tables in a Data Warehouse Use Change Data Capture

In today’s 24/7 digital world, real-time data is a necessity to stay relevant for today’s businesses. Companies who wish to remain competitive must be able to quickly respond to customer demand and adjust to market changes. Supplying business leaders with real-time information for informed decision-making can be a challenge with information spread among disparate systems.

Are These the 6 Best Reverse ETL Vendors for 2021?

The amount of big data that enterprises churn out is simply staggering. All this information is worthless unless organizations unlock its true value for analytics. This is where ETL proves useful. Traditional ETL (extract, transform, and load) remains the most popular method for moving data from point A to point Z. It takes disparate data sets from multiple sources, transforming that data to the correct format and loading it into a final destination like a data warehouse.

Living on the Edge: How to Accelerate Your Business with Real-time Analytics

Leveraging the Internet of Things (IoT) allows you to improve processes and take your business in new directions. But it requires you to live on the edge. That’s where you find the ability to empower IoT devices to respond to events in real time by capturing and analyzing the relevant data.

ETL vs ELT: 11 Critical differences

ETL and ELT refer to two patterns of data storage architecture within your data pipelines. The letters in both acronyms stand for: So both ETL (extract, transform, load) and ELT (extract, load, transform) processes help you collect data, transform it into a usable form and save it to permanent storage, where it can be accessed by data scientists and analysts to extract insights from the data. What is the difference?

How To Setup an SSH Tunnel Connection

In this video, we will set up an SSH tunnel connection within Xplenty to access a server on a private network. When configuring SSH tunnels for connecting to databases, authentication is performed using the SSH server. This can be done via either user name and password authentication or by using key-based authentication. Key-based authentication gives the user the ability to log in without a password.

Building a Real-Time ML Pipeline with a Feature Store - MLOps Live #16

With the growing business demand for real-time use cases such as NLP, fraud prediction, predictive maintenance and real-time recommendations, ML teams are feeling immense pressure to solve the operational challenges of real-time feature engineering for machine learning, in a simple and reproducible way. This is where online feature stores come in. An online feature store accelerates the development and deployment of online AI applications by automating feature engineering and providing a single pane of glass to build, share and manage features across the organization.

5 Tips for Pushing Data from Your Warehouse to NetSuite

NetSuite is an enterprise resource planning (ERP) business management platform part of the Oracle enterprise software ecosystem. Although it's primarily intended for small and medium-sized businesses, organizations of all sizes and industries have successfully used NetSuite to balance their checkbooks, manage inventory, bill and invoice customers, and more.

How To Get True ROI From Your Account-Based Marketing (ABM)

Account-based marketing, or ABM, is more often used as targeted demand generation—not one-to-one marketing. In a 2020 study of more than 300 organizations worldwide, Forrester found that “a significant number of respondents claimed they were using an ABM approach but weren’t doing what we would consider the basics of ABM, such as working with sales.”1 ABM isn’t just about assigning one siloed team the responsibility of targeting and revealing high-potential prospects.

Operating Apache Kafka with Cruise Control

There are two big gaps in the Apache Kafka project when we think of operating a cluster. The first is monitoring the cluster efficiently and the second is managing failures and changes in the cluster. There are no solutions for these inside the Kafka project but there are many good 3rd party tools for both problems. Cruise Control is one of the earliest open source tools to provide a solution for the failure management problem but lately for the monitoring problem as well.

Client Reporting 101: Tips and Best Practices for Agencies and Freelancers

Communication‌ ‌is‌ ‌the‌ ‌key‌ ‌factor‌ ‌for‌ ‌a‌ ‌good‌ ‌relationship‌ ‌with‌ ‌your‌ ‌clients.‌ ‌ Here‌ ‌are‌ ‌the‌ ‌best‌ ‌client‌ ‌reporting‌ ‌practices‌ ‌to‌ ‌help‌ ‌you‌ ‌showcase‌ ‌the‌ results‌ ‌you‌ &#

The Pros and Cons of Application Software Integration

Gartner predicts that by 2023, organizations that promote data sharing will outperform their peers on most business value metrics. According to Debra Logan, Gartner’s Research Vice President, “Data sharing is the way to optimize higher quality data and more robust data and analytics to solve business challenges and goals.” Given these numbers, it is clear that businesses will need to embrace application software integration as a core business strategy.

The 7 Critical Differences Between DynamoDB vs MongoDB:

MongoDB vs DynamoDB: How do you choose between them? Whether you are a two-man team bootstrapping a proof of concept or an established one battling with high throughput and heavy load; this post can serve as a guidepost in your decision process. Before going into the details, a brief history lesson on how these technologies emerged is pertinent; you must understand the optimal conditions for running these systems and how they operate in the wild before making an informed choice.

Use Cases for Reverse ETL

According to Gartner, leading organizations in every industry are wielding data and analytics as competitive weapons. Companies that leverage data as a competitive differentiator will stand the best chance of acting faster on opportunities and responding to threats in a competitive marketplace. The problem is that most companies aren’t aware of the value of their data. As a result, they aren’t leveraging the full potential of their data to make informed decisions.

Enabling Multi-User Fine-Grained Access Control for Cloud Storage in CDP

Shared Data Experience (SDX) on Cloudera Data Platform (CDP) enables centralized data access control and audit for workloads in the Enterprise Data Cloud. The public cloud (CDP-PC) editions default to using cloud storage (S3 for AWS, ADLS-gen2 for Azure). This introduces new challenges around managing data access across teams and individual users. To solve these challenges for S3 and ADLS-gen2, Cloudera has introduced a new service — the Ranger Authorization Service (RAZ).

Cloudera and NVIDIA Help IRS Fight Fraud, Safeguard Taxpayers

Across the federal government, agencies are struggling to identify, organize, analyze, and act on troves of data. It’s a problem that leaders are working actively to tackle, but they’re in a race against immeasurable volumes of data that is continuously being generated in perpetuity in stores known and unknown. At the Internal Revenue Service, decades’ worth of data exceeds even the most cutting-edge processing capabilities.

Streaming Analytics with SQL Stream Builder

SQL Stream Builder, part of Cloudera Streaming Analytics, allows developers, analysts, and data scientists to write streaming applications using industry-standard SQL. It provides an interactive experience, so the development process is quick, easy, and productive while taking advantage of Apache Flink’s streaming power. It provides an advanced materialized view engine to interface with applications, tooling, and services via REST API.

Ad agencies choose BigQuery to drive campaign performance

Advertising agencies are faced with the challenge of providing the precision data that marketers require to make better decisions at a time when customers’ digital footprints are rapidly changing. They need to transform customer information and real-time data into actionable insights to inform clients what to execute to ensure the highest campaign performance.

What Scenario Should You Use CDC for?

Sometime in 2019, Netflix cracked a conundrum that stumped them for years. The company had so much data about its content and subscribers, it had to sync multiple heterogeneous data stores like MySQL and Elasticsearch continuously, which brought seriously stressful challenges like dual writes and distributed transactions. So Netflix created its own CDC tool that processes captured log events in sequence and takes dumps for specific tables and primary keys of tables. Problem sorted. Case closed.

What is Data Mapping?

Imagine this: less than half an organization’s structured data is used in decision-making. Think of the missed opportunities for customer acquisition and revenue by not taking advantage of that information. According to an IBM study, 87 percent of CEOs regard data as a strategic asset. So why then are companies not harnessing the power of this information?

Spark Troubleshooting Solutions - DataOps, Spark UI or logs, Platform or APM Tools

Spark is known for being extremely difficult to debug. But this is not all Spark’s fault. Problems in running a Spark job can be the result of problems with the infrastructure Spark is running on, inappropriate configuration of Spark, Spark issues, the currently running Spark job, other Spark jobs running at the same time – or interactions among these layers.

Optimizing your BigQuery incremental data ingestion pipelines

When you build a data warehouse, the important question is how to ingest data from the source system to the data warehouse. If the table is small you can fully reload a table on a regular basis, however, if the table is large a common technique is to perform incremental table updates. This post demonstrates how you can enhance incremental pipeline performance when you ingest data into BigQuery.

Reverse ETL to NetSuite

Reverse ETL is a data integration technology that offers a wonderful way to enable solutions for making various stored data more actionable and usable. This process is especially helpful for enterprise business operation tools that help teams execute processes and meet goals more effectively. The idea is to use clean and accurate data to enhance various SaaS platforms and business management tools to enhance processes.

Supporting Transformation with an Integrated Data Platform. Three Common Questions Answered.

In recent years there has been increased interest in how to safely and efficiently extend enterprise data platforms and workloads into the cloud. CDOs are under increasing pressure to reduce costs by moving data and workloads to the cloud, similar to what has happened with business applications during the last decade. Our upcoming webinar is centered on how an integrated data platform supports the data strategy and goals of becoming a data-driven company.

Data And The Music Industry | Rise Of The Data Cloud

Ever wondered how is data changing the music industry? In this episode, Moin Haque, SVP of Architecture and Engineering, and Vlad Barkov, VP of Data Architecture & Engineering at Warner Music Group, discuss the transformation of the music industry during the pandemic, choosing the right business partners, making data independent, and much more.

Early-stage growth: Why shifting the founder mindset is critical to acquiring your first 10 customers

Growth. It’s the mountain every startup founder must learn to climb in order to run a successful business. And as with any great mountain, the journey to the top never feels more daunting than at the base. How your startup earns its first 10 customers will set the tone for the rest of the trek and determine how fast your team reaches the summit — if at all.

Salesforce Rest API

REST API provides a solution for communicating easily with various apps, platforms, and web services. This tool allows the user to communicate and execute actions universally with various languages and more. Companies implementing REST API stand to gain a better experience and create more flexibility when using ERPs, CRMs, and more like Salesforce. When users connect to Salesforce with a REST API, they get more flexibility and options that make the process more actionable and streamlined.

5 Tips for Pushing Data from Your Warehouse to Intercom

Intercom bills itself as a “conversational relationship platform,” helping businesses connect with their audiences primarily through live chat, as well as email and chatbots. The Intercom platform offers functionality including engagement and onboarding for new customers, support for existing customers, and marketing for potential customers on the fence.

The role of a CDO with Cosmo, Chief Destiny Officer

Have you ever wished you had a crystal ball? We tracked down a CDO who actually uses one. See, Cosmo, CDO is not a Chief Data Officer — he’s a Chief Destiny Officer. We’re all about data at Talend, but sometimes it’s good to see things from another perspective. We sat down with Cosmo to ask him about his job, his background, and his methods.

Everything as a Service: Unlock Business Outcomes

In the wake of COVID-19, we saw a significant shift toward as-a-service offerings, something we haven’t seen in years. From conversations with CIOs over the past 12 months, we know they are looking for the flexibility, efficiencies and cost savings they get from the as-a-service model. This is especially important to them as they evolve their business models in a hybrid IT direction and become consumers of IT.

Data Warehouse vs Database: What is the difference and which one should you choose?

The world of big data is getting bigger every day. As the volume of data increases exponentially, businesses of all sizes try to capture raw data, process it, and extract insights for competitive decision-making. The end-to-end operation of extracting value from data is called the ETL process. It stands for: A crucial component of the ETL process is the data storage aspect. The two main contentious architectures for storage solutions are databases and data warehouses. But how do they differ?

Change Data Capture: CDC for E-Commerce

Change data capture is one of the fundamental underpinnings of modern data management. Without knowing when their enterprise data has changed or refreshed with new information, businesses wouldn’t be able to access up-to-the-minute insights that help them stay competitive in a constantly shifting landscape. In change data capture (CDC), users are promptly notified (either in real-time or near real-time) when changes have been made to a source table or source database.

Spectacular growth: Beaumotica accelerates expansion with data-driven insights from Talend

Beaumotica combines smart lighting, design, and top brands to create the perfect mood and atmosphere for any room. And with help from Talend, the company can now combine data, analytics, and automation to optimize business decisions and accelerate growth. Last year alone the company tripled its business and expanded into new territories across Europe. Based in The Netherlands, Beaumotica has been growing steadily since 2007.

5 Tips for Pushing Data from Your Warehouse to SAP

Reverse ETL flips the targets and destinations of the standard ETL (extract, transform, load) process. Instead of collecting your data within a centralized data warehouse, reverse ETL transfers information out of this warehouse and into third-party operational systems for ease of access and better analytics and reporting. That’s all very well and good — but what does reverse ETL look like in practice with systems and software such as SAP?

6 Database Schema Designs and How to Use Them

In this guide, we'll discuss what a database schema is, six database schema designs, and how and why they are used. We know a lot of thought goes into database construction. Before creating any database, developers plan what it should include and how the different aspects work together. This planning ensures a database has the necessary design for its intended use. Coders then use the schema to implement the database’s design.

The Role of the Empowered Citizen Integrator in Democratizing Technology Across the Enterprise

For organizations to succeed today, they have to make data-driven decisions and get the most value out of the information. “To monetize data, companies must first transform it so that it can be reused and recombined to enable new value creation.” Most organizations have a ton of information that is spread across multiple departments. Making sense of the information is no easy task. Inaccurate analytics or delayed insights can put the business in a vulnerable position.

With Stitch, Simba is losing no sleep over aggressive growth plans

“If we didn’t have Stitch, we would have to recruit and hire data engineers, buy space for hundreds of millions of rows that we’re sinking into the database, and on and on. For us, Stitch is essential.” –Tomasz Eitner, BI and Data Analyst, Simba Sleep Simba Sleep has always been a data-driven company. Before the firm was even formally launched, the founders purchased research profiles from more than 10 million sleepers—including 180 million body profile data points.

How to connect Mongo DB to Heroku Postgres

Every computer application must have a method of storing, managing and using data. This requires an application and at least one database that can communicate with each other. Managing this connection can be difficult, especially with multiple databases. Fortunately, there are platforms that can manage databases and connections applications more efficiently. Heroku offers a Postgres management system for creating, managing, and using databases.

Our reflections on the 2021 Gartner Magic Quadrant for Data Integration Tools

“The data integration tool market is seeing renewed momentum, driven by requirements for hybrid and multi-cloud data integration, augmented data management, and data fabric designs.” This is what Gartner assesses in its latest Magic Quadrant for Data Integration Tools* report. And that assessment makes perfect sense. Data is the lifeblood of an organization.

Optimizing Cloudera Data Engineering Autoscaling Performance

The shift to cloud has been accelerating, and with it, a push to modernize data pipelines that fuel key applications. That is why cloud native solutions which take advantage of the capabilities such as disaggregated storage & compute, elasticity, and containerization are more paramount than ever. At Cloudera, we introduced Cloudera Data Engineering (CDE) as part of our Enterprise Data Cloud product — Cloudera Data Platform (CDP) — to meet these challenges.

Migrating Data Pipelines from Enterprise Schedulers to Airflow

At Airflow Summit 2021, Unravel’s co-founder and CTO, Shivnath Babu and Hari Nyer, Senior Software Engineer, delivered a talk titled Lessons Learned while Migrating Data Pipelines from Enterprise Schedulers to Airflow. This story, along with the slides and videos included in it, comes from the presentation.

Automated Competition Scraping with Apify and Keboola

Whether you saw or missed our webinar, we thought it would be useful to provide a step-by-step guide on how to set up quick competition monitoring (or, any other web scraping and data processing automation) with Apify and Keboola. Thank you Apify and Revolt.bi for the collaboration! So what can you do with automated competition data processing? In this article, we’ll take an example of daily monitoring of the best-sellers list at Amazon.

How to load Salesforce data into BigQuery using a code-free approach powered by Cloud Data Fusion

Organizations are increasingly investing in modern cloud warehouses and data lake solutions to augment analytics environments and improve business decisions. The business value of such repositories increases as customer relationship data is loaded and additional insights are generated.

BigQuery Admin reference guide: Recap

Over the past few weeks, we have been publishing videos and blogs that walk through the fundamentals of architecting and administering your BigQuery data warehouse. Throughout this series, we have focused on teaching foundational concepts and applying best practices observed directly from customers. Below, you can find links to each week’s content: Query Processing : Ever wonder what happens when you click “run” on a new BigQuery query?

How to Operationalize your Data Warehouse with Reverse ETL

Organizations are losing out on data-driven decision-making opportunities when data stays in the data warehouse. While business intelligence solutions can surface insights from these data sets, it often reaches team members too late to be used for daily business operations. Reverse ETL empowers organizations to increase the value of their data warehouses through operationalization. Learn how this can transform the way companies use data and insights.

Dimagi implements Passerelle Data Rocket to accelerate state and local COVID-19 response

Frontline healthcare providers don’t always have access to the latest and greatest technology. But when they are trying to fight a global pandemic with pen-and-paper tracking systems, something has to change. Dimagi is a tech company on a mission: to deliver scalable digital solutions for organizations to amplify their frontline impact.

Cost of ELK

Do you know how much your ELK stack costs? Managing and analyzing your data is a critical part of your business. However, the true cost of an ELK stack can be hard to calculate, and the truth is you may be spending a lot more than you think. Elasticsearch wasn't designed to work efficienctly at the scale required by today's data volume, especially the growth of log data. As your data grows, your ELK stack becomes more expensive to scale and maintain, leaving you with the headache and the tab. Well, ChaosSearch has the answer.