Systems | Development | Analytics | API | Testing

March 2021

How BigQuery helps scale and automate insights for baseball fans

When looking at data, business decision makers are often blocked by an intermediate question of "What should I take away from this data?" Beyond putting together the numbers and building the results, data analysts and data scientists play a critical role in helping answer this question. Organizations big and small depend on data analysts and data scientists to help “translate from words to numbers, and then back to words” as sports analytics pioneer Dean Oliver once said.

Spring forward with BigQuery user-friendly SQL

Spring is here. Clocks move forward. The Sakura (cherry blossom) festival in Japan marks the celebration of the new season. In India, the holi festival of colors ushers in the new harvest season. It’s a time for renewal and new ways of doing things. This month, we are pleased to debut our newest set of SQL features in BigQuery to help our analysts and data engineers spring forward.

Leveraging ETL to Enable your Domain Driven Design

How much do you know about Domain-Driven Design (DDD)? It's a design approach to software development where the language and structure of software code match the business domain. The concept comes from a 2003 book by Eric Evans. And it influences software architects, information architects, data engineers, and computer science professionals who organize code and solve some seriously stressful software problems. Domain-Driven Design is a super-successful concept with brilliant business logic benefits.

Cloudera Data Platform extends Hybrid Cloud vision support by supporting Google Cloud

CDP Public Cloud is now available on Google Cloud. The addition of support for Google Cloud enables Cloudera to deliver on its promise to offer its enterprise data platform at a global scale. CDP Public Cloud is already available on Amazon Web Services and Microsoft Azure. With the addition of Google Cloud, we deliver on our vision of providing a hybrid and multi-cloud architecture to support our customer’s analytics needs regardless of deployment platform.

Choosing Open Wisely

Since Snowflake’s inception we’ve had the needs of our customers as our North Star, with a clear focus on security and governance of data, continuous advances on performance and reduction of latencies, and relentless capabilities innovation. As part of our product development, we constantly evaluate where open standards, open formats, and open source can help or hinder our progress towards those goals. In short, it is a matter of degree.

NEW Lenses: PostgreSQL & metadata to navigate your Kafka galaxy

When you’re one of many developers commanding streaming applications running in Apache Kafka, you want enough data observability to fly your own data product to the moon. But you also want to boldly go where no developer has gone before to discover new applications. At the same time, you don’t want to be exposed to sensitive data that summons you to your compliance team, crashing you back down to earth.

Data Lake Opportunities: Rethinking Data Analytics Optimization [VIDEO]

Data lakes have challenges. And until you solve those problems, efficient, cost-effective data analytics will remain out of reach. That’s why ChaosSearch is rethinking the way businesses manage and analyze their data. As Mike Leone, Senior Analyst for Data Platforms, Analytics and AI at ESG Global, and Thomas Hazel, ChaosSearch’s founder and CTO, explained in a recent webinar, ChaosSearch offers a data analytics optimization solution that makes data faster and cheaper to store and analyze.

How Xplenty Unlocked a Global Sales Brand's Post-Pandemic Potential

When COVID hit, multinationals went into a tailspin, scrambling for solutions to pandemic-related problems like suspended flights, social distancing, and stay-at-home orders. How could global brands function when operations are so interconnected? One global sales and marketing brand stayed calm in the crisis, innovating localized strategies that strengthened remote regional teams.

How to build modern data applications

With the steep rise of data, smart businesses have started capitalizing on this new oil to build a new type of products and services: data applications. Admittedly, the engineering and business development of data apps overlaps with their cousins - the trusty desktop app and the well-known web app, … But there is a core difference that sets data applications apart: they are first and foremost about the data they use to deliver value.

Selling amid the ecommerce explosion: 4 success factors to consider

The new normal brought digital commerce to the forefront, with customers preferring remote sales, online ordering and payments, and contactless purchases. The question is, are organizations equipped to cater to these constantly evolving buyer habits? Let’s see how the right solution and strategic application development can help fuel growth in the digital economy.

How to build products that shine

Have you ever looked at clouds with a friend and seen patterns, shapes, and animals that they just can’t make out? The images appear so obviously to you but no matter how hard you try to show them, your friend sees an entirely different shape, or perhaps nothing at all. Building product features can be a similar exercise in frustration. As designers and builders, sometimes we see our own “cloud shapes” in the product.

Data Lake Challenges: Or, Why Your Data Lake Isn't Working Out [VIDEO]

Since the data lake concept emerged more than a decade ago, data lakes have been pitched as the solution to many of the woes surrounding traditional data management solutions, like databases and data warehouses. Data lakes, we have been told, are more scalable, better able to accommodate widely varying types of data, cheaper to build and so on. Much of that is true, at least theoretically.

3 Best Logging Add-Ons for Heroku as of 2021

Heroku is a powerful PaaS that helps developers easily and quickly launch and scale modern applications. In addition to its base features, it also offers Heroku Elements Marketplace, which has add-ons for expanding the functionality of the base platform. While Heroku has built-in logging for your applications in this platform, you may find that the default interface leaves something to be desired.

Accelerated integration of Eventador with Cloudera - SQL Stream Builder

In October 2020, Cloudera made a strategic acquisition of a company called Eventador. This was primarily to augment our streaming capabilities within Cloudera DataFlow. Eventador was adept at simplifying the process of building streaming applications. Their flagship product, SQL Stream Builder, made access to real-time data streams easily possible with just SQL (Structured Query Language).

What is data storytelling? The value of context & narrative in BI

By 2025, Gartner predicts data stories will be the most widespread way of consuming analytics. It’s far from a fad - it’s a part of modern analytics you can’t afford to overlook. However, are you still unsure of what data storytelling is, or why there's a sudden buzz?

The Data Chief: ThoughtSpot's Ajeet Singh on design thinking and disruption

At this point, we can pretty confidently project that the cloud isn’t going anywhere. In the last few years, we’ve seen critical production workloads move to the cloud en masse—and that’s only going to continue. Now that the shift has occurred, other companies will need to get on board as well. The question is no longer if a company should adopt cloud but when and how. Ajeet Singh, ThoughtSpot’s Co-founder and Executive Chairman, is no stranger to embracing change.

Why Is Your Crypto App Not Measuring NPS (Properly)?

Crypto trading and exchange apps have surged in recent years as a direct consequence of the exponential growth in the number and market capitalization of cryptocurrencies. With ever-growing press coverage and heightened visibility, the crypto ecosystem gets more and more crowded every second. This constant influx of players has clearly been beneficial, as the assets traded continue to grow.

Retailers must leverage data to survive this pandemic

Jamie Kiser, COO and CCO at Talend, explains why retailers, striving to ensure they’re not missing out on future opportunities, must leverage one thing: data. By utilizing customer intelligence and better data management, retailers can collect supply chain data in real-time, make better orders to suppliers based on customer intelligence. While major industries from tech to the public sector felt COVID’s pain points, not many felt them as acutely as retail.

Democratizing Data Across a Dynamic Business With Qlik

The retail and manufacturing industries have had many new challenges due to COVID-19. As a leading action sports apparel company, La Jolla Group has persevered during this time. Now more than ever we rely on a holistic view of our forecasting, sales and production data. We have democratized this data empowering users from across the business to make well-informed business decisions, quickly. That’s where Qlik has shown real superiority over other platforms like Power BI and Tableau.

Why CDOs need a people change management champion on their team

CDOs are charged with leading their organizations to become more data-driven. In 2021, this mission is more critical as the distance between data-driven organizations and laggards widened. For many, this means accelerating digital transformation plans.. Last year, ten-year plans got compressed into a single year. Companies raced to implement new technologies - everything from digital payments, store fronts, chat bots, cloud databases, and augmented analytics solutions

Using BigQuery Administrator for real-time monitoring

When doing analytics at scale with BigQuery, understanding what is happening and being able to take action in real-time is critical. To that end, we are happy to announce Resource Charts for BigQuery Administrator. Resources Charts provide a native, out-of-the-box experience for real-time monitoring and troubleshooting of your BigQuery environments.

ETLG: ETL for Data Governance and Better Security

Most enterprises are leveraging vast reserves of data to improve their business insights and decision-making. However, as companies manage larger stores of data and move more and more information from operational databases to data warehouses, it creates an ever-mounting threat of data breaches.

Will Data Privacy drive an Enterprise Data Strategy?

Data privacy is an increasingly complex and contentious topic. The appropriate use of data and transparency to the potential uses of the data are at the center of debate amongst the largest Big Tech companies. The protection and controls around data become increasingly complex when used in the context of banking and insurance activities. Personal and confidential information carries heightened sensitivity in the light of financial, health and insurance activities.

Qlik Sense SaaS Data Cataloging - Demo

Qlik Sense now includes an initial set of data catalog capabilities that will be the foundation for additional, related functionality in the future. Integrated Data Cataloging allows users to spend less time finding data and more time getting value out of it. And users can now assign tags and alternative business names to any dataset within the hub, as well as view sample data, making it easy for them to find and determine which data is best to use within a new or existing app.

How to get Observability for Apache Airflow

Observability of Apache Airflow presented by Ry Walker, Founder & CTO at Astronomer. Apache Airflow has become an important tool in the modern data stack. We will explore the current state of observability of Airflow, common pitfalls if you haven't planned for observability, and chart a course for where we can take it going forward.

DataOps Unleashed Things You May Not Know About Apache Kafka but Should

Things You May Not Know About Apache Kafka but Should presented by Patrick Druley, Senior Solution Engineer at Confluent. In this session, you will learn about some of the common misconceptions, best practices, and little-known facts about Apache Kafka. Event Streaming has changed the way businesses think about data movement and integration. If you are new to Kafka or having been creating topics and developing clients for years, there's something for everyone in this fun and informative session.

DataOps Unleashed Dataops Automation and Orchestration With Fivetran, Dbt, and the Modern Data Stack

Dataops Automation and Orchestration With Fivetran, Dbt, and the Modern Data Stack presented by Nick Acosta, Developer Advocate at Fivetran. Many organizations struggle with creating repeatable and standardized processes for their data pipeline. Fivetran reduces pipeline complexity by fully managing the extraction and loading of data from a source to a destination and orchestrating transformations in the warehouse.

Google BigQuery is a Leader in The 2021 Forrester Wave: Cloud Data Warehouse

We are thrilled to announce that Google has been named a Leader in The Forrester Wave™: Cloud Data Warehouse, Q1 2021 report. For more than a decade, BigQuery, our petabyte-scale cloud data warehouse, has been in a class of its own. We're excited to share this recognition and we want to thank our strong community of customers and partners for voicing their opinion. We believe this report validates the alignment of our strategy with our customers’ analytics needs.

Filter more pay less with the latest Cloudera Data Warehouse runtime!

One of the most effective ways to improve performance and minimize cost in database systems today is by avoiding unnecessary work, such as data reads from the storage layer (e.g., disks, remote storage), transfers over the network, or even data materialization during query execution. Since its early days, Apache Hive improves distributed query execution by pushing down column filter predicates to storage handlers like HBase or columnar data format readers such as Apache ORC.

Why COVID makes a CIO's tech strategy THE strategy

The global pandemic continues to impact our world in so many ways. But it has also served as a catalyst for technological change—a forcing function that is accelerating cloud migration. Every company wants to create better products or services, differentiate offerings, price strategically, grab more market share, retain customers, and grow in a sustainable manner.

Yellowfin does it again - Gartner 2021 Magic Quadrant

Once again, Yellowfin has been recognized in the Gartner Magic Quadrant. This is the eighth time that we have been recognized and for the second year, we are in the visionary quadrant. Yellowfin is also the only Australian vendor to be included. Gartner has recognized Yellowfin for three things in particular - innovation, breadth of capability and openness.

Exploring Data & Dashboard Creation on CDP Public Cloud

In this video, we'll walk through an example on how you can use Cloudera Data Warehouse to both easily run ad hoc queries against data as well as turn the results of those queries into beautiful, interactive, data visualizations and dashboards that show off the results of your data exploration.

How to use Snowflake Guides & Labs | Behind The Data Cloud

Developers, in this episode, you’ll learn how to kick off quickly with Snowflake Guides as well as how to access a repository of open source projects in Snowflake Labs. We’ll also reveal Snowflake’s Awesome List which contains key resources, learning opportunities, and open source demos. We switch things up with Daniel Myers from Developer Relations taking a turn as our guest, with Snowflake Community Manager Elsa Mayer acting as host. If you enjoy this episode, make sure to subscribe and share this video with a colleague.

Integrate Amazon RDS With Other Data Sources

How do you integrate data from Amazon RDS (Relational Database Service) with data from other sources such as S3, Redshift, or even MongoDB? The answer is Xplenty. Our data integration on the cloud lets you join data from various sources and then process it to gain new insights. What about storing the results back to RDS? No problem; Xplenty does that as well.

Decentralized Data Teams Helped With Low Code

When a company is small, having a fully centralized data team may not be an issue. As you grow, however, problems can start to arise. You have one structure that’s supporting all of your business units, and they may not be able to dedicate sufficient time and resources to individual business units. This can lead to delays in surfacing important insights and decisions made on old or inaccurate data.

Prepare Your Data - The Self-Service Data Roadmap, Session 2 of 4

In this webinar, Unravel CDO and VP Engineering Sandeep Uttamchandani describes the second step for any large, data-driven project: the Prep phase. Having found the data you need in the Discover phase, it's time to get your data ready. You must structure, clean, enrich, and validate static data, and ensure that "live," updated or streamed data events are continually ready for processing.

Our Shared Responsibility Model

There’s a common misconception that as soon as a business signs up for a solution from a cloud service provider (CSP), that the CSP will automatically ensure all their dealings in that cloud environment are safe and secure. As dedicated as Cloud Service Providers are to cybersecurity, that’s simply not possible. Your cloud provider has no control over the customer data you share, the aptitude of your employees, or how you optimize your own on-premises security and firewalls.

CDP Endpoint Gateway provides Secure Access to CDP Public Cloud Services running in private networks

Cloudera Data Platform (CDP) Public Cloud allows users to deploy analytic workloads into their cloud accounts. These workloads cover the entire data lifecycle and are managed from a central multi-cloud Cloudera Control Plane. CDP provides the flexibility to deploy these resources into public or private subnets. Nearly unanimously, we’ve seen customers deploy their workloads to private subnets.

Powering Algorithmic Trading via Correlation Analysis

Finding relationships between disparate events and patterns can reveal a common thread, an underlying cause of occurrences that, on a surface level, may appear unrelated and unexplainable. The process of discovering the relationships among data metrics is known as correlation analysis. For data scientists and those tasked with monitoring data, correlation analysis is incredibly valuable when used for root cause analysis and reducing time to remediation.

Analyzing Python package downloads in BigQuery

The Google Cloud Public Datasets program recently published the Python Package Index (PyPI) dataset into the marketplace. PyPI is the standard repository for Python packages. If you’ve written code in Python before, you’ve probably downloaded packages from PyPI using pip or pipenv. This dataset provides statistics for all package downloads, along with metadata for each distribution. You can learn more about the underlying data and table schemas here.

FRTB: Will 2023 Finally be the Year?

The Fundamental Review of the Trading Book (FRTB), introduced by the Basel Committee on Banking Supervision (BCBS), will transform how banks measure risk. FRTB is designed to address some fundamental weaknesses that did not get addressed in the post-2008 financial crisis regulatory reforms. In order to help make banks more resilient to drastic market changes, it will impose capital requirements that are more closely aligned with the market’s actual risk factors.

Why DataOps is Critical for Your Business

Data is often compared to oil – it powers today’s organizations, just like the fossil fuel powered companies of the past. Just like oil, the data that companies collect needs to be refined, structured, and easily analyzed in order for it to really provide value in the form of gaining actionable insights. Every organization today is in the process of harnessing the power of their data using advanced analytics, which is likely running on a modern data stack.

Qlik Launches Order-to-Cash Solution Accelerators for SAP: Modern Real-time Analytics to Optimize Your Working Capital

Today, more than ever, line-of-business users responsible for managing working capital need actionable insights in real-time. At the same time, IT/data teams want to accelerate projects, as well as modernize and integrate their data architectures and analytics, while managing risks and costs.

Xplenty's X-Console: A How-To Guide

One of Xplenty's most rewarding features is its ability to enact low-code and no-code digital transformation. Even with no experience in ETL or data integration, non-technical users can take advantage of Xplenty’s intuitive drag-and-drop interface to build robust, complex data pipelines to a data warehouse or data lake in the cloud.

The Rise of DataOps: Governance and Agility with TrueDataOps

The velocity of change is accelerating. The rate of change businesses are experiencing is just astounding. As many organizations have experienced during the pandemic, especially in their supply chain, the need for the data environment to be able to deliver faster is now mission-critical for all! We need better data—but at a rate that’s much faster than before. Businesses need their data teams to become more responsive to changing data demands. That means they need agility.

9 reasons why Microservices Architecture is the superior development approach

Unless you've been living on Mars for the past few years, I'm sure you’ve heard the buzzword “microservices”, also known as microservices architecture. A distinctive development approach, this natural evolution in software engineering came about due to the ever-increasing complexities of enterprise applications. Traditional applications are usually monolithic in design, which makes them bulky and very difficult to adapt to the changing needs of the business.

Analytics best practice: 5 key dashboard design principles

Simply put, a lot of effort is going into creating dashboards that the intended audience don’t even look at. The main purpose of a dashboard is to communicate business data in a visual form that highlights to the reader what is important, arranges it for clarity and leads them through a sequence that tells the story best so they can make better data-led decisions. Design and an understanding of how humans make decisions exist to assist this purpose.

ClouderaNow 21 - Automate Data Enrichment Pipelines

See this demo of Cloudera Data Engineering which builds upon Apache Spark and allows us to load, transform, and enrich our datasets and has built-in workload orchestration to automate these pipelines at scale. The demo will also illustrate how easy it is to go from streaming to enrichment and data pipeline automation all in an end-to-end data platform.

Iguazio Receives an Honorable Mention in the 2021 Magic Quadrant for Data Science and Machine Learning Platforms

We’re proud to share that Iguazio has received an honorable mention in the Gartner Magic Quadrant for Data Science and Machine Learning Platforms, 2021. This is the second year in a row that Iguazio receives this recognition. The 2021 report assesses 20 vendors of platforms enabling data scientists and engineers to develop, deploy and manage AI/ML in the enterprise, across a wide array of criteria relating to their capabilities, performance and completeness of vision.

Top Three Requirements for Data Flows

Data flows are an integral part of every modern enterprise. No matter whether they move data from one operational system to another to power a business process or fuel central data warehouses with the latest data for near-real-time reporting, life without them would be full of manual, tedious and error-prone data modification and copying tasks.

The death of the dashboard: What it really means for analytics

Let’s get this out of the way: To understand the much discussed ’death of the dashboard' proclamation, the phrase needs to be viewed under a different lens beyond the literal. Firstly, it's not a new concept at all: Yellowfin have been saying it for years. The problem is in the current confusing interpretation around what it means for business intelligence. In short, dashboards aren’t actually dying, nor is their usefulness for certain users spent.

Developing Data Literacy and Standardized Business Metrics at Tailored Brands

In this episode of CDO Battlescars, Sandeep Uttamchandani, Unravel Data’s CDO, speaks with Meenal Iyer, Sr. Director of Enterprise Analytics and Data at Tailored Brands. They discuss battlescars in two areas, data and metrics: Growing Data Literacy and Developing a Data-Driven Culture and Standardization of Business Metrics.

The Dashboard Is Dead, Long Live the Dashboard

There is a lot of talk these days about the dashboard being a thing of the past. After all, simply displaying KPIs and visualizations in a dashboard is something everyone can do, right? If monitoring KPIs is all you need to do, then we would agree: The dashboard is largely dead. We can deliver those singular data points to you anywhere, monitoring what you’re interested in, alerting you to changes and triggering action.

How a data analyst came to understand what Keboola has to offer to ease his frustrations

In their quest to find out if Keboola could be of wider benefit to a company that has so far been using the platform only as their ETL solution, Michal Hruska, a senior data consultant at Keboola, and Pavel Dolezal, Keboola’s CEO, met with Tim, the company’s data analyst. They sat down to talk about the beaten tracks of working with data and its challenges and the ways Keboola can help solve them.

The Snowflake Data Cloud for Healthcare & Life Sciences

Learn how the Snowflake Data Cloud helps Healthcare and Life Sciences organizations deliver improved care, products, services, and therapies. Healthcare and Life Sciences organizations are under increasing pressure to leverage data to deliver improved care, services, and therapies. The Snowflake Data Cloud can help these organizations centralize, unite and securely share sensitive health and life sciences data to help deliver comprehensive, equitable, and individualized care and services.

Amazon RDS: The Best Relational Database Service?

Companies these days are handling more data than ever: an average of 163 terabytes (163,000 gigabytes), according to a survey by IDG. Efficiently storing, processing and analyzing this data is essential in order to glean valuable insights and make informed business decisions. Yet the question remains: What is the best way to store enterprise data? For many use cases, the most appealing choice is a relational database.

Multi-Cloud Data Analytics: What, Why, and How

What is multi-cloud data analytics and why are so many companies getting on board? Cloud computing itself is now a well-established best practice, but a multi-cloud strategy is nearly as common these days. While 94 percent of organizations are now using cloud computing, 84 percent are using a multi-cloud data strategy. Multi-cloud is an especially fruitful data strategy for companies pursuing data analytics.

Xplenty: The AWS Solution Architect's Secret Weapon

AWS Solution Architects are in red-hot demand, and the AWS Certification is the highest-paying certification in the United States. As such, you wear many hats as a Solution Architect for Amazon Web Services. You're a problem-solver, a creative genius, a multitasker, and a big-picture thinker. And you design AWS implementations better than anyone else you know. But there are some things about being an AWS Solution Architect that aren't so rosy. Amazon's ever-changing recommendations.

Considering Hybrid Cloud? Four Top Questions, Answered.

2021 is set to be the year of hybrid cloud. In fact, Forbes has even listed it as one of the top 10 digital transformation trends of the year, declaring it the “winning enterprise architecture.” A multiple cloud approach does provide greater choice and greater flexibility – two major benefits at a time when agility and adaptability have never been more important. But this approach comes with greater operational complexity.

A view from inside: How Keboola benefits from using Keboola Connection - The show must go on!

It’s been almost a year since I wrote about using Keboola Connection in Keboola. A lot of things have happened since then: my Bloodborne board game finally arrived, I'm a double uncle… oh, and I got engaged. I also celebrated another anniversary this month - seven years of working at Keboola! And I believe that we’ve made some great progress yet again. Last time, I gave somewhat of an intro to our internal reporting.

How to Offload ETL from Redshift to Xplenty

Amazon Redshift is great for real-time querying, but it's not so great for handling your ETL pipeline. Fortunately, Xplenty has a highly workable solution. Xplenty can be used to offload ETL from Redshift, saving resources and allowing each platform to do what it does best: Xplenty for batch processing and Redshift for real-time querying. Redshift is Amazon’s data warehouse-as-a-service, a scalable columnar DB based on PostgreSQL.

5 Customer Data Integration Best Practices

For the last few years, you have heard the terms "data integration" and "data management" dozens of times. Your business may already invest in these practices, but are you benefitting from this data gathering? Too often, companies hire specialists, collect data from many sources and analyze it for no clear purpose. And without a clear purpose, all your efforts are in vain. You can take in more customer information than all your competitors and still fail to make practical use of it.

Data governance beyond SDX: Adding third party assets to Apache Atlas

Governance and the sustainable handling of data is a critical success factor in virtually all organizations. While Cloudera Data Platform (CDP) already supports the entire data lifecycle from ‘Edge to AI’, we at Cloudera are fully aware that enterprises have more systems outside of CDP. It is crucial to avoid that CDP becomes the next silo in your IT landscape.

7 ways to improve your eCommerce customer data collection.

In the last year, businesses have lost $756 billion because of poor eCommerce personalization. Customers have become accustomed to a tailored digital service. Unless you start gathering data about your customers and improving their online shopping experience, customers will walk their wallets to your competitor, who will offer them a better service. We will look at 7 use cases of how to better collect customer data in your eCommerce and put it to good use. Don’t worry.

The Route to Automated Remediation

An abundance of information can be daunting for any company. If internal teams do not know where the data is, it might hamper their efficiency at the cost of data quality and cleanliness. From a cost-effectiveness viewpoint, organizations are likely to waste excessively by hanging on to redundant data or storing varied data in one location irrespective of their sensitivity level.

The Road to Zero Touch Goes Through Machine Learning

The telecom industry is in the midst of a massive shift to new service offerings enabled by 5G and edge computing technologies. With this digital transformation, networks and network services are becoming increasingly complex: RAN, Core and Transport are only a few of the network’s many layers and integrated components. Today’s telecom engineers are expected to handle, manage, optimize, monitor and troubleshoot multi-technology and multi-vendor networks.

Now Generally Available, Snowflake's Search Optimization Service Accelerates Queries Dramatically

Snowflake customers want to discover insights from their data faster than ever, which is challenging because data volumes are growing at a breakneck pace. Effective searches are critical to customer satisfaction. Today, we’re excited to announce the general availability of search optimization, which significantly improves the performance of selective queries on large tables.

Correlation Analysis Explained

When you detect that something is off in your business, how long does it take you to find the root cause? The longer it takes, the more it can cost you. Correlation analysis identifies relationships between KPIs, which business teams use to accelerate root cause analysis (RCA) and mean time to remediation (MTTR). Doing it manually however can be tedious and limit your visibility.

Why operational reporting is still essential in modern BI solutions

Yellowfin frequently speaks with prospects that list operational reporting as a critical requirement when buying an analytics solution. Despite the usefulness of the latest features, standard reporting capabilities is still an essential part of their BI checklist. But why?

Going Beyond Observability for Spark Applications & Databricks Environments

Join Chris Santiago, Solutions Engineer Director at Unravel Data, as he takes you through Unravel’s approach to getting better and finer grain visibility with Spark applications and how to tune and optimize them for resource efficiency. An overview of out of the box tools like Ganglia and their overall lack of visibility on Databricks jobs How Unravel helps you gain finer grain visibility, observability, monitoring into Spark data pipelines How Unravel can recommend better configurations and tuning of Spark applications.

Inventory management with BigQuery and Cloud Run

Many people think of Cloud Run just as a way of hosting websites. Cloud Run is great at that, but there's so much more you can do with it. Here we'll explore how you can use Cloud Run and BigQuery together to create an inventory management system. I'm using a subset of the Iowa Liquor Control Board data set to create a smaller inventory file for my fictional store. In my inventory management scenario we get a csv file dropped into Cloud Storage to bulk load new inventory.

Protecting Personal Data: GDPR, CCPA, and the Role of ETL

The growth of data has been exponential. By 2023, it's anticipated that approximately 463 exabytes (EB) will be created every day. To put this into perspective, one exabyte is a unit equivalent to 1 billion gigabytes. By 2021, 320 billion emails will be sent daily, many of which contain personal information. Data collected around the globe contains the type of information that businesses leverage to make more informed decisions.

Using Xplenty with Parquet for Superior Data Lake Performance

Building a data lake in Amazon S3 using AWS Spectrum to query the data from a Redshift cluster is a common practice. However, when it comes to boosting performance, there are some tricks that are worth learning. One of those is using data in Parquet format, which Redshift considers a best practice. Here's how to use Parquet format with Xplenty for the best data lake performance.

Major Fortune 100 Brands Choose the Unravel Data Platform

It’s hard to believe that Unravel Data was founded eight years ago — though the first few years were dedicated to defining and building our initial product. Since this time, the company has raised several rounds of funding, released several versions of our flagship DataOps Platform, and is being used by some of the world’s leading brands to improve the efficiency and reliability of their data pipelines.

Using SQL to democratize streaming data

Streaming analytics is crucial to modern business – it opens up new product opportunities and creates massive operational efficiencies. In many cases, it’s the difference between creating an outstanding customer experience versus a poor one – or losing the customer altogether. However, in the typical enterprise, only a small team has the core skills needed to gain access and create value from streams of data.

4 ways advanced analytics can help exceed customer expectations

In 2020, companies across countries and industries scrambled to stabilize operations, ensure the health and safety of employees, and find ways to continuously deliver. But in the wake of all this uncertainty, one thing remains crystal clear: customer engagement is more crucial than ever. After all, with uncertainty comes doubt. Customers may be on edge, and they need reassurance that organizations will continue to meet—and exceed—their expectations.

API Analytics Across the Developer Journey

Every API product manager wants as many developers as possible adopting and using their APIs. They want them to get to Hello World quickly and have a great developer experience (DX) along the way. Of course, the bigger goal is to be able to tie API success into the larger objectives of the company. For many, despite the best intentions, their metrics are too simplistic, narrow, and based on outdated models of engagement.

4 reasons why the BI industry has run out of ideas

While there has been some incremental improvements in the last few years, there has been nothing significant recently - and I think there are four clear reasons for that. Firstly, there has been a lot of consolidation in the industry recently. When that happens, behemoth vendors focus far more on selling than building new products that are going to disrupt the industry.