Systems | Development | Analytics | API | Testing

April 2022

Interview With Director of Data Science, Michael Chang

For our latest expert interview on our blog, we’ve welcomed the Director of Data Science and Machine Learning at Included, Michael Chang. Michael helps measure and optimize workforce diversity and inclusion efforts through data. Prior to Included, Michael worked in various data capacities at Facebook, Teach for America, Interactive Corp, and eBay. Michael also enjoys teaching and is an adjunct instructor for data science at UCLA Extension and Harvard FAS.

Data Legends Podcast: Differentiate or Drown: Managing Modern-Day Data

What are the top three mega trends for data leaders this year (and beyond)? In this episode, we tackle cloud data platforms, the five sub disciplines of observability, and real-time machine learning. Listen to our conversation with Kevin Petrie, Vice President of Research at Eckerson Group. Hear the Answers to these Questions: Why a cloud data platform is a common destination with many routes? Which tools to standardize the different classes of observability? How the interrelationship between model observability and ML works?

The Secrets Behind Personalization: The Customer 360 view

In the era of hyper-personalization, your customers expect you to tailor your products and services to a market of one (aka, them). This is hard to do when the majority of big corporations are built for the masses, not for the hyper-niche markets. So how do you move from addressing the “male or female between 20 and 65 years” segment to having personalized conversations? With the Customer 360 view.

Three Ways Active Intelligence Can Support the CFO

Finance has been at the forefront of enterprise analytics for decades. Over the years, these analytics have evolved from reactive, descriptive analytics related to financial performance, treasury holdings, and inventory management to predictive and prescriptive analytics for risk, credit, and financial business modeling.

Store & Access Information at Scale: How Drawbacks Lead to Innovation

Ever since there was a need to both store and access information, there has been both physical and logical means to achieve it. Everything from stone tablets to paper, to a prolifera of technology in the digital age. As information became easier to create, databases were built to give it structure to simplify its access, accompanied by characteristics to improve performance and scale.

How To Scale Threat Detection & Response With Snowflake & Securonix

Securonix brings exciting new capabilities to the field of threat detection and response thanks to its integration with Snowflake. Together, the two companies provide a split-architecture solution that solves the problem of data silos and enables organizations to make better, more timely decisions about potential threats to their organization. It’s a next-gen SIEM solution already finding widespread use in such organizations as healthcare institutions, airlines, and telecommunication companies.

Data Legends Podcast: Musings on Data Lakes, Computer Science, AI & More

When it comes to building new products, there’s a fine line between which pieces of the puzzle should be owned by humans with deep domain knowledge, and which aspects can or should be automated through AI. How far can the boundary be pushed? We speak with Jeremy Foran, Chief Technology Officer at Purple Cow Internet, about his new role as CTO at a fast-growing internet service provider.

Data Legends Podcast Episode 2, Amr Awadallah

Historically, knowledge has been relatively siloed by language. But with advancements in AI, there are now more opportunities than ever to capture broader and deeper insights across the written and spoken word by breaking down language and distance barriers globally. Listen to our conversation with Amr Awadallah, Founder and CEO at ZIR AI and former technology exec at Cloudera, Google, and Yahoo.

Data Legends Podcast: Making Sense of Data Quality Amongst Current Seasonality & Uncertainty

When providing the data to support marketing, it's important to frame and validate its quality based on whether it meets "the six C’s." Is the data clean, complete, correct, comprehensive, chosen well, and calculable? In our conversation with Christopher Penn, Co-Founder and Chief Data Scientist at TrustInsights.ai, we discuss numerous questions many in the industry are asking today.

4 Years Ago, the GDPR Changed Everything. Now What?

The EU’s General Data Protection Regulation is approaching its 4th year anniversary since it was implemented in May 2018. Since its inception, it has been hailed as a groundbreaking framework for making users’ rights on the Internet a human right. Its impact in many other markets has been undeniable and it truly has affected how the world of the Internet works, even outside EU borders.

A Window Into the Future of Data in Motion and What It Means for Businesses

Modern businesses have vast amounts of data at their fingertips and are acutely aware of how enterprise data strategies positively impact business outcomes. Despite this, only a handful of organisations interact with all stages of the data life cycle process to truly distill information that distinguishes future-ready businesses from the rest.

Power to the people - creating trust in data with collaborative governance

Today’s enterprise IT organizations are experiencing a massive upheaval due to pressure from employee forces. It’s a familiar story. Just think of the turmoil caused by the dawning of the bring-your-own-device (BYOD) era, with employees demanding to use their beloved personal mobile phones for work.

Pentaho 9.3 Suite Drives Modern Business Intelligence Across the Hybrid Cloud

Hitachi Vantara’s latest improvements to Pentaho make it significantly easier for organizations to move data workloads from on premises to the cloud and back again. The new Pentaho 9.3 Long-Term Support (LTS), part of Hitachi’s Lumada portfolio, offers a cloud deployment option that we anticipate will be a critical accelerant of data-driven transformation.

The Ultimate Guide to E-commerce Tech Stack

Our Five Key Points: When you start an e-commerce business, you know that you’ll need some top-level digital services to get your products and services online and in front of the right audience. Not every online store is as big as Amazon, but with effective tools, you can still make waves. How do you pick the right apps and SaaS or PaaS to make your online business as profitable as possible?

Why SLAs Are Critical to Ensuring Data Reliability.

As far back as the 1920s, Service Level Agreements (SLA) were used to guarantee a certain level of service between two parties. Back then, it was the on-time delivery of printed AR reports. Today, SLAs define service standards such as uptime and support responsiveness to ensure reliability. The benefit of having an SLA in place is that it establishes trust at the start of new customer relationships and sets expectations.

Monitor & analyze BigQuery performance using Information Schema

In the exponentially growing data warehousing space, it is very important to capture, process and analyze the metadata and metrics of the jobs/queries for the purposes of auditing, tracking, performance tuning, capacity planning, etc. Historically, on-premise (on-prem) legacy data warehouse solutions have mature methods of collecting and reporting performance insights via query log reports, workload repositories etc. However all of this comes with an overhead of cost-storage & cpu.

Overcome These 4 Common D365 F&SCM Challenges with Jet Reports

So, you’re working for a medium to large enterprise that uses Microsoft Dynamics 365 Finance & Supply Chain (D365 F&SCM) as its ERP system. You have multiple options for reporting and analysis available to you from Microsoft. But if your business is growing, you are probably looking to push beyond the out-of-the-box capabilities to develop your own custom analysis and meaningful data insights.

Winning with Analytics: The Transformation of Clinical Trials for Scientists

The end goal of clinical technology organizations in the US and abroad is to use modern technology to bring life-saving new treatments to fruition. Leaders in this sphere help generate the evidence and insights to help biotech, pharmaceutical, medical device, and diagnostic companies accelerate value, minimize risk, and optimize outcomes. Life sciences clients recognize that technology is the answer to inefficiencies and delays in delivering new treatments to the public.

Beyond Observability for the Modern Data Stack

The term “observability” means many things to many people. A lot of energy has been spent—particularly among vendors offering an observability solution—in trying to define what the term means in one context or another. But instead of getting bogged down in the “what” of observability, I think it’s more valuable to address the “why.” What are we trying to accomplish with observability? What is the end goal?

Great Financial Storytelling Begins with Great Tech

Finance professionals know that data matters, but stories convey truth in ways that mere numbers simply cannot. Those who work in finance may describe themselves as “numbers people.” They have a natural affinity for quantitative information, as well as a knack for drawing meaningful conclusions when presented with a collection of numerical figures. Even so, finance team members probably understand and retain information more readily when it’s presented in narrative form.

Data Observability Driven Development | The perfect analogy for beginners

When explaining what Data Observability Driven Development (DODD) is and why it should be a best practice in any data ecosystem, using food traceability as an analogy can be helpful. The purpose of food traceability is to be able to know exactly where food products or ingredients came from and what their state is at each moment in the supply chain. It is a standard practice in many countries, and it applies to almost every type of food product.

What Are Embedded Data Visualization Tools?

Data visualization is a core element and key capability of embedded analytics. Embedded data visualization tools create real-time visualizations (charts, graphs, etc) and can be embedded into any software application or integrated into a cloud or standalone app, and provides every user quick and easy ways to visualize information. This article explores the benefits of using embedded data visualization tools in your applications, and discusses how it can improve upon the traditional analytics experience.

Support Multiple Data Modeling Approaches with Snowflake

Since I joined Snowflake, I have been asked multiple times what data warehouse modeling approach Snowflake best supports. Well, the cool thing is that Snowflake supports multiple data modeling approaches equally. Turns out we have a few customers who have existing data warehouses built using a particular approach known as the Data Vault modeling approach, and they have decided to move into Snowflake. So the conversation often goes like this.

How E-Commerce Influences B2B Analytics in Your Data-Driven Enterprise

E-commerce is the future of B2C retail, with online retailers expected to generate a total of $5.55 trillion in sales in 2022. The benefits of e-commerce are almost endless: This business model overcomes geographical limitations, allows merchants to sell new products and services 24 hours a day, and eliminates many of the costs associated with managing a brick-and-mortar store. But do you know how e-commerce influences B2B analytics?

3 Use Cases for Embedded BI in 2022 to Enhance Your SaaS Product

As data takes the center stage for many organizations, data experts and business leaders are figuring out how to analyze it all. The total amount of data created, consumed, and copied globally has reached 64.2 zettabytes in 2020 and forecasted a CAGR of 19.2% from 2022 to 2025. In other words, we need better tools to leverage it! Organizations today leverage their data to provide better services, build better products, and create new revenue streamlines.

What's an E-Commerce Dashboard? & Why Does it Matter?

The essential things to know about e-commerce dashboards are: Choosing an e-commerce platform like Shopify is just the first step when building an online business. In order to make smarter business decisions and boost your online sales and marketing campaigns, you’ll need information about how your e-commerce site is performing. That’s exactly what an e-commerce dashboard is designed to accomplish.

Automatically extract TML definitions from tml/export

ThoughtSpot elements such as search, Liveboards, and data connections are all defined in a JSON-based metadata definition called ThoughtSpot Modeling Language, or TML. Recently, I blogged about how you can use Postman to access platform APIs to import/export TML as part of your devops processes; for example, to check in TML definitions and push to another environment via a continuous integration process. The TML export is pretty straightforward.

Webinar Recap: Functional strategies for migrating from Hadoop to AWS

In a recent webinar, Functional (& Funny) Strategies for Modern Data Architecture, we combined comedy and practical strategies for migrating from Hadoop to AWS. Unravel Co-Founder and CTO Shivnath Babu moderated a discussion with AWS Principal Architect, Global Specialty Practice, Dipankar Ghosal and WANdisco CTO Paul Scott-Murphy. Here are some of the key takeaways from the event.

Building vs. Buying Your Modern Data Stack: A Panel Discussion

One of the highlights of the DataOps Unleashed 2022 virtual conference was a roundtable panel discussion on building versus buying when it comes to your data stack. Build versus buy is a question for all layers of the enterprise infrastructure stack. But in the last five years — even in just the last year alone — it’s hard to think of a part of IT that has seen more dramatic change than that of the modern data stack.

Elevate Gives Retailers a Powerful New Tool for Managing Supply Chains

In today’s world, retail customers expect things fast. They want their products on time and they want their orders not to be canceled. And when things go wrong, they want answers. To deliver that experience, retailers need to be able to understand at a granular level how their customers’ orders are moving through their supply chains. In this episode of “Powered by Snowflake,” Daniel Myers chats with Elevate Co-founder and CTO James Sutton about his company’s recently introduced retail operations platform that provides the analytics retailers need to evaluate and manage supply chain performance.

Why Google Cloud BigQuery for SAP Enterprises?

What is BigQuery and how can it help you gain new insights from your SAP data? In this video, Kevin Nelson, a Developer Advocate at Google Cloud, will demonstrate how to integrate your SAP data with BigQuery to help drive new insights across your organization and give more power to your data analytics users. Chapters: product: Cloud - General; fullname: Kevin Nelson;

AstraZeneca: Building a collaborative culture between data leaders and business teams

How data projects are put into production at AstraZeneca? Successful projects require discussion, business expertise, tech knowledge, and agility. AstraZeneca achieved this by building a collaborative culture between data leaders and business teams. “Data without trust is useless. Data Governance is critical to knowing that we can trust our data and ensuring that the data is well understood, well looked after, and only accessible to the right people.”

Top 3 Automation Best Practices for Viewpoint Users

Construction is a vast and complex industry. As such, the simpler and more streamlined your processes are, the better. Complicated, manual tasks hamper construction professionals’ ability to conduct business by slowing down efficiency and productivity. Despite the obvious benefits of modern automated workflows, a survey conducted by McKinsey Global Institute found construction is the second slowest adopter of digitization.

The Ultimate Shopify E-Commerce Tech Stack Guide

The 8 most important parts of your Shopify e-commerce tech stack are: As a Shopify e-commerce business, you’ve made the right choice: you have access to the best e-commerce platform for creating and managing an online store. However, choosing an e-commerce solution like Shopify is just the first piece of the puzzle. To truly take advantage of your e-commerce software, you’ll need a solid Shopify e-commerce tech stack to back it up.

From the Ground Up: The Truth About Data Innovation

Data holds incredible untapped potential for Australian organisations across industries, regardless of individual business goals, and all organisations are at different points in their data transformation journey with some achieving success faster than others. To be successful, the use of data insights must become a central lifeforce throughout an organisation and not just reside within the confines of the IT team. More importantly, effective data strategies don’t stand still.

Making Sense of Data Quality Amongst Current Seasonality & Uncertainty

When providing the data to support marketing, it's important to frame and validate its quality based on whether it meets "the six C’s." In our conversation with Christopher Penn, Co-Founder and Chief Data Scientist at TrustInsights.ai, we discuss questions many in the industry are asking today.

The New Breed: How to Think About Robots

You’ve heard the saying “if you do what you love, you’ll never work a day in your life,” right? Well, I hate to say it, but that’s me. I never dreamed that I would wind up in a field that combined all of my interests, but somehow that happened. Through my research at the MIT Media Lab I get to apply my legal and social sciences background to human-robot interaction. Which yes, does mean that I mostly get to play with robots all day.

Heureka Group: Empowering over 5,000 e-commerce shops with data insights and generating 450k EUR per year by enriching data

Heureka Group is an online shopping advisor that prides itself in providing simple, fast, secure, and enjoyable e-commerce and price-comparison solutions across central and eastern Europe. In less than 15 years, the company has grown to more than 20 million monthly users, becoming one of Europe’s leading e-commerce platforms. Heureka Group continues to build on that success by launching and acquiring e-commerce clients across the region.

MeDirect Bank: Thinking ahead for a seamless transition to the cloud

MeDirect Bank is a bank and financial services company based in Malta that provides services ranging from deposit accounts to mutual funds to wealth management. The company has evolved from its regional roots to become the third largest bank in Malta, with customers all over the world. And it’s done so by evolving with its customers’ needs; providing accessible, transparent services wherever customers are — physically or digitally.

Webinar Recap: Optimizing and Migrating Hadoop to Azure Databricks

The benefits of moving your on-prem Spark Hadoop environment to Databricks are undeniable. A recent Forrester Total Economic Impact (TEI) study reveals that deploying Databricks can pay for itself in less than six months, with a 417% ROI from cost savings and increased revenue & productivity over three years. But without the right methodology and tools, such modernization/migration can be a daunting task.

Data Lake vs Data Warehouse: 7 Critical Differences

Here are seven key differences between data lakes vs data warehouses: A lot of terms get thrown around in the big data space that every business should understand. Many of these terms are easily confused with each other. This is the case with data lakes vs data warehouses. What are some of the most important differences between them, and how can your business use them most effectively for data analytics and data management? Read on to learn the differences between data lakes and data warehouses.

How To Simplify The ETL Code Process with Low-Code Tools

Five differences between using an ETL platform vs. writing your own code: The ETL (extract, transform, load) process is one of the most critical, and one of the most challenging, parts of enterprise data integration. But what if we told you there was a low-code ETL solution to your problems?

Top 3 Data and Analytics Trends to Prepare for in 2022

The past two years have seen significant disruption across sectors, markets and technology dynamics, forever changing the way businesses, workers, and customers use data. But while global conditions have created uncertainty, it’s also driven more opportunities for organizations to optimize processes to respond faster to evolving customer demands, competitor shifts, and new risks - leveraging new, innovative data solutions.

New Snowflake Features Released in March 2022

In March, Snowflake continued to enhance its capabilities around data programmability and data pipeline development, with the Snowpark API and stored procedures for Java now in public preview, schema detection now generally available, and the Snowflake SQL API generally available. In addition, Snowflake’s user interface, Snowsight, is generally available. Not to mention an expanded selection of new partners to choose from in Snowflake Data Marketplace.

Get Your Retail Plan in Shape: A 7-Step Regimen for Year-Round Selling

Once upon a time, the retail calendar centered itself on the Christmas season. Now, the retail surge is year-round. Not just the wave of traditional seasonal holidays from Valentine’s Day to the 4th of July, but also newer sales holidays, such as Cyber Monday, or even holidays created by some gigantic companies themselves, like Amazon’s Prime Day. Now, instead of a steady pace leading up to a frenzied December, retailers are in sprint mode all the time.

Automate your Reports on Google Sheets with Hevo Activate

Usually, your business users request you share the business reports in Spreadsheets. They are highly familiar with Sheets and prefer their reports on Sheets only. They assume delivering reports in XLS format is easy and quick. But, we understand the efforts and time required to export reports to Spreadsheets. Every time, you will have to run queries on your centralized data at the warehouse and then export results in XLS format. You may need to edit and update the Sheet regularly.

Estée Lauder: Transforming the retail experience during a pandemic

Change is something every business leader has to deal with. But in the past two years, doing business has just gotten weird and harder. In this video, Christal Bemont, Talend's CEO and David Malloy, Executive Director of the Estee Lauder Companies discuss how he used healthy data to fuel a dramatic business transformation of the major cosmetics retailer.

5 Secrets to Understanding Value Shoppers

Our five key points: Value shoppers are the savvy buyers, the ones always on the lookout for the best deals and who knows where to find every discount and coupon. If you're an e-commerce retailer, the importance of understanding your value shoppers can’t be overstated. Consumers searching for a great deal will absolutely look elsewhere if they’re not impressed by your online offerings.

Data In Motion: NASA and Aurica

Some 300 million years ago, Earth had one continent called Pangea. Over millions of years, that vast single land mass broke up and drifted in different directions, creating the seven continents that exist today. Since the planet changed so dramatically over millennia, it raises an obvious question: How will it change in the future? The same forces, plate tectonics and continental drift, that broke up Pangea hundreds of millions of years ago still exert themselves.

AI Winter is coming. Get ready with Data Observability.

“Without clean data, or clean enough data, your data science is worthless.” Michael Stonebraker, adjunct professor, MIT AI is one of the fastest-growing and most popular data-driven technologies in use. Nine in ten of Fortune 1000 companies currently have ongoing investments in AI. So you may be wondering: how could there possibly be another AI winter?

Commerzbank | Unleashing Hidden Data Treasures for Customers

Like many financial institutions, Commerzbank was challenged with staying flexible to meet customer needs, while also meeting regulatory compliance. In this Movers & Makers, Justyna Lebedyk, Product Owner in Big Data for Commerzbank, talks about how their digital transformation with the hybrid cloud and Cloudera allowed them to overcome this challenge.

Modernizing the Analytics Data Pipeline

Enterprises run on a steady flow of best-fit data analytics. Robust processes ensure these assets are always accurate, relevant, and fit for purpose. Increasingly, organizations are implementing these processes within structured development and operationalization “pipelines.” Typically, analytics data pipelines include data engineering functions such as extract-transform-load (ETL) and data science processes such as machine-learning model development.

Build or Buy Embedded Analytics: What's the difference?

Companies nowadays are well aware of the importance of embedded analytics when it comes to being data-driven. Today, building your own analytics infrastructure into your software applications for your customers is not the only option anymore. There is a growing market of embedded analytics tools that offer purchasable solutions for data analysis.

Automate Your Yardi Real Estate Data Collection and Management

From managing financial statements, signage, storage space, office space floors, and land, real estate financial professionals manage many of their businesses’ most critical moving parts. And real estate is growing–by 2026, the market is expected to reach $5388.87 billion by 2026 at a compound annual growth rate (CAGR) of 9.6%. When the time comes for month-end reporting, ERPs like Yardi manage and compartmentalize data with out of the box reports.

Cloud vendor's MLOps or Open source?

If someone had told my 15-years-ago self that I’d become a DevOps engineer, I’d have scratched my head and asked them to repeat that. Back then, of course, applications were either maintained on a dedicated server or (sigh!) installed on end-user machines with little control or flexibility. Today, these paradigms are essentially obsolete; cloud computing is ubiquitous and successful.

BigQuery Omni innovations enhance customer experience to combine data with cross cloud analytics

IT leaders pick different clouds for many reasons, but the rest of the company shouldn’t be left to navigate the complexity of those decisions. For data analysts, that complexity is most immediately felt when navigating between data silos. Google Cloud has invested deeply in helping customers break down these barriers inherent in a disparate data stack. Back in October 2021, we launched BigQuery Omni to help data analysts access and query data across the barriers of multi cloud environments.

Automatic data risk management for BigQuery using DLP

Protecting sensitive data and preventing unintended data exposure is critical for businesses. However, many organizations lack the tools to stay on top of where sensitive data resides across their enterprise. It’s particularly concerning when sensitive data shows up in unexpected places – for example, in logs that services generate, when customers inadvertently send it in a customer support chat, or when managing unstructured analytical workloads.

Business Intelligence on the Cloud Data Platform: Approaches to Schemas

The cloud data platform combines data warehouse and data lake capabilities to support the exploding world of analytics. Like a data warehouse, the cloud data platform structures, transforms, and queries data. Like a data lake, it classifies multi-structured data objects in an elastic object store. The cloud data platform provides an ideal launchpad for modern business intelligence (BI) projects that need fast, flexible access to lots of varied data. As you might expect, this is a tall order to fill.

Why You Need a Fully Automated Data Pipeline

The five main reasons to implement a fully automated data pipeline are: When you think about the core technologies that give companies a competitive edge, a fully automated data pipeline may not be the first thing that leaps to mind. But to unlock the full power of your data universe and turn it into business intelligence and real-time insights, you need to gain full control and visibility over your data at all its sources and destinations.

Unstructured Data Now Generally Available in Snowflake, Processing with Snowpark in Public Preview

We’re excited to announce the general availability of the unstructured data management functionality in Snowflake. We launched public preview of this functionality in September 2021, and since then we have seen adoption by customers across industries for a variety of use cases. These use cases include storing and securing call center recordings, securely sharing PDF documents in Snowflake Data Marketplace, storing medical images and extracting data from them, and many more.

MongoDB vs. PostgreSQL: Detailed Comparison of Database Structures

One of the most important parts of the function of any company is a secure database. With phishing attacks, malware, and other threats on the rise, it is essential that you make the right choice in order to keep your data safe and process it effectively. However, it can be extremely difficult to choose among the wide variety of database solutions on the market today. Two commonly-used options are Mongodb and Postgresql. What do you need to know about MongoDB vs. PostgreSQL?

Analyzing Unstructured Data With Snowflake Explained In 90 Seconds

What if there was a way to easily manage, process, and analyze any data type in a single platform? Snowflake is here to help. Simplify your architecture with a single platform for all data types and workloads, unlocking new use cases for your data. With Snowpark, your data scientists and engineers can securely build scalable, optimized pipelines, and quickly and efficiently execute machine learning workflows while working in Python, Java, or Scala.

Building and Managing the Modern Datastore: The Data Lakehouse

The 'data lakehouse' is quickly becoming popular in the data analytics community. Data lakehouse architecture combines the benefits of a data warehouse and a data lake. It aims to merge the data warehouse’s data structure and management features along with the flexibility and relatively low cost of the data lake. Watch this panel discussion to learn how the data lakehouse can address the limitations of the data lake and data warehouse architecture to deliver significant value for organizations. Explore why the data lakehouse is an ideal option for enterprise data storage initiatives.

GDPR Prevails: Google Analytics Running into Trouble in the EU?

Almost 6 years ago, the European Union’s General Data Protection Regulation (better known for its acronym, GDPR) changed the world of personal data protection forever. The groundbreaking ruling has since been replicated, albeit with changes, in over a dozen other markets.

Streamline Reporting During (and After) Your Deltek Vision to Vantagepoint Transition

Deltek Vantagepoint is the newly branded, freshly reimagined next version of Deltek Vision built specifically for professional services organizations. Deltek has set a rough deadline for Vision customers to upgrade to Vantagepoint. While this date may be delayed based on rate of adoption, many Deltek users have either already upgraded or are starting to consider the impact of the change, and how to best approach it.

The Repurchase Rate: How to Calculate It and Why it Matters for E-commerce

As an e-commerce retailer, which business metrics matter the most to you? Net revenue? Business growth? Conversion rate? All these KPIs are important, but there could be one critical e-commerce metric you’re overlooking: The repurchase rate. All e-commerce business owners worry about how many units of their products or services they’re going to sell, and how much revenue they’ll make. It turns out that focusing on existing customers could be the key to increased long-term revenue.

Integrate.io Achieves Google Cloud Ready - BigQuery Designation!

We are excited to announce that Integrate.io has achieved Google BigQuery designation! Google's BigQuery service is a great way to store and analyze large amounts of data. To ensure customers have confidence in their integration, Google engineers do validation tests on the integrations before they're allowed into their system - making sure everything works as expected!

Build Robust and Efficient Analytics Engine with Hevo's Data Transformation

In today’s digital age, robust and faster data analytics is essential for your organization’s growth and success. The faster you deliver analytics-ready data to your analyst, the faster they can analyze and derive insights. Though you would have adopted the ELT process with EL data pipelines to load data quickly to the warehouse, your team would still face inefficient and delayed analysis.

The Modern Data Stack Ecosystem: Spring 2022 Edition

Welcome to the Spring 2022 Edition of the Modern Data Stack Ecosystem. In this article, we’ll provide an in-depth look at the Modern Data Stack (MDS) ecosystem, updated from our Fall 2021 edition. We also highly recommended our article, The Future of the Modern Data Stack, to anyone who is new to the MDS and wants to learn about its history.

New Pathways to New Insights

To this point, AI has been applied to augment analytics in a somewhat bifurcated fashion. On one hand, we have seen natural language support the business consumer that requires simple answers to known questions, helping them quickly take action. And, on the other, AI helps content authors and BI developers auto-suggest charts and automate data preparation, improving efficiency and reducing manual workloads. But, there’s a gap, and the value is huge.

How to Conduct a Thorough Landing Page Analysis | Data Snack #14 | Hubspot Marketing & Databox

Today we’re talking landing pages—whether you’re investing time and resources into content and SEO or rely heavily on paid search and paid social, all that website traffic won’t mean much if the landing pages you’re driving traffic to aren’t converting well.

Getting data to the front lines of your business - Scott Holden, CMO of ThoughtSpot

Getting your executive team to make more data-driven decisions is important. But according to Scott Holden, CMO of ThoughtSpot, it isn’t enough. At ThoughtSpot, Scott is focused on helping entire organizations become data-driven, from top to bottom. On this episode, he discusses how he’s doing this, how he thinks about decision-making in general, and much more. Key Takeaways Quotes“Everything that I look at in my world comes through an analytical lens, and the power of that is that you can merge lots of different data sets together and get a more cohesive view.”

What Does Embedded BI Really Mean? OEM Reporting Tools Defined

More people are looking for more efficient BI products to integrate into their applications in 2022, and want to know exactly what embedded BI solutions mean for their users. This article will define embedded BI, explain its growth in popularity among software users, and why we suggest Yellowfin as your embedded BI solution for better analytics.

Do You Have What it Takes to Manage the Flood of Data?

In 2010, Eric Schmidt, then CEO of Google, made the startling claim that every two days we humans generate as much information as we did from the dawn of civilization to today, or about five exabytes of data. At the time, we had TB disk drives and could only imagine an exabyte, which is one million terabytes. The next increments from TB is the peta byte and then the zettabyte, which is 1,000 exabytes. By the end of 2010, the world had crossed the zettabyte threshold.

Net Revenue Formula: What Startups Should Know

Starting a small business is an exciting journey, but there are so many metrics to consider, especially when evaluating your success or demonstrating that success to stakeholders. Business finance is about so much more than the number of sales you make. Within the startup world, it’s common to hear of e-commerce companies implying that they are enjoying financial growth from the get-go.

Assessing the Validity and Relevance of Data To Discover True, Actionable Information and Insights

In a previous article, we talked about the lost art of questioning and its importance when working with data and information to find actionable insights. In this article, we will expand on this topic and explain how questioning differs depending on what stage in the process you are from transforming data and information into insights.

At Covanta, data health improves the business and the planet

At Talend, we tend to describe poorly organized, unhealthy data as “digital landfills.” But we don’t often talk about actual landfills. That’s right, the ones filled with trash. As anyone watching real estate prices will know, land is a finite resource. It’s crazy to think that we’re still dedicating land to storing our garbage, where it will sit releasing pollutants and greenhouse gases for decades to come.

How Mercado Libre Builds Upon a Continuous Intelligence Ecosystem with BigQuery and Looker

At Mercado Libre, we are obsessed with unlocking the power and potential of data. One of our key cultural principles is to have a Beta Mindset. This means that we operate in a “state of beta”, constantly asking new questions of our data, experimenting with technologies and iterating our business operations in service of creating the best experiences for our customers.

MLOps in BigQuery ML with Vertex AI Model Registry

Without a central place to manage models, those responsible for operationalizing ML models have no way of knowing the overall status of trained models and data. This lack of manageability can impact the review and release process of models into production, which often requires offline reviews with many stakeholders.

What's New in Amazon EMR Unveiled at DataOps Unleashed 2022

At the DataOps Unleashed 2022 virtual conference, AWS Principal Solutions Architect Angelo Carvalho presented How AWS & Unravel help customers modernize their Big Data workloads with Amazon EMR. The full session recording is available on demand, but here are some of the highlights.

Customer Profitability Analysis in E-Commerce

Five things to know about customer profitability analysis: Digital retailers often talk a lot about 'profit' without ever determining the factors that drive profitability in their businesses. One of the biggest contributors to profit in e-commerce is existing and new customers who purchase products and services from online stores. However, the connection between customers and profitability can be unclear unless you carry out the right kind of analysis.

Real-Time Streaming for Data Science

First, we collect data from an existing Kafka stream into an Iguazio time series table. Next, we visualize the stream with a Grafana dashboard; and finally, we access the data in a Jupyter notebook using Python code. We use a Nuclio serverless function to “listen” to a Kafka stream and then ingest its events into our time series table. Iguazio gets you started with a template for Kafka to time series.

5 benefits of modernizing your application's analytics with embedded analytics

As an ISV company selling a SaaS application, you have built analytics into your software because you know customers highly value insights into the data that's held within your application. Giving your customers business intelligence (BI) and analytics within your application offers them a window of insight into the data to help them optimize their business. You deliver more value which boosts end user adoption and means your client buys for longer.

GigaOm Names Iguazio a Leader and Outperformer for 2022

We’re proud to share that the Iguazio MLOps Platform has been named a leader and outperformer in the GigaOm Radar for Data Science Platforms: Pure-Play Specialist and Startup Vendors report. The GigaOm Radar reports take a forward-looking view of the market and are geared towards IT leaders tasked with evaluating solutions with an eye to the future. GigaOm analysts emphasize the value of innovation and differentiation over incumbent market position.

How Olfin Car increased its sales by 760%

Olfin Car is a leading seller of new and used cars in the Czech Republic with additional services in the field of financing, authorized car service, and insurance. They have sales of up to two billion CZK and sell over 2500 cars yearly. By combining data analysis, reporting and targeted marketing Olfin Car was able to fundamentally improve the company results both in online sales and in working with data. They ended up running all data processes in Keboola with the help of our partners Marketing BI.

Little Fluffy Hybrid Clouds

In this series of demystifying the tech trends, my colleagues and I will be looking at busting the buzzwords to help you keep on track. Concerned about puzzling parlance, analytics argot, techie terminology – or plain old jargon? This series breaks down words and concepts to give you the deepest insight and understanding into how to talk the talk in the world of tech, so you can engage in conversations with the confidence of being data literate.

What defines the modern data stack and why you should care

When I was working at Google back in the mid 2000’s, we dealt with tens of billions of ad impressions a day, trained several machine learning models on years worth of historic data, and used frequently-updated models in ranking ads. The whole system was an amazing feat of engineering and there was no system out there that was even close to handling this much data. It took us years and hundreds of engineers to make this happen, today, the same scale can be achieved in any enterprise.

Revolt BI: Implementing Keboola results in a 20x faster data simulation model, 3-5% revenue increase, or even identifying 815,000 EUR of savings

Revolt BI is a consultancy and data implementation agency that provides comprehensive business intelligence solutions for companies of all sizes, by implementing best-of-breed solutions available on the market. For DataOps, they swear by Keboola to bring all the data neatly together and to automatically process it according to the needs of their clients which in total have over 6,9 billion dollars in revenue combined.

FinTech Companies Thrive and Innovate with ChaosSearch

Welcome to the second installment of our ChaosSearch for FinTech blog series, where we explore how financial technology (FinTech) companies can solve analytics challenges and drive business outcomes with ChaosSearch. In Part One of this series, we brought you an in-depth look at how FinTech companies could accelerate application development and streamline operations in the cloud by adopting ChaosSearch for log analytics at scale.

Talend's acquisition of Gamma Soft offers exciting new capabilities for our customers

I am so pleased to announce that Talend has acquired Gamma Soft, a change data capture market innovator. This is a significant enhancement in the capabilities Talend can provide its customers and partners. Talend’s company vision is to take the work out of working with data, and we're thrilled to add Gamma Soft’s technology to our offerings to do this. Change data capture (CDC) technology is highly sought after by many companies.

Now in preview, BigQuery search features provide a simple way to pinpoint unique elements in data of any size

Today, we are excited to announce the public preview of search indexes and related SQL SEARCH functions in BigQuery. This is a new capability in BigQuery that allows you to use standard BigQuery SQL to easily find unique data elements buried in unstructured text and semi-structured JSON, without having to know the table schemas in advance. By making row lookups in BigQuery efficient, you now have a powerful columnar store and text search in a single data platform.

How to replicate SAP data in BigQuery

Are you interested in unlocking advanced analytics by replicating SAP data into BigQuery? In this video, Lucia Subatin, a Technical Lead in Solution Engineering, will demonstrate how to download and implement an ABAP enhancement built by Google Cloud to stream data directly into BigQuery. Watch, follow along, and ask questions in the comments below! Chapters: product: Cloud - General; fullname: Lucia Subatin;

Data Chief Live: External data: Your secret weapon in a cookie-less world

How do you get to know your customer in a cookie-less world? Join Rosemary Hua, Global Head of Retail & CPG GTM at Snowflake and Forbes 30 Under 30, Erik Mitchell, founder and principal at Seek Data, Nik Lampropoulos, Global Director of Data, Insights & Analytics, Hogarth Worldwide and Cindi Howson, ThoughtSpot CDSO, as they discuss questions like.

Talend acquires Gamma Soft

April 7, 2022, Talend, a global leader in data integration and management, announced today it has acquired Gamma Soft, a market innovator in change data capture (CDC). The addition of Gamma Soft’s highly complementary, enterprise-class change data capture technologies will help customers streamline their data modernization initiatives, including cloud migrations, and support advanced, real-time analytics use cases across hybrid and multi-cloud environments.

Google Cloud names ThoughtSpot a Google Cloud Ready - BigQuery company to help customers dominate the decade of data

We’re entering the defining decade of data. While every aspect of our lives have been changed by data in recent years, the next ten will see data rebuild the world around us. Every business, in every industry, needs a plan to adapt to this new world if they want to thrive. But how? That’s a question in the minds of data leaders, CEOs, and board members. The right approach is critical if companies want to dominate this new era. The wrong decision can spell disaster.

Space-Based AI Shows the Promise of Big Data

At a distance of a million miles from Earth, the James Webb Space Telescope is pushing the edge of data transfer capabilities. The observatory launched Dec. 25 2021 on a mission to look at the early universe, at exoplanets, and at other objects of celestial interest. But first it must pass a rigorous, months-long commissioning period to make sure that the data will get back to Earth properly. Mission managers provided an update Feb.

Cost-Per-Order Formula for E-Commerce Explained

Our Five Key Points: You have a customer. They want a product: you sell that product. It’s simple, but how do you know if you’re charging the right amount for that product? Knowing how to calculate the cost per order is essential in helping you set the correct prices for your products and services. As an e-commerce retailer, you’ll naturally have lower overheads than a brick-and-mortar store.

SQL Puzzle Optimization: The UDTF Approach For A Decay Function

How do you implement a decay function in SQL? You can use window functions, which scale better than joins, or better yet, you can try what Felipe Hoffa did: use tabular UDFs. In this video, Felipe shows you how you can use a tabular UDF to write custom code that can analyze a table row by row while preserving state. Felipe wrote a table UDF in JavaScript that uses a low amount of memory to keep track of the decaying values. He was able to run it in 36 seconds, instead of the 46 seconds that the SQL with windows solution took; and then he optimized the JavaScript even further and ran it in just 9 seconds.

Data Warehouse Automation: What, Why, and How?

Building a data warehouse is an expensive affair and it often takes months to build one from scratch. There is also a constant struggle to keep up with the large volumes of data that is constantly generated. On top of that, setting up a strong architectural foundation, working on repetitive and mundane data validation tasks and ensuring data accuracy is another challenge. This puts tremendous stress on data teams and data warehouses. Data warehouse automation is intended to handle this growing complexity.

Solve a Problem, Change the World w/ Amr Awadallah

A universal human problem that we don’t often address is that historically, knowledge has been relatively siloed by language. But with advancements in AI, there are new opportunities to capture broader and deeper insights across the written and spoken word by breaking down global language and distance barriers. This was the topic of discussion on our most recent Data Legends podcast episode, featuring Amr Awadallah, founder and CEO at ZIR AI and former technology exec at Cloudera, Google, and Yahoo.

Cortex leverages ThoughtSpot Everywhere to innovate in B2B marketing intelligence

Gaining an accurate view of revenue intelligence for B2B markers is challenging. With disconnected and dirty data residing in many systems, customers need a solution that collects, normalizes, and aggregates information into reports that answer the questions B2B marketers should have a handle on. And let’s face it, no matter how great a set of standard reports might be, every customer wants to see their data a little differently.

Data-Centric AI with Continual and Snowflake

Data infrastructure is rapidly growing and evolving along with infrastructure for AI/ML, with the latter growing largely independent from the former. An emerging generation of AI/ML tooling emphasizes data-centric versus model-centric approaches to the ML development lifecycle. These tools recognize that data is the foundation for AI and seek to open opportunities for all data professionals to participate by eliminating the unnecessary complexity of traditional model-centric solutions.

Hybrid Data Delivery "Cloud Sources" Walkthrough

We have expanded our Hybrid Data Delivery service to load analytics ready data, from a number of cloud-based data sources, directly to snowflake - without the need for Qlik replicate. This initial update currently allows you to connect to data from over 20 cloud-based data sources such as Amazon Redshift, Google BigQuery, and Salesforce and land it directly to a Snowflake as a target on a scheduled basis, so it can be used with your analytics applications – offering a single solution for on-prem and cloud data movement and replication.

AstraZeneca: Building a finance data hub

At AstraZeneca, supporting funcstions like Finance are intensely data-driven. Recently, the data and IT team completely overhauled their data architecture to better serve the needs of the Finance team, they decided to build a Finance data hub. In this video, key project stakeholders explain why and how they build the data hub for the finance team (using Talend and AWS), and they detail how it's integrated with other data hubs at astraZeneca.

Building Product Analytics At Petabyte Scale

Product analytics is the most critical and complex task for any product team. There are thousands of data points that have to be analyzed carefully while setting up the product analytics foundation and it enables product teams to use data to track, visualize, and analyze user engagement and behavior that can be used to improve and optimize a product experience. However, managing large data workloads can be very challenging as not all data that is collected can be directly used for analytics.

Iguazio named in Forrester's Now Tech: AI/ML Platforms, Q1 2022

We are delighted to share that Iguazio has been named along with Microsoft, Databricks, Cloudera, Alteryx and others in Now Tech: AI/ML Platforms, Q1 2022, Forrester’s Overview of the Leading AI/ML Platform Providers, by Mike Gualtieri. This report by Forrester Research looks at AI/ML Platform providers, to help technology executives evaluate and select one based on functionality aligned with their needs.

Top 8 Machine Learning Resources for Data Scientists, Data Engineers and Everyone

Machine learning is a practice that is evolving and developing every day. Newfound technologies, inventions and methodologies are being introduced to the community on a daily basis. As ML professionals, we can enrich our knowledge and become better at what we do by constantly learning from each other. But with so many resources out there, it might be overwhelming to choose which ones to stay up-to-date on. So where is the best place to start?

A Real-Time Data Integration Fabric for Active Intelligence

Greek philosopher Heraclitus wasn’t talking about the challenge of today’s enterprise IT landscape but the quote certainly fits. From the advent of the first digital computer in the 1940s to the emergence of first public cloud in 2004, the rate of change has only accelerated. In fact, over 60% of corporate data resides in the cloud in 2022, up from 50% last year.

Why Can't we Advance Healthcare and Life Sciences this Fast all the time?

Vaccine development became the top priority for the life sciences industry – delivering new vaccines at unprecedented speed and maneuvering large-scale production processes. Numerous factors helped accelerate the vaccine roll-out including prior research, genome sequencing, jumping the FDA approval queue and a plethora of testing volunteers. So now that we’ve experienced these advancements, how can the industry keep momentum to speed-up innovative solutions across healthcare?

Turning data into a life-saving asset

A global leader in pharmaceuticals found themselves faced with a unique spin on a common challenge: Their biopharmaceutical division — responsible for producing vaccines and generating over $1 billion in annual sales — was struggling to turn raw data into trusted insights. Data underlies everything the global pharmaceutical company does, however, without data they can trust, they would be at risk of taking longer to get vaccines to market and incurring higher expenses along the way.

Stitch vs. Fivetran vs. Integrate.io: A Comprehensive Comparison

When it comes to providing the latest and greatest ETL and ELT tools, the platforms Stitch, Fivetran, and Integrate.io are all top contenders. That being said, each platform also has its own set of pros and cons. Ultimately, the best ETL/ELT platform for your company will largely depend upon the needs of your organization. So, which platform will reign supreme for your company in the Stitch vs Fivetran vs Integrate.io matchup?

Getting Started with Continual and Snowflake

This guide will show you how to easily add Continual as the AI layer to your modern data stack with Snowflake at the core. The intention is to provide an introduction to using Continual on Snowflake. After completing this tutorial, users are invited to try more advanced examples. We are going to demonstrate connecting Continual to Snowflake, building feature sets and models from data stored in Snowflake, and analyzing and maintaining the predictive model continuously over time.

The E-Commerce Purchase Funnel Explained

Our Five Key Points: If you're an online retailer, you've probably heard of the e-commerce purchase funnel. Some sales experts call it the conversion funnel or simply the sales funnel. It's a way of explaining your potential customer's journey, from their initial impulse to research a product to finally clicking the "buy" button.

Best 15 ETL Tools in 2023

ETL stands for Extract, Transform, and Load. It is defined as a Data Integration service and allows companies to combine data from various sources into a single, consistent data store that is loaded into a Data Warehouse or any other target system. ETL serves as the foundation for Machine Learning and Data Analytics workstreams. Through multiple business rules, ETL organizes and cleanses data in a way that caters to the Business Intelligence needs, like monthly reporting.

Data Warehouse Automation: What, Why, and How?

Data Warehouse Automation helps IT teams deliver better and faster results by getting rid of repetitive design, development, deployment and operational tasks within the data warehouse lifecycle. With automation, organizations can accelerate the data to the analytics journey, work more effectively with large amounts of data and save cost. Join this session with Darshan Wakchaure, Global Data & Analytics Competency Head, Tech Mahindra as he shares his insights on the key benefits of Data Warehouse Optimization and how to achieve Data Warehouse Automation at scale.

Introducing Active Assist recommendations for BigQuery capacity planning

BigQuery already offers highly flexible pricing models, such as the on-demand and flat-rate pricing for running queries, to meet the diverse needs of our users. Today, we’re excited to make it even easier for you to optimize BigQuery usage with new BigQuery slot recommendations powered by Active Assist, a part of Google Cloud’s AIOps solution that uses data, intelligence, and machine learning to reduce cloud complexity and administrative toil.

The Best Guide to AUR-in Retail

AUR stands for average unit retail. It gives you the average selling price of a product in a given time period. An important e-commerce metric, AUR is typically calculated quarterly. But why is it important? And how can you use AUR-in retail to boost your e-commerce business? Let’s take a look at this important business metric and how Integrate.io's data integration solution can help you bring all your e-commerce data and statistics together.

Apache Kafka to BigQuery: 2 Easy Methods

Organizations today have access to a wide stream of data. Data is generated from recommendation engines, page clicks, internet searches, product orders, and more. It is necessary to have an infrastructure that would enable you to stream your data as it gets generated and carry out analytics on the go. To aid this objective, incorporating a data pipeline for moving data from Apache Kafka to BigQuery is a step in the right direction.