Systems | Development | Analytics | API | Testing

November 2020

How to Migrate Your Enterprise Data Warehouse to a Cloud Data Warehouse

Migrating a data warehouse from a legacy environment requires a massive upfront investment in resources and time. There is a lot to consider before and during migration. You may need to replan your data model, use a separate platform for tasks scheduling, and handle changes in the application’s database driver. Therefore, organizations must take a strategic approach to streamline the process. This article presents a step-by-step approach for migrating a data warehouse to the cloud.

Making Privacy an Essential Business Process

Canada is poised to become a world-leader in privacy regulation and with new regulation comes record-breaking fines for those who can’t keep up. In November, Canada introduced the Digital Charter Implementation Act. If passed, companies could face fines of up to five percent of global revenue or $25 million CAD — whichever is greater — for violating Canadians’ privacy.

The Modern Data Eco System - How teams collaborate to unleash their data

With data becoming the main asset of a business, one of the biggest challenges is how to successfully leverage data to gain a business advantage. In the modern Data Eco System people with different skills set need to collaborate and work together to achieve their data objectives. How does a modern analytics team with data scientists, business analysts and data engineers work together? How are technologies such as Machine Learning, Big data and Cloud come together in a productive way.

Black Friday deal: $350 free Managed Kafka credits

Thanksgiving holiday is upon us. For many of our customers, this is one of the most important periods of the year, with more than 189.6 million U.S. shoppers buying up bargains from Thanksgiving day through Cyber Monday last year. For them and for us, it’s crucial that internal systems can handle high traffic volume without downtime or performance degradation.

Optimizing & Simplifying Business Analytics | Part 1 | Snowflake Inc.

Jumpstarting the digitalization of business, Babu Kuttala, Chief Data and Analytics Officer at ABB, details how he came into his role, his influence in different markets, & how he grows and simplifies ABB's use of internal and external data. Rise of the Data Cloud is brought to you by Snowflake.

Why Data Analytics Is Important for Business Success

Given the competitive value of analytics and rapid adoption rates across industries, you can’t afford a subpar analytics program. In the late 90s, Oakland Athletics general manager Billy Beane used data to discover undervalued talent and assemble a perennial playoff-caliber team, and he did so on a shoestring budget compared to Major League Baseball’s heavy hitters. Beane’s pioneering use of data analytics became the subject of the bestselling book Moneyball.

Why ELT Is the Future of Data Integration

Many analytics programs struggle to assimilate data from numerous and unpredictable sources, but automated ELT offers a solution. Why do so many businesses struggle to establish successful analytics programs? A lack of data is not the problem. Data volumes — from hundreds of cloud applications to millions of IoT endpoints — are exploding across organizations and industries.

Cloud-Based Data Analytics in Three Steps

Implementing a modern, cloud-based analytics stack doesn’t have to be hard — you can do it in three steps, actually. Implementing a modern data stack (MDS) — data integration tool, cloud data warehouse and business intelligence platform — is the best way to establish a successful analytics program as data sources and data volumes multiply.

How a Discovery Data Warehouse, the next evolution of augmented analytics, accelerates treatments and delivers medicines safely to patients in need

I met Matthew in New York City about a year ago. We sat in a private conference room and he told me the story of his pharma startup. A small group of researchers set out to solve the black-box enigma of certain kinds of vicious cancers. There are so many cancers, so their vision was to focus on especially heinous ones. Fast forward to their recent FDA approval of their “Hail Mary” procedure and treatment methodology for stage-four patients of a particular cancer.

Demo: Cloudera DataFlow on Data Hub

Cloudera DataFlow for Data Hub makes hybrid use cases possible by extending on-premises flow management, streams messaging, and stream processing and analytics capabilities to the public cloud. Watch an integrated demo of Cloudera DataFlow on Data Hub to understand how easy it is to ingest, process, and analyze your streaming data across multiple public cloud clusters.

Predictive Real-Time Operational ML Pipeline: Fighting First-Day Churn

Retaining customers is more important for survival than ever. For businesses that rely on very high user volume, like mobile apps, video streaming, social media, e-commerce and gaming, fighting churn is an existential challenge. Data scientists are leading the fight to convert and retain high LTV (lifetime value) users.

Introducing Lightweight, Customizable ML Runtimes in Cloudera Machine Learning

With the complexity of data growing across the enterprise and emerging approaches to machine learning and AI use cases, data scientists and machine learning engineers have needed more versatile and efficient ways of enabling data access, faster processing, and better, more customizable resource management across their machine learning projects.

What Is Data Analytics?

Learn the how and what of analytics and data integration. This is the first in a two-part abridged version of The Essential Guide to Data Integration. Read Part 2 here, and get the full book for free here! You can also watch the webinar. What is data analytics How do you integrate data? Should you build or buy a data analytics solution? What are some business and technical considerations for choosing a data analytics tool, and how can you get started? Let’s start with the first two questions.

Seven Ways to Scale a Data-Driven Culture in Your Organization

Without an overarching company data culture, even the best technology tools won’t get you where you want to go, say the co-founders of Data Culture. Data isn’t just a tech solution. For Gabi Steele and Leah Weiss, founders of the consultancy Data Culture, it’s also a “people” solution. Even within companies that enthusiastically embrace a cloud-based modern data stack, a substantial gap often exists between the business and data sides of the organization.

SELECT ApacheKafka WITH StreamingSQL FROM RealTimeData

In another life, I taught the Book of Genesis to high school students, including The Tower of Babel excerpt. It struck me ironic that God’s wrath strikes down the tower, cofounds the universal language and scatters humans around the globe to teach King Nimrod a lesson in hubris; meanwhile, the boys in my class were texting their girlfriends across the country and playing video games with friends in Europe and Asia.

What's new in BigQuery ML: non-linear model types and model export

We launched BigQuery ML, an integrated part of Google Cloud’s BigQuery data warehouse, in 2018 as a SQL interface for training and using linear models. Many customers with a large amount of data in BigQuery started using BigQuery ML to remove the need for data ETL, since it brought ML directly to their stored data. Due to ease of explainability, linear models worked quite well for many of our customers.

8 key considerations for choosing an Embedded Analytics solution

Historically, analytics has not always been a priority feature for software vendors. Many applications typically are built with analytics bolted-on later, as standalone tools. But the changing needs of today’s business users has accelerated the importance of providing in-built ways to monitor and explore their data while they use your software.

How to Estimate Any Website Traffic with These 3 Tools

While there are several different ways to measure the success of a site: one core metric is its traffic. Knowing how well your site performs compared to other sites is crucial in understanding how successful you have been in your efforts against competitors. This is possible if you know how much traffic your competitors are generating. Along with this, some competitor intelligence tools hand over the full list of keywords that bring traffic to a site.

Scalable Data Stack Helps Welcome Tech Empower Immigrants

Welcome Technologies builds more robust data pipelines with Fivetran to propel its work on improving the lives of immigrants through a data-first approach. Key Takeaway With its data-first approach, Welcome Tech is developing machine learning and security models to better serve the immigrant community. After building and maintaining a Postgres connector, Welcome Tech brings on Fivetran to scale its data architecture.

Snowflake + Fivetran + dbt: Turn Your Marketing Data Silos into Marketing Insights

The 2020 Marketing LUMAscape showcases more than 8,000 tools marketers can use to generate leads, drive brand awareness, and measure all their marketing efforts. But with all those tools, comes a lot of disparate and siloed data. How do you bring them all together in a consistent, reliable, and fast way to understand your ROI, determine attribution, and see which of your marketing efforts are working?

Turbocharge Your Application With Contextual Analytics Webinar - Yellowfin BI

Innovate your application and create highly valuable analytic experiences for your end-users with contextual analytics. Contextual analytics, as the next phase of embedded, brings dashboards, automated analysis and analytics directly into your application’s core workflows delivering data directly within the user interface and within the transaction flow. By seamlessly blending analytics and actions, improve both your app’s core functionality and enable opportunities for exciting new analytical experiences for your users - and improve the value of your application.

Data Exploration & Reporting with Cloudera Data Warehouse

In this video, we’ll go over how you can use both Cloudera Public Cloud to both Ingest data through Cloudera Data Engineering as well as explore it through Hue and Impala within Cloudera Data Warehouse. You'll see how easy it is to run queries that give you insight into your data and how you can use a built in data visualization tool to then create a dashboard to share your results.

How a modern data platform supports government fraud detection

November 15-21 marks International Fraud Awareness Week – but for many in government, that’s every week. From bogus benefits claims to fraudulent network activity, fraud in all its forms represents a significant threat to government at all levels. Some experts estimate the U.S. government loses nearly 150 billion dollars due to potential fraud each year, McKinsey & Company reports.

Fivetran's mission with General Manager, EMEA - Nathaniel Spohn

Fivetran’s mission is to make data as accessible and reliable as electricity. We're focused on providing automated access to data so data analysts and engineers can be empowered to actually analyze their data. For small companies and large enterprises, Fivetran replicates data from 180+ sources to enable business intelligence and data-driven decisions alongside our partner companies.

Fraud Prevention - 3 Data Strategies for Financial Services

Fraud awareness in the Financial Services industry is more important than ever. According to the September 2020 benchmarking report conducted by the Association of Certified Fraud Examiners (ACFE) in response to the coronavirus, 77% of survey respondents, representing a range of industries, have observed an increase in the overall level of fraud as of August, compared with 68% in May. The report reveals 68% of respondents have observed an increase in payment fraud schemes (vs.

Support for Calling External Functions via Azure API Management Now in Public Preview

In June, Snowflake announced the public preview of the external functions feature with support for calling external APIs via AWS API Gateway. With external functions, you can easily extend your data pipelines by calling out to external services, third-party libraries, or even your own custom logic, enabling exciting new use cases. For example, you can use external functions for external tokenization, geocoding, scoring data using pre-trained machine learning models, and much more.

Want Post-Pandemic Business Success? Embedded Analytics is the New Normal

With COVID-19, it has become clear that ideas and assumptions that worked in the past will no longer apply now and in the future. The region is expected to face its deepest economic slowdown since the 1970s due to COVID-19. Businesses might be wondering what to do next.

MDM in telcos: Why it's important and how to automate it through ML

Data volume in the telecommunications sector is growing at an incredible rate and organizations need to find solutions to various data challenges that may arise. Not only should you expect to encounter challenges in storing data, but also in streamlining the different processes and workflows needed to manage it efficiently. This includes sourcing data, ensuring its quality and uniformity, and providing access to relevant users, among other activities.

How to Own That New State-of-the-Art Model Repo!

Deep learning has evolved in the past five years from an academic research domain, to being adopted, integrated and leveraged for new dimensions of productivity across multiple industries and use cases, such as medical imaging, surveillance, IoT, chatbots, robotic,s and many more. From NLP to computer vision, deep learning has been breaking the barriers of SOTA algorithms and providing results that were, otherwise, impossible to achieve.

Cost Optimization on Microsoft Azure

Do you use big data and streaming services - such as Azure HDInsight, Databricks, and Kafka/EventHubs? Do you have on-premises big data that you want to move to Azure? Keeping costs down in Microsoft Azure is difficult, but vital. Join Chris Santiago of Unravel Data and explore how to to reduce, manage, and allocate streaming data and big data costs in Azure.

Cost-Effective, High-Performance Move to Cloud

The move to cloud may be the biggest challenge, and opportunity, facing IT departments today. In this 45-minute webinar, Unravel Data product marketer Floyd Smith and Solutions Engineering Director Chris Santiago describe how to move workloads to the cloud quickly, cost-effectively, and with high performance for the newly cloud-based workloads. Tune in to find out the best way to de-risk your cloud migration projects with data driven insights.

Anodot the business monitoring platform

Business metrics are notoriously hard to monitor because of their unique context and volatile nature. Anodot’s Business Monitoring platform uses machine learning to constantly analyze and correlate every business parameter, providing real-time alerts and forecasts in their context. This is machine learning packaged in a turn-key solution – no data science experience needed.

Fivetran Receives ISV Partners Innovation Award From Databricks

We’re honored to win this prestigious award, and we’re doubling down on the Lakehouse architecture with Databricks SQL analytics plans. Fivetran is the proud recipient of the Databricks ISV Partners Innovation Award as announced at this week’s Data & AI Summit Europe event. The award recognizes how Fivetran has collaborated with Databricks to empower data professionals to accelerate time to insights with Delta Lake and the Lakehouse architecture for the modern data stack.

Fraud Detection using Deep Learning

One of the many areas where machine learning has made a large difference for enterprise business is in the ability to make accurate predictions in the realm of fraud detection. Knowing that a transaction is fraudulent is a critical requirement for financial services companies, but knowing that a transaction that was flagged by a rules-based system as fraudulent is a valid transaction, can be equally important.

Introducing CDE: Purpose Built Tooling For Accelerating Data Pipelines Demo Highlight

Spark has become the de-facto processing framework for ETL and ELT workflows for good reason, but for many enterprises working with Spark has been challenging and resource-intensive. Leveraging Kubernetes to fully containerize workloads, DE provides a built-in administration layer that enables one-click provisioning of autoscaling resources with guardrails, as well as a comprehensive job management interface for streamlining pipeline delivery. DE enables a single pane of glass for managing all aspects of your data pipelines.

Selling Corona During the COVID-19 Pandemic | Snowflake Inc.

Ari Margalit, VP Architecture & Data Solutions of Anheuser Busch InBev, discusses how he transformed ABI into a technology organization by building a data ecosystem, BrewDat that incorporates advancing analytics that aid in decision making & creating project portfolios. Rise of the Data Cloud is brought to you by Snowflake.

Kubeflow: Simplified, Extended and Operationalized

The success and growth of companies can be determined by the technologies they rely on in their tech stack. To deploy AI enabled applications to production, companies have discovered that they’ll need an army of developers, data engineers, DevOps practitioners and data scientists to manage Kubeflow — but do they really? Much of the complexity involved in delivering data intensive products to production comes from the workflow between different organizational and technology silos.

Combating Fraud in Insurance with Data

Well, it is International Fraud Awareness Week, focused on promoting fraud prevention and education. A fantastic initiative! Maybe I am naïve but I feel a bit sad that there is a need for “fraud week”. The insurance industry has a long and intimate relationship with fraud in many different ways. Insurance fraud can take place at a process or business function level, most notably in claims or underwriting.

The Developer's Guide to Contextual Analytics

As a specialized and mature form of embedded analytics, contextual analytics is a game-changer if you're a software vendor looking to further augment your customers’ user experience, without requiring developers to completely reengineer your offering. Contextual analytics blends the data your users need for decision-making right at the point of their daily work, directly inside the interface and transaction flow of your software.

Improve Your Business Intelligence With a Modern Data Stack

F5 Networks modernized its data stack, boosted time to insight, and placed actionable data in the hands of the right decision-makers. F5 Networks is a Seattle-based application services and application delivery networking company. Because its revenue depends on speed and accuracy, the company is always looking for ways to improve business insights and support data-driven decision-making.

The Top 8 Data Analysis Mistakes To Avoid

Data analysis is incredibly useful for all kinds of businesses and also has academic and hobbyist applications. Nonetheless, it’s still possible to fall into numerous traps when trying to accurately interpret your data. That’s why we’re giving you a list of the top 8 common data analysis mistakes to avoid at all costs. Our first expert Jitin Narang, CMO at TechAHead contributed the following five top data mistakes to avoid:

Fifteen years of making data useful

Happy anniversary to us! Fifteen years ago, Talend’s founders anticipated the business need to have data accessible to all users across an organization. I’ve been with Talend since the beginning, and I wanted to celebrate this milestone by sharing our product innovation and evolution through the years. Talend was created with the idea that we could offer something new to the market: open source ETL.

Innovating Safe & Sustainable Solutions | Part 2 | Snowflake Inc.

COVID-19 has forced Michelin Group CIO Yves Caseau to find solutions that increase health protections & reduce carbon emissions, which has shaped Michelin's new standards for safe transportation & for returning the planet to sustainable levels of GHG emissions. Rise of the Data Cloud is brought to you by Snowflake.

A Data Lakehouse without Data?

Imagine if you bought a beautiful lake house, invited all your friends to come and visit, and the lake was dry? Not much value and a little embarrassing, right? Now imagine you have that beautiful lake house and you have special water valves to control not just if there is water in the lake but also control the water quality, clarity, and what fish the lake is stocked with? Much more impressive, correct?

Extreme data center pressure? Burst to the cloud with CDP!

Here at Cloudera, we’ve seen many large organizations struggle to meet ever-changing and ever-growing business demands. We see it everywhere. Traditional on-premise architectures, which create a fixed, finite set of resources, forces every business request for new insight to be a crazy resource balancing act, coupled with long wait times, or a straight-up no, it cannot be done.

Why design matters in BI

One of the things that I'm really passionate about is great design. Design is important in all aspects of our lives and it's really important for analytics as well. When you're the recipient of bad design, you know it immediately. Have you ever seen those emails that have been completely misaligned or sat through a PowerPoint where everything is in the wrong colors and fonts? How does it make you feel?

Building a Global Business Using Data | Part 1 | Snowflake Inc.

Michelin's goal of creating a global organization inspires Group CIO, Yves Caseau, automate IT operations and build predictive maintenance systems, so the company can monitor manufacturing processes & customer interactions on the fly. Rise of the Data Cloud is brought to you by Snowflake.

Predicting Ad Performance in Real-Time: PadSquad & Iguazio at the Data Science Salon

In this talk, Daniel Meehan, CEO & Founder of PadSquad explains how to build a predictive AI application which can analyze events and impressions from online ads in real-time. He discusses how to run and analyze thousands of real-time and batch events per second for ad performance optimization.

The ELK Test for Software Management

Are you planning moving to data-driven software management to keep on top of your quality assurance status, your security findings and development velocity? Thinking about building it yourself using Elasticsearch, Logstash and Kibana or maybe just using your Splunk instance for this? Read more to find out why this may or may not be a good idea.

Spark's Cloud Migration Journey with Snowflake

Peter Langham - Domain Chapter Lead, Data Engineering and Data Science will discuss why Spark decided to move their data and analytics capability to the cloud, the benefits achieved and how Snowflake Cloud Data Platform has enabled them to be more data driven, deliver faster and improve analytic output across the business.

Expediting SQL Workers means Expediting your Business

Two of the more painful things in your everyday life as an analyst or SQL worker are not getting easy access to data when you need it, or not having easy to use, useful tools available to you that don’t get in your way! As one of my dear customers, a data worker in Pharma, said to me: “I really don’t care about bells and whistles, I just want to get my task done.” This simple statement captures the essence of almost 10 years of SQL development with modern data warehousing.

Why We Built the Lumada DataOps Suite

Why is DataOps important? Without intelligent data operations (DataOps), there can be no digital innovation. Agile data environments improve business operations and enable new customer experiences and new business models. Our customers demonstrate every day the value of their data and how it is critical for digital transformation.

5 signs your telco CX is lacking-and how data science can help

Modern customers only expect the best. And with the pandemic leading to a lot of disruption, it’s become even more important for telcos to stay focused on continuously improving customer satisfaction and ensure a great experience is provided across various touchpoints. Take a step back and assess whether your organization is letting customers down. Here are 5 things to steer clear of.

Data Egress Cost Analysis

Understand the impact of data transfer and egress costs across Microsoft Azure, Amazon Web Services and Google Cloud Platform. One of the questions most frequently asked by cloud-savvy, price-aware customers goes something like this: OK, so we like that your tool makes it easy to integrate our cloud database and storage in our centralized data warehouse, but I know our budget will be scrutinized for total cost of ownership (TCO), including our data egress costs.

The Security Challenges of Data Warehousing in the Cloud

Many organizations struggle to meet growing and variable data warehouse demands. No matter how much they pad their annual IT budgets, there never seems to be enough capacity to cover unexpected business requests. This leads to resource restrictions for the various business units that use the platform. When business units are not well served by central IT, “shadow IT” emerges.

How Data Fabrics Power Industrial IoT

Unlike typical resources companies depend upon to thrive, the amount of data available to enterprises is not finite. With edge technology and smart devices, there is truly no limit to the quantity of useful data companies can and should be using to make better informed decisions. But many businesses unnecessarily limit the variety, quality, and extent of the data at their disposal by not having the right data architecture.

Using AI to Detect Stock Market Abuse | Part 1 | Snowflake Inc

NASDAQ has been transforming its system — integrating AI to identify stock market abuse, decrease latency, and modernize customer portfolios. Michael O'Rourke, NASDAQ's SVP of Machine Intelligence, details how he's led this transformation and the role Snowflake plays in the organization. Rise of the Data Cloud is brought to you by Snowflake.

Five Reasons to Consider a Modern Data Stack

Fivetran co-founder and CEO George Fraser shares the importance of the modern data stack and five developments he’s eagerly following. When Thomas Edison switched on his first working light bulb, even he could not have predicted that this new technology would eventually revolutionize every aspect of modern life.

Four Ways a Modern Data Stack Can Fuel Results

When record-breaking demand exploded with COVID-related purchases, e-Commerce company Drizly thrived thanks to its modern data stack. Drizy is the world’s largest online alcohol marketplace. From the company’s website and app, users can order their favorite beers, wines and liquors from local retailers, and have them delivered in less than an hour.

How insurers can better deliver at "The Moment of Truth"

It’s all about the Customer Customers today expect services to be highly personalized. In a digital world tuned to understand your likes, dislikes, interests and preferences we expect a similar level of customization in all aspects of our lives. Insurance is no different. Insurance is not something the average consumer thinks about every day but when a life changing event happens, insurance becomes extremely important. It is in this “Moment of Truth” that insurers excel or fail.

How Xandr, AT&T's Adtech Company, Prevents Revenue Loss with Autonomous Business Monitoring

Anodot CEO and Co-Founder David Drai joined Amazon Web Services and Xandr to discuss the shift to machine learning-based anomaly detection in business monitoring. Xandr Chief Technology Officer Ben John shared how their advertising marketplace is using Anodot platform to cut detection from “up to a week to less than a day”. You can watch the webinar at the link above or read on for the highlights of that talk.

Ten Steps to Cloud Migration

In cloud migration, also known as “move to cloud,” you move existing data processing tasks to a cloud platform, such as Amazon Webservices (AWS), Microsoft Azure, or Google Cloud Platform, to private clouds, and-or to hybrid cloud solutions. See our blog post, What is Cloud Migration, for an introduction. Figure 1: Steps in cloud migration.

Using Artificial Intelligence to Interact with the Stock Market | Snowflake Inc.

Michael O'Rourke, SVP of Machine Intelligence at NASDAQ, discusses how NASDAQ integrates artificial intelligence and machine learning models to identify trends, provide data solutions, & detect Stock Market abuse. Rise of the Data Cloud is brought to you by Snowflake.

Demystifying Cloud Data Egress Costs

Understand the impact of data transfer and egress costs across Azure, Amazon Web Services, and Google Cloud platform in data integration One of the most frequent questions asked by cloud-savvy, price-aware customers is something like: Ok, so we like that your tool makes it easy to integrate our cloud database and storage in our centralized data warehouse, but I know our budget will be scrutinized for Total Cost of Ownership (TCO), including our data egress costs.

An Overview of Real Time Data Warehousing on Cloudera

Users today are asking ever more from their data warehouse. This is resulting in advancements of what is provided by the technology, and a resulting shift in the art of the possible. As an example of this, in this post we look at Real Time Data Warehousing (RTDW), which is a category of use cases customers are building on Cloudera and which is becoming more and more common amongst our customers.

Analytics Experience Explained

One of the really big trends that we're seeing in the analytics space, is the move towards talking about the analytics experience. Analytics experience is about supporting or triggering decisions and transactions. This is a shift from what I would describe as the passive use of analytics, where people were expected to use dashboards and reports that didn't add a lot of value to their transactions or decision making. The difference sounds subtle, but it's really quite profound.

Snowflake for Marketing Analytics

Identify deeper insights with 360° customer views, create relevant messaging and offers, and produce much higher marketing ROI. Snowflake’s platform virtually eliminates data silos to create a single repository for a single copy of your data. As a result, marketing teams extract deep insights and deliver timely, relevant and consistent customer messaging and offers.

Snowflake Workloads Explained: Data Engineering

Snowflake streamlines data engineering, while delivering performance and reliability. Learn how with Snowflake, data engineers can spend little to no time managing infrastructure, avoiding such tasks as capacity planning and concurrency handling. Instead, they can focus on more value-add activities towards delivering your data.

Snowflake Workloads Explained: Data Applications

Snowflake’s platform powers applications with virtually unlimited performance, concurrency, and scale. Launch new features faster with simplified data pipelines and improved engineering efficiency. Delivered as a service, Snowflake handles the infrastructure complexity, so you can focus on innovating with the data applications you build.