Announcing the Fivetran dbt Package for Shopify
Use our latest dbt package to unlock your Shopify data and power customer analytics.
Use our latest dbt package to unlock your Shopify data and power customer analytics.
In our previous blog post we introduced Cloudera Data Visualization in Cloudera Data Warehouse (CDW) available in tech preview, in CDP Public Cloud. This blog will help you get started with Cloudera Data Visualization, so you can start building interesting and powerful applications on all types of data.
The Unravel 4.6.2.0 release, now generally available, builds on our previous 4.6 release with a new UI/UX, multi-cluster support, monitoring for ELK (Elasticsearch, Logstash, and Kibana), and a new installer that makes Unravel available in minutes.
As more and more companies are embedding AI projects into their systems, attracted by the promise of efficiencies and competitive advantages, data science teams are feeling the growing pains of a relatively immature practice without widespread established and repeatable norms.
Performance is one of the key, if not the most important deciding criterion, in choosing a Cloud Data Warehouse service. In today’s fast changing world, enterprises have to make data driven decisions quickly and for that they rely heavily on their data warehouse service. In this blog post, we compare Cloudera Data Warehouse (CDW) on Cloudera Data Platform (CDP) using Apache Hive-LLAP to Microsoft HDInsight (also powered by Apache Hive-LLAP) on Azure using the TPC-DS 2.9 benchmark.
The CDO role has been rapidly evolving over the past few years – from governance leader to data science and AI expert to digital transformation guru. And, with this change, has come a shift from a back-office focus to a focus on achieving measurable results required by the C-suite.
When evaluating your approach to analytics and BI, a common comparison is Qlik vs. Tableau. Consistently, when we have that conversation, there are some sharp distinctions that come to the fore in terms of capabilities, vision and cost.
The telecommunications space plays a critical role in facilitating modern communication, especially during these uncertain times. With all the social distancing measures and quarantine restrictions, it has become essential to keep people and businesses connected to each other—be it with their families, colleagues, or customers. This surge in demand for connectivity has transformed the telco landscape in numerous ways.
Let's talk media buying in the data age.
Cloudera delivers an enterprise data cloud that enables companies to build end-to-end data pipelines for hybrid cloud, spanning edge devices to public or private cloud, with integrated security and governance underpinning it to protect customers data. Cloudera has found that customers have spent many years investing in their big data assets and want to continue to build on that investment by moving towards a more modern architecture that helps leverage the multiple form factors.
Scribe Media and Grocery TV share how they join data to determine job costing, cash forecasting, executive reporting, revenue sharing and more.
Some of the most powerful results come from combining complementary superpowers, and the “dynamic duo” of Apache Hive LLAP and Apache Impala, both included in Cloudera Data Warehouse, is further evidence of this. Both Impala and Hive can operate at an unprecedented and massive scale, with many petabytes of data. Both are 100% Open source, so you can avoid vendor lock-in while you use your favorite BI tools, and benefit from community-driven innovation.
Companies use their data to accelerate business growth and overtake their competitors. To achieve this, they invest a lot in their ETL (extract-transform-load) operations, which take raw data and transform it into actionable information. It’s no wonder, then, that ETL testing is a crucial part of a well-functioning ETL process, since the ETL process generates mission-critical data.
What a pivotal year it’s been for the integration platform as a service (iPaaS) market! Today, improving customer centricity, driving new innovative applications and systems to market, or optimizing supply chains to meet new digital demands have become so critical to many organizations’ growth.
In this blogpost, we are going to take a look at some of the OpDB related security features of a CDP Private Cloud Base deployment. We are going to talk about auditing, different security levels, security features of Data Catalog, and Client Considerations.
The Fivetran data warehousing benchmark compares price, performance and differentiated features for BigQuery, Presto, Redshift and Snowflake.
Fivetran introduces our native integration with dbt to run your transformation models hosted in Github directly from our application!
With a modern data stack, Ziff Davis centralizes Google Analytics 360 and ad data to gain insights into paid search performance.
The world has gone remote. For those that can, working from home has become the new normal thanks to Covid-19. The gradual shift underway over the past number of years has accelerated, and most organizations have adapted. This mass pivot has been enabled largely by technology, specifically the move to SaaS and cloud, which allow employees to working productively from almost anywhere.
Fivetran integrates MariaDB data into BigQuery, enabling Australia’s largest ecommerce pet store to understand and improve its customer journey.
Consider these comparisons before you try to build your own data pipeline
As organizations look to optimize the speed and cost of their cloud journey in today’s rapidly evolving economy, Cloudera is delighted to announce the availability of Cloudera Data Platform (CDP) Public Cloud in AWS Marketplace. Now customers can easily, confidently and cost-effectively discover, procure and deploy the world’s first Enterprise Data Cloud, powered by AWS, for faster time-to-insight from their advanced analytics and machine learning services.
At Snowflake, our number one company value is “put customers first. We only succeed when our customers do. And how we help enable their success depends on how well we serve them as a technology provider. To understand if our efforts meet their needs, we conduct an annual Customer Experience Relational Survey. As we’ve done each year, we are pleased to share the findings of this year’s survey, conducted in May 2020 and produced in partnership with Walker.
Supply chain redesign has become an area of critical focus as businesses try to ensure supply chains aren’t overly reliant on any one constituent part.
I know a lot of organizations have really struggled in the current environment, but the last six months have actually been quite exciting for us at Yellowfin. We've achieved a lot and have built a fantastic strategy for the future. We have really focused in on our sales organization, hiring a new VP of Global Sales, Josh Read, and appointing new sales leadership in the regions as well.
Enterprises are reaching strides in Marketing, but are these companies reaping the rewards of unified data analysis?
Evaluating a new, unknown technology is a complicated task. Although you can articulate the goals you’re trying to achieve, you’re probably faced with multiple solutions that approach the problem in different ways and highlight varying features. To cut through the clutter, you need to figure out what questions to ask in order to evaluate which technology has the optimal capabilities to get the job done in your unique setting.
In the first part of the blog series, we discussed how correlation analysis can be leveraged to reduce time to detection (TTD) and time to remediation (TTR) by guiding mitigation efforts early. Further, correlation analysis helps to reduce alert fatigue by filtering out irrelevant anomalies and grouping multiple anomalies stemming from a single incident into one alert. In this part, we throw light on the applicability of correlation analysis in the realm of eCommerce, specifically, promotions.
A North American telecom company struggled for years trying to react quickly enough to new categories and new levels of spam texts and calls. They also did not have a good way to know when and why they would need additional capacity on their own, or any other telecom company’s networks.
For enterprise organizations, managing and operationalizing increasingly complex data across the business has presented a significant challenge for staying competitive in analytic and data science driven markets.
Random forest is one of the most widely used machine learning algorithms in real production settings.
Automate the process of building and maintaining data pipelines to free up data engineers for more interesting, mission-critical projects.
Today, Snowflake began life as a publicly traded company on the New York Stock Exchange. What does it mean? It depends on who you are. For employees, this is of course a huge milestone, especially for the longest serving employees who hired on at the company in 2013 when the company first started staffing beyond its core founding team.
Neural Guard produces automated threat detection solutions powered by AI for the security screening market. With the expansion of global trends like urbanization, aviation, mass transportation, and global trade, the associated security and commercial challenges have become ever more crucial.
Refine customer success data modeling with more detailed ticket tracking
Over the last decade, data collection has become a commodity. Consequently, there has been a tremendous deluge of data in every area of industry. This trend is captured by recent research, which points to growing volume of raw data and growth of market segments fueled by that data growth.
A Forbes survey shows that data scientists spend 19% of their time collecting data sets and 60% of their time cleaning and organizing data. All told, data scientists spend around 80% of their time on preparing and managing data for analysis. One of the greatest obstacles that make it so difficult to bring data science initiatives to life is the lack of robust data management tools.
For some, this may look like a new category at this year’s Data Impact Awards. However, the Enterprise Data Cloud category marks the evolution of what was once the Data Anywhere category. The main reason for this change is that this title better represents the move that our customers are making; away from acknowledging the ability to have data ‘anywhere’.
Cloudera Data Platform 7.2.1 introduces fine-grained authorization for access to Azure Data Lake Storage using Apache Ranger policies. Cloudera and Microsoft have been working together closely on this integration, which greatly simplifies the security administration of access to ADLS-Gen2 cloud storage.
In 2020, however, contining to rely just on dashboards for your BI needs isn't enough. Why? Data is growing exponentially - in both size and complexity - within every business today. Manually keeping track of performance and searching for insights has become difficult for many users, and it's fostered new expectations - to be able to do more with analytics - including making it faster and easier to keep on top of changes or opportunities.
For our Head of Product Design and Creative Director, Tony Prysten, design is always top of mind. In analytics platforms, good design plays an important role in how people understand and use data. Here Tony shares how Yellowfin has been created with designers and developers in mind.
There seems to be universal acceptance that effective use of data can help maximize bottom line value. However, many businesses still aren’t successfully leveraging data to its full extent due to people, processes and technology roadblocks. Thankfully, there is a unifying approach to unlocking the immense opportunity for enterprises to more effectively leverage data to create new products, services and business models.
As an industry that capitalizes on the transfer and exchange of data, the telecom sector has a wealth of data on their hands they can use to stay ahead of the competition — network performance, product usage, customer information, billing details, and more. This constant influx of data presents a lot of opportunities for telcos, but only if organizations adopt strategies that aim to make this data accessible and useful.
Spark is known for its powerful engine which enables distributed data processing. It provides unmatched functionality to handle petabytes of data across multiple servers and its capabilities and performance unseated other technologies in the Hadoop world. Although Spark provides great power, it also comes with a high maintenance cost. In recent years, innovations to simplify the Spark infrastructure have been formed, supporting these large data processing tasks.
Everybody needs more data and more analytics, with so many different and sometimes often conflicting needs. Data engineers need batch resources, while data scientists need to quickly onboard ephemeral users. Data architects deal with constantly evolving workloads and business analysts must balance the urgency and importance of a concurrent user population that continues to grow.
The FORTUNE 500 list by FORTUNE is one of those venerable institutions of the business world. Since 1955, FORTUNE has been portraying the shape of the U.S. economy through its annual top 500 companies.
The ongoing disruption to critical supply chains in both the manufacturing and retail space has seen businesses having to respond quickly, turning to data, analytics, and new technologies to better predict and manage ‘real-time’ business disruptions.
In this blog post, we are going to take a look at some of the OpDB related security features of a CDP Private Cloud Base deployment. We are going to talk about encryption, authentication and authorization.
The heat of summer and the smell of fresh-cut grass triggers many memories. I feel a sense of yearning from those memories, particularly as I know, during normal times, the college football season has begun. It’s been many years – too many to mention here – since I last played. The sense of anticipation persists, as it is this time of year the team would gather for camp.
Cloudera has been named a Leader in The Forrester Wave™: Notebook-Based Predictive Analytics and Machine Learning, Q3 2020. At Cloudera, we are committed to always staying at the forefront of data and analytics innovation — enabling enterprises to more optimally work with data to deliver analytic results across the business quickly and securely.
Digital transformation has been on the agenda for a long time, but the sudden need to respond to the unprecedented challenges of 2020, has meant the buzzword has become an executable reality for many enterprises. I recently came across a KPMG report that revealed that 80% of executives are increasing investments on emerging technologies now, to drive higher realized value in the future. Underlying digital transformation and investment decisions is a precious asset: data.
Recently, we took another look at our Chicago crime dataset to see what was different in crime as a result of the COVID-19 lockdown and what we found was fascinating. This is a dataset provided by the City of Chicago that tracks any type of reported crime. We often use it to demonstrate the power of Yellowfin. The first thing we saw was that a lot of crime went down. This chart shows crime rates in Chicago over the past two years.
The decision tree algorithm - used within an ensemble method like the random forest - is one of the most widely used machine learning algorithms in real production settings.
With Fivetran handling ELT, ASICS Digital engineers spend time on data science and machine learning projects to propel the business forward.
In our last two posts, we talked with Deloitte’s Marc Beierschoder and Martin Mannion respectively about the requirement organizations have to deploy their data and analytics, quickly, into a hybrid environment. On top of that, there is the fundamental aspect of consistent security and governance of your enterprise data cloud and need for multiple users with different requirements to access data flexibly.
This blog post will present a simple “hello world” kind of example on how to get data that is stored in S3 indexed and served by an Apache Solr service hosted in a Data Discovery and Exploration cluster in CDP. For the curious: DDE is a pre-templeted Solr-optimized cluster deployment option in CDP, and recently released in tech preview. We will only cover AWS and S3 environments in this blog.
With the third generation of BI upon us, analytics solutions are leveraging AI to generate insights, automate tasks, and support new types of interactions. Qlik is consistently recognized as a leader in augmented analytics, and with the September 2020 release, we’ve set the bar even higher.
Communication Service Providers (CSPs) are in the middle of a data-driven transformation. The current scale and pace of change in the Telecommunications sector is being driven by the rapid evolution of new technologies like the Internet of Things (IoT), 5G, advanced data analytics, and edge computing. This is opening up new revenue opportunities, use cases, and even the possibility for different types of business models within the sector, changing the way that CSPs operate.
The pandemic has created monumental shifts in daily life, making all of us re-evaluate almost every aspect of our work and home lives.
In the final installment in the series, Vijay Raja, Director of Industry & Solutions Marketing at Cloudera shares his views on how the telecom sector is changing and where it goes next. Hi Vijay, thank you so much for joining us again. To continue where we left off, how are ML and IoT influencing the Telecom sector, and how is Cloudera supporting this industry evolution?
Cluster analysis is a process used in artificial intelligence and data mining to discover the hidden structure in your data. There is no single cluster analysis algorithm. Instead, data practitioners choose the algorithm which best fits their needs for structure discovery. Here, we present a comprehensive overview of cluster analysis, which can be used as a guide for both beginners and advanced data scientists.
Companies from every industry vertical, including finance, retail, logistics, and others, all share a common horizontal analytics challenge: How do they best understand the market for their products? Solving this problem requires companies to conduct a detailed marketing, sales, and finance analysis to understand their place within the larger market. These analyses are designed to unlock insights in a company's data that can help businesses run more efficiently.
Google BigQuery was released to general availability in 2011 and has since been positioned as a unique analytics data warehousing service. Its serverless architecture allows it to operate at scale and speed to provide incredibly fast SQL analytics over large datasets. Since its inception, numerous features and improvements have been made to improve performance, security, reliability, and making it easier for users to discover insights.
Recently, Cloudera announced the release of Cloudera CDP Private Cloud, delivering the final component of our hybrid cloud strategy. There’s nothing comparable to it in the industry. CDP Private Cloud offers benefits of a public cloud architecture—autoscaling, isolation, agile provisioning, etc.—in an on-premise environment.
Comprehensive, comparative analysis of our data can be a highly time-consuming manual task for your users. But new-wave automation and machine learning (ML) tools makes monitoring, alerting and gleaning new insights a lot faster. One such example is Yellowfin Signals and the latest addition to its extensive algorithm library, called step change.
Data science is proving to be a major competitive advantage for companies. While business intelligence (BI) helps companies with reporting and historical analysis, data science goes a step further and predicts the future. It can leverage much more data from many more sources, and using machine learning (ML) principles, it automatically identifies patterns and trends to model, predict, or forecast future outcomes.
From a tech enthusiast to a modern entrepreneur, everyone is using Search Engine Optimization or SEO to generate online traffic. Such is the importance of SEO today that the entire industry today is worth $65 billion. People often find it confusing to implement strategies for search engine optimization, but there are a few tools that are making the process easy for digital marketers and SEO experts. Are you looking for the best SEO tools for your brand?
I’ll admit it. I am a gushing fan of this new product from Allegro AI called Allegro Trains. I’m not sure what to call it — what noun I should attach to this creature. “Framework” and “Platform” have become, to my ears, rather meaningless jargon designed to detach suit-wearing types from their money. “Harness” is close.
Idempotence is an important characteristic of many systems, including Fivetran. Learn what it means.
It is hard to believe if you have had previous experience with setting up, sizing, and deploying a distributed search engine service that this is possible. Imagine how many times IT has lost valuable time spending hours trying to understand Apache Solr application requirements and map them into how to best size and deploy the Solr service. Time that is lost to Line of Business as well.
Live data-streaming offers businesses exciting new opportunities to transform the way they operate, leveraging real-time insights to drive better decision making and enhance operational efficiency. To find out more about how streaming data might impact the financial services sector I sat down for a chat with Dinesh Chandrasekhar, Head of Product Marketing in Cloudera’s Data-in-Motion Business Unit.
The way that people do business in Japan has radically changed as a result of COVID-19. Historically, the Japanese have been very much about face to face relationship selling where you establish relationships and build trust but now everything has to be done remotely or online. Like everyone else, the Japanese are really keen to keep doing business so they've actually embraced the remote way of doing things which has been quite interesting to watch and had some unexpected benefits.
To transform your retail organization and be more customer-centric, you need to improve in areas where it counts: areas that impact your customers’ experience. The lifeblood of any retail organization, your customers expect a lot from the industry, especially with all the recent changes in shopping habits and the economic landscape in general. Customers are more discerning these days, and it will be up to retail organizations to cater to their needs and expectations.