What Is a Data Pipeline and Why Your Ecommerce Business Needs One

Our six key points on data pipelines include: Whether you’re a one-person show reselling items on an online marketplace or a large Ecommerce enterprise with hundreds of employees, these businesses share a common factor: both generate data. The size of your business can influence the amount of data you generate, sure. But any amount of data — if it’s not adequately accessible — is worthless. Every business, especially an Ecommerce business, needs a data pipeline.

Kafka best practices: Monitoring and optimizing the performance of Kafka applications

Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Administrators, developers, and data engineers who use Kafka clusters struggle to understand what is happening in their Kafka implementations.

MLOps World Toronto: MLOps Beyond Training Simplifying and Automating the Operational Pipeline

Most data science teams start with building AI models and only think about operationalization later. But taking a production-first approach and automating components is the key to generating measurable ROI for the business. In this talk, Iguazio’s co-founder and CTO, Yaron Haviv, explains how to simplify and automate your production pipeline to bring data science to production faster and more efficiently. He displays real live use cases while going through all the different steps in the process.

Demo - Exploiting a data fabric to drive data literacy and data democratisation

Join Talend experts to learn how to drive data literacy and adoption throughout your organisation with a seamless data fabric. Discover how to balance collaboration, ease of use and governance to deliver trusted data insights and outcomes at the speed of the business.

Performance considerations for loading data into BigQuery

It is not unusual for customers to load very large data sets into their enterprise data warehouse. Whether you are doing an initial data ingestion with hundreds of TB of data or incrementally loading from your systems of record, performance of bulk inserts is key to quicker insights from the data. The most common architecture for batch data loads uses Google Cloud Storage(Object storage) as the staging area for all bulk loads.

People Were Skeptical of Data Warehouses. Now History Is Repeating.

This is a guest post by computer scientist Bill Inmon, recognized as the "father of the data warehouse." Bill has written 65 books in nine languages and is currently building a technology called textual ETL. Many years ago, there were no data warehouses. Most Ecommerce retailers relied on legacy systems with unintegrated data that couldn’t communicate with each other, resulting in data silos. Comparing data sets from these systems was almost impossible.

Differences between the C++ and Java MiNiFi agents

In this video we will go through all the differences between the C++ and Java MiNiFi agents. The video shows the differences observed on the Edge Flow Manager UI ranging from different information to the presence of buttons and dropdown elements determined by the agent type. Differences in feature set and functionality are also highlighted. The two implementations also have different footprints (memory and CPU) as well as a different set of available components. This video will help you determine the MiNiFi agent that best suits your use case.