Systems | Development | Analytics | API | Testing

April 2023

The Top 15 Matillion Alternatives

Businesses and organizations must leverage the power of data to stay ahead of competitors in today's fast-paced market. But ingesting data from various sources is only possible with a specialized solution like Matillion. Its ease of use and hundreds of pre-built connectors made Matillion popular among many companies. But its pricing and limited capabilities convinced some of them to seek an alternative.

5 Ways to Use Log Analytics and Telemetry Data for Fraud Prevention

As fraud continues to grow in prevalence, SecOps teams are increasingly investing in fraud prevention capabilities to protect themselves and their customers. One approach that’s proved reliable is the use of log analytics and telemetry data for fraud prevention. By collecting and analyzing data from various sources, including server logs, network traffic, and user behavior, enterprise SecOps teams can identify patterns and anomalies in real time that may indicate fraudulent activity.

Deploying Machine Learning Models for Real-Time Predictions Checklist

Deploying trained models takes models from the lab to live environments and ensures they meet business requirements and drive value. Model deployment can bring great value to organizations, but it is not a simple process, as it involves many phases, stakeholders and different technologies. In this article, we provide recommendations for data professionals who want to improve and streamline their model deployment process.

Is Your Data Speaking to You? Real-Time Anomaly Detection Helps You Listen Effectively

As we hurtle into a more connected and data-centric future, monitoring the health of our data pipelines and systems is becoming increasingly harder. These days we are managing more data and systems than ever before, and we are monitoring them at a higher scale.

Running Ray in Cloudera Machine Learning to Power Compute-Hungry LLMs

Lost in the talk about OpenAI is the tremendous amount of compute needed to train and fine-tune LLMs, like GPT, and Generative AI, like ChatGPT. Each iteration requires more compute and the limitation imposed by Moore’s Law quickly moves that task from single compute instances to distributed compute. To accomplish this, OpenAI has employed Ray to power the distributed compute platform to train each release of the GPT models.

A Comprehensive Guide to Integrating Product Analytics With Other Data Sources and Systems

In today's data-driven world, product analytics is crucial in understanding user behavior, improving product features, and driving business growth. However, product analytics alone may not provide a complete picture of user interactions and business performance. Integrating product analytics with other data sources and systems is essential to gain deeper insights and make more informed decisions.

Yellowfin Guided NLQ vs Tableau Ask Data: What's the Difference?

When it comes to choosing a business intelligence (BI) solution vendor for your business, there are a variety of factors to consider. One important area of comparison is the natural language query (NLQ) features offered by different BI vendors. NLQ is increasingly becoming a key capability of the modern self-service analytics experience, as it allows users to ask complex questions of their data, and receive insightful answers in the form of a pre-generated, best practice data visualizations.

Building Cloud Native Data Apps on Premises

Data is core to decision making today and organizations often turn to the cloud to build modern data apps for faster access to valuable insights. With cloud operating models, decision making can be accelerated, leading to competitive advantages and increased revenue. Can you achieve similar outcomes with your on-premises data platform? You absolutely can.

DynamoDB to Redshift: 4 Best Methods

When you use different kinds of databases, there would be a need to migrate data between them frequently. A specific use case that often comes up is the transfer of data from your transactional database to your data warehouse such as transfer/copy data from DynamoDB to Redshift. This article introduces you to AWS DynamoDB and Redshift. It also provides 4 methods (with detailed instructions) that you can use to migrate data from AWS DynamoDB to Redshift.

Why Talend

Activate your data to drive your business forward. Talend offers complete, flexible, trusted data management so you can get more value from your data. Count on Talend for modern data management that drives real value. Talend has been recognized as a leader in its field by leading analyst firms and industry publications.

Ingest your data with Cloudera Streaming & DataFlow

Cloudera Data in Motion is designed to enable businesses to respond to critical events in real-time and streamline their data capture, processing, and distribution, while maintaining security and governance. It offers an open architecture for maximum flexibility and control over resources, addressing data in motion challenges.

How to start a data literacy program in 6 steps

In a world where 2.5 quintillion bytes of data are created every day, it’s not surprising that organizations want to harness the power of being data-driven. In our 2022 Data Health Barometer, 99% of companies surveyed recognized that data is crucial for success — but 97% said they face challenges in using data effectively. Perhaps in response to those challenges, 65% of companies reported that they'd started a data literacy program.

Product announcement: Keboola is launching no-code transformations!

In this new exciting development, Keboola is launching no-code data transformations for everybody on the platform. No-code transformations empower users without the technical know-how to build robust, feature-rich applications without having a degree in computer science or waiting for the IT department to develop the apps for them.

Building a Data-Centric Platform for Generative AI and LLMs at Snowflake

Generative AI and large language models (LLMs) are revolutionizing many aspects of both developer and non-coder productivity with automation of repetitive tasks and fast generation of insights from large amounts of data. Snowflake users are already taking advantage of LLMs to build really cool apps with integrations to web-hosted LLM APIs using external functions, and using Streamlit as an interactive front end for LLM-powered apps such as AI plagiarism detection, AI assistant, and MathGPT.

Using Dead Letter Queues with SQL Stream Builder

Cloudera SQL Stream builder gives non-technical users the power of a unified stream processing engine so they can integrate, aggregate, query, and analyze both streaming and batch data sources in a single SQL interface. This allows business users to define events of interest for which they need to continuously monitor and respond quickly. A dead letter queue (DLQ) can be used if there are deserialization errors when events are consumed from a Kafka topic.

Managing technical debt: How to go from 12 BI tools to 1

CIOs are fed up with having a plethora of BI and analytics tools with business units seemingly chasing the shiniest new solution. And although most industry surveys show data and analytics budgets continuing to grow despite a faltering economy, there is closer scrutiny and belt tightening to rid teams of overlapping capabilities. Here’s a look at how BI tool portfolios have become such a mess and how to streamline them.

Why You Should Move From Management Reporter to Jet Reports

Much like Apple people tend to be all Apple, all the time, Microsoft Dynamics ERP users tend to prefer Microsoft products for all their computing needs. It’s not hard to understand why. Using products from the same ecosystem prevents compatibility issues and saves time in learning multiple systems.

Top 10 Data Extraction Tools for 2023

A data extraction tool can help improve the accuracy of data by automating the extraction process and reducing the risk of human error. This can lead to more reliable and consistent data that can be used to make better business decisions. Moreover, data extraction tools can help you increase productivity and improve the quality of your data as they automate the process of retrieving data from multiple sources.

The Best Big Data Tools in 2023

Data engineers who work with huge amounts of data know that “big data” is not just an overhyped term. When the volumes of data get into petabytes the best data engineering tools start to break down. This is when you need devoted big data technologies that are fault-tolerant, scalable, and offer high performance even when amounts of data test the limits of your data platform. This article won’t be just another listicle. Instead, we’ll showcase the best big data tools by use case.

How To Improve User Engagement And Retention With Product Analytics?

In the article “From Data to Insights: An Introduction to Product Analytics”, we walk you through the basics of product analytics, providing you with a high-level approach to get started with it. This time, we’d like to dig in deeper to dissect every step of the product analytics process, assuming that the main goal is to improve user engagement and retention.

Creating a basic write back solution with Qlik Cloud

Using Qlik Cloud Analytics and Qlik Application Automation you can create sophisticated solutions to solve many business problems. With Qlik's new properties in the action button object, you can now execute an Application Automation workflow while passing parameter / value pairs to the workflow. Check out this simple walk-through to see an example of writing back data to a MS Azure SQL database.

SaaS In 60 - New filter, scheduler interval, write back and more!

This week we’ve added a new customizable filter object, a new interval in the scheduler for alerts and reloads, support for Parquet data files and the ability to execute an app automation workflow from a button object in Qlik Cloud Analytics with variable value passing which can be used for a number of advanced automated workflow use cases, including write back.

Data Mesh vs. Data Fabric

In today’s data-driven world, businesses must deal with complex challenges related to managing, integrating, and properly using massive amounts of data housed in multiple locations. Organizations that unlock the right data architectural approach empower themselves with much better decision-making and strategic insights. Two popular approaches — data mesh and data fabric — have surfaced as prominent and innovative solutions for handling data at scale.

Discovering Data Monetization Opportunities in Financial Services

Data has become an essential driver for new monetization initiatives in the financial services industry. With the vast amount of data collected from customers, transactions, and market movements, among other sources, this abundance offers tremendous potential for financial institutions to extract valuable insights that can inform business decisions, improve customer service, and create new revenue streams.

G2 Spring Reports 2023: Keboola leads the way once again

Keboola has once again taken first place in multiple categories, as announced by the world’s leading software peer-to-peer review marketplace, G2. As a data stack as a service, Keboola has one core goal - to make data accessible across all departments in your business. We want to empower teams everywhere to use data and build data products effortlessly, regardless of their technical background. Keboola’s success in the G2 Spring Report is gratifying proof that our users continue to love us.

Use AI to train AI: prompt learning using OpenAI API and ClearML

Making a Question Answering (QA) bot that can cite your own documentation and help channels is now possible thanks to chatGPT and Langchain, an open-source tool that cleverly uses chatGPT but doesn’t require retraining it. But it’s a far cry from “out of the box.” One example is that you have to get the prompt just right. To get an LLM (large language model) to do exactly what you want, your instructions will have to be very clear, so what if we automate that too?

How Spreadsheet Server Can Help Viewpoint Customers Optimize Their Reporting

If you work in a finance team within a construction business, it’s likely your main goals are to reduce risk, improve profitability, and maintain exceptional levels of compliance. To achieve success, you need direct access to accurate data from your ERP and the ability to quickly create drillable Excel reports for GL and other finance requirements.

Qlik's New Capacity Model - Easing Adoption Burdens While Putting Data's Value Front and Center

For years data and analytics buyers have been managing a difficult set of tradeoffs. They know they need to invest in solutions to drive their businesses and data strategies forward. In fact, every third-party survey shows they plan to do so well into the future. However, customers have struggled in how to balance cost and value, and getting a clear answer isn’t always obvious or easy.

Industry Impact | The Hybrid Data Platform for Insurance

In the age of connected everything, insurers face new challenges and opportunities as they strive to deliver personalized insurance coverage while minimizing costs and preventing fraud. With the Cloudera Data Platform, insurers can unlock the power of real-time data and analytics to make insurance more precise, more personalized, and more profitable. By building a 360-view of each customer, streamlining claims and services, and unlocking usage-based insurance with IoT sensor data, insurers can manage risks and create opportunities to transform for today and stay ahead tomorrow.

Angles Professional: Operational Reporting from insightsoftware

Speed-up operational report production with ready to go software including pre-built content that meets 80% of finance’s needs. With direct, multi-source connectivity and drag-and-drop editing, you have everything in hand to self-serve interactive reports and dashboards to support day-to-day decision making.

Say Hello to ClearML's Machine Learning-Powered Sarcasm Detector: How to Train a Language Classifier using ClearML

by Victor Sonck, Developer Advocate, ClearML Sarcasm can be difficult to detect in text, especially for machines. However, with the power of large language models, it’s possible to create a tool that can identify sarcastic comments with high accuracy. That’s exactly what the ClearML team did with their latest project: a sarcasm detector that combines various ClearML tools to showcase the capabilities of MLOps.

Ensuring Data Privacy in Product Analytics: Why Is It Important, and How To Do It?

In our previous Countly Digest, we delved into the top concern of product managers and analysts: tracking user behaviour. It is a crucial aspect of product analytics, as it enables companies to understand user behaviour and preferences, leading to better product development and increased revenue. ‍

Top 11 Data Ingestion Tools for 2023

Data ingestion is an important component of any successful data initiative. It refers to the process of collecting data from multiple sources and loading it into another system. Businesses most commonly use a subtype of data ingestion called ETL (extract, transform, load), which allows the data to be transformed before it's loaded. This extra step provides many benefits. Most importantly, it allows organizations to automatically match and correlate data from a variety of different sources.

Unlock Business Value with the New Snowflake Manufacturing Data Cloud

Manufacturers today are implementing a range of new technologies to increase operational efficiency and create visibility and flexibility across value chains. These include robotics, automation, data analytics, IoT, and artificial intelligence (AI) and machine learning (ML), according to Deloitte. Company leaders hope these innovations will help them create more productive and resilient supply chains, improve production quality and efficiency, and mitigate risks.

Tzag Elita and ClearML: Powering the Future of AI Workflows

The world of artificial intelligence and machine learning is constantly evolving, with new challenges and innovations emerging every day. Tzag Elita is a leading provider of HPC and AI solutions in Israel, and understands the importance of staying ahead of the curve and delivering cutting-edge solutions to their customers. That’s why ClearML is excited to announce our partnership with Tzag Elita.

Marketing and advertising data extraction made easy

Watch this video for a brief explanation of how to get all your ad campaign analytics data out of your ad platforms and see it all in one place without any coding, API extraction, or manually compiling data. Stitch partners with the most common ad platforms to help move your data from sources like Google Ads, Linkedin Ads, Facebook Ads including Instagram, TikTok for Business, Snapchat Ads, Adroll, Microisoft Bing Ads, and more into any data warehouse or data lake.

The Snowflake Manufacturing Data Cloud

Introducing the Snowflake Manufacturing Data Cloud, a global network that connects manufacturers to the data, applications, and services needed to enable industrial use cases at scale. Bridging the divide between IT and OT systems, identifying upstream and downstream risks, managing costs down while increasing production and quality, the Manufacturing Data Cloud offers a way to consolidate, analyze, and democratize data generated across an enterprise, from the factory floor to the ends of the supply chain.

AWS DMS: Challenges & Solutions Guide

AWS DMS (Amazon Web Services Database Migration Service) is a managed solution for migrating databases to AWS. It allows users to move data from various sources to cloud-based and on-premises data warehouses. However, users often encounter challenges when using AWS DMS for ongoing data replication and high-frequency change data capture (CDC) processes.

Boost Data Literacy to Overcome Skills Shortages

As the world emerges from the recent pandemic, organizations continue to struggle to find solid ground in an uncertain economic climate. Plagued by supply chain disruptions and price inflation, finance teams are at the forefront of organizational efforts to strategize and remain agile in changing circumstances.

Evaluating the risks associated with a data mesh approach

This blog looks at some of the risks associated with data mesh and why organizations need to look at more than just the concepts of distributed data management to ensure successful data mesh. Companies need to evaluate the needs for managing their data products, data governance, the use of data platforms, and how business domains will be managed across the data ecosystem.

Maximize the value potential of your data with data excellence

To reach the value potential of your data, visibility of challenges and an incremental improvement path is key. Many organizations are overlooking some of the foundational changes required in both technology and culture to enable maturity to data excellence. Join Darren Brunt to explore the journey, benefits and potential challenges around establishing data excellence.

From Data to Insights: An Introduction to Product Analytics

At Countly, we’ve been talking about product analytics for over a decade, moving from one in-depth article to the other without really taking the time to give our readers a straightforward handbook to understand this practice and harness its potential. As industry veterans, we'd like to share our two cents on product analytics. ‍

Data Management with Cloudera CDP | Gartner Show Floor Showdown

At Gartner's Data and Analytics Summit in Orlando Florida, Director of Product Management, David Dichmann, presented the Cloudera Data Platform (CDP) for Data Management. Using Flood data provided by Gartner together with additional data assets, we demonstrate how Cloudera's Hybrid, Open, Portable and Secure data platform could assist data practitioners in developing an early warning detection service for potential coastal flooding for the state of Florida.

A buyers guide to choosing the most suitable data observability platform

A lot has been written about data observability by authors, analysts, and vendors over the past few years as it is becoming an increasingly important component of organizations' data architectures. This blog will examine why organizations need data observability and how they should approach the buying experience/cycle.

Top 5 Microsoft SQL ETL Tools for Data Integration

Data integration is the process of combining data from multiple sources into a single, unified destination. An ETL tool can help streamline this process, as it automatically extracts data from various sources, transforms it, and loads it into a target warehouse. By using an ETL tool, organizations reduce the cost and complexity of data integration, improve data accuracy, and ensure data security and privacy.

The 19 Best AI Use Cases in 2023

The recent advances in deep learning neural networks are pushing beyond what we thought AI technology could do. Heck, with DALL-E 2 winning art competitions and ChatGPT passing anything from New York’s Bar exam to Advanced Sommelier exams, modern AI technology is performing better than the average Joe. Which begs the question: How can AI solutions be used to improve business outcomes?

North Labs: Automating processes and saving 10+ hours per week

North Labs is an end-to-end data consultancy enabling business transformation in the areas of DevSecOps, Data Engineering, and Cloud. The team serves highly regulated industries with their mission-critical digital implementations, while simultaneously safeguarding data and ensuring compliance. We talked to Daniel Rothamel, Cloud Data Delivery Engineer at North Labs, to better understand how the joint solution of Snowflake, Keboola, and North Labs helps their customers to get more value from their data.

Transforming Manufacturing Data: The Power of Qlik and Databricks Together

Manufacturing is undergoing a massive transformation. Driven by technological advancements that generate vast amounts of data. The industry is moving towards becoming smarter, more sustainable, and services driven. The fragmented nature of manufacturing’s data architecture however, has created barriers to realizing the full value of data, with many projects stalling at the Proof-of-Concept stage.

Lenses 5.1 - A 1st class ticket to be event-driven in AWS

Hello again. We strive to improve the productivity of developers building event-driven applications on the technology choices that best fit your organization. AWS continues to be a real powerhorse for our customers. Not just for running the workloads, but in supporting them with their native services: MSK Kafka, MSK Connect and now increasingly Glue Schema Registry. This is bringing a strong alternative to Confluent and their Kafka infrastructure offerings.

Secret rotation for Kafka Connect connectors with AWS Secret Manager

With version 5.1, Lenses is now offering enterprise support for our popular open-source Secret Provider to customers. In this blog, we’ll explain how secrets for Kafka Connect connectors can be safely protected using Secret Managers and walk you through configuring the Lenses S3 Sink Connector with the Lenses Secret Provider plugin and AWS Secret Manager.

Solving key challenges in the ML lifecycle with Unravel and Databricks Model Serving

Machine learning (ML) enables organizations to extract more value from their data than ever before. Companies who successfully deploy ML models into production are able to leverage that data value at a faster pace than ever before. But deploying ML models requires a number of key steps, each fraught with challenges.

Managing the Cost and Complexity of Hybrid Cloud Infrastructure

More enterprises are striving to become data-driven so they deliver greater value to their customers, shareholders, employees and society at large. But many are still struggling with data infrastructure. To reap the benefits, learn how companies are solving key challenges, from managing the rapid growth of data to ensuring security and compliance, in this discussion with Dan McConnell, SVP of product management, Enterprise Infrastructure.

Selecting the right data pipeline tools

Data integration is the process of combining data from different sources and formats to create a unified and consistent view of the data. This involves merging data from multiple databases, applications, and other sources into a single repository, and transforming and formatting data so that it can be easily accessed and analyzed. Data assets need quality controls to ensure they are valid and reliable as many teams within an organization leverages the same data for different purposes.

Struggling with Inefficient Processes? Process Mining Can Help

Every organization contends with numerous moving parts that drive business forward. But they can be inefficient and convoluted – in fact, Forrester research shows that 71% of organizations use 10 or more applications for a single business process. To make matters worse, only 16% of companies have complete visibility over their own processes. This is where process mining can help. How can you gain more clarity so you can improve efficiency within your organization?

The Glue Schema that binds Apache Kafka

With increased applications developed by different engineering teams on Kafka comes increased need for data governance. JSON is often used when streaming projects bootstrap but this quickly becomes a problem as your applications iterate, changing the data structures with add new fields, removing old and even changing data formats. It makes your applications brittle, chaos ensues as downstream consumers fall over due to miss data and SREs curse you.

Jet Reports for Dynamics 365 F&SCM

Jet Reports provides easy to use financial reporting in Excel for D365F&SCM that can be refreshed near real-time, on demand, with the click of a button. Easy-to-configure, pre-built templates get users up and running fast without having to understand complex Dynamics data structures. Access, share, and organize reports on the web to have the accurate answers you need from anywhere.