Systems | Development | Analytics | API | Testing

September 2023

An Introduction to Apache Kafka Consumer Group Strategy

Ever dealt with a misbehaving consumer group? Imbalanced broker load? This could be due to your consumer group and partitioning strategy! Once, on a dark and stormy night, I set myself up for this error. I was creating an application to demonstrate how you can use Apache Kafka® to decouple microservices. The function of my “microservices” was to create latte objects for a restaurant ordering service.

AI-Enhanced Features to Transform your Business - Do More with Qlik Episode 46

Insight Advisor – our intelligent AI-assistant supports a variety of advanced insight generation and automation experiences including search driven insights, conversational analytics, and analysis types – our unique framework for choosing a type of advanced analysis and generating visualizations, NLG, and even smart sheets. You’ll get a look at key driver analysis, to uncover the factors driving a selected metric. And you’ll get a deeper dive into how you can customize insight generation and natural language processing through our business logic layer.

Embedded Analytics and Business Intelligence for Your Applications with Logi Symphony

Create lightning-fast embedded analytics experiences that accelerate time-to-value without the need for additional engineering resources. Seamlessly integrate custom analytics to make data insights accessible and engaging for all users.

Data Streaming Cheat Sheet and Checklist | Data Streaming Systems

Thank you for watching this course. We have a few additional resources for you to dig deeper and be fully equiped to start your data in motion journey: a comprehensive cheat sheet with a check list of what you need to verify before going to production and a sneak preview of what we saved for the follow-up course.

AI Won't Replace Humans - But It Will Change How We Make Decisions

Have you heard the phrase “AI won't replace humans - but humans with AI will replace humans without AI”? I personally love this quote because it perfectly encapsulates the nature of the anticipated workforce shift from the rise of generative AI. As I wrote back in 2017, the power of AI is not about machines supplanting human abilities, but rather about a symbiotic relationship between humans and AI. I think the Star Trek analogy I used then is standing the test of time…

Qlik CEO Mike Capone Discusses Data-Driven Leadership & Empowering Asian Enterprises with Innovation

Join us in this insightful Ask Me Anything session as Qlik CEO, Mike Capone, delves into the world of data-driven leadership. Discover how Qlik is at the forefront of meeting the escalating demand of Asian Enterprises for innovation through the strategic use of data, technology, and analytics. Stay tuned for valuable insights and practical approaches to driving success in today's data-driven landscape.

Cloud Kafka Resiliency and Fault Tolerance | Data Streaming Systems

Learn how to manage cloud volatility when running applications on Confluent Cloud. Understand how to optimally configure Kafka client for resilient cloud operations and explore error handling patterns in Kafka Streams. Leverage concepts like idempotent producers and consumers, and exactly one processing semantics.

How Euromedia Increased Sales Growth by 12% With Data

Euromedia CZ is one of the biggest players in the Czech book market that runs three businesses in one: The book market in which Euromedia operates is notoriously challenging, not only because of its niche nature, but also because of the wild economic trends during Covid lockdowns. Post-Covid Europe and Russia's war in Ukraine have both had a huge impact on consumer demand.

MFT vs. SFTP: Which File Transfer Is Right for You?

Managed File Transfer (MFT) and Secure File Transfer Protocol (SFTP) are both go-to solutions for sending files between internet-connected systems. However, SFTP prioritizes securely transferring files over a network, while MFT is a more robust solution for securing, managing, and even customizing file transfers to fit your specific needs.

Strengthening Your Data Ecosystem with Unrivaled Security

As data ecosystems evolve security becomes a paramount concern, especially within the realm of private cloud environments. Cloudera on Private Cloud with the Private Cloud Base (CDP PvC Base) stands as a beacon of innovation in the realm of data security, offering a holistic suite of features that work in concert to safeguard sensitive information.

Empowering Seamless Data Governance with a New User Experience in Snowsight

At Snowflake, we are dedicated to helping our customers effectively mobilize their data while upholding stringent standards for compliance and data governance. We understand the importance of quick and proactive identification of objects requiring governance, as well as the implementation of protective measures using tags and policies.

Shared Data Reporting: Deep Excel Functionality Meets Web-Based Dashboards

Finance teams are no strangers to pressure. But now more than ever, challenges from both outside and inside organizations are testing your resiliency. Inflation, economic uncertainty, and swiftly-changing regulations significantly impact finance professionals. Every organization has roadblocks like budgetary restraints, data limitations, and clunky, manual processes.

Current '23 Keynote: Streaming into the Future - The Evolution & Impact of Data Streaming Platforms

Jay Kreps (Confluent Co-Founder and CEO), Shaun Clowes (Confluent CPO), and data streaming leaders from organizations like NASA, Warner Brothers, and Notion explore the past, present, and future of data streaming. They will address two key questions: how can organizations integrate data across their applications to deliver better experiences, and how can they embed data and analytics into every part of the business to drive better decision-making?

Top 6 Reasons to Modernize Legacy Messaging Infrastructure

Traditional messaging middleware like Message Queues (MQs), Enterprise Service Buses (ESBs), and Extract, Transform and Load (ETL) tools have been widely used for decades to handle message distribution and inter-service communication across distributed applications. However, they can no longer keep up with the needs of modern applications across hybrid and multi cloud environments for asynchronicity, heterogeneous datasets and high volume throughput.

Practical Data Mesh: Building Decentralized Data Architectures with Event Streams

Why a data mesh? Predicated on delivering data as a first-class product, data mesh focuses on making it easy to publish and access important data across your organization. An event-driven data mesh combines the scale and performance of data in motion with product-focused rigor and self-service capabilities, putting data at the front and center of both operational and analytical use-cases.

Confluent unveils Apache Flink® on Confluent Cloud, making it easier to build real-time applications with stream processing on a unified platform

Confluent launches the industry's only serverless, cloud-native Flink service to simplify building high-quality, reusable data streams. Confluent expands Stream Governance capabilities with Data Portal, so teams can easily find all the real-time data streams in an organisation. New Confluent Cloud Enterprise offering lowers the cost of private networking and storage for Apache Kafka.

Top 8 Salesforce Middleware Integration Tools

Salesforce is among the leading CRM software platforms for collecting and leveraging user data to make smart sales, marketing, and customer support decisions. However, other software in your tech stack can benefit from such data. With the right Salesforce middleware, you can exchange data easily with your other critical tools.

IBM Technology Chooses Cloudera as its Preferred Partner for Addressing Real Time Data Movement Using Kafka

Organizations increasingly rely on streaming data sources not only to bring data into the enterprise but also to perform streaming analytics that accelerate the process of being able to get value from the data early in its lifecycle. As lakehouse architectures (including offerings from Cloudera and IBM) become the norm for data processing and building AI applications, a robust streaming service becomes a critical building block for modern data architectures.

How to Mask PII Before LLM Training

Generative AI has recently emerged as a groundbreaking technology and businesses have been quick to respond. Recognizing its potential to drive innovation, deliver significant ROI and add economic value, business adoption is rapid and widespread. They are not wrong. A research report by Quantum Black, AI by McKinsey, titled "The Economic Potential of Generative AI”, estimates that generative AI could unlock up to $4.4 trillion in annual global productivity.

3 Ways to Replace Distrust of Your SAP Data With Confidence

Unlocking the full potential of your SAP solution requires the complete trust of your data. Without trust, users will second guess any insights, stalling business progress. Research has pinpointed three key pain points that companies encounter with their SAP data: a prevailing sense of data distrust, a lack of maintenance and data cleansing, and a shortage of skilled users. These pain points not only impede progress but also pose significant roadblocks to migrating to S/4HANA, the future of SAP.

Introducing Confluent Cloud for Apache Flink

In the first three parts of our Inside Flink blog series, we discussed the benefits of stream processing, explored why developers are choosing Apache Flink® for a variety of stream processing use cases, and took a deep dive into Flink's SQL API. In this post, we'll focus on how we’ve re-architected Flink as a cloud-native service on Confluent Cloud. However, before we get into the specifics, there is exciting news to share.

Deliver Intelligent, Secure, and Cost-Effective Data Pipelines

The Q3 Confluent Cloud Launch comes to you from Current 2023, where data streaming industry experts have come together to share insights into the future of data streaming and new areas of innovation. This year, we’re introducing Confluent Cloud’s fully managed service for Apache Flink®, improvements to Kora Engine, how AI and streaming work together, and much more.

What is Multi-Tenancy? Understanding Multi-tenant Analytics

Multi-tenancy is a concept that refers to the ability of a software application or system to serve multiple tenants, or customers, on a shared infrastructure. In simpler terms, it is the capability of a single instance of a software application to accommodate multiple users or organizations, each with their own datasets and customization options.

Empower data with BigQuery & Looker

If you’re working with large amounts of data, and looking for guidance on how to build a data warehouse in Google Cloud using BigQuery- this new Jump Start Solution is for you! In this video, we’ll walk you through the Jump Start Solution that combines BigQuery as your data warehouse and Looker Studio as a dashboard and visualization tool.

How to Increase Data Processing: Combining SFTP and Heroku

Secure File Transfer Protocol (SFTP), at its core, is a protocol designed to provide secure file transfer capabilities. With an extensive application in web development and IT infrastructures, its primary use case revolves around the encrypted transfer of files between remote servers and local machines.

Driving Data Discovery and Reliability for Better Business Decision Making

Enterprises are drowning in data. Structured, semi-structured or unstructured data for the modern, data-driven enterprise is everything, everywhere, all at once. But that’s also a challenge for enterprises looking to transform their data into usable information for business success. The sheer volume of data is challenging the ability of enterprises to find trustworthy, reliable data to drive their business decisions. Traditional data catalogs offer only structured data discovery.

From Analytics to Outreach

Not all heroes in the tech world write code. Some wield the power of data analytics and SEO to create compelling stories and foster brand growth. This week, our Monday Member Spotlight features Jose, TestQuality’s Marketing Assistant with years of specialized experience in Google Analytics and SEO. Let's explore how he takes a data-driven approach to spread the word about TestQuality.

How to Use SFTP to Securely Transfer Files

Transferring files securely between machines is a fundamental part of the ETL (Extract, Transform, Load) process, which involves extracting data from one source, transforming it for analysis, and loading it into a data warehouse. The challenge? Ensuring these files are both sent and received without interception by malicious entities. For years, FTP (File Transfer Protocol) served as the go-to method to transfer files between a client and server on a network.

Lenses 5.3: Robust Kafka with single click topic backup/restore

Navigating the intricacies of Apache Kafka just got a lot more intuitive. With Lenses 5.3 we bring you peace of mind, regardless of where you are in your Kafka journey. Our newest release is all about smoothing out the bumps, and making sure you're equipped to handle Kafka's challenges with confidence. Here's a sprinkle of what's in store, ahead of our big 6.0 release later this year.

A single-click Kafka topic backup experience

We like to reduce the most mundane, complex and time-consuming work associated with managing a Kafka platform. One such task is backing up topic data. With a growing reliance on Kafka for various workloads, having a solid backup strategy is not just a nice-to-have, but a necessity. If you haven’t backed up your Kafka and you live in fear of disaster striking, worry no more.

Mission-critical data flows with the open-source Lenses Kafka Connector for Amazon S3

An effective data platform thrives on solid data integration, and for Kafka, S3 data flows are paramount. Data engineers often grapple with diverse data requests related to S3. Enter Lenses. By partnering with major enterprises, we've levelled up our S3 connector, making it the market's leading choice. We've also incorporated it into our Lenses 5.3 release, boosting Kafka topic backup/restore.

Boost Data Streaming Performance, Uptime, and Scalability | Data Streaming Systems

Operate the data streaming platform efficiently by focusing on prevention, monitoring, and mitigation for maximum uptime. Handle potential data loss risks like software bugs, operator errors, and misconfigurations proactively. Leverage GitOps for real-time alerts and remediation. Adjust capacity to meet demand and monitor costs with Confluent Cloud's pay-as-you-go model. Prepare for growth with documentation and minimal governance.

Use GitOps as an efficient CI/CD pipeline for Data Streaming | Data Streaming Systems

Early automation saves time and money. GitOps improves CI/CD pipeline, enhancing operations & traceability. Learn to use GitOps for data streaming platforms & streaming applications with Apache Kafka and Confluent Cloud.

Robust Disaster Recovery with Kafka and Confluent Cloud | Data Streaming Systems

Explore the resilience of Kafka, understand the implications of datacenter disruptions, and mitigate data loss impacts. Learn to scale with Confluent Cloud, cluster and schema linking, and how to use an active/passive disaster recovery pattern for business continuity.

SFTP vs. FTP: Understanding the Difference

When transferring data, especially in the context of Extract, Transform, and Load (ETL), the choice of protocol matters. Both SFTP and FTP provide solutions, but their intrinsic differences could greatly influence the outcome in terms of security and functionality. Here's everything you need to know about SFTP vs. FTP for ETL.

Unlocking New Capabilities with ChatGPT in Logi Symphony

In today’s fast-paced market, data has become the lifeblood of decision-making. For application teams and users, having access to insightful and actionable data is not just a luxury; it’s a necessity. And now, the proliferation of AI in embedded analytics, like Logi Symphony’s new ChatGPT integration, is revolutionizing the way application teams and users interact with data, increasing efficiency and reducing the technical skills required to generate valuable insights.

Bridge the Gap Between Reporting and Data Visualization in Power BI

In a rapidly evolving business environment, timely insights from data and the ability to react quickly to change are critical. Business intelligence is a key tool, empowering companies to get the most out of their data by providing tools to analyze information, streamline operations, track performance, and inform decision-making. Power BI can generate easy-to-read visualizations that help stakeholders perform key analysis.

Challenges Using Apache Kafka | Data Streaming Systems

Streaming platforms need key capabilities for smooth operations: data ingestion, development experience, management, security, performance, and maintenance. Self-managed platforms like Apache Kafka can meet these needs, but can be costly and require intensive maintenance. On the other hand, Confluent Cloud offers fully-managed services with features like scalable performance, auto-balancing, tiered storage, and enhanced security and resiliency. It provides systematic updates and maintenance, freeing users from infrastructure concerns. Confluent Cloud streamlines creation of a global, well-governed data streaming platform.

How DISH Wireless Benefits From a Data Mesh Built With Confluent

"Over the last few years, DISH Wireless has turned to AWS partners like Confluent to build an entirely new type of telecommunication infrastructure—a cloud-native network built to empower developers. Discover how data streaming allows DISH Wireless to:— Deliver data products that turn network data into business value for customers— Harness massive volumes of data to facilitate the future of app communications— Seamlessly connect apps and devices across hybrid cloud environments.

The What's, How's and Why's of SFTP

When it comes to the exposure of data, no other period in history has posed the magnitude of risks and regulations companies face today. Companies in any industry — but particularly those in healthcare, finance, and government — must keep cybersecurity top-of-mind to avoid data breaches of personally identifiable information (PII). Not only are data breaches a threat to company reputation, but compliance issues can also lead to hefty fines and, in extreme cases, imprisonment.

Salesforce Chatter API vs. Connect API: What's the Difference?

While Salesforce Chatter API and Connect API help integrate Salesforce with other systems, they serve different purposes. Salesforce Chatter API is for integrating Chatter’s communication and social collaboration features into your application. On the other hand, Salesforce Connect API integrates Salesforce with other data sources or systems so you can access and use real-time data from those systems within Salesforce.

PostgreSQL vs MySQL: The Critical Differences

MySQL and PostgreSQL offer many of the same features and capabilities—but there are critical differences between these two Relational Database Management Systems (RDBMS) that cannot be ignored. If you’re not familiar with these differences, here’s a quick and easy overview: In this guide, we provide a brief history and overview of each database system.

AI Like a Rockstar

Although it might seem a little early, I was just thinking: what will 2023 be remembered for? For many it will be the year that Beyonce and Taylor Swift took to stages around the world and pushed the boundaries of live music (I’m a confirmed Swiftie if you didn’t know). It is also the year of AI. When I speak with customers, they all talk about how they are steering towards AI adoption.

8 Things You Can Do With Data Apps in Keboola

Imagine this: data consumers in your organization no longer anxiously wait for essential data, self-serving data with ease and confidence. Data engineers, freed from dull routine tasks and ad-hoc requests, tackling strategic projects at speed. Meet Data Apps. This new Keboola feature empowers business teams with self-serve data while freeing data engineers to focus on high-impact work. How?

23 Best Free NLP Datasets for Machine Learning

NLP is a field of AI that enables machines to understand, interpret, and generate human language in a way that is both meaningful and contextually relevant. Recently, ChatGPT and similar applications have created a surge in consumer and business interest in NLP. Now, many organizations are trying to incorporate NLP into their offerings.

Snowflake Customer 360 For Organizations

In today’s highly competitive market, consumers are more likely to stick with brands and businesses that recognize their wants and needs. Achieving this level of personalization requires companies to have a 360-degree view of customers or Customer 360. Learn how the Snowflake Data Cloud helps companies activate data to improve customer experiences by powering Customer 360.

Think Your Company Doesn't Need a Chief Data Officer? Here Are 7 Reasons Why It Does

Perhaps your C-suite is already a bit crowded. The typical hierarchy will include a CEO, COO, CFO, CTO, CMO, CIO, and a few more. Adding another position may not be terribly appealing, but there is one C-suite role every company should consider—chief data and analytics officer (CDO or CDAO).

What Is a Data-Driven Organization?

Organizations that consistently use data for decision-making and improving operations outperform their competition. In fact, B2B companies utilizing data experience above-market growth and earnings increases of 15 to 25%. McKinsey forecasts that by 2025, almost all employees will use data to support their work naturally and regularly. Compare that to how organizations currently apply data-driven approaches—sporadically throughout the organization.

Do the Benefits of Cloud Outweigh the Costs?

Companies are now making a decisive shift from traditional on-premises Oracle software to Oracle’s cutting-edge cloud solutions. In fact, a recent Gartner report on cloud expenditure found that cross-industry cloud spend has risen from 8% as a percentage of total IT spend in 2018 to 16% in 2022. Worldwide spending on public cloud services is expected to grow by 21.7% in 2023.

Yellowfin vs Power BI: What's the Difference?

Adopting a new business intelligence (BI) solution requires a thorough understanding of its feature-set and functionality in order to ensure analytics is integrated into your business as seamlessly as possible and that the value of your new tool is realized. Previously, we have covered how Yellowfin can be used with Power BI as a complementary solution.

What Well-Designed Data Lake Architecture Looks Like

The importance of a well-structured data lake architecture cannot be overstated. As businesses work with an ever-increasing influx of data, the need for a robust, scalable, and efficient data storage solution becomes crucial. Let’s explore Data Lake Architecture Design—a concept revolutionizing how enterprises store, access, analyze, and compute their data.

Top 5 Best Practices for Building Event-Driven Architectures Using Confluent and AWS Lambda

Confluent and AWS Lambda can be used for building real-time, scalable, fault-tolerant event-driven architectures, ensuring that your application logic is executed reliably in response to specific business events. Confluent provides a streaming SaaS solution based on Apache Kafka® and built on Kora: The Cloud Native Apache Kafka Engine, allowing you to focus on building event-driven applications without operating the underlying infrastructure.

AWS, Qlik, and SAP Data: Turning the Lifeblood of Business into Value and Action

One of my favorite analogies is that data is the lifeblood of the business. Before you roll your eyes at me (I see it now), hear me out. At your annual physical, when you get your blood work done, think of how much information is uncovered about your overall health from a tiny vial of your blood. From those 10 CCs they extract comes back pages of information regarding your cell counts, glucose, cholesterol, and other information.

Apache Kafka Message Compression

Apache Kafka® supports incredibly high throughput. It’s been known for feats like supporting 20 million orders per hour to get COVID tests out to US citizens during the pandemic. Kafka's approach to partitioning topics helps achieve this level of scalability. Topic partitions are the main "unit of parallelism" in Kafka. What’s a unit of parallelism? It’s like having multiple cashiers in the same store instead of one.

Technology Spotlight: Open Data Lakehouse for Private Cloud

Cloudera emphasizes the importance of trusted data for reliable AI. We've introduced Open Data Lakehouse for private cloud, incorporating Apache Iceberg for enhanced data management and security. This empowers analysts and data scientists with direct access to all data, including real-time streaming. Iceberg's capabilities reduce silos, lower storage costs, and mitigate business risks. We also focus on scalability, introducing features like snapshots and user quotas in Apache Ozone. Cloudera prioritizes enterprise readiness with Zero Downtime Upgrades and broader hardware and software support.

Telecommunications Data Monetization Strategies in 5G and beyond with Cloudera and AWS

The world is awash with data, no more so than in the telecommunications (telco) industry. With some Cloudera customers ingesting multiple petabytes of data every single day— that’s multiple thousands of terabytes!—there is the potential to understand, in great detail, how people, businesses, cities and ecosystems function.

Fine-Tuning a Foundation Model for Multiple Tasks

In this video we discuss the reasons why fine-tuning is needed to create mroe contextual accurate LLMs, and the methods that you can do to accomplish this. We also give a demo of our newest Applied ML Prototype (AMP) which demonstrates how to implement LLM fine-tuning jobs that make use of the QLoRA and Accelerate implementations available in the PEFT open-source library from Huggingface and an example application that swaps the fine-tuned adapters in real time for inference targetting different tasks. Learn more at cloudera.com#ai #ml.

Building Airtight Data Security Architecture in Growing Businesses

In the second installment of Mavericks of Data, we have an engaging discussion with Mahesh Krishnan, CTO of Fujitsu Australia and innovator, thought leader, author, speaker, and passionate technologist who has over 30 years of experience in the IT sector. Mahesh talks about his role within Fujitsu, the recent developments and key considerations in data security, setting up data security within growing businesses and the challenges revolving around data sensitivity.

Revolutionize Your Data Experience With Cloudera on Private Cloud

In the age of the AI revolution, where chatbots, generative AI, and large language models (LLMs) are taking the business world by storm, enterprises are fast realizing the need for strong data control and privacy to protect their confidential and commercially sensitive data, while still providing access to this data for context-specific AI insights.

Marketing Success in the Age of AI Requires a Modern Marketing Data Stack

Data is essential to marketing. It’s how we know our audience and measure campaign outcomes. It shows us where to adjust a campaign on the fly, for even better results. But working with data is increasingly complex, and having the right stack of technologies is invaluable.

Mode + ThoughtSpot recognized as Leaders in Snowflake's 2023 Modern Marketing Data Stack awards

We’re thrilled to announce that both ThoughtSpot and Mode (acquired by ThoughtSpot in July 2023) have been recognized as Leaders in Snowflake's recent Modern Marketing Data Stack report! Given the ever-evolving landscape of modern data analytics products, organizations are looking to ThoughtSpot and Mode when seeking innovative solutions—helping them harness the power of their marketing data.

Snowflake's Annual Modern Marketing Data Stack Report: Talend Being Named a "Leader in Integration and Modeling" is Just the Beginning

With marketing analytics now influencing more than half (53%) of marketing decisions, there’s finally some good information around using data in marketing. In fact, Gartner found that when analytics influences less than 50% of decisions, organizations find it challenging to prove the value of their marketing.

Snowflake CDP: The Future of Customer Data Management

In today's fast-paced digital landscape, harnessing the power of data has become paramount for businesses striving to deliver exceptional customer experiences and stay ahead in the competitive market. At the forefront of this data revolution stands Snowflake CDP, an innovative Customer Data Platform (CDP) that promises to redefine how businesses manage, integrate, and leverage their customer data.

Snowflake's Annual Modern Marketing Data Stack Report: Being Named a "Leader in Integration and Modeling" is Just the Beginning

With marketing analytics now influencing more than half (53%) of marketing decisions, there’s finally some good data around using data in marketing. In fact, Gartner found that when analytics influences less than 50% of decisions, organizations find it challenging to prove the value of their marketing.

Dataflow Programming with Apache Flink and Apache Kafka

Recently, I got my hands dirty working with Apache Flink®. The experience was a little overwhelming. I have spent years working with streaming technologies but Flink was new to me and the resources online were rarely what I needed. Thankfully, I had access to some of the best Flink experts in the business to provide me with first-class advice, but not everyone has access to an expert when they need one.

FinOps Camp Episode 3: Considerations for Mapping your FinOps Adventure

FinOps Camp Episode 3: Considerations for Mapping your FinOps Adventure Program elements, use cases, and principles to manage cloud data costs Join SanjMo Principal and Founder Sanjeev Mohan and Unravel VP of Solutions Engineering Chris Santiago as we share considerations for mapping your FinOps Adventure.Creating a solid FinOps strategy is crucial to navigating the rapidly-evolving world of cloud services.

MLOps Live #24: How to Build an Automated AI ChatBot

In this MLOps Live session, Gennaro, Head of Artificial Intelligence and Machine Learning at Sense, describe how he and his team built and perfected the Sense chatbot, what their ML pipeline looks like behind the scenes, and how they have overcome complex challenges such as building a complex natural language processing ( NLP) serving pipeline with custom model ensembles, tracking question-to-question context, and enabling candidate matching.

The Advantages of Cloud SFTP

Data management is a critical aspect of any business, and secure, efficient data transfer mechanisms are an absolute must. This is where Secure File Transfer Protocol (SFTP) comes into play, offering a method to transfer files securely over networks. However, with the rise of cloud computing, a more accessible, scalable, and cost-effective solution has emerged: Cloud SFTP.

New Looker + ThoughtSpot Connector: Where semantic modeling meets natural language search

Semantic layers are a game changer, allowing organizations to define metrics and business logic in one, centralized location. Because business users can trust that their data is built on a single source of truth, the semantic layer also empowers self-service analytics. Looker Modeler has become a leader among semantic layers, allowing users to seamlessly layer on top of their business data.

Power Holistic Customer Insights with Salesforce and Snowflake Data Sharing-Based Integration

Snowflake and Salesforce have built on our existing partnership to unify the full breadth of customer and business data and generate actionable insights for our customers. We are happy to announce the general availability of Bring Your Own Lake (BYOL) Data Sharing with the Snowflake Data Cloud from Salesforce Data Cloud. Organizations can now leverage Salesforce data directly in Snowflake via zero-ETL data sharing to accelerate decision-making and help streamline business processes.

Create trusted insights with Verified Liveboards

ThoughtSpot users can easily create content with data using our intuitive, AI-powered search experience. However, business users sometimes find themselves asking a critical question: which content should I trust and use for my specific business use case? For example, if there are ten “Sales Performance” Liveboards created by different authors, you may wonder which is the golden version—the Liveboard that is reviewed, approved, and consistently maintained.

Top 9 Salesforce Reporting Tools for Data-Driven Insights

Salesforce has gained in popularity as one of the most comprehensive CRM platforms. However, when it comes to reporting, Salesforce lacks some key features required to make the most of your data-driven insights. For example, Salesforce reporting doesn't provide customization and it requires tedious manual data entry. Salesforce reporting tools offer an easy fix.

Not All Natural Language Query (NLQ) Models Are Created Equal: Part 3 - Power BI Q&A

In part one of this series, we discussed the evolution of Yellowfin’s Guided NLQ solution and focused on aspects of Guided NLQ that stand apart from the competition. In part two, we then compared Guided NLQ to Sisense's equivalent NLQ solution, Sisense Simply Ask. In part three, we will look deeper at another competitor’s NLQ offering, Microsoft Power BI and its Q&A feature.

Your Guide to Flink SQL: An In-Depth Exploration

In the first two parts of our Inside Flink blog series, we explored the benefits of stream processing with Flink and common Flink use cases for which teams are choosing to leverage the popular framework to unlock the full potential of streaming. Specifically, we broke down the key reasons why developers are choosing Apache Flink® as their stream processing framework, as well as the ways in which they are putting it into practice.

SaaS in 60 - Expression Generator

Insight Advisor Expression Generation - As we expand the breadth and depth of our Insight Advisor AI-assistant capabilities, we now offer auto-generated expressions in the expression editor driven by natural language processing. Users seeking to create analytics expressions, including complex set analysis, can simply describe what they want to calculate and Insight Advisor will generate the expression to use. This powerful capability delivers on the promise of AI, making the complex simple and allowing more people to expand their data literacy.

Staige AI-Enhanced - Spark Your Own AI Innovations

Qlik Staige, a unified strategy to provide AI-enhanced solutions to enterprises while confidently embracing the power of Artificial Intelligence (AI). Qlik Staige helps customers innovate and move faster by making secure, governed AI and automation part of everything they can do with Qlik – from experimenting with and implementing generative AI models, to developing AI-powered predictions to improve future outcomes, to driving better insights into their business.

A Hybrid, Open Data Lakehouse That Can Handle it All

Open more possibilities with an open data lakehouse. As the industry’s first open data lakehouse, Cloudera Data Platform delivers scalable performance and efficiency to enable smarter business decisions—paving the safest, fastest path from data to AI. Cloudera, together with Intel and HPE, can power analytics and AI at scale across any CSP or private cloud infrastructure.

Overview of Cloud storage for your data platform

One of the most important questions in architecting a data platform is where to store and archive data. In a blog series, we’ll cover the different storage strategies for Kafka and introduce you to Lenses’ S3 Connector for backup/restore. But in this first blog, we must introduce the different Cloud storage options available. Later blogs will focus on specific solutions, explain in more depth how this maps to Kafka and then how Lenses manage your Kafka topic backups.

The Ultimate Guide to ELK Log Analysis

ELK has become one of the most popular log analytics solutions for software-driven businesses, with thousands of organizations relying on ELK for log analysis and management in 2021. In this ultimate guide to using ELK for log management and analytics, we’re providing insights and information that will help you know what to expect when deploying, configuring, and operating an ELK stack for your organization. Keep reading to discover answers to the following.

Financial Reporting Challenges for CFOs in 2023

Financial professionals encounter periods of high activity throughout the year. Whether you serve as a CFO, specialize in taxation, or contribute to the team responsible for closing financial records and generating year-end reports, any time can become crunch time. These intervals demand long hours at the office (or working evenings from your home office) as you diligently tackle the extensive list of tasks that require immediate attention.

Red Hat + Cloudera | A Hybrid Data Platform for Generative AI for FSI

Red Hat and Cloudera have joined forces to enable customers to take advantage of the cloud with full confidence, especially in the financial services industry, where data protection is critical. Red Hat Payment Industry Lead, Ramon Villarreal describes how collaborating with Cloudera provides leading financial services organizations with data resiliency, performance and expedited time to market as they leverage the cloud to move and manipulate massive amounts of data.

How Financial Services and Insurance Streamline AI Initiatives with a Hybrid Data Platform

With the emergence of new creative AI algorithms like large language models (LLM) fromOpenAI’s ChatGPT, Google’s Bard, Meta’s LLaMa, and Bloomberg’s BloombergGPT—awareness, interest and adoption of AI use cases across industries is at an all time high. But in highly regulated industries where these technologies may be prohibited, the focus is less on off the shelf generative AI, and more on the relationship between their data and how AI can transform their business.

Sprint2Cloud: A Quicker Way to Get Through Cloud Migrations

In today’s high-velocity digital arena, businesses are thrust into the whirlwind of global events, rapid technological advancements, and the incessant push for innovation. Yet, amidst the tempest of mergers, digital acceleration, and shifting tech paradigms, charting a confident path towards cloud migration can be daunting.

Securely Connect to LLMs and Other External Services from Snowpark

Snowpark is the set of libraries and runtimes that enables data engineers, data scientists and developers to build data engineering pipelines, ML workflows, and data applications in Python, Java, and Scala. Functions or procedures written by users in these languages are executed inside of Snowpark’s secure sandbox environment, which runs on the warehouse.

Leveraging Machine Learning in Product Analytics for Enhanced Insights and Actionability

Product analytics traditionally hinged on examining user interactions to extract actionable insights. The integration of machine learning (ML) has elevated this process, enriching our understanding and our ability to predict future trends. Let's unfold how ML integrates into product analytics and the transformative advantages it introduces. ‍

How to Run Apache Kafka on Windows

Is Windows your favorite development environment? Do you want to run Apache Kafka® on Windows? Thanks to the Windows Subsystem for Linux 2 (WSL 2), now you can, and with fewer tears than in the past. Windows still isn’t the recommended platform for running Kafka with production workloads, but for trying out Kafka, it works just fine. Let’s take a look at how it’s done.

Streaming Pipelines With Snowflake Explained In 2 Minutes

Streaming data has been historically complex and costly to work with. That's no longer the case with Snowflake's streaming capabilities. Together, Snowpipe Streaming and Dynamic Tables (in public preview) break the barrier between batch and streaming systems. Now you can build low-latency data pipelines with serverless row-set ingestion and declarative pipelines with SQL. You can easily adapt to your business requirements to change latency as a single parameter.

Expanding Possibilities: Cloudera's Teen Accelerator Program Completes Its Second Year

At Cloudera, we’re known for making innovative technological solutions that drive change and impact the world. Our mission is to make data and analytics easy and accessible to everyone. And that doesn’t end with our customer base. We also aim to provide equitable access to career opportunities within data and analytics to the workforce of tomorrow.

Choosing the Right ETL Tool for Google BigQuery Storage

Google BigQuery is a robust and scalable cloud-based data warehouse that allows storing and analyzing vast amounts of data. BigQuery is a natural choice if your data already exists on the Google Cloud Platform (GCP). But before you leverage the platform, you need to extract the source data, carry out transformations, and load the data into your data lake or warehouse. This is where the ETL process and the ETL tools play a significant role.

New Fivetran connector streamlines data workflows for real-time insights

In a survey by the Harvard Business Review, 87% of respondents stated their organizations would be more successful if frontline workers were empowered to make important decisions in the moment. And 86% of respondents stated that they needed better technology to enable those in-the-moment decisions. Those coveted insights live at the end of a process lovingly known as the data pipeline.

Design and Deployment Considerations for Deploying Apache Kafka on AWS

Various factors can impede an organization's ability to leverage Confluent Cloud, ranging from data locality considerations to stringent internal prerequisites. For instance, specific mandates might dictate that data be confined within a customer's Virtual Private Cloud (VPC), or necessitate operation within an air-gapped VPC. However, a silver lining exists even in such circumstances, as viable alternatives remain available to address these specific scenarios.

Snowflake Schemas vs Star Schemas: 5 key differences

In the realm of data warehousing, star and snowflake schemas play crucial roles in organizing vast amounts of data efficiently. Both of these schemas offer unique advantages and cater to distinct requirements in the data processing landscape. Before diving into the details, let’s first provide a snapshot comparison to set the scene: Star schemas are more straightforward, while snowflake schemas are a more normalized version of star schemas.

Snowpark ML: The 'Easy Button' for Open Source LLM Deployment in Snowflake

Companies want to train and use large language models (LLMs) with their own proprietary data. Open source generative models such as Meta’s Llama 2 are pivotal in making that possible. The next hurdle is finding a platform to harness the power of LLMs. Snowflake lets you apply near-magical generative AI transformations to your data all in Python, with the protection of its out-of-the-box governance and security features.

The Hidden Costs of Embedded Analytics: A Pricing Comparison

Embedded analytics solutions have become increasingly popular in recent years, as more organizations across multiple sectors recognize the value of integrating advanced business intelligence (BI) and analytical capabilities into their existing software applications. Such solutions allow for deeper insights from data to make more informed decisions - this is well established.

Globe Group Slashes Infra Costs and Fuels Personalized Marketing With Confluent

But their batch-based processing systems and lack of access to self-service data was slowing them down, making it difficult to harness real-time data and create the targeted marketing campaigns they needed to reach their customers..

Apache Ozone Odyssey | Exploring the Future of Scalable Storage with Apache Ozone.

This collaborative meetup was designed to bring together individuals interested in exploring the basics of Apache Ozone. Expert Ozone developer Nandakumar Vadivelu will guide you through the basics of setting up and configuring Ozone, as well as highlighting its key features and benefits. Begining with an overview of Apache Ozone's fundamentals, diving into its architecture and core components. This session is perfect for those who are new to Ozone or want to explore its potential as a highly scalable and efficient storage solution.

The Benefits, Challenges and Risks of Predictive Analytics for Your Application

In this modern, turbulent market, predictive analytics has become a key feature for analytics software customers. Predictive analytics refers to the use of historical data, machine learning, and artificial intelligence to predict what will happen in the future. This ability to analyze and predict future scenarios sets certain applications apart from the pack, offering application teams significant advantage in a competitive market.

Installing MiNiFi agents has never been so easy!

This video walks you through one of the new features coming with Edge Flow Manager 1.6.0: the one-line installer command. Did you ever think that installing a MiNiFi (C++ or Java) agent was complicated? Did you ever struggle with generating and configuring the certificates for mTLS communication between the agents and Edge Flow Manager?

Creating a data-driven culture with self service and data literacy

In this segment, Geraldine Wong, CDO of GXS Bank, explains how her bank's data strategy aims to promote inclusion through superior data insights and AI, but achieving this requires building a data-driven culture by providing employees the right tools, access, and knowledge about the data.

Real-time Fraud Detection - Use Case Implementation

When it comes to fraud detection in financial services, streaming data with Confluent enables you to build the right intelligence-as early as possible-for precise and predictive responses. Learn how Confluent's event-driven architecture and streaming pipelines deliver a continuous flow of data, aggregated from wherever it resides in your enterprise, to whichever application or team needs to see it. Enrich each interaction, each transaction, and each anomaly with real-time context so your fraud detection systems have the intelligence to get ahead.

Designing Event-Driven Systems

Many forces affect software today: larger datasets, geographical disparities, complex company structures, and the growing need to be fast and nimble in the face of change. Proven approaches such as service-oriented (SOA) and event-driven architectures (EDA) are joined by newer techniques such as microservices, reactive architectures, DevOps, and stream processing. Many of these patterns are successful by themselves, but as this practical ebook demonstrates, they provide a more holistic and compelling approach when applied together.