Systems | Development | Analytics | API | Testing

Snowflake's Arctic-TILT: A State-of-the-Art Document Intelligence LLM in a Single A10 GPU

The volume of unstructured data — such as PDFs, images, video and audio files — is surging across enterprises today. Yet documents, which represent a substantial portion of this data and hold significant value, continue to be processed through inefficient and manual methods.

Top Data Governance Tools for 2024

According to Gartner, 80% of companies worldwide are expected to have efficient data management systems in place by 2025. This projection highlights the growing recognition of data governance tools as essential enablers for maintaining and enhancing the quality and security of organizational data within these data management systems. In this blog, we will talk about some of the best data governance tools and software to consider in 2024.

Introducing Confluent Cloud Freight Clusters

We’re excited to introduce Freight clusters—a new type of Confluent Cloud cluster designed for high-throughput, relaxed latency workloads that is up to 90% cheaper than self-managing open source Apache Kafka®. Freight clusters utilize the latest innovations in Confluent Cloud’s cloud-native engine, Kora, to deliver low cost networking by trading off ultra low latency performance.

Improving LLM Accuracy & Performance - MLOps Live #28 with Databricks

Watch session #28 in our MLOps Live Webinar Series featuring Databricks where we discuss improving LLM accuracy & performance. Hear Margaret Amori (Databricks), Vijay Balasubramaniam (Databricks) , and Yaron Haviv (Iguazio) share best practices and pragmatic advice on successfully improving the accuracy and performance of LLMs while mitigating challenges like risks and escalating costs. See real examples including techniques to overcome common challenges using tools such as Databricks Mosaic AI and their new open LLM, DBRX.

Snowflake Data Clean Rooms for Marketing

In less than 5 minutes, Ankur Abhishek, Senior Product Manager at Snowflake, demostrates how Snowflake Data Clean Rooms can be used for audience overlap, audience lookalike, and attribution analysis. As Kamakshi Sivaramakrishnan, Senior Director of Product Management at Snowflake, explains, "This is the full marketing lifecycle brought in its entirety in a Snowflake clean room, run securely with multiple parties collaborating with each other. This is demystifying clean rooms.".

Data Integrity vs. Data Quality: Here's How They Are Different

Data integrity refers to protecting data from anything that can harm or corrupt it, whereas data quality checks if the data is helpful for its intended purpose. Data quality is a subset of data integrity. One can have accurate, consistent, and error-free data, but it is only helpful once we have the supporting information for this data. Data integrity and quality are sometimes used interchangeably in data management, but they have different implications and distinct roles in enhancing data usability.

What is Data Preprocessing? Definition, Importance, and Steps

Did you know data scientists spend around 60% of their time preprocessing data? Data preprocessing plays a critical role in enhancing the reliability and accuracy of analytics. This blog will discuss why data preprocessing is essential for making data suitable for comprehensive analysis.