Systems | Development | Analytics | API | Testing

Google BigQuery

Best practices of migrating Hive ACID Tables to BigQuery

Are you looking to migrate a large amount of Hive ACID tables to BigQuery? ACID enabled Hive tables support transactions that accept updates and delete DML operations. In this blog, we will explore migrating Hive ACID tables to BigQuery. The approach explored in this blog works for both compacted (major / minor) and non-compacted Hive tables. Let’s first understand the term ACID and how it works in Hive. ACID stands for four traits of database transactions.

Zero-ETL approach to analytics on Bigtable data using BigQuery

Modern businesses are increasingly relying on real-time insights to stay ahead of their competition. Whether it's to expedite human decision-making or fully automate decisions, such insights require the ability to run hybrid transactional analytical workloads that often involve multiple data sources. BigQuery is Google Cloud’s serverless, multi-cloud data warehouse that simplifies analytics by bringing together data from multiple sources.

No pipelines needed. Stream data with Pub/Sub direct to BigQuery

Pub/Sub’s ingestion of data into BigQuery can be critical to making your latest business data immediately available for analysis. Until today, you had to create intermediate Dataflow jobs before your data could be ingested into BigQuery with the proper schema. While Dataflow pipelines (including ones built with Dataflow Templates) get the job done well, sometimes they can be more than what is needed for use cases that simply require raw data with no transformation to be exported to BigQuery.

Scalable Python on BigQuery using Dask and NVIDIA GPUs

BigQuery is Google Cloud’s fully managed serverless data platform that supports querying using ANSI SQL. BigQuery also has a data lake storage engine that unifies SQL queries with other open source processing frameworks such as Apache Spark, Tensorflow, and Dask. BigQuery storage provides an API layer for OSS engines to process data. This API enables mixing and matching programming in languages like Python with structured SQL in the same data platform.

Performance considerations for loading data into BigQuery

It is not unusual for customers to load very large data sets into their enterprise data warehouse. Whether you are doing an initial data ingestion with hundreds of TB of data or incrementally loading from your systems of record, performance of bulk inserts is key to quicker insights from the data. The most common architecture for batch data loads uses Google Cloud Storage(Object storage) as the staging area for all bulk loads.

Built with BigQuery: How Exabeam delivers a petabyte-scale cybersecurity solution

Exabeam, a leader in SIEM and XDR, provides security operations teams with end-to-end Threat Detection, Investigation, and Response (TDIR) by leveraging a combination of user and entity behavioral analytics (UEBA) and security orchestration, automation, and response (SOAR) to allow organizations to quickly resolve cybersecurity threats.

Now in preview, BigQuery BI Engine Preferred Tables

Earlier in the quarter we had announced that BigQuery BI Engine support for all BI and custom applications was generally available. Today we are excited to announce the preview launch of Preferred Tables support in BigQuery BI Engine! BI Engine is an in-memory analysis service that helps customers get low latency performance for their queries across all BI tools that connect to BigQuery.