Open Lakehouse Meetup (ft. Apache Iceberg): Building Scalable Data Platforms
Discover the future of the Data Lakehouse with this deep dive into Apache Iceberg V3 and V4 from the Bengaluru community meetup. Learn how PyIceberg and DuckDB are revolutionizing Python-native data processing by eliminating the need for Spark clusters for 99% of common query sizes. Explore high-performance ingestion benchmarks from Oleg and the Google Dataproc Lightning Engine, achieving over 500k rows/sec through Apache Arrow and C++ vectorization. This session is a masterclass for data engineers on metadata compaction, Rest Catalogs, and building vendor-agnostic data platforms.