Impala

Repository

2011

Apache 2.0

Website

Project Description

Lightning-fast, distributed SQL queries for petabytes of data stored in Apache Hadoop clusters.

Impala is a modern, massively-distributed, massively-parallel, C++ query engine that lets you analyze, transform and combine data from a variety of data sources:

Best of breed performance and scalability.
Support for data stored in HDFS, Apache HBase and Amazon S3.
Wide analytic SQL support, including window functions and subqueries.
On-the-fly code generation using LLVM to generate CPU-efficient code tailored specifically to each individual query.
Support for the most commonly-used Hadoop file formats, including the Apache Parquet project.
Apache-licensed, 100% open source.

Links

Download source code as [.zip file] [.tar.gz file]
Documentation: [README]

Impala

Project Description

Links

Monthly Archive

Follow Us