Data Engineering with Data Hub on CDP
Today I'm going to show you how to run Data Engineering workloads on Cloudera Data Hub. First we'll deploy a Data Hub cluster with Zeppelin and Spark. Then, I'll show you an example of a pyspark job accessing data on S3. After that we'll run another pyspark job to access data in Hive.