Systems | Development | Analytics | API | Testing

February 2020

4 Big Data Riddles: The Straggler, the Slacker, the Fatso, and the Heckler

This article discusses four bottlenecks in BigData applications and introduces a number of tools, some of which are new, for identifying and removing them. These bottlenecks could occur in any framework but a particular emphasis will be given to Apache Spark and PySpark.

Unravel Introduces Workload Migration and Cost Analytics Solution for Azure Databricks, now available on Azure Marketplace

Fresh off a new funding round which includes strategic cloud partner Microsoft, Databricks continues to make huge strides in its mission to ease Spark complexity and simplify analytics through its Unified Analytics Platform. Databricks has also graduated from “visionary” to “leader” in the latest Gartner Magic Quadrant for Data Science and Machine Learning Platforms in 2020.

Data Structure Zoo

Solving a problem programatically often involves grouping data items together so they can be conveniently operated on or copied as a single unit – the items are collected in a data structure. Many different data structures have been designed over the past decades, some store individual items like phone numbers, others store more complex objects like name/phone number pairs. Each has strengths and weaknesses and is more or less suitable for a specific use case.