Systems | Development | Analytics | API | Testing

Some of the Top SQL-on-Hadoop Tools with Pros and Cons

Hadoop ecosystem now serves as a comfortable home to Big Data now, and the Hadoop data stores now have a greater acceptance across the world by programmers, developers, data scientists, and database management experts. These ecosystems are as convenient as the data storages; however, the inherent reporting system of Hadoop poses a few challenges for the users to overcome.

Distributed model training using Dask and Scikit-learn

The theoretical bases for Machine Learning have existed for decades yet it wasn’t until the early 2000’s that the last AI winter came to an end. Since then, interest in and use of machine learning has exploded and its development has been largely democratized. Perhaps not so coincidentally, the same period saw the rise of Big Data, carrying with it increased distributed data storage and distributed computing capabilities made popular by the Hadoop ecosystem.

How Keboola benefits from using Keboola Connection

The Shoemaker (often) goes barefoot. It is often the case, that while one is working hard on helping their customers get better, they neglect their own processes, taking the same shortcuts they warn their clients against. It was like that at Keboola a few years back, until we agreed that this is no longer acceptable, and created a job role (mine) to apply our teachings internally as well.

What is happening in augmented analytics

Augmented analytics is when you take what was traditionally a very manual workflow and automate it. This gives you the ability to analyze data far more rapidly and to package up changes for humans to interpret. Essentially you’re augmenting a human experience, so rather than spending all your time looking for a needle in the haystack, the machine finds the needle and gives it to you.

What is happening in augmented analytics?

Augmented analytics is when you take what was traditionally a very manual workflow and automate it. This gives you the ability to analyze data far more rapidly using machines and to package up changes for humans to interpret. Essentially you’re augmenting a human experience, so rather than spending all your time looking for a needle in the haystack, the machine finds the needle and gives it to you. By bringing the human and the machine together you can create something very special and deliver that to an end user.