In 2010, Eric Schmidt, then CEO of Google, made the startling claim that every two days we humans generate as much information as we did from the dawn of civilization to today, or about five exabytes of data. At the time, we had TB disk drives and could only imagine an exabyte, which is one million terabytes. The next increments from TB is the peta byte and then the zettabyte, which is 1,000 exabytes. By the end of 2010, the world had crossed the zettabyte threshold.
In a previous article, we talked about the lost art of questioning and its importance when working with data and information to find actionable insights. In this article, we will expand on this topic and explain how questioning differs depending on what stage in the process you are from transforming data and information into insights.
At Talend, we tend to describe poorly organized, unhealthy data as “digital landfills.” But we don’t often talk about actual landfills. That’s right, the ones filled with trash. As anyone watching real estate prices will know, land is a finite resource. It’s crazy to think that we’re still dedicating land to storing our garbage, where it will sit releasing pollutants and greenhouse gases for decades to come.
At Mercado Libre, we are obsessed with unlocking the power and potential of data. One of our key cultural principles is to have a Beta Mindset. This means that we operate in a “state of beta”, constantly asking new questions of our data, experimenting with technologies and iterating our business operations in service of creating the best experiences for our customers.
Without a central place to manage models, those responsible for operationalizing ML models have no way of knowing the overall status of trained models and data. This lack of manageability can impact the review and release process of models into production, which often requires offline reviews with many stakeholders.
At the DataOps Unleashed 2022 virtual conference, AWS Principal Solutions Architect Angelo Carvalho presented How AWS & Unravel help customers modernize their Big Data workloads with Amazon EMR. The full session recording is available on demand, but here are some of the highlights.