At Mercado Libre, we are obsessed with unlocking the power and potential of data. One of our key cultural principles is to have a Beta Mindset. This means that we operate in a “state of beta”, constantly asking new questions of our data, experimenting with technologies and iterating our business operations in service of creating the best experiences for our customers.
Without a central place to manage models, those responsible for operationalizing ML models have no way of knowing the overall status of trained models and data. This lack of manageability can impact the review and release process of models into production, which often requires offline reviews with many stakeholders.
You’ve spent hours unsuccessfully figuring out why users are unable to complete the payment portion on your website. You and collaborating developers are searching errors and logs from production, but everyone reports “it works on my machine.” And your company continues to lose money from every user who cannot complete their purchase.
At the DataOps Unleashed 2022 virtual conference, AWS Principal Solutions Architect Angelo Carvalho presented How AWS & Unravel help customers modernize their Big Data workloads with Amazon EMR. The full session recording is available on demand, but here are some of the highlights.
As the size of a software project grows, so does the complexity of integrating changes made by multiple developers and resolving conflicts and other issues as they arise. Quality control can also become progressively more difficult without proper management of the build pipeline. Automated builds are the standard solution to this problem across the industry. Understanding build automation in detail is a valuable skill for any developer, no matter the size of their team.
First, we collect data from an existing Kafka stream into an Iguazio time series table. Next, we visualize the stream with a Grafana dashboard; and finally, we access the data in a Jupyter notebook using Python code. We use a Nuclio serverless function to “listen” to a Kafka stream and then ingest its events into our time series table. Iguazio gets you started with a template for Kafka to time series.