Machine learning is more accessible than ever, with datasets available online and Jupyter notebooks providing an easy way to explore and train models. In building a model, we often forget that it will be incorporated into an application that will provide value to the user. Therefore, we wanted to demonstrate how we can "use" the models we build in an application.
GEODIS Distribution & Express, a subsidiary of GEODIS, is the leader in France for reliable last-mile delivery service (deliveries within 24 to 48 hours). In 2020 alone, its 115 agencies handled 100 million parcels and carried out 5,000 rounds per day in more than 35 countries across Europe. Nathalie Mandjee, Business Intelligence Manager at GEODIS Distribution & Express, discovered that this part of the company was growing into a profit center for the larger business.
Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Administrators, developers, and data engineers who use Kafka clusters struggle to understand what is happening in their Kafka implementations.
In part 1 of this blog we discussed how Cloudera DataFlow for the Public Cloud (CDF-PC), the universal data distribution service powered by Apache NiFi, can make it easy to acquire data from wherever it originates and move it efficiently to make it available to other applications in a streaming fashion.
The sheer quantity and diversity of data sources make today’s landscape strikingly different — which requires a new set of tools.