Systems | Development | Analytics | API | Testing

How to Accelerate HuggingFace Throughput by 193%

Deploying models is becoming easier every day, especially thanks to excellent tutorials like Transformers-Deploy. It talks about how to convert and optimize a Huggingface model and deploy it on the Nvidia Triton inference engine. Nvidia Triton is an exceptionally fast and solid tool and should be very high on the list when searching for ways to deploy a model. Our developers know this, of course, so ClearML Serving uses Nvidia Triton on the backend if a model needs GPU acceleration.

How to Do Data Labeling, Versioning, and Management for ML

It has been months ago when Toloka and ClearML met together to create this joint project. Our goal was to showcase to other ML practitioners how to first gather data and then version and manage data before it is fed to an ML model. We believe that following those best practices will help others build better and more robust AI solutions. If you are curious, have a look at the project we have created together.

How to Distribute Machine Learning Workloads with Dask

Tell us if this sounds familiar. You’ve found an awesome data set that you think will allow you to train a machine learning (ML) model that will accomplish the project goals; the only problem is the data is too big to fit in the compute environment that you’re using. In the day and age of “big data,” most might think this issue is trivial, but like anything in the world of data science things are hardly ever as straightforward as they seem.

Power Your Lead Scoring with ML for Near Real-Time Predictions

Every organization wants to identify the right sales leads at the right time to optimize conversions. Lead scoring is a popular method for ranking prospects through an assessment of perceived value and sales-readiness. Scores are used to determine the order in which high-value leads are contacted, thus ensuring the best use of a salesperson’s time. Of course, lead scoring is only as good as the information supplied.

Building an Automated ML Pipeline with a Feature Store Using Iguazio & Snowflake

When operationalizing machine and deep learning, a production-first approach is essential for moving from research and development to scalable production pipelines in a much faster and more effective manner. Without the need to refactor code, add glue logic and spend significant efforts on data and ML engineering, more models will make it to production and with less issues like drift.

How To Deploy a HuggingFace Model (Seamlessly)

What if I want to serve a Huggingface model on ClearML? Where do I start? In general, machine learning engineers know by now that a good model serving engine is invaluable when serving models in production. These days, NVIDIA’s Triton inference engine is a popular option to do so, but it is lacking in some respects.