Systems | Development | Analytics | API | Testing

Enabling distributed NLP research at SIL

In my main position, as a data scientist at SIL International, I work on expanding language possibilities with AI. Practically this includes applying recent advances in Natural Language Processing (NLP) to low resource and multilingual contexts. We work on things like spoken language identification, multilingual dialogue systems, machine translation, and translation quality estimation.

Modern Software Needs Modern Testing: The Test Toolchain, AI, and Risk-Based Thinking

Web and mobile apps are now organizations’ primary connection with their customers. Staying relevant and winning market share requires that firms can make constant changes to these apps. However, can organizations deploy many more small changes - often many per day - with confidence and with managed risk? We'll take a closer look at how a modern testing toolchain combines both production safety nets - from canaries, to feature flags, to error reporting - with pre-production intent validation tools for both developers and quality assurance/quality engineering. We can see how it is possible to measure and predict and limit the risk of a change by using AI.

The Rise of Unstructured Data

The word “data” is ubiquitous in narratives of the modern world. And data, the thing itself, is vital to the functioning of that world. This blog discusses quantifications, types, and implications of data. If you’ve ever wondered how much data there is in the world, what types there are and what that means for AI and businesses, then keep reading!

The Evolution of Data, Analytics, and AI-All in Less Than an Hour!

Much of my focus over the last couple of decades has been in analytics, big data, and AI, and Joe DosSantos and I discussed the progression of these fields over time in a recent Data Brilliant podcast episode. My subtitle for that episode might be, “The Promise and Perils of a Hot New Field,” as we addressed several aspects of how these popular concepts have evolved in the first fifth of the 21st century.

The Modern Data Stack Ecosystem - Fall 2021 Edition

In our previous article, The Future of the Modern Data Stack, we examined the motivations of the modern data stack, its current state, and looked optimistically into the future to see where it is headed. If you’re new to the modern data stack, we highly recommend giving the aforementioned article a read. A question we often get from new adopters of the modern data stack is “What tech should we be looking into?”.

ClearML-Data Lemonade: getting local datasets quickly and easily

Congratulations on creating a clean(ish) dataset to use for training! Now while the dataset is stored where it’s accessible to everyone, the distribution itself is a hassle! Local workstations, local GPU machines, and cloud machines (that may be spun up and down without disk persistence) are getting data everywhere. …and to say it is annoying is an understatement!

Operationalizing AI: Lessons from the Field

A casual stroll through recent tech headlines in the past few years makes two things abundantly clear: investment in AI is at an all-time high, and companies really struggle to get value out of AI technology. At first glance, these ideas seem to be at odds with each other: why consider investing in a field that hasn’t lived up to the hype? If you dig into the details, you’ll notice that a gap exists between the development and production use of AI in many companies.