Content tagged with "Machine Learning"

A beige analog compass by Ylanite Koppens

A beige analog compass by Ylanite Koppens

Introduction

Open machine learning research is undergoing something of a reproducibiltiy crisis. In fairness it’s not usually the authors’ fault - or at least not entirely. We’re a fickle industry and the tools and frameworks were ‘in vogue’ and state of the art a couple of years ago are now obsolete. Furthermore, academics and open source contributors are under no obligation to keep their code up to date. It is often left up to the reproducer to figure out how to breathe life back into older work.

Read more...

A jar of pickles by Ksenia Charnaya

A jar of pickles by Ksenia Charnaya

I recently came across an infuriating problem where an MLFlow python model I had trained on one system using Python 3.6 would not load on another system with an identical version of Python.

The exact problem was that when I ran mlflow models serve -m <url/to/model/in/bucket> the service would crash saying that the model could not be unserialized because ValueError: unsupported pickle protocol: 5.

Read more...

MLFlow is a powerful open source MLOps platform with built in framework for serving your trained ML models as REST APIs. The REST framework will load data provided in a JSON or CSV format compatible with pandas and pass this directly into your model. This can be handy when your model is expecting a tabular list of numerical and categorical features. However it is less clear how to serve with models and pipelines that are expecting unstructured text data as their primary input. In this post we will explore how to train and then serve an NLP model using MLFlow, scikit-learn and spacy.

Read more...

Introduction

When you’re working with large datasets, storing them in git alongside your source code is usually not an optimal solution. Git is famously, not really suited to large files and whilst general purpose solutions exist (Git LFS being perhaps the most famous and widely adopted solution), DVC is a powerful alternative that does not require a dedicated LFS server and can be used directly with a range of cloud storage systems as well as traditional NFS and SFTP-backed filestores all listed out here.

Read more...

In an ever-growing digital landscape filled with more content than a person can consume in their lifetime, recommendation engines are a blessing but can also be a a curse and understanding their strengths and weaknesses is a vital skill as part of a balanced media diet.

If you remember when connecting to the internet involved a squawking modem and images that took 5 minutes to load then you probably discovered your favourite musician after hearing them on the radio, reading about them in NME being told about them by a friend. Likewise you probably discovered your favourite TV show by watching live terrestrial TV, your favourite book by taking a chance at your local library and your favourite movie at a cinema. You only saw the movies that had cool TV ads or rave reviews – you couldn’t afford to take a chance on a dud when one ticket, plus bus fare plus popcorn and a drink cost more than two weeks pocket money.

Read more...

In recent weeks and months the impending global climate catastrophe has been at the forefront of many peoples’ minds. Thanks to movements like Extinction Rebellion and high profile environmentalists like Greta Thunberg and David Attenborough as well as damning reports from the IPCC, it finally feels like momentum is building behind significant reduction of carbon emissions. That said, knowing how we can help on an individual level beyond driving and flying less still feels very overwhelming.

Read more...

As adoption of chatbots and conversational interfaces continues to grow, how will businesses keep their brand safe and their customer’s data safer?

From deliberate infiltration of  systems to bugs that cause accidental data leakage, these days, the exposure or loss of personal data is a large part of what occupies almost every self-respecting CIO’s mind. Especially since the EU has just slapped its first defendant with a GDPR fine.

Read more...

As a machine learning professional specialising in computational linguistics (helping machines to extract meaning from human text), I have confused people on multiple occasions by suggesting that their document processing problem could be solved by neural networks trained using a Graphics Processing Unit (GPU). You’d be well within your rights to be confused. To the uninitiated what I just said was “Let’s solve this problem involving reading lots of text by building a system that runs on specialised computer chips designed specifically to render images at high speed”.

Read more...