Brainsteam

Journalism is a Blurry JPEG of Science (And That's Usually Ok)

Published on February 18, 2023 by James Ravenscroft

Science Journalism is like a blurry image of science. Image courtesy of Joshua Sortino

I really enjoyed the recent New Yorker article by Ted Chiang ¹ that draws an analogy between the way that lossy photo formats like JPEG store representations of images and the way that large language models store knowledge. I found this analogy to very useful. Its a great way to describe the current state of these models to folks tangential to ML and NLP without dropping into transformer architecture and attention mechanisms. The post has also drawn some criticism² from scientists working in deep learning for not being “in keeping with our scientific understanding of LMs or deep learning”. Whilst Chiang may miss the finer strokes, the picture he paints is broadly representative. In a very meta way, his own work is like a blurry JPEG of how LLMs work. You might even consider that scientific journalism in general is like a blurry JPEG of scientific writing. I believe that in this context, such broad metaphor is ok most of the time. Let me explain.

2023 - Week 4

Published on January 28, 2023 by James Ravenscroft

#personal #ai #music

It’s the end of week 4 of January 2023 already. I can’t believe how fast things are going - as usual. This week has definitely been the week that the January blues really hit me but at the same time, some good stuff has happened too.

On Friday afternoon I got this email:

An Email telling me that my thesis corrections are satisfactory and clearing me to submit my final thesis for binding at the university.

Published on November 20, 2022 by James Ravenscroft

#humour #ai

Reproducing 'ancient' experiments with Pytorch inside docker

Published on March 1, 2021 by James Ravenscroft

#machine-learning #python #ai #devops #mlops #work #phd #open source

A beige analog compass by Ylanite Koppens

Introduction

Open machine learning research is undergoing something of a reproducibiltiy crisis. In fairness it’s not usually the authors’ fault - or at least not entirely. We’re a fickle industry and the tools and frameworks were ‘in vogue’ and state of the art a couple of years ago are now obsolete. Furthermore, academics and open source contributors are under no obligation to keep their code up to date. It is often left up to the reproducer to figure out how to breathe life back into older work.

Pickle 5 Madness with MLFlow and Python 3.6/3.7

Published on January 14, 2021 by James Ravenscroft

#machine-learning #python #ai #devops #mlops #work #open source

I recently came across an infuriating problem where an MLFlow python model I had trained on one system using Python 3.6 would not load on another system with an identical version of Python.

The exact problem was that when I ran mlflow models serve -m <url/to/model/in/bucket> the service would crash saying that the model could not be unserialized because ValueError: unsupported pickle protocol: 5.

Serving NLP Models with MLflow

Published on December 29, 2020 by James Ravenscroft

#machine-learning #python #ai #devops #mlops #nlp #spacy #work #open source

MLFlow is a powerful open source MLOps platform with built in framework for serving your trained ML models as REST APIs. The REST framework will load data provided in a JSON or CSV format compatible with pandas and pass this directly into your model. This can be handy when your model is expecting a tabular list of numerical and categorical features. However it is less clear how to serve with models and pipelines that are expecting unstructured text data as their primary input. In this post we will explore how to train and then serve an NLP model using MLFlow, scikit-learn and spacy.

‘Dark’ Recommendation Engines: Algorithmic curation as part of a ‘healthy’ information diet.

Published on September 4, 2020 by James Ravenscroft

#machine-learning #ai #work

In an ever-growing digital landscape filled with more content than a person can consume in their lifetime, recommendation engines are a blessing but can also be a a curse and understanding their strengths and weaknesses is a vital skill as part of a balanced media diet.

If you remember when connecting to the internet involved a squawking modem and images that took 5 minutes to load then you probably discovered your favourite musician after hearing them on the radio, reading about them in NME being told about them by a friend. Likewise you probably discovered your favourite TV show by watching live terrestrial TV, your favourite book by taking a chance at your local library and your favourite movie at a cinema. You only saw the movies that had cool TV ads or rave reviews – you couldn’t afford to take a chance on a dud when one ticket, plus bus fare plus popcorn and a drink cost more than two weeks pocket money.

How can AI practitioners reduce our carbon footprint?

Published on June 20, 2019 by James Ravenscroft

#AI #climate catastrophe #machine learning #nlp #work

In recent weeks and months the impending global climate catastrophe has been at the forefront of many peoples’ minds. Thanks to movements like Extinction Rebellion and high profile environmentalists like Greta Thunberg and David Attenborough as well as damning reports from the IPCC, it finally feels like momentum is building behind significant reduction of carbon emissions. That said, knowing how we can help on an individual level beyond driving and flying less still feels very overwhelming.

Applied AI in 2019

Published on January 6, 2019 by James Ravenscroft

#AI #Work

Looking back at some of the biggest AI and ML developments from 2018 and how they might influence applied AI in the coming year.

2018 was a pretty exciting year for AI developments. It’s true to say there is still a lot of hype in the space but it feels like people are beginning to really understand where AI can and can’t help them solve practical problems.

Re-using machine learning models and the “no free lunch” theorem

Published on March 21, 2018 by James Ravenscroft

#machine-learning #ai #work

Why re-use machine learning models?

Model re-use can be a huge cost saver when developing AI systems. But how well will your models perform in their new environment?

Content tagged with "Ai"

Introduction

In an ever-growing digital landscape filled with more content than a person can consume in their lifetime, recommendation engines are a blessing but can also be a a curse and understanding their strengths and weaknesses is a vital skill as part of a balanced media diet.

Why re-use machine learning models?