Looking back at some of the biggest AI and ML developments from 2018 and how they might influence applied AI in the coming year.

2018 was a pretty exciting year for AI developments. It’s true to say there is still a lot of hype in the space but it feels like people are beginning to really understand where AI can and can’t help them solve practical problems.

In this article we’ll take a look at some of the AI innovation that came out of academia and research teams in 2018 and how they might affect practical AI use cases in the coming year.

More Accurate and Faster-to-Train NLP Models with Pre-trained Language Models

Imagine if instead of going to school and university you could be given a microchip implant that teaches you most things you need to know about your subject of choice. You’d still need to learn-by-doing when you landed a job with your “instant” degree and fine tune the knowledge that had been given to you but hey, we’re talking about 6-12 months of learning instead of 12-18 years. That’s the essence of what Transfer Learning is all about within the Machine Learning and Deep Learning space.

BERT is a state-of-the-art neural NLP model unveiled by Google in November 2018. It, like a number of other models unveiled in 2018 like ELMo and ULMFiT can be pre-trained on unlabelled text (think news articles, contracts and legal terms, research papers or even wikipedia) and then used to support supervised/labelled tasks that require much smaller volumes of training data than an end-to-end supervised task. For example we might want to automate trading of stocks and shares based on sentiment about companies in the news. In the old days we’d have to spend weeks having armies of annotators read news articles and highlight companies and indicators of sentiment. A pre-trained language model may already have captured the underlying semantic relationships needed to understand company sentiment so we only need to annotate a fraction of the data that we would if we were training from scratch.

Of course another benefit of using pre-trained models is reduced training time, compute resources (read: server/energy usage costs). Like half-baked bread, the model still needs some time in the oven to make the connections it needs to perform its final task but this is a relatively short amount of time compared to training from empty.

In 2019 we’ll be training lots of NLP models a lot more quickly and effectively thanks to these techniques.

Photo-realistic Image Creation

For those unfamiliar with GANs, we’re talking about unsupervised neural models that can learn to generate photo-realistic images of people, places and things that don’t really exist. Let that sink in for a moment!

Originally invented by Ian Goodfellow in 2014, GANs back then were able to generate small, pixelated likenesses but they’ve come a long way. StyleGAN is a paper published by a research team at NVIDIA which came out in December and might have slipped under your radar in the festive mayhem of that month. However StyleGAN represents some serious progress in generated photo-realism.

Firstly StyleGAN can generate images up to 1024×1024 pixels. That’s still not huge in terms of modern photography but instagram pictures are 1080×1080 and most social media networks will chop your images down to this kind of ballpark in order to save bandwidth so we’re pretty close to having GANs that can generate social-media-ready images.

The second major leap represented by StyleGAN is the ability to exercise tight control over the style of the image being generated. Previous GAN implementations generated their images at random. StyleGAN uses parameters to control the styles of the output images, changing things like hair colour, whether or not the person is wearing glasses, and other physical properties.

Figure 8 from the StyleGan paper published here shows how manipulating one of the model parameters can give fine-grained control over the appearance of the output.

Brands and digital marketing agencies are already seeing huge success with CGI brand influencers on instagram. GANs that can be tightly controlled in order to position items, clothing and products in the image could be the next logical evolution of these kinds of accounts.

In 2019 we think fake photos could be the next big thing in digital media.

Hyper-Realistic Voice Assistants

In May 2018 Google showed us a glimpse of Google Duplex, an extension of their assistant product that used hyper-realistic speech recognition and generation to phone a hair dresser and schedule an appointment. There were a few pretty well argued and important ethical concerns about having AIs pretend to be humans. However, hyper-realistic voice assistant tech is coming.

There are huge advantages of these approaches, not just to end consumers, but to businesses too. Many businesses already have chatbots that allows users to chat to them via WhatsApp or Facebook and there are early-adopters building voice skills for Google Home and Amazon Alexa. Humans are always going to be a necessary and important part of customer interaction handling since machines will always make mistakes and need re-training. Nonetheless, automation can help reduce the stress and strain on contact-center operators at peek times and allow humans to deal with the more interesting enquiries by handling the most common customer questions on their behalf.

In 2019 we expect the voice interface revolution to continue to pick up pace.

GDPR and Model Interpretability

Ok so I’m cheating a bit here since GDPR was not a technical AI/ML development but a legal one. In May 2018, GDPR was enacted across Europe and since the internet knows no borders, most web providers internationally started to adopt GDPR best practices like asking you if its ok to track your behavior with cookies and telling you what data they store on you.

GDPR also grants individuals the following right:

not to be subject to a decision, which may include a measure, evaluating personal aspects relating to him or her which is based solely on automated processing and which produces legal effects concerning him or her or similarly significantly affects him or her, such as automatic refusal of an online credit application or e-recruiting practices without any human intervention.

GDPR Recital 71

This isn’t a clear cut right to an explanation for all automated decisions but it does mean that extra dilligence should be carried out where possible in order to understand automated decisions that affect users’ legal rights. As the provision says this could massively affect credit scoring bureaus and e-recruitment firms but could also affect car insurance firms who use telemetrics data as part of their decision making process when paying out for claims or retailers that use algorithms to decide whether to accept returned digital or physical goods.

In 2018 the best practices for model interpretability lay in training a “meta model” that sits on top of your highly accurate deep neural network and tries to guess which features of the data caused it to make a particular decision. These meta-models are normally simple in implementation (e.g. automated decision trees) so that they themselves can be directly inspected and interpreted.

Whether spurred on by the letter of the law or not, understanding why your model made a particular decision can be useful for diagnosing flaws and undesirable biases in your systems anyway.

In 2019 we expect that model interpretability will help providers and developers of AI to improve their approach and offer their users more transparency about decisions made.