Explaining the buzz around Artificial Intelligence

Authors: David Ackerman, David Chan, Byron Chu, Preethi Kumar, Barton Satchwill

PSSSTTT! In case you missed it, Alberta is a hotbed for global machine learning research. With the recent announcement that DeepMind is opening its first non-UK office in Edmonton (in collaboration with the University of Alberta), it’s clear that our city belongs with Toronto-Waterloo and Montreal as one of the giants of the Canadian Artificial Intelligence (AI) ecosystem.

In this blog post, Cybera’s data science team will try to break down exactly why AI is so important, and how some of its much-hyped tools and techniques (like deep learning and neural networks) actually works!

What’s the Big Deal?

The strength of Canada’s AI ecosystem is primarily research based, with many of the influential papers / leaders in the area of deep learning coming from research labs at universities across the country. The federal government has recognized this strength, and in Budget 2017, announced $125 million in funding for the establishment of the Pan-Canadian AI Strategy. Broadly, it aims to:

  • Increase the number of highly qualified personnel in AI in Canada.

  • Establish and interconnect nodes of scientific AI excellence in Edmonton, Montreal and Toronto.

  • Develop thought leaders in economic, ethical, legal and policy areas related to advances in AI technology.

  • Support the Canadian AI research community.

This injection of funding comes at a critical time, as interest and investment in AI (and related deep learning and machine learning technologies) is peaking, and our top researchers continue to be drawn to silicon valley. The commitment by the federal government and the growing number of start-ups in this area will hopefully help retain this Canadian talent!

What’s with all the buzz-words?

The terms AI, machine learning, and deep learning are often used interchangeably. All three refer to specific research fields, but it’s generally understood that machine learning and deep learning are sub-fields in the larger AI research landscape. There have been a number of recent significant contributions in these areas by Canadian researchers (see Richard S. Sutton, Geoffrey Hinton, Yoshua Bengio, Ian Goodfellow, and others) especially in deep learning.

Fundamental to both machine and deep learning is the concept of artificial neural networks.  These are computational networks that derive inspiration from biology and how neurons in the brain communicate with each other. Artificial neural networks attempt to mimic this communication strategy, and are organized in layers (such as input, hidden and output layers), where signals can be transmitted between neurons and layers. By varying the number, type and organization of the neuronal layers, artificial neural networks can be used to develop models trained to solve various problems including playing games, recognizing images, and processing natural languages (see below for notable examples).  

           Historical evolution of AI development. (source)

           Google Trends (past 5 years). Can you guess what the March 2016 peak in AI refers to?

 

Peeling back the (Neural Network) layers

The “Big Data” era has been spurred on by advances in computing technology (i.e. data processing and storage, and cloud computing) to the point where data is now commonly treated as a commodity (see “data is the new oil”). Currently, much of the speculated value and investment being derived from this commodity is driven by advances in AI and specifically Deep Learning. These technologies enable computational models to learn from this “Big Data” by using multiple layers of abstraction between the input and output layers (see review). With this overly simplified concept in mind, let’s try to give an overview of some of the most common and/or popular Deep Learning approaches and their applications. It is also important to note that many of these approaches are combined in-practice to achieve the best results.


Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are best known for their applications in “computer vision,” where computer models are taught to analyze visual images (e.g. pictures) and classify those images. For example, the model may be tasked to differentiate between images of cats and dogs. The accuracy obtained from using CNNs in these classification problems has improved significantly since the development of  “deeper” network architectures, and they now surpass human performance on the ImageNet Challenge benchmark.

Breakthrough contribution(s):

Where they are currently being applied:

  • Facial recognition, object recognition, handwriting recognition, video analysis, autonomous driving.


Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are commonly applied to problems that require the processing of sequential data (i.e. sequences). They can be used to identify missing or the following sequences, and apply memory states, which can be used to share parameters across different parts of the model in the short and long term.

An important subtype of RNNs, called LSTMs (long short term memory networks), is a key technology for natural language processing challenges.

Breakthrough contribution(s):

  • A machine translation paper that leveraged deep neural networks to read a sentence in one language, and generate a translation in a different language.

Where they are currently being applied:

  • Text prediction, machine translation, conversation modeling, acoustic modeling, autonomous driving.


Generative Adversarial Networks

Generative Adversarial Networks (GAN) modeling is one of the most promising approaches in the branch of AI to deal with unsupervised computer learning.

The two main components of a GAN are the generator neural network, and the discriminator neural network. The former tries to create or generate “fake” data that is similar to the “real” data that we feed it, while the latter tries to classify which is the real data.

Breakthrough contribution(s):

  • In 2014 the concept of GANs was first published and improved for adversarial training.   

Where they are currently being applied:


(Deep) Reinforcement Learning

“Is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.” Wikipedia definition

(Deep) Reinforcement Learning (RL) networks are currently making news as they are used to playing games against human adversaries (e.g. Go). The three main RL concepts are state, action and reward (both immediate and long-term). These models/agents are able to continuously learn (represented by Q-learning algorithms and policies) from their own mistakes (i.e. trial-and-error) while taking in new inputs.  

Breakthrough contribution:

  • In 2015, DeepMind published a paper outlining how its Deep RL model successfully learned to play 50 different Atari games, without prior knowledge of game rules, by observing the pixels on the screen and resultant game score changes.

Where they are currently being applied:


Where to get more info

There are a plethora of blog posts and tutorials that attempt to capture the fundamentals and newest advances in the rapidly developing AI technology areas. For more information, check out the links embedded in this blog post, and be sure to sign up to newsletters such as Wild Week in AI and O’Reilly’s AI coverage. As well, Cybera’s data science team has been busy and will be sharing more of our experiences and results as they become available. Stay (hyper-parameter-)-tuned. 🙂