Learn about neon™ with the Nervana Deep Learning Course

Jul 16, 2016

Author Bio Image

Hanlin Tang

Principal Engineer, Artificial Intelligence Products Group

Intel Nervana is excited to share a series of short Nervana videos and accompanying exercises to learn how to build deep learning models with neon, our deep learning framework. We start with a basic introduction into deep learning concepts, provide an overview of the neon framework, and discuss key neon concepts such as loading data and defining branching architectures. This will be a living series, so check back for more updates and videos!

You can also find more resources, including pre-trained models, Kaggle challenge scripts, videos from our meetups, and more here.

Video Sessions

01 Deep learning introduction

This video introduces the basic deep learning concepts necessary to both understand the neon codebase and build your own deep learning models. We discuss how deep learning is different from traditional machine learning, and cover basic concepts such as: supervised learning, backpropagation, stochastic gradient descent, activation functions, and the basic linear unit.

02 Recurrent neural networks

For sequence data such as speech or text, recurrent neural networks (RNNs) are often used to capture the short and long term temporal dependencies in the data. Training RNNs is challenging because of the vanishing gradient problem. We introduce the Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks that are designed to combat the vanishing gradient problem.

03 Convolutional Neural Networks

For images and other data where ordering in the spatial dimensions have meaning, convolutional neural networks have proven to be effective networks. In this video, we discuss 1D, 2D, and 3D convolutional networks, and review recent CNN architectures that have enabled deeper and more powerful models (VGG, ResNet, etc.).

04 Neon Workflow

The neon deep learning framework provides an easy python-based approach to getting started with deep learning. Here we introduce the basic modules within neon and how to construct models and use our command line arguments to customize training runs. We recommend viewing this video before trying our jupyter notebooks. The MNIST Example and Fine-tuning VGG notebooks below are useful companions.

05 Neon Concepts

In this video, we discuss two key concepts within neon: loading data into neon, and defining complex branching architectures. neon provide four different ways to load your data for training, depending on your data size and complexity. Several notebooks guide you through writing a custom dataset object, custom activation functions and layers, custom callbacks, and defining a complex branching model.

06 Nervana Cloud

Some of our notebooks require GPUs because of memory and speed constraints. Our Nervana Cloud provides an easy interface to launch training jobs on our GPU servers. Trained models can also be deployed on a server to receive incoming inference requests via a REST API. This video demonstrates how to launch jobs, inspect progress, and deploy a trained job for inference.

One of our popular cloud features is interactive mode, where users can launch a jupyter notebook server running on our GPUs and access the notebook through their web browser to interactively step through code for debugging or exploration.

Exercises

The above videos are accompanied by several jupyter notebooks found at https://github.com/NervanaSystems/neon_course that are guided exercises through key concepts in neon and common operations.

The jupyter notebooks in this repository include:

  1. MNIST example

Comprehensive walk-through of how to use neon to build a simple model to recognize handwritten digits. Recommended as an introduction to the neon framework.

  1. Fine-tuning

A popular application of deep learning is to load a pre-trained model and fine-tune on a new dataset that may have a different number of categories. This example walks through how to load a VGG model that has been pre-trained on ImageNet, a large corpus of natural images belonging to 1000 categories, and re-train the final few layers on the CIFAR-10 dataset, which has only 10 categories.

  1. Writing a custom dataset object

neon provides many built-in methods for loading data from images, videos, audio, text, and more. In the rare cases where you may have to implement a custom dataset object,his notebooks guides users through building a custom dataset object for a modified version of the Street View House Number (SVHN) dataset. Users will not only write a custom dataset, but also design a network to, given an image, draw a bounding box around the digit sequence.

  1. Writing a custom activation function and a custom layer

This notebook walks developers through how to implement custom activation functions and layers within neon. We implement the Affine layer, and demonstrate the speed-up difference between using a python-based computation and our own heavily optimized kernels.

  1. Defining complex branching models

When simple sequential lists of layers do not suffice for your complex models, we present how to build complex branching models within neon.

  1. Deep Residual network on the CIFAR-10 dataset

In neon, models are constructed as python lists, which makes it easy to use for-loops to define complex models that have repeated patterns, such as deep residual networks. This notebook is an end-to-end walkthrough of building a deep residual network, training on the CIFAR-10 dataset, and then applying the model to predict categories on novel images.

  1. Writing a custom callback

Callbacks allow models to report back to users its progress during training. In this notebook, we present a callback that plots training cost in real-time within the jupyter notebook.

  1. Detecting overfitting

Overfitting is often encountered when training deep learning models. This tutorial demonstrates how to use our visualization tools to detect when a model has overfit on the training data, and how to apply Dropout layers to correct the problem.

Author Bio Image

Hanlin Tang

Principal Engineer, Artificial Intelligence Products Group

Related Blog Posts

neon™ 2.6.0: Inference Optimizations for Single Shot MultiBox Detector on Intel® Xeon® Processor Architectures

We are excited to release the neon™ 2.6.0 framework, which features improvements for CPU inference path on a VGG-16 based Single Shot multibox Detector (SSD) neural network. These updates, along with the training optimizations released in neon 2.5.0, show that neon is gaining significant boosts in both training and inference performance.  (Granular configuration details, as well…

Read more

#Release Notes

Reinforcement Learning Coach v0.9

Since the release of Coach a couple of months ago, we have been working hard to push it into new frontiers that will improve its usability for real world applications. In this release, we are introducing several new features that will move Coach forward in this direction. Imitation Learning First, we added several convenient tools…

Read more

#Release Notes #Technology

neon v2.3.0: Significant Performance Boost for Deep Speech 2 and VGG models

We are excited to announce the release of neon™ 2.3.0.  It ships with significant performance improvements for Deep Speech 2 (DS2) and VGG models running on Intel® architecture (IA). For the DS2 model, our tests show up to 6.8X improvement1,4 with the  (Intel® MKL) backend over the NumPy CPU backend with neon™ 2.3.0, and more…

Read more

#Release Notes