The guy who wrote the convnetjs javascript library for running nerual nets in your browser or node, has this great series on his blog describing neural nets to coders that lke to tinker with code samples rather than reading long texts.
Another Deep Learning School
I recently posted about Deep Learning Summer School 2016
Here is a link to another Bay Area Deep Learning School one that was the same year in September at Stanford CA.
Introduction to Feedforward Neural Networks
Hugo Larochelle
I will cover some of the fundamental concepts behind feedforward neural networks. I’ll start by briefly reviewing the basic multilayer architecture of feedforward networks, as well as backpropagation from automatic differentiation and stochastic gradient descent (SGD). Then, I’ll discuss the most recent ideas that are now commonly used for training deep neural networks, such as variants of SGD, dropout, batch normalization and unsupervised pretraining.
Deep Learning for Computer Vision
Andrej Karpathy
I will cover the design of convolutional neural network (ConvNet) architectures for image understanding, the history of state of the art models on the ImageNet Large Scale Visual Recognition Challenge, and some of the most recent patterns of developments in this area. I will also talk about ConvNet architectures in the context of related visual recognition tasks such as object detection, segmentation, and video processing.
Deep Learning for NLP
Richard Socher
I will describe the foundations of deep learning for natural language processing: word vectors, recurrent neural networks, tasks and models influenced by linguistics. I will end with some recent models that put together all these basic lego blocks into a very powerful deep architecture called dynamic memory network.
Foundations of Deep Unsupervised Learning
Ruslan Salakhutdinov
Building intelligent systems that are capable of extracting meaningful
representations from highdimensional data lies at the core of solving many Artificial Intelligence tasks, including visual object recognition, information retrieval, speech perception, and language understanding. In this tutorial I will discuss mathematical basics of many popular unsupervised models, including Sparse Coding, Autoencoders, Restricted Boltzmann Machines (RBMs), Deep Boltzmann Machines (DBMs), and Variational Autoencoders (VAE). I will furtherdemonstrate that these models are capable of extracting useful hierarchical representations from high dimensional data with applications in visual object recognition, information retrieval, and natural language processing. Finally, time permitting, I will briefly discuss models that can generate natural language descriptions (captions) of images, as well as generate images from captions using attention mechanism.
Deep Reinforement Learning
John Schulman, OpenAI
I’ll start by providing an overview of the state of the art in deep reinforcement learning, including recent applications to video games (e.g., Atari), board games (AlphaGo) and simulated robotics. Then I’ll give a tutorial introduction to the two methods that lie at the core of these results: policy gradients and Qlearning. Finally, I’ll present a new analysis that shows the close similarity between these two methods. A theme of the talk will be to not only ask “what works?”, but also “when does it work?” and “why does it work?”; and to find the kind of answers that are actionable for tuning one’s implementation and designing better algorithms.
Theano Tutorial
Pascal Lamblin
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multidimensional arrays efficiently, on CPU or GPU. Since its introduction, Theano has been one of the most popular frameworks in the machine learning community, and multiple frameworks for deep learning have been built on top of it (Lasagne, Keras, Blocks, …). This tutorial will focus first on the concepts behind Theano and how to build and evaluate simple expressions, and then we will see how more complex models can be defined and trained.
Deep Learning for Speech
Adam Coates
Traditional speech recognition systems are built from numerous modules, each requiring its own challenging engineering. With deep learning it is now possible to create neural networks that perform most of the tasks of a traditional engine “end to end”, dramatically simplifying the development of new speech systems and opening a path to humanlevel performance. In this tutorial, we will walk through the steps for constructing one type of endtoend system similar to Baidu’s “Deep Speech” model. We will put all of the pieces together to form a “scale model” of a state of the art speech system; smallscale versions of the neural networks now powering production speech engines.
Torch Tutorial
Alex Wiltschko
Torch is an open platform for scientific computing in the Lua language, with a focus on machine learning, in particular deep learning. Torch is distinguished from other array libraries by having firstclass support for GPU computation, and a clear, interactive and imperative style. Further, through the “NN” library, Torch has broad support for building and training neural networks by composing primitive blocks or layers together in compute graphs. Torch, although benefitting from
extensive industry support, is a community owned and community developed ecosystem. All neural net libraries, including Torch NN, TensorFlow and Theano, rely on automatic differentiation (AD) to manage the computation of gradients of complex compositions of functions. I will present some general background on automatic differentiation (AD), which is the fundamental abstraction of gradient based optimization, and demonstrate
Twitter’s flexible implementation of AD in the library torchautograd
Sequence to Sequence Learning for NLP and Speech
Quoc Le
I will first present the foundations of sequence to sequence (seq2seq) learning and attention models, and their applications in machine translation and speech recognition. Then I will discuss attention with pointers and functions. Finally I will describe how reinforcement learning can play a role in seq2seq and attention models.
Foundations and Challenges of Deep Learning
Yoshua Bengio
Why is deep learning working as well as it does? What are some big challenges that remain ahead? This talk will first survey some key factors in the success of deep learning. First, from the context of the nofree lunch theorem, we will discuss the expressive power of deep netwroks to capture abstract distributed representations. Second, we will discuss our surprising ability to actually optimize the parameters of neural networks in spite of their nonconvexity. We will then consider a few challenges ahead, including the core representation question of disentangling the underlying explanatory factors of variation, especially with unsupervised learning, why this is important for bringing reinforcement learning to the next level, and optimization questions that remain challenging, such as learning of longterm dependencies, understanding the optimization landscape of deep networks, and how learning in brains remain a mystery worth attacking from the deep learning perspective.
The best free introductory course to machine learning according to the Internet
Many people have recommended the Machine Learning course held by Andrew Ng at Stanford University that is available via Coursera. Some say it is the best course they have heard. You can participate for free if you want, but if you want a certificate that you have taken the course you can buy the course for a small fee.
ConvNetJS – A JavaScript library for training Neural Networks in your browser
Here is a link to an Open Source JavaScript library that allows you to train neural networks in your browser. It was created by Andrej Karpathy, a PhD student at Stanford University and is currently community maintained.
It currently supports:
 Common Neural Network modules (fully connected layers, nonlinearities)
 Classification (SVM/Softmax) and Regression (L2) cost functions
 Ability to specify and train Convolutional Networks that process images
 An experimental Reinforcement Learning module, based on Deep Q Learning.
Project site: http://cs.stanford.edu/people/karpathy/convnetjs/
Github: https://github.com/karpathy/convnetjs
Deep learning for Self driving cars
I found a great course for learning to create self driving cards using Deep learning, Deep Reinforcemet Learning, Convolutional Neural Networks and Recurrent Neural Network for different parts of the tasks needed to be solved in producing an autonomous vehicle that can adapt to traffic, control a car, learn to drive and steer through time.
Lasso regression
In order to avoid overfitting in regression due to too many feature while at the same time have enough features to minimize the sum of squared errors in order to get a more accurate fit on the test data, you need to regularize the regression.
This can be done with a Lasso regression where you want to minimize the sim of squared errors + plus a penalty parameters times the coefficient of the regression (which indicates the amount of features)
minimize SSE + λβ
Deep Learning School
Here you can watch lectures from the 2016 Deep Learning Summer School in Montreal.
Course excerpt:
Deep neural networks that learn to represent data in multiple layers of increasing abstraction have dramatically improved the stateoftheart for speech recognition, object recognition, object detection, predicting the activity of drug molecules, and many other tasks. Deep learning discovers intricate structure in large datasets by building distributed representations, either via supervised, unsupervised or reinforcement learning.
The Deep Learning Summer School 2016 is aimed at graduate students and industrial engineers and researchers who already have some basic knowledge of machine learning (and possibly but not necessarily of deep learning) and wish to learn more about this rapidly growing field of research.
Here is the schedule in which you could view the presentations
table.schedule td {
verticalalign: top;
padding: 10px;
}
01/08/2016  02/08/2016  03/08/2016  04/08/2016  05/08/2016  06/08/2016  07/08/2016  
9:00 10:30 
Doina
Precup 
Rob
Fergus

Yoshua
Bengio

Kyunghyun
Cho

Joelle
Pineau

Ruslan
Salakhutdinov

Bruno
Olshausen
Neuro I

10:30 11:00 
Coffee Break 
Coffee Break 
Coffee Break 
Coffee Break 
Coffee Break 
Coffee Break 
Coffee Break 
11:00 12:30 
Hugo Larochelle

Antonio Torralba

Sumit
Chopra 
Edward Grefenstette

Pieter
Abbeel 
Shakir
Mohamed 
Surya Ganguli and Deep Learning Theory

12:30 14:30 
Lunch  Lunch  LunchWiDL event  Lunch  Lunch  Lunch  Lunch 
14:30 16:00 
Hugo Larochelle
Neural Networks II (click on part II) 
Alex Wiltschko Torch I 
Jeff
Dean 
Julie Bernauer (NVIDIA) GPU programming with CUDA 
Joelle, Pieter & Doina Advanced Topics in RL 
Contributed talks Session 4 
Contributed talks Session 4 
16:00 16:30 
Coffee Break 
Coffee Break 
Coffee Break 
Coffee Break 
Coffee Break 
Coffee Break 
Coffee Break 
16:30 18:00 
Pascal Lamblin 
Practical Session Alex Wiltschko (Torch) Frédéric Bastien 
Jeff Dean & TensorFlow (click on part II) 
Contributed talks Session 1 
Contributed talks Session 2 
Contributed Posters Session 1 
Contributed Posters Session 2 
Evening  Opening Reception (18:0020:30) — by — Imagia 
Happy Hour (18:4522:30) buses at 18:30 — by — Maluuba 
Happy Hour (18:3020:30) — by — Creative Destruction Lab 
(or you can just follow them in consecutive order at http://videolectures.net/deeplearning2016_montreal/ since they seem to be in the order they were presented.)
Contributed talks:
12:55 Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks
Rajarshi Das
14:29 Professor Forcing: A New Algorithm for Training Recurrent Networks
Anirudh Goyal
10:59 Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations
Tegan Maharaj
18:58 Deep multiview representation learning of brain responses to natural stimuli
Leila Wehbe
14:49 Learning to Communicate with Deep MultiAgent Reinforcement Learning
Jakob Foerster
13:57 ModelBased Relative Entropy Stochastic Search
Abbas Abdolmaleki
16:33 Learning Nash Equilibrium for GeneralSum Markov Games from Batch Dat
Julien Pérolat
20:30 A Networkbased EndtoEnd Trainable Taskoriented Dialogue System
TsungHsien Wen
15:28 Inference Learning
Patrick Putzky
16:45 Variational Autoencoders with PixelCNN Decoders
Ishaan Gulrajani
13:33 An Infinite Restricted Boltzmann Machine
MarcAlexandre Côté
15:15 Deep siamese neural network for prediction of longrange interactions in chromatin
Davide Chicco
14:09 Beam Search Message Passing in Bidirectional RNNs: Applications to FillintheBlank Image Captioning
Qing Sun
18:40 Analyzing the Behavior of Deep Visual Question Answering Models
Aishwarya Agrawal
13:55 Recombinator Networks: Learning CoarsetoFine Feature Aggregation
Sina Honari
Introduction to Generative Adversarial Networks (GANs)
Read these articles:
 Attacking Machine Learning with Adversarial Examples
 Generative Adversarial Networks (GANs) in 50 lines of code (PyTorch)
and then read these papers by Ian Goodfellow et al:
You can also watch this video by Ian Goodfellow:
Outlier Rejection
To detect and get rid of outliers in a dataset (which may for instance have been caused by sensor error or data entry error) you first train your data, and remove the data point that has the highest residual error (over 10%) and then train again.
Otherwise erroneous data entries may give you an incorrect regression line.
Classification vs. Regression
Two slightly similar concepts in supervised machine learning are Supervised classification, and regression.
With supervised classification you will get a discrete output (a label or boolean value) and in regression your output is continuous (i.e. a number).
The thing you are trying to find in the different cases is a decision boundary when using classification and a best fit line in regression. You evalueate the former with it’s accuracy value, and the latter using the “sum of squared errors” or r^{2}.