Course 4 [] has been released!

The fourth course, Convolutional Neural Networks of has now been released on coursera. People have been waiting for this one, but i think that the delay was to make the material very up to date with current research results. The four weeks of learning deals with:

  1. Foundations of Convolutional Neural Networks
  2. Deep convolutional models: case studies
  3. Object detection
  4. Special applications: Face recognition & Neural style transfer

You only look once

The “YOLO9000: Better, Faster, Stronger” paper describes the improvements to the YOLO, You only look once, architecture that enables realtime object detection and classification. It can classify over 9000 object categories and outperforms Faster RCNN with ResNet and SSD while being significantly faster. They train on both COCO dataset for detection simultaneously with ImageNet for Classification and combine it with a wordtree so that they can also fallback to “dog” if they cannot classify for instance a specific dog breed.

The first version, and architecture can be seen in this paper.

Here is a video presentation:

Tensorflow Object Detection API

Google has released an opensource framework built on top of Tensorflow, called the Tensorflow Object Detection API which is a tool for making it easy to make and deploy object detection models.

There are different state of the art types of models you can build. It you for instance make models using the Single Shot Multibox Detector (SSD) with MobileNets you will get lightweight models that you can run in real time on mobile devices.

The models you get are Single Shot Multiboc Detector, using MobileNets or Inception V2, RegionBased Fully Convolutional Networks with Resnet 101, and Faster RCMM with Resnet 101 or Inception Resnet v2.

You also get a Jupyter notebook for trying things out


If all these terms above makes no sense, you can read this excellent blog post explaining Deep Learning for Object Detection by Joyce Xu.