You only look once

The “YOLO9000: Better, Faster, Stronger” paper describes the improvements to the YOLO, You only look once, architecture that enables realtime object detection and classification. It can classify over 9000 object categories and outperforms Faster RCNN with ResNet and SSD while being significantly faster. They train on both COCO dataset for detection simultaneously with ImageNet for Classification and combine it with a wordtree so that they can also fallback to “dog” if they cannot classify for instance a specific dog breed.

The first version, and architecture can be seen in this paper.

Here is a video presentation: https://www.youtube.com/watch?v=GBu2jofRJtk

Paper on Deep Reinforcement Learning

Paper: A Brief Survey of Deep Reinforcement Learning
Authors: Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, Anil Anthony Bharath
Submitted: Submitted on 19 Aug 2017
Read the PDF

Abstract:

Deep reinforcement learning is poised to revolutionise the field of AI and represents a step towards building autonomous systems with a higher level understanding of the visual world. Currently, deep learning is enabling reinforcement learning to scale to problems that were previously intractable, such as learning to play video games directly from pixels. Deep reinforcement learning algorithms are also applied to robotics, allowing control policies for robots to be learned directly from camera inputs in the real world. In this survey, we begin with an introduction to the general field of reinforcement learning, then progress to the main streams of value-based and policy-based methods. Our survey will cover central algorithms in deep reinforcement learning, including the deep Q-network, trust region policy optimisation, and asynchronous advantage actor-critic. In parallel, we highlight the unique advantages of deep neural networks, focusing on visual understanding via reinforcement learning. To conclude, we describe several current areas of research within the field.

Paper: Introduction to Convolutional Nerual Networks by Jianxin Wu

In this recently published paper, Jianxin Wu helps the reader understand
how a CNN runs at the mathematical level. It is self contained and you should not need any further material to understand it from a mathematical viewpoint.

With CNN, the important part is understanding what happens when you adjust the different parameters. Bu in order to make sense of those it is much easier when you know the underlying principles behind it.

Here you go