AlphaGo Zero, beats previous world champion winner AlphaGo, 100-0

You heard of DeepMinds AlphaGo that beat worlds best Go player in the game everyone said computers would still need ten years to beat humans in.

That version trained on millions of expert human gameplays and then trained on itself through reinforcement learning.

This version skips all human gameplay and learns by playing against itself through a novel reinforcement learning method. It only has the rules of the game and starts to play against itself, making adjustments and keeping the versions that improve.

Blog Post:
Research Page:

If you would like to replicate the research, there is an open source project that is based on the paper However, in order to get the same results as AplhaGo Zero, you would need to have the same weights, and in order to achieve similar weights, you would need to have access to the same computing power as they. It would take 1700 years on commodity computers. The projects aim is to make a distributed effort to repeat the work.

Course 4 [] has been released!

The fourth course, Convolutional Neural Networks of has now been released on coursera. People have been waiting for this one, but i think that the delay was to make the material very up to date with current research results. The four weeks of learning deals with:

  1. Foundations of Convolutional Neural Networks
  2. Deep convolutional models: case studies
  3. Object detection
  4. Special applications: Face recognition & Neural style transfer

You only look once

The “YOLO9000: Better, Faster, Stronger” paper describes the improvements to the YOLO, You only look once, architecture that enables realtime object detection and classification. It can classify over 9000 object categories and outperforms Faster RCNN with ResNet and SSD while being significantly faster. They train on both COCO dataset for detection simultaneously with ImageNet for Classification and combine it with a wordtree so that they can also fallback to “dog” if they cannot classify for instance a specific dog breed.

The first version, and architecture can be seen in this paper.

Here is a video presentation:

Unity Machine Learning Agents. Super awesome, possibly terrifying

Unity has released a new SDK supporting machine learning agents in the Unity gaming engine. This enables you to:

  • Study complex multi agent behaviors in realistic competitive and cooperative scenarios. This is a lot safer than doing it with robots.
  • Study multi-agent behavior for industrial robotics, autonomous vehicles and other applications in a realistic environment.
  • Create intelligent agents for your games.

The benefits of this is that it will be a lot easier to develop and test learning algorithms that can later be used in real life. There is also a potential danger. In the same way that we can test industry robots, autonomous vehicles etc that can then be ported into the real world. We will inevitably see very smart ai agents that drive opponents in realistic war games. These can also be ported into the real world.

Since deep reinforcement learning can beat any human player in any game, the more realistic the game gets, the scarier it gets to imagine what would happen if you plug such an ai into some fighter jets or autonomous tanks.

Pretend that you are teaching yourself from two weeks ago

This blog is not intended to draw crowds. In fact, this is my current visitor tsunami:

The purpose of this blog is to be my personal notebook. A tool for allowing myself to remember “what was that link to that page now again?”. Also, if you want to learn, you have to teach others. If you have no platform to teach from, you can blog. There has been a barrier for me to post stuff online, and that is that i think that it needs to be perfect. One tends to imagine a certain visitor group and what they will think if you write this or that, or if you don’t know something that should be obvious. Lower the bar I say. Imagine yourself from two weeks or a month ago and explain the stuff to him. He is all ears, and actually would benefit from the stuff you have to say. Also, he tends to like the things you like. So my recommendation to you, younger self, is to start putting your thoughts into text, and don’t be afraid what people will think.

Have a nice day.