You only look once

The “YOLO9000: Better, Faster, Stronger” paper describes the improvements to the YOLO, You only look once, architecture that enables realtime object detection and classification. It can classify over 9000 object categories and outperforms Faster RCNN with ResNet and SSD while being significantly faster. They train on both COCO dataset for detection simultaneously with ImageNet for Classification and combine it with a wordtree so that they can also fallback to “dog” if they cannot classify for instance a specific dog breed.

The first version, and architecture can be seen in this paper.

Here is a video presentation:

Unity Machine Learning Agents. Super awesome, possibly terrifying

Unity has released a new SDK supporting machine learning agents in the Unity gaming engine. This enables you to:

  • Study complex multi agent behaviors in realistic competitive and cooperative scenarios. This is a lot safer than doing it with robots.
  • Study multi-agent behavior for industrial robotics, autonomous vehicles and other applications in a realistic environment.
  • Create intelligent agents for your games.

The benefits of this is that it will be a lot easier to develop and test learning algorithms that can later be used in real life. There is also a potential danger. In the same way that we can test industry robots, autonomous vehicles etc that can then be ported into the real world. We will inevitably see very smart ai agents that drive opponents in realistic war games. These can also be ported into the real world.

Since deep reinforcement learning can beat any human player in any game, the more realistic the game gets, the scarier it gets to imagine what would happen if you plug such an ai into some fighter jets or autonomous tanks.

Pretend that you are teaching yourself from two weeks ago

This blog is not intended to draw crowds. In fact, this is my current visitor tsunami:

The purpose of this blog is to be my personal notebook. A tool for allowing myself to remember “what was that link to that page now again?”. Also, if you want to learn, you have to teach others. If you have no platform to teach from, you can blog. There has been a barrier for me to post stuff online, and that is that i think that it needs to be perfect. One tends to imagine a certain visitor group and what they will think if you write this or that, or if you don’t know something that should be obvious. Lower the bar I say. Imagine yourself from two weeks or a month ago and explain the stuff to him. He is all ears, and actually would benefit from the stuff you have to say. Also, he tends to like the things you like. So my recommendation to you, younger self, is to start putting your thoughts into text, and don’t be afraid what people will think.

Have a nice day.

Google brain AMA 2017 TL;DR

The google brain team did an AMA (Ask me anything) on Reddit. This is the tl;dr:

  • They think PyTorch (made by people at Facebook) is great and that they did a good job with it. And that it is good that many people make Machine Learning libraries. You also learn from each other when developing your library.
  • Some of the hurdles in machine learning is to make deep networks stable and that many of the new breakthroughs in ML such as GANs or DeepRL are still to have their ‘batch normalization’ moment (that one idea that makes everything work without having to fight it). Also moving away from supervised learning will be difficult. Another challenge is to make systems that solve many problems instead of one.
  • Geoffrey Hintons capsules are coming along fine. They have a paper in nips on it.
  • They talked about some failures and stuff that hadn’t worked.
  • Their work days involve a lot of reading papers.
  • They recommend using the highest level API that solves your problem, then you get best practices for free
  •  The line between AI engineer and research scientist is blurry.
  • Give researchers access to more computation power and they will accomplish more.
  • PhD scientists go through the same interview pipeline as all devs
  • Robotics will benefit from the fact that we now have perception
  • A good way to learn is to read papers and re-implement them. If you want to lear a variety of ML topics, pick papers that cover different topics such as image classification, language modeling, GANs etc. If you want to become an expert in one subfield, pick a bunch of related papers
  • People are excited about: efficient large-scale optimization, building a theoretical foundation for deep learning, Human/AI Interaction, bridging the gap between real world and simulation, imitation learning, generatin long structured documents with long term dependencies in them, tools.
  • To people from many different backgrounds can come, stated that you have an interest in AI/ML
  • Learning tips: *TensorFlow tutorials *Geoff Hinton’s Coursera course *Vincent Vanhoucke’s Udacity course *Kaggle, a great site with lots of ML competitions *Deep Learning by Ian Goodfellow and Yoshua Bengio and Aaron Courville
  • You should probably use a GAN if you want to generate samples of continuous valued data or if you want to do semi-supervised learning, and you should use a VAE or FVBN if you want to use discrete data or estimate likelihoods.
  • In bilogy and genomics, they are involved in a variety of research projects in biology and genomics, such as predicting diabetic retinopathy status from fundus imagesidentifying cancerous cells in pathology images, using deep learning to call genetic variants in next-generation DNA sequencing data. We even have a recently-created Genomics team focused on applying TensorFlow, and extending it where necessary, to genomics problems. Other teams around Google and Alphabet, such as Google Accelerated SciencesVerily Life Sciences, and Calico, also apply deep learning techniques to biological data.
  • They like and would complement it with the Deep Learning textbook, Elements of statistical learning. – Hugo Larochelle online course, the deep learning summer series, Blog posts like, Sebastian Ruder’s blog.
  • You are welcome for Tensorflow
  • They keep up on what’s happening in the field by: Papers published in top ML conferences, Arxiv Sanity, “My Updates” feature on Google Scholar, Research colleagues pointing out and discussing interesting pieces of work, Interesting sounding work discussed on Hacker News or this subreddit

Tensorflow Object Detection API

Google has released an opensource framework built on top of Tensorflow, called the Tensorflow Object Detection API which is a tool for making it easy to make and deploy object detection models.

There are different state of the art types of models you can build. It you for instance make models using the Single Shot Multibox Detector (SSD) with MobileNets you will get lightweight models that you can run in real time on mobile devices.

The models you get are Single Shot Multiboc Detector, using MobileNets or Inception V2, RegionBased Fully Convolutional Networks with Resnet 101, and Faster RCMM with Resnet 101 or Inception Resnet v2.

You also get a Jupyter notebook for trying things out


If all these terms above makes no sense, you can read this excellent blog post explaining Deep Learning for Object Detection by Joyce Xu.