AI wins agains the best professional dota players

OpenAi developed an AI that wins agains the best professional dota 2 players in the world in 1-on-1 games. It does not use imitation-learning or tree search to learn. Instead it learns by playing agains a copy of itself continuously improving. The game is very complicated and if you would code the ai by hand you would maybe create a quite poor player. By having the computer to teach itself to play it learns a lot of tactics.

read more at:
https://blog.openai.com/dota-2/

Here are tactics it learned by itself:

DeepMind and Blizzard releases Starcraft II as an AI research environment

AIs learning to play atari games are very impressive, beating Go champions was an eye opener to the world. Now DeepMind together with Blizzard releases Starcraft II as an ai research environment
It will be very interesting to see what happens and to try it out.

I have attempted at creating AI scripts for Age of Empires II (which is the best game ever btw) and there are quite good scripts for it. It is however limited by the API that the scripting engine in AOE2 has, and there the scripts are just looped over and over again and if a condition is met, that particular rule is executed.

In this case, you will get a half a million anonymized game replays, a machine learning API, a connection between DeepMinds toolset and Blizzards API.

It will be very interesting to see how deep learning can take on this.
I can imagine we will se pro-like reactions to be used agains user tactics. When you are scripting an ai for instance for AOE2, you need to take a whole bunch of tactics into account. And once you know how an ai script behaves you can easily beat it. Even thought the “new” ai script made for the newest releases for AOE2HD are considered very difficuly, you can beat it by tower rushing it, making it impossible for the computer to gain an economy advantage since the towers keep them form gathering resources. The benefit of the AI is often that it can multitask.

I can imagine that deep with reinforcement learning the computer will generate tactics to counter pro gamers. I quess however, that it will take a year or two before we see deep learning beat pro-gamers.

I hope to see some very interesting games…

On the other hand. I am not sure that i think it is that very good to put the efforts of AI research into developing war strategy machine learning.

Here is the paper.

Andrew Ng’s deeplearning.ai has released 5 deep learning specialization courses on Coursera

Andre Ng announced that he has launched five new courses in Deep Learning on Coursera.

The courses range from 2-4 weeks of study per course where you put in 3-6 hours of study per week per course.

  1. Neural Networks and Deep Learning
  2. Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization
  3. Structuring Machine Learning Projects
  4. Convolutional Neural Networks
  5. Sequence Models

The courses will earn you a certificate and are described as follows:

In five courses, you will learn the foundations of Deep Learning, understand how to build neural networks, and learn how to lead successful machine learning projects. You will learn about Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization, and more. You will work on case studies from healthcare, autonomous driving, sign language reading, music generation, and natural language processing. You will master not only the theory, but also see how it is applied in industry. You will practice all these ideas in Python and in TensorFlow, which we will teach.

Feedback from humans to help machines learn

Deep Mind and Open AI collaborated on an interesting project where they discovered how to use human feedback to help a deep learning algorithm learn by providing the reward feedback. The goal is to help improve AI safety by correcting wrong behavior through human intervention.

An example is a to help a robot perform a backflip.
It learns through reinforcement learning, and sometimes it asks a human which alternative is the best one, and the humans choice is used to train a reward predictor, which it uses in the reinforcement learning process.

The idea was that the algorithm tried different methods and presented alternatives to the human, where the human could choose which one was the best one to reach its goal of performing a backflip. It would continue and generate its own reward estimates, continue learning and later check in with the human to see how it had improved and which alternative now was the best one. To train a robot to perform a backflip, 900 such inputs were needed.

This method is helpful for situations where it is difficult to create a reward function.

This iterative approach to learning means that a human can spot and correct any undesired behaviours, a crucial part of any safety system. The design also does not put an onerous burden on the human operator, who only has to review around 0.1% of the agent’s behaviour to get it to to do what they want. However, this can mean reviewing several hundred to several thousand pairs of clips, something that will need to be reduced to make it applicable to real world problems.

Read about it here:
The Paper
A blog Post from Deep Mind
OpenAI’s blog post

Practical Deep Learning for coders (course.fast.ai)

This is in my opinion the best free course on getting into the state of the art in deep learning. It is a site that offers a free 7 week learning experience for deep learning. taught by 2 year in a row Kaggle winner, entrepreneur and generally nice guy Jeremy Howard and Math PhD/Data scientist/Full stack developer/Forbes Featured Rachel Thomas, two amazing people in AI. Their approach to teaching Deep Learning for Coders is that it shall be accessible to as many people as possible and not to a selected few. So instead of abstract mathy lectures, they allow you to get your hands dirty from the first lecture and improve your intuition of the field, thus enabling you to create state of the art deep learning solutions from day one.

After starting the course, i immediately realized that these are very talented educators that are sincere about their goal to make AI accessible to everyone, and to make it benefit others. What i especially like about the course is the way they approach the topic pedagogically. Their method is inspired by the book “Making Learning Whole: How Seven Principles of Teaching can Transform Education” by author David Perkins. Perkins compares todays education with learning baseball:

If you would learn baseball the way that math is taught, you would first learn about the shape of a parabola, and then you would learn the material science behind the stitchings in baseballs and so forth. And twenty years later after you have completed your PhD and post-doc, you would be taken to your first baseball game and you would be introduced to the rules of baseball. And then 10 years later you might get to hit. The way that in practice baseball is taught is we take the kid down to the baseball diamond and we say “These people are playing baseball, would you like to play?” And they say, “Yeah! Sure I would”. “Perfect, stand here, i’m gonna throw this. Hit it. Ok, great, now run. Good you’re playing baseball”

That is why the first class of the course they demonstrate that here are 7 lines of code that you can use to perform state of the art image classification using deep learning. And to do any image classification you want as long as you structure it the right way. You may not understand most of it, but as you need to adapt the tasks to your needs, you will need to learn more details, and thus you learn.

The course consists of a 2 hour lecture each week, detailed lecture notes, a community contributed wiki and jupyter notebooks which you also will do your assignments in. (There is also setup instructions for getting a GPU equipped machine up and running on AWS) In The first weeks assigment you will submit an entry into the Kaggle competition for classifying cat and dog images. By taking advantage of what you learn you will outperform what was the state of the art when the competition was launched 2013.

http://course.fast.ai/