Artificial Intelligence Competition Leaderboard

I have not seen this cool leaderboard for AI challenges before.
https://leaderboard.allenai.org/

There are a few very interesting similar competition leaderboards for machine learning such as Kaggle and Numerai. Allenai host right now 4 interesting NLP challenges.

Here is the description of one of the challenges:

OpenBookQA: Open Book Question Answering

OpenBookQA is a new kind of question-answering dataset modeled after open book exams for assessing human understanding of a subject. It consists of 5,957 multiple-choice elementary-level science questions (4,957 train, 500 dev, 500 test), which probe the understanding of a small “book” of 1,326 core science facts and the application of these facts to novel situations. For training, the dataset includes a mapping from each question to the core science fact it was designed to probe. Answering OpenBookQA questions requires additional broad common knowledge, not contained in the book. The questions, by design, are answered incorrectly by both a retrieval-based algorithm and a word co-occurrence algorithm. Strong neural baselines achieve around 50% on OpenBookQA, leaving a large gap to the 92% accuracy of crowd-workers.

Google Dataset Search

This is quite cool. Google has released a search tool for finding datasets! https://toolbox.google.com/datasetsearch

You can for instance find world surface temperature data, real-time assessment of hybridization between wolves and dogs, lot’s of x-ray datasets or data from breast cancer screenings etc…

The data seems to come from a lot of research projects where they have used different machine learning techniques to analyse the data.

Now that we have a lot better means of using machine learning and we have easy access to a lot of related data and our compute power has increased dramatically it might be that we will see quite a few improvements to older research results. I welcome this initiative and believe that the world will become a better place due to us collectively solving the worlds many problems using AI.

 

Train Imagenet in 18 minutes

Jeremy Howard et al, at fast.ai has done what one might consider a huge breakthrough in regards to training deep learning models quickly.

They managed to train Imagenet in 18 minutes using publicly available resources that only cost them $40 to run!

this was their method:

  • fast.ai’s progressive resizing for classification, and rectangular image validation
  • NVIDIA’s NCCL with PyTorch’s all-reduce
  • Tencent’s weight decay tuning; a variant of Google Brain’s dynamic batch sizes, gradual learning rate warm-up (Goyal et al 2018, and Leslie Smith 2018).
  • ResNet-50 architecture
  • SGD with momentum.

http://www.fast.ai/2018/08/10/fastai-diu-imagenet/

Deep Drive Dataset available

The large dataset for teaching your algorithms to drive can be downloaded from http://bdd-data.berkeley.edu/.

It contains over 100,000 HD video sequences, that make up over a thousand hours of footage. The data contains over 100 000 annotated  images for object detection for bus, traffic light, traffic sign, person, bike, truck, motor, car, train, and rider. Alos segmentation, drivable area, lane markings etc.

I love how data is released to the public for the greater good.

The new Fast.ai 2 Videos available

The Fast and the Furious 2 of machine learning is now available for your pleasure.

http://course.fast.ai/part2.html

Fast.ai is the very best way to learn practical Deep Learning. Period.

The first iteration of course 1 and 2, used Keras  and the new versions use their own library built on top of PyTorch. Their new library is awesome and has a lot of useful best practice functions.