This is quite cool. Google has released a search tool for finding datasets! https://toolbox.google.com/datasetsearch
You can for instance find world surface temperature data, real-time assessment of hybridization between wolves and dogs, lot’s of x-ray datasets or data from breast cancer screenings etc…
The data seems to come from a lot of research projects where they have used different machine learning techniques to analyse the data.
Now that we have a lot better means of using machine learning and we have easy access to a lot of related data and our compute power has increased dramatically it might be that we will see quite a few improvements to older research results. I welcome this initiative and believe that the world will become a better place due to us collectively solving the worlds many problems using AI.
Github user andri27-ts has put together materail for learning Deep Reinforcement Learning in 60 days. If you find DeepMinds breakthroughs with thyr AlphaGo Zero and OpenAI’s Dota 2 facinating and want to learn how they work, the repository offers resources and project suggestions.
Jeremy Howard et al, at fast.ai has done what one might consider a huge breakthrough in regards to training deep learning models quickly.
They managed to train Imagenet in 18 minutes using publicly available resources that only cost them $40 to run!
this was their method:
- fast.ai’s progressive resizing for classification, and rectangular image validation
- NVIDIA’s NCCL with PyTorch’s all-reduce
- Tencent’s weight decay tuning; a variant of Google Brain’s dynamic batch sizes, gradual learning rate warm-up (Goyal et al 2018, and Leslie Smith 2018).
- ResNet-50 architecture
- SGD with momentum.
Here is a nice collection of Deep Learning resources including tutorials, papers and courses. Enjoy:
Stanford has released a dataset intended to be used to improve the state of the art in x-ray image classification.
Download it here and eter your own submission to the challenge: https://stanfordmlgroup.github.io/competitions/mura/
The large dataset for teaching your algorithms to drive can be downloaded from http://bdd-data.berkeley.edu/.
It contains over 100,000 HD video sequences, that make up over a thousand hours of footage. The data contains over 100 000 annotated images for object detection for bus, traffic light, traffic sign, person, bike, truck, motor, car, train, and rider. Alos segmentation, drivable area, lane markings etc.
I love how data is released to the public for the greater good.
The Fast and the Furious 2 of machine learning is now available for your pleasure.
Fast.ai is the very best way to learn practical Deep Learning. Period.
The first iteration of course 1 and 2, used Keras and the new versions use their own library built on top of PyTorch. Their new library is awesome and has a lot of useful best practice functions.
If you are interested in learning more about Data Science, you can check out the course page for the CS109 Data Science Course at Harvard University.
Topics covered are among others:
- Web Scraping
- Regular Expressions
- Data Reshaping
- Data Cleanup
- Frequentist Statistics
- Bias and Regression
- SVM, Decision Trees, Random Forests
- Ensemble Methods
- Bayes Theorem and Bayesian Methods
- Interactive Visualization
- Deep Networks
“Thanks everyone for an amazing month of January. It’s been an inspiring, life-changing experience for me.” – Lex Fridman
Several more lecture recordings are soon to be released.
Here is the official webpage of the course: