Jeremy Howard et al, at fast.ai has done what one might consider a huge breakthrough in regards to training deep learning models quickly.
They managed to train Imagenet in 18 minutes using publicly available resources that only cost them $40 to run!
this was their method:
- fast.ai’s progressive resizing for classification, and rectangular image validation
- NVIDIA’s NCCL with PyTorch’s all-reduce
- Tencent’s weight decay tuning; a variant of Google Brain’s dynamic batch sizes, gradual learning rate warm-up (Goyal et al 2018, and Leslie Smith 2018).
- ResNet-50 architecture
- SGD with momentum.