logo资料库

自监督学习教程(来源于牛津大学&DeepMind).pdf

第1页 / 共141页
第2页 / 共141页
第3页 / 共141页
第4页 / 共141页
第5页 / 共141页
第6页 / 共141页
第7页 / 共141页
第8页 / 共141页
资料共141页,剩余部分请下载后查看
Self-Supervised Learning Andrew Zisserman July 2019 Slides from: Carl Doersch, Ishan Misra, Andrew Owens, AJ Piergiovanni, Carl Vondrick, Richard Zhang
The ImageNet Challenge Story … 1000 categories • Training: 1000 images for each category • Testing: 100k images
The ImageNet Challenge Story … strong supervision
The ImageNet Challenge Story … outcomes Strong supervision: • Features from networks trained on ImageNet can be used for other visual tasks, e.g. detection, segmentation, action recognition, fine grained visual classification • To some extent, any visual task can be solved now by: 1. Construct a large-scale dataset labelled for that task 2. Specify a training loss and neural network architecture 3. Train the network and deploy • Are there alternatives to strong supervision for training? Self-Supervised learning ….
Why Self-Supervision? 1. Expense of producing a new dataset for each new task 2. Some areas are supervision-starved, e.g. medical data, where it is hard to obtain annotation 3. Untapped/availability of vast numbers of unlabelled images/videos – Facebook: one billion images uploaded per day – 300 hours of video are uploaded to YouTube every minute 4. How infants may learn …
Self-Supervised Learning The Scientist in the Crib: What Early Learning Tells Us About the Mind by Alison Gopnik, Andrew N. Meltzoff and Patricia K. Kuhl The Development of Embodied Cognition: Six Lessons from Babies by Linda Smith and Michael Gasser
What is Self-Supervision? • A form of unsupervised learning where the data provides the supervision • In general, withhold some part of the data, and task the network with predicting it • The task defines a proxy loss, and the network is forced to learn what we really care about, e.g. a semantic representation, in order to solve it • In recent work we might also choose tasks that we care about ….
Example: relative positioning Train network to predict relative position of two regions in the same image 8 possible locations Classifier CNN CNN Randomly Sample Patch Sample Second Patch Unsupervised visual representation learning by context prediction, Carl Doersch, Abhinav Gupta, Alexei A. Efros, ICCV 2015
分享到:
收藏