R-CNNs : combine region proposals with CNNs
we focused on two problems: localizing objects with a deep network and training a high-capacity model with only a small quantity of annotated detection data.
Pedestrian detection with unsupervised multi-stage feature
learning. (fine-tune, unsupervised pre-train)
supervised pre-training on a large auxiliary dataset (ILSVRC), followed by domain-specific fine-tuning on a small dataset (PASCAL), is an effective paradigm for learning high-capacity CNNs when data is scarce.
Modules:
- category-independent region proposals(selective search)
- CNNs
- linear SVM
intersection-over-union(IoU)
Training:
1) supervised pre-train
2) finetune on specific categories. (IoU > 0.5 as positive)
We start SGD at
a learning rate of 0.001 (1/10th of the initial pre-training
rate),
3) extract features from CNNs, Iou < 0.3 as negatives , ->linear SVM
Feature
Histograms of sparse codes for object detection. In CVPR, 2013
DPM