1. Introduction
- Image classificaiton, object detection, fine-grained categorization 등 처럼 end-to-end 방식으로 학습된 DCNNs (Deep Convolutional Neural Networks) 이 SIFT, HOG features처럼 engineered representation에 의존하는 방식보다 높은 성능
- 두 가지 기술적 어려움
- signal down sampling
- max-pooling과 downsampling ('striding') 의 반복작용으로 signal resolution이 감소
- spatial 'insensitivity' (invariance)
2. Related Work
3. Convolutional Neural Networks for Dense Image Labeling
3.1. Efficient Dense Sliding Window Feature Extraction with the Hole Algorithm
3.2. Controlling the Receptive Field Size and Accelerating Dense Computation with Convolutional Nets
4. Detailed Boundary Recovery: Fully-Connected Conditional Random Fields and Multi-Scale Prediction
4.1. Deep Convolutional Networks and the Localization Challenge
4.2. Fully-Connected Conditional Random Fields for Accurate Localization
4.3. Multi-Scale Prediction
5. Experimental Evaluation
6. Discussion