Dog classification using CNN transfer learning

This is the project I did in ECE 539 class. The result is pretty good, and I wish to show the result I have achieved.

Introduction

The primary task of this project is to use Convolution Neural Network (CNN) to identify different types of dogs base on their images.

The Dataset

This project uses Stanford Dogs Dataset. 20,580 images from 120 species are used in the project. As the result of the size of the dataset, I used transfer-learning strategy rather than training from the sketch.

Baseline Result

The Stanford Group who collected this dataset provided a basic learning algorithm they tried and achieved a result with mean accuracy up to 22% using 100 images for each species as the training sample. My training set will be approximately the same size, and this result is faired enough to be compared with.

Problem analysis

There are several challenges in the dataset:

  • The images are not cropped to focus on the dogs. Those images have huge background variation, and the dogs have different postures. This is the major challenge of the image set.
  • The variation of characteristic between classes is different. For example, the basset hound and bloodhound share very similar facial characteristics but differ significantly in their color, while the Japanese spaniel and Papillion share very similar color but greatly differ in their facial characteristics.

Training Result

I used both VGG-16 & VGG-19 CNN networks as feature extractor. The layer ‘fc7’ of both networks are used. Their features are combined together as one feature vector with a dimension of  8192. And an SVM is used as the classifier.

With this setting, a total accuracy of 74% is achieved. The confusion matrix is shown below.

The confusion matrix at max accuracy

The training result includes VGG-16 & VGG-19. Thus it’s too big for GitHub. The result can be downloaded via Onedrive. The original dataset is not included in the file.

References

  1. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
  2. Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. Novel dataset for fine-grained image categorization. In First Workshop on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, June 2011.
  3. Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. Stanford dogs dataset. Technical report, Stanford University, 2011. http://vision.stanford.edu/aditya86/ImageNetDogs/.
  4. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012.

Leave a Reply