Thierry Pinheiro Moreira
E-mail: thierrypin [at] liv [dot] ic [dot] unicamp [dot] br
Rafael de Oliveira Werneck
Website: Rafael Werneck
E-mail: rafael [dot] werneck [at] ic [dot] unicamp [dot] br
Mauricio Lisboa Perez
Website: Mauricio Perez
E-mail: mauricio [dot] perez [at] students [dot] ic [dot] unicamp [dot] br
Website: Eduardo Valle
E-mail: dovalle [at] dca [dot] fee [dot] unicamp [dot] br
We acquired the Flickr-dog dataset 6 by selecting dog photos from Flickr available under Creative Commons licenses. We cropped the dog faces, rotated them to align the eyes horizontally, and resized them to 250x250 pixels.
We selected dogs from two breeds: pugs and huskies. Those breeds were selected to represent the different degrees of challenge: we expected pugs to be difficult to identify, and huskies to be easy. For each breed, we found 21 individuals, each with at least 5 photos. We labeled the individuals by interpreting picture metadata (user, title, description, timestamps, etc.), and double checked with our own ability to identify the dogs.
Altogether, the Flickr-dog dataset has 42 classes and 374 photos.
The protocol was a stratified k-fold cross-validation, that splits the dataset into k folds, preserving as much as possible the class proportions among the folds. We used 10 folds, with nine folds for training, and one for testing.
Our main metric is the balanced average accuracy, which is the arithmetic mean of the accuracy for each of the classes. We also employ confusion matrices for detailed analyses of the results.
For the retrieval experiment, we employ a top-k recall, which ranks the classification scores for all classes, and counts the test as successful if the right class is among the highest k scores.