This dataset does not exist: training models from generated images

Victor Besnier Himalaya Jain Andrei Bursuc Matthieu Cord Patrick Pérez

ICASSP 2020

Abstract

Current generative networks are increasingly proficient in generating high-resolution realistic images. These generative networks, especially the conditional ones, can potentially become a great tool for providing new image datasets. This naturally brings the question: Can we train a classifier only on the generated data? This potential availability of nearly unlimited amounts of training data challenges standard practices for training machine learning models, which have been crafted across the years for limited and fixed size datasets. In this work we investigate this question and its related challenges. We identify ways to improve significantly the performance over naive training on randomly generated images with regular heuristics. We propose three standalone techniques that can be applied at different stages of the pipeline, i.e., data generation, training on generated data, and deploying on real data. We evaluate our proposed approaches on a subset of the ImageNet dataset and show encouraging results compared to classifiers trained on real images.

Results

Effect of applying HSM at different iterations during classifier training. The first image of each category is sampled randomly. The other two images are generated from HSM-computed codes at two different steps during training. The difference between the images shows that the effect of HSM is specific to the classifier.

Results for ImageNet-10 real test images. Performance of classifiers trained on generated images with all combinations of the proposed methods. Each classifier is trained for $150$ epochs (except Long training, where we let DS run for $150$ epochs) over a set of $N = 13K$ images; in case of continuous sampling we replace $50\%$ (i.e., $6,500$) of the images every epoch, while fixed dataset is the usual setup where no images are replaced during training. In all setups we use $N$ images per epoch. First column, without applying any of the proposed methods, is the baseline. Each of the proposed methods individually shows improvement over the baseline. The combination of the methods further improves the results.

Effect of replacement fraction $r$ in DS. Classification accuracy using DS on real images with varying $r$, i.e., fraction of the dataset being replaced with new images every epoch. The figure shows plots for DS with and without BNA.

BibTeX

@inproceedings{besnier2020dataset,
   title={This dataset does not exist: training models from generated images},
   author={Besnier, Victor and Jain, Himalaya and Bursuc, Andrei and Cord, Matthieu and P{\'e}rez, Patrick},
   booktitle={ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
   pages={1--5},
   year={2020},
   organization={IEEE}
}