Diverse Image Generation via Self-Conditioned GANs

Despite the remarkable progress in Generative Adversarial Networks (GANs), unsupervised models fail to generalize to diverse datasets, such as ImageNet or Places365. To tackle such datasets, we rely on class-conditional GANs, which require class labels to train. These labels are often not available or are expensive to obtain.

We propose to increase unsupervised GAN quality by inferring class labels in a fully unsupervised manner. By periodically clustering already present discriminator features, we improve generation quality on large-scale datasets such as ImageNet and Places365. Besides increasing generation quality, we also automatically infer semantically meaningful clusters.

Visualizing Inferred Clusters

Our method is able to infer meaningful clusters on ImageNet and Places365. We visualize some of the inferred clusters and the generated samples conditioned on each cluster below.

The following webpages visualize other clusters on Places365 and on ImageNet.

Visualizing Sample Diversity

We visualize sample diversity by showing for each true class, the samples that a classifier has highest confidence in.

The following webpages visualize additional examples on on Places365 and on ImageNet.

Image Reconstructions

We visualize image reconstructions for an unconditional GAN and for a self-conditioned GAN trained on Places365.

Additional reconstructions of Places365 images can be found here.

Visualizing Generator Neurons

We observe that class-conditional models exhibit strong unit sharing. Here, we show examples of self-conditioned and class-conditioned GAN units corresponding to different concepts for different conditions, and units which correspond to the same concept across conditions.

We thank Phillip Isola, Bryan Russell, Richard Zhang, and our anonymous reviewers for their helpful comments. We are grateful for the support from the DARPA XAI program FA8750-18-C000, NSF 1524817, NSF BIGDATA 1447476, and a GPU donation from NVIDIA. This website template is borrowed from the GAN-seeing website.