How did the GANs change the way machine learning works?
The history of deep learning has shown to be a bit unusual. Many practices, such as convolutional neural networks, invented in the 80s, had a comeback only after 20 years. While most methods had a comeback, Generative Adversarial Networks (GANs) were one of the most innovative techniques in deep learning in the past decade.
GANs were first proposed by Goodfellow et al. [1] at the University of Montreal. The framework consists of a generator and a discriminator, with the generator attempting to create realistic data to fool the discriminator, while the discriminator learns to differentiate between real and generated data. The two networks compete against each other, pushing each other to improve.
Core Concept of GANs
GANs consist of a generator that maps a latent space to a data distribution and a discriminator that distinguishes real data from generated data. The generator’s goal is to trick the discriminator, while the discriminator works to accurately classify data as real or generated.
GANs are designed not to reproduce training data but to create novel data that resembles the original distribution. This adversarial setup creates a two-person game aiming for equilibrium, where neither network can improve further.
Challenges in GAN Training
Achieving and maintaining equilibrium is challenging. Additionally, validating whether the generator has learned to produce a realistic data distribution is difficult compared to traditional deep learning methods.
The performance of GANs depends heavily on the amount of data and network depth, which can require computational resources currently beyond reach.
Applications of GANs
GANs have broad applications in art, fashion, advertising, science, and video games. However, they have also been used maliciously, such as creating fake social media profiles with synthesized images.
Key Developments in GANs
- Deep Convolutional GANs (DCGANs): Uses CNNs instead of multilayer perceptrons, with techniques like deconvolution and batch normalization to improve performance.
- Improved GANs: Introduces minibatch discrimination, virtual batch normalization, and feature matching.
- LAPGAN: Generates higher-resolution images using a Laplacian pyramid.
- CycleGAN: Performs image-to-image translation without aligned pairs, leveraging adversarial and cycle consistency loss.
- PACGAN: Addresses mode collapse by enabling the discriminator to make decisions based on multiple samples.
- SAGAN: Utilizes self-attention and spectral normalization for improved image generation.
Open Challenges
GANs struggle with generating discrete data and measuring the uncertainty of a trained generative network. Addressing these limitations remains an area of active research.
References
- I. Goodfellow, et al., “Generative adversarial nets,” Advances in Neural Information Processing Systems, 2014.
- A. Radford, et al., “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint, 2015.
- T. Salimans, et al., “Improved techniques for training GANs,” Advances in Neural Information Processing Systems, 2016.
- E. L. Denton, et al., “Deep generative image models using a Laplacian pyramid of adversarial networks,” Advances in Neural Information Processing Systems, 2015.
- T. Karras, et al., “Progressive growing of GANs for improved quality, stability, and variation,” arXiv preprint, 2017.
- J.-Y. Zhu, et al., “Unpaired image-to-image translation using cycle-consistent adversarial networks,” Proceedings of the IEEE International Conference on Computer Vision, 2017.
- Z. Lin, et al., “PACGAN: The power of two samples in generative adversarial networks,” Advances in Neural Information Processing Systems, 2018.
- H. Zhang, et al., “Self-attention generative adversarial networks,” arXiv preprint, 2018.