- Neural Style Transfer
- Generative Adversarial Networks (GAN)
Neural Style Transfer in a nutshell
- Reproduce an image with a new artistic style provided by another image.
- Blend a content image and a style reference image in a stylized output image.
- First described in A Neural Algorithm of Artistic Style by Gatys et al (2015). Many refinements and variations since.
Example (Prisma app)
As always: loss minimization.
The content loss
- Content = high-level structure of an image.
- Can be captured by the upper layer of a convolutional neural network.
- Content loss for a layer = distance between the feature maps of the content and generated images.
The style loss
- Style = low-level features of an image (textures, colors, visual patterns).
- Can be captured by using correlations across the different feature maps (filter responses) of a convnet.
- Feature correlations are computed via a Gram matrix (outer product of the feature maps for a given layer).
- Style loss for a layer = distance between the Gram matrices of the feature maps for the style and generated images.
The total variation loss
- Sum of the absolute differences for neighboring pixel-values in an image. Measures how much noise is in the image.
- Encourage spatial continuity in the generated image (denoising).
- Act as a regularization loss.
- Objective: minimize the total loss.
- Optimizer: L-BFGS (original choice made by Gatys et al.) or Adam.
Generative Adversarial Networks (GAN)
GAN in a nutshell
- Simultaneously train two models:
- One tries to generate realistic data.
- The other tries to discriminate between real and generated data.
- Each model is trained to best the other.
- First described in Generative Adversarial Nets
by Goodfellow et al. (2014).
- NIPS 2016 Tutorial
- The generator creates images from random noise.
- Generated images are mixed with real ones.
- The discriminator is trained on these mixed images.
- The generator’s parameters are updated in a direction that makes the discriminator more likely to classify generated data as “real”.
Specificities and gotchas
- A GAN is a dynamic system that evolves at each training step.
- Interestingly, the generator never sees images froms the training set directly: all its informations come from the discriminator.
- Training can be tricky: noisy generated data, vanishing gradients, domination of one side…
- GAN convergence theory is an active area of research.
- GAN Open Questions
GAN progress on face generation
The GAN landscape
Some GAN flavours
- DCGAN (2016): use deep convolutional networks for generator and discriminator.
- CycleGAN (2017): image-to-image translation in the absence of any paired training examples.
- StyleGAN (2019): fine control of output images.
- GAN - The Story So Far
GAN use cases: not just images!