Teaching Computers to Imagine with Deep Generative Models

Sergey Tulyakov, Stéphane Lathuilière

November 19 - 26, 2019

The University of Trento, Italy

Abstract

Recent methods in computer vision can be roughly categorized as those that provide some decision given an input image or a video. Such decision includes the number of objects in the input, their type (ie car, tree, etc). In other words, they provide some sort of labelling capabilities. We term such methods as discriminative. Another group of methods, termed generative models, models the distribution of inputs. Such techniques offer generative capabilities, given some input such methods can generate an image, video, audio or text. Moreover, these methods can be conditioned on user input offering some sort of control on what is being generated. This control includes changing a particular attribute of an image, while keeping other attributes unchanged, such as summer to winter, male to female, smiling to non-smiling face. For humans, changing an attribute requires careful training, specialized software and is time consuming. Therefore, such capabilities can be considered as a form of learned imagination. Do to the ability to “imagine” generative techniques have been widely used in a variety of applications: image synthesis, style transfer, image-to-image translation, video synthesis and retargeting. Such models are used to enhance discriminative techniques with unlabelled or synthetic data, learn to reconstruct 3D when 3D labels are not available.

Focus: generative models in deep learning and their applications to image and video manipulation, translation as well as methods capitalizing upon such models to perform discriminative tasks.

Prerequisites: this course requires an understanding of the basic building blocks of convolutional neural networks and machine learning theory, including standard deep learning architectures, cost functions, activations and learning paradigms. Therefore, for PhD students that have not already worked on deep neural networks an introductory course such as Introduction to Deep Learning or Deep Learning for Image Processing is highly recommended.

November 19
9:30-11:30	Introduction to the course. Introduction to Deep Learning	Slides
9:30-11:30	Generative Adversarial Networks	Slides
14:00-16:00	Variational Autoencoders	Slides Derivations References
November 20
10:00-12:00	Generative adversarial networks continued. Image-to-image translation	Slides
14:00-16:00	Practical session on GANs	Colab
November 21
10:00-12:00	Pose-guided generation	Slides
	Video synthesis: generation, prediction, translation, retargeting	Slides
	Gradient-based style-transfer and adversarial examples	Slides
14:00-16:00	Practical session on VAEs	Colab
November 22
10:00-12:00	Deep Fakes	Slides
	Improving discriminative models	Slides
	Challenge: extending MoCoGAN	Slides Colab

Teaching Computers to Imagine with Deep Generative Models

Sergey Tulyakov, Stéphane Lathuilière

November 19 - 26, 2019

The University of Trento, Italy

Abstract

Instructors

Program

November 19

November 20

November 21

November 22