USING CONDITIONAL ADVERSARIAL NETWORKS TO CREATE A DIGITAL MASK
These videos were commissioned by the musical artist Lord Over. Themes surrounding technology, humanity and identity are a constant thread throughout Lord Over's work, and as a reclusive artist, we were looking for a way to obscure their form while allowing them to perform and emote as they might in real life.
We use image-to-image translation, as outlined first in the pix2pix paper, to create a model from existing footage (like an interview with Italian philosopher Julius Evola or 'Human Ken Doll' Rodrigo Alves) and allow the performer to control and express themself, both through the body and appearance of another person and through the fragmented and distorted perception of a slightly sick neural network.
The facial landmarks of the performing artist are matched to the landmarks extracted from the source material, which the network was initially trained to recognize. The resulting image is the network's best approximation of what the face and expression looked like in the initial source material.
Each model took roughly 3 days to train using a single-threaded NVIDIA Tesla M60 on AWS. The Evola model, as seen in the image above and used in the video for 'Reflection', was trained for the least amount of steps, and the resulting image is much grainier and less specific than the other, more learned models. For each video we used different dropout, weights and learning rates to create unique contours and textures, almost like brush strokes, and purposefully aimed to overfit the data to create noisy, granular effects- like turning up the gain on an amp. As the facial landmarks move outside what the model has been trained to recognize, the assumptions made by the model generate an image that's more wild, noisy and abstract.
Screenshots: (Right Click and Save As for full size)