Create Your Own Deep Fake

Mintoadil
3 min readApr 1, 2020

Its Just a task of image processing under shade of computer vision using deep learning models to create fake images of people just an overview of deep fake please check my GitHub link in last of the section for practical implementation

The working of Deepfakes is very easy to understand. Let’s take we want to transfer the face of person A to a video of person B.

First initially , we collect hundreds or thousands of pictures for both persons. We build an encoder to encode all these pictures using a deep learning CNN network. Then we use a decoder to reconstruct the image. This autoencoder (the encoder and the decoder) has over a million parameters but is not even close enough to remember all the pictures. So the encoder helps in extracting the most salient features to recreate the original input. The features are the descriptions from a witness (encoder) and a composite sketch artist (decoder) uses them to reconstruct a picture of the suspect.

To decode the features, we use separate decoders for person A and person B. Now, with the help of back propagation technique we can easily train decoder and encoder such that the input will match closely with the output. This process is time-consuming.if you have a GPU graphic card, it take approx 3 days to generate decent results. Most probably doing deep learning task i mostly used cloud services

After the completion of training , we process the video frame-by-frame to swap a person’s face with another. Now with the help of face detection technique we can easily extract the face of person A out and feed it into the encoder. However, instead of feeding to its original decoder, we use the decoder of the person B to reconstruct the picture. i.e. we draw person B with the features of A in the original video. Then we merge the newly created face into the original image.

Basically, the encoder is detecting face angle, skin tone, facial expression, lighting and other important features to reconstruct the person A. When we use the second decoder to reconstruct the image, we are drawing person B but with the context of A. In the picture below, the reconstructed image has facial characters of Trump while maintaining the facial expression of the target video.

Please visit my GitHub profile for practical implementation:

https://github.com/adil7860/computer-vision

Credits and reference :https://aliaksandrsiarohin.github.io/first-order-model-website/

--

--