RELATED WORK

Advanced face manipulation can deceive both humans and automated face identification systems. Especially, DeepFakes is a significant issue as manipulated face/mul- timedia with false information could be briskly spread using messaging and social networking platforms. Such spread of fake information may create turbulent ramifications. For instance, an ex-lover can alter the content and person in a video (e.g., face swapping) to produce false revenge porn video, which may lead to the victim ending their life, especially if the victim is young person.

Categories of Face Manipulations

The manipulated faces are generally produced by altering facial features (e.g., gender and age), swapping two faces with each other (also known as face morphing), augmenting unnoticeable perturbations (also known as adversarial examples), artificially generating faces, or re-enacting/animating face expressions in the facial vid- eos/images [2]. By analysing the existing face manipulations systematically, we can broadly group all manipulations into four categories: face synthesis, face swap, facial attributes, and face expression [3,20].

4.2.1.1 Face Synthesis

This manipulation technique produces human faces that do not exist in real world, by generally using generative adversarial networks (GANs) [21]. These methods obtain astounding results, yielding face samples that are almost indistinguishable from real ones. For example, Karras et al. [22] proposed StyleGAN architecture, which is an enhanced version of ProGAN approach [23] and can generate entire nonexistent faces.

4.2.1.1.1 Face Synthesis Generation Methods and Datasets

Algorithmic architectures of GANs can be described by using two neural networks that are named generator and discriminator. First, the generator creates fake face images of realistic quality, while the discriminator distinguishes face images among real and fake samples. When the discriminator cannot discriminate among real and fake images, the result is images that are not in reality but seemingly identical to reality. By taking advantage of the GAN, CycleGAN [24] is proposed that learns unsupervised image-to-image translation. Shen et al. [25] proposed FacelD-GAN that considers the facial identity classification as the third actor and contend with generator by discriminating the identifications of real and synthesised facials.

There exist some public face synthesis datasets for research. Figure 4.2 shows examples of three different datasets of synthesised faces. The common feature of these datasets is that none of them contain any real person’s pictures. Therefore, researchers focusing on this kind of manipulation detection or recognition often use

Examples of different datasets, which are composed of synthesised faces

FIGURE 4.2 Examples of different datasets, which are composed of synthesised faces.

real faces from popular public databases to train their systems. Following, we briefly describe the datasets: [1]

4.2.1.2 Face Swap

In this face alteration, a person’s face is replaced by face of another one. Face swapping can be done by either traditional computer graphics-based schemes or new DL meth- ods/techniques. There exist many popular mobile apps for this purpose, e.g., Snapchat. Moreover, for face swapping, recently many w'orks have been published in the literature. For instance, Marcel et al. [9] created face DeepFakes dataset utilising a GAN- based face swapping scheme.4

4.2.1.2.1 Face Swap Generation Methods and Datasets

Face Swap is one of the increasingly popular manipulation techniques. Below, we summarise some publicly available datasets.

  • UADFV [26]: It comprises 49 real videos downloaded from YouTube and 49 manipulated videos obtained from these videos using the FakeApp application. Each video stands for a person with specifically 294 x 500 pixels resolution and an average of 11.14 seconds duration.
  • FaceForensics++ [1]: This dataset contains 1,000 real videos selected on YouTube and 1,000 manipulated videos. Manipulations were generated using the faceswap5 application.
  • DFDC6: The DeepFake detection challenge (DFDC) dataset is presented by Facebook DFDC. It contains 1,131 real of 66 actors’ videos and 4,119 forged videos at first. Forged videos were created using two different approaches; the details of these algorithms however are not disclosed. On December 11, 2019, the entire DFDC dataset was released and the competition started.
  • 4.2.1.3 Facial Attributes

Some face attributes (e.g., skin colour and gender) are altered in this category. Adobe Photoshop, AgeingBooth, and FaceApp are some of the popular apps for this type of manipulations. Also, He at al. [27] developed a scheme called attGAN, which can manipulate beard, young, age, hair colour, and mouth face traits while preserving identity of the person as well as other facial details.

4.2.1.3.1 Face Attributes Generation Methods and Datasets

Since the code of most GAN techniques is publicly available, there are a few datasets known in the literature regarding face attributes exploiting such GAN techniques. Chang et al. [28] proposed a two-stage technique to generate face attribution: Texture Completion GAN (ТС-GAN) and 3D Attribute GAN (3DA-GAN). The TC-GAN automatically removes the missing appearance from congestion and supplies a normalised UV texture. The 3DA-GAN operates on the UV texture area to create target attributes with the maximum protected identity of subject. Moreover, for complex picture alteration, Perarnau et al. [29] presented an approach, which is called IcGANs (Invertible Conditional GANs) based on a combination of an encoder utilised collectively accompanied by conditional GAN. This method gives certain outcomes for altering qualities, although it critically modifies one’s facial identification.

Some facial attributes manipulation datasets have been made public, which can be utilised for research purposes. Some of them are detailed in the following.

  • CelebA [30]: The Celeb-Faces Attributes (CelebA) dataset was obtained by tagging images chosen from a large-scale face feature dataset, CelebFaces [31]. The dataset consist of 10,177 identities, over than 202,000 facial images with five locations of landmark and 40 binary attribution for each images.
  • PubFig [28]: This dataset consists of 58,797 images which belong to 200 people. Since it is obtained from the internet under uncontrolled conditions, it consists of remarkable variations in poses, expressions, etc. It labels 73 face attributes.
  • Attribute 25K [32]: This dataset contains 24,943 people images, which are collected from Facebook. Not all features can be labelled for every image as images vary greatly in perspectives, poses, and occlusions. For example, if the person’s head is not visible, it cannot be labelled with glasses.
  • 4.2.1.4 Face Expression

This manipulation technique replaces one person’s face expression by face expression of another one. Thies et al. [33] presented a technique that works on realtime videos for facial expression manipulation. The presented technique is called Face2Face.

4.2.1.4.1 Face Expression Generation Methods and Datasets One of the well-known databases that has focused on facial expression manipulation to date is FaceForensics ++ [1]. This dataset is an extension of FaceForensics [16]. At first, the FaceForensics dataset concentrated only on Face2Face, a computer graphics method that hands on the source identity expression to the target identity when preserving the identification of the target. It was accomplished by choosing of manual keyframe. Later, fake samples were created through transferring source expressions of every frame to the target video. Next, the same researchers introduced a new learning approach relied on NeuralTextures [34] in FaceForensics ++. The approach is rendering-based, which utilises real video data learning the neural appearance of target person, with the inclusion of a rendering network. The researchers rated it as a GAN-loss utilised in Pix2Pix [35], which is patch-based, in their applications. Only it was changed face expression corresponding to the mouth. The dataset contains 1,000 real videos downloaded from YouTube. It contains a total of 2,000 fake videos, 1,000 each, for each approach considered.

There are several apps available that can be utilised to manipulate facial expressions. For instance, Face2Face that is based on existing GAN algorithmic structures allowing it to easily change facial expressions. Similarly, with the StarGAN approach proposed in Ref. [36], the authors showed that a person’s face image can be changed with different expressions. As far as we know the only database that is obtainable for research purposes is FaceForensics ++ [1].

  • [1] ЮОК-Faces1: This dataset consists of 100,000 synthetic images created byusing StyleGAN. In the dataset, the StyleGAN was trained with approximate 29,000 images of 69 different identities, and facial samples with aplain background were produced. • TPDNE: This dataset contains 150,000 synthetic face images gathered onthe website.2 The synthetic face images are relied on the StyleGAN technique trained with the FFHQ3 dataset. • DFFD: Stehouwer et al. [20] presented a dataset, which is called DiverseFake Face Dataset (DFFD). The authors used two pre-trained models forface synthesis manipulation. The 100,000 and 200,000 fake images werecreated using ProGAN and StyleGAN models, respectively.
 
Source
< Prev   CONTENTS   Source   Next >