Application of Deep Convolutional Generative Adversarial Networks to Generate Pose Invariant Facial Image Synthesis Data

Jagad Nabil Tuah Imanda; Fitra Bachtiar; Achmad Ridok

doi:10.29207/resti.v7i5.5112

Jagad Nabil Tuah Imanda Universitas Brawijaya
Fitra Bachtiar Universitas Brawijaya
Achmad Ridok Universitas Brawijaya

DOI: https://doi.org/10.29207/resti.v7i5.5112

Keywords: Face Recognition, Pose Invariant, Generative Adversarial Network, Deep Convolutional Generative Adversarial Network, Hyperparameter Tunning

Abstract

The field of technology is currently developing rapidly, one of the developments is artificial intelligence. Artificial intelligence can still find it difficult to solve problems that are easy for humans to do but difficult for computers to describe, such as facial recognition. There are still several problems related to the existing facial recognition model, namely, the facial recognition model is still unable to recognize facial shapes that are not in a perfect state due to several factors such as face position, lighting, expression, and obstacles covering the face. Among these several factors, the most influencing factor is the position of the face. Therefore, in this study, deep convolutional generative adversarial networks (DCGANs) will be applied to generate fake image data with varying face positions. This research will be carried out starting from collecting data, processing data, designing and training models, hyperparameter tuning, and lastly analyzing test results. Based on the results of hyperparameter tuning that were performed sequentially, the best hyperparameter combination produced is 200 epoch, 0.002 Generator learning rate, 0.5 Generator momentum/beta1, Adam as Generator optimizer, 0.0002 Discriminator learning rate, 0.5 Discriminator momentum/beta1, and Adam as Discriminator optimizer. The combination of hyperparameters gives a result with an FID score of 74.05. Based on testing with human observers, generated fake images have relatively good results, but there are still few bad fake image results

Downloads

Download data is not yet available.

References

Brownlee, J. (2019). “Deep Learning for Computer Vision Image Classification, Object Detection and Face Recognition in Python”. Machine Learning Mastery.

Shao, H.-C., Liu, K.-Y., Lin, C.-W., & Lu, J. (2020). DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition. http://arxiv.org/abs/2002.09859

Tsai, E.-J., & Yeh, W.-C. (2021). PAM: Pose Attention Module for Pose-Invariant Face Recognition. http://arxiv.org/abs/2111.11940

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative Adversarial Networks. http://arxiv.org/abs/1406.2661

Tran, L., Yin, X., & Liu, X. (2017). Disentangled Representation Learning GAN for Pose-Invariant Face Recognition. https://doi.org/10.48550/arXiv.1705.11136

Tian, Y., Peng, X., Zhao, L., Zhang, S., & Metaxas, D. N. (2018). CR-GAN: Learning Complete Representations for Multi-view Generation. http://arxiv.org/abs/1806.11191

Bhattacharjee, A., Banerjee, S., & Das, S. (2018). PosIX-GAN: Generating multiple poses using GAN for Pose-Invariant Face Recognition. https://doi.org/10.1007/978-3-030-11015-4_31

Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. http://arxiv.org/abs/1511.06434

Wu, K., Yu, Y., Zhang, X., Li, J., & Zhang, Q. (2020). Application of Face Data Augmentation Based on Rotate-and-Render-DCGAN in Campus Security. Proceedings of 2020 IEEE 3rd International Conference of Safe Production and Informatization, IICSPI 2020, 561–564. https://doi.org/10.1109/IICSPI51290.2020.9332396

Adedeji, O., Owoade, P., Ajayi, O., & Arowolo, O. (2022). Image Augmentation for Satellite Images. http://arxiv.org/abs/2207.14580

Bond-Taylor, S., Leach, A., Long, Y., & Willcocks, C. G. (2021). Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models. https://doi.org/10.1109/TPAMI.2021.3116668

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., & Hochreiter, S. (2017). GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. http://arxiv.org/abs/1706.08500

N. Gourier, D. Hall, J. L. Crowley. (2004). Estimating Face Orientation from Tobust Detection of Salient Facial Features. Proceedings of Pointing 2004, ICPR, International Workshop on Visual Observation of Deictic Gestures, Cambridge, UK.

Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2008). Multi-PIE. FG 2008: 8th IEEE Int’l Conference on Automatic Face and Gesture Recognition, September 17-19, 2008, Amsterdam, The Netherlands.

Queen Mary University of London. (2001). Head Pose Estimation.

Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. arXiv preprint arXiv:1312.6114. http://arxiv.org/abs/1312.6114

Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10) (pp. 807-814). PMLR.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.

Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. In Proceedings of the 30th International Conference on Machine Learning (ICML-13) (Vol. 28, No. 1, pp. 3-11). PMLR.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533-536.

Application of Deep Convolutional Generative Adversarial Networks to Generate Pose Invariant Facial Image Synthesis Data

Abstract

Downloads

References

Most read articles by the same author(s)