Implementasi Convolutional Neural Network untuk Klasifikasi Variasi Intensitas Emosi pada Dynamic Image Sequence
Implementation of a Convolutional Neural Network for the Classification of Emotional Intensity Variations in Dynamic Image Sequences
Abstract
Facial emotion recognition (FER) is a research topic that focuses on the analysis of human facial expressions. There are many FER research has been conducted on single images or photo. Emotion analysis on single images has many disadvantages compared to dynamic image sequences or videos. This is due to human emotions or expressions within a certain time. The classification of emotions becomes complicated when considering different emotions. There are some people who are very expressive, there are some people who have low or moderate expressions. Predictions of emotion with variety intensities has decresed error due to data sets that provide only a few emotions intensities. Data annotation is a major problem in recognition fields that require a lot of time and effort to annotate new data. This study aims to find information about facial emotions with emotional intensity from subtle to sharp in a sequence images or videos. The dataset will be trained using Convolutional neural network by augmentation to add data annotations. The proposed method was evaluated using the public BP4D-Spontaneous dataset. The evaluation results show that the average emotion recognition in video sequences using the holdout method is 18%. Evaluation of the loss function parameter shows overfitting where the curve generalization gap is too high. The last evaluation is the evaluation of the emotion class between the real class and the prediction class in 14.28%. This shows that the classification of emotion recognition in dynamic image sequences is quite low.
Downloads
References
Y. Huang, F. Chen, S. Lv, and X. Wang, “Facial expression recognition: A survey,” Symmetry (Basel)., vol. 11, no. 10, 2019, doi: 10.3390/sym11101189.
I. M. Revina and W. R. S. Emmanuel, “A Survey on Human Face Expression Recognition Techniques,” J. King Saud Univ. - Comput. Inf. Sci., 2018, doi: 10.1016/j.jksuci.2018.09.002.
O. Çeliktutan, S. Ulukaya, and B. Sankur, “A comparative study of face landmarking techniques,” Eurasip J. Image Video Process., vol. 2013, no. 1, pp. 1–27, 2013, doi: 10.1186/1687-5281-2013-13.
S. P. Teja Reddy, S. Teja Karri, S. R. Dubey, and S. Mukherjee, “Spontaneous Facial Micro-Expression Recognition using 3D Spatiotemporal Convolutional Neural Networks,” Proc. Int. Jt. Conf. Neural Networks, vol. 2019-July, pp. 1–8, 2019, doi: 10.1109/IJCNN.2019.8852419.
P. Ekman, “Darwin, Deception, and Facial Expression,” vol. 221, pp. 205–221, 2003, doi: 10.1196/annals.1280.010.
D. A. Trevisan and E. Birmingham, “Are emotion recognition abilities related to everyday social functioning in ASD? A meta-analysis,” Res. Autism Spectr. Disord., vol. 32, pp. 24–42, 2016, doi: 10.1016/j.rasd.2016.08.004.
G. Sandbach, S. Zafeiriou, M. Pantic, and L. Yin, “Static and dynamic 3D facial expression recognition: A comprehensive survey,” Image Vis. Comput., vol. 30, no. 10, pp. 683–697, 2012, doi: 10.1016/j.imavis.2012.06.005.
S. L. Happy, A. Dantcheva, and F. Bremond, “A Weakly Supervised learning technique for classifying facial expressions,” Pattern Recognit. Lett., vol. 128, pp. 162–168, 2019, doi: 10.1016/j.patrec.2019.08.025.
Paul, Ekman., The Facial Action Coding System: A Technique for Measurement of Facial Movement. Palo Alto: Consulting Psychologist Press, 1978.
P. Ekman, “Darwin, Deception, and Facial Expression,” Ann. N. Y. Acad. Sci., vol. 1000, pp. 205–221, 2003, doi: 10.1196/annals.1280.010.
Q.-Y. Zhou, J. Park, and V. Koltun, “Peak-Piloted Deep Network for Facial Expression Recognition,” Eur. Conf. Comput. Vision(ECCV), vol. 9906, no. August, pp. 694–711, 2016, doi: 10.1007/978-3-319-46475-6.
Z. Yu, Q. Liu, and G. Liu, “Deeper cascaded peak-piloted network for weak expression recognition,” Vis. Comput., vol. 34, no. 12, pp. 1691–1699, 2018, doi: 10.1007/s00371-017-1443-0.
D. H. Kim, W. J. Baddar, J. Jang, and Y. M. Ro, “Multi-objective based spatio-temporal feature representation learning robust to expression intensity variations for facial expression recognition,” IEEE Trans. Affect. Comput., vol. 10, no. 2, pp. 223–236, 2019, doi: 10.1109/TAFFC.2017.2695999.
Y. Kim, B. Yoo, Y. Kwak, C. Choi, and J. Kim, “Deep generative-contrastive networks for facial expression recognition,” in International Conferenece on Multimedia Retrieval, 2017, pp. 1–11, [Online]. Available: http://arxiv.org/abs/1703.07140.
M. Liu, S. Shan, R. Wang, and X. Chen, “Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2014, pp. 1749–1756, doi: 10.1109/CVPR.2014.226.
K. Sikka, G. Sharma, and M. Bartlett, “LOMo: Latent ordinal model for facial analysis in videos,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, vol. 2016-Decem, pp. 5580–5589, doi: 10.1109/CVPR.2016.602.
Z. Meng, P. Liu, J. Cai, S. Han, and Y. Tong, “Identity-Aware Convolutional Neural Network for Facial Expression Recognition,” in Proceedings - 12th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2017 - 1st International Workshop on Adaptive Shot Learning for Gesture Understanding and Production, ASL4GUP 2017, Biometrics in the Wild, Bwild 2017, Heteroge, 2017, pp. 558–565, doi: 10.1109/FG.2017.140.
P. Liu, S. Han, Z. Meng, and Y. Tong, “Facial expression recognition via a boosted deep belief network,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2014, pp. 1805–1812, doi: 10.1109/CVPR.2014.233.
H. Ding, S. K. Zhou, and R. Chellappa, “FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition,” in Proceedings - 12th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2017 - 1st International Workshop on Adaptive Shot Learning for Gesture Understanding and Production, ASL4GUP 2017, Biometrics in the Wild, Bwild 2017, Heteroge, 2017, pp. 118–126, doi: 10.1109/FG.2017.23.
H. Jung, S. Lee, J. Yim, S. Park, and J. Kim, “Joint fine-tuning in deep neural networks for facial expression recognition,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, vol. 2015 Inter, pp. 2983–2991, doi: 10.1109/ICCV.2015.341.
Q.-Y. Zhou, J. Park, and V. Koltun, “Peak-Piloted Deep Network for Facial Expression Recognition,” Eur. Conf. Comput. Vision(ECCV), vol. 9906, no. August, pp. 694–711, 2016, doi: 10.1007/978-3-319-46475-6
C. F. Benitez-Quiroz, R. Srinivasan, and A. M. Martinez, “EmotioNet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 5562–5570, 2016, doi: 10.1109/CVPR.2016.600.
S. Li, W. Deng, and J. P. Du, “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017, vol. 2017-Janua, pp. 2584–2593, doi: 10.1109/CVPR.2017.277.
A. Mollahosseini, B. Hasani, and M. H. Mahoor, “AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild,” IEEE Trans. Affect. Comput., vol. 10, no. 1, pp. 18–31, 2019, doi: 10.1109/TAFFC.2017.2740923.
E. Barsoum, C. Zhang, C. C. Ferrer, and Z. Zhang, “Training deep networks for facial expression recognition with crowd-sourced label distribution,” in ICMI 2016 - Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016, pp. 279–283, doi: 10.1145/2993148.2993165.
S. Li and W. Deng, “Deep Facial Expression Recognition: A Survey,” IEEE Trans. Affect. Comput., vol. 3045, no. c, pp. 1–1, 2020, doi: 10.1109/taffc.2020.2981446.
S. W. Chew, P. Lucey, S. Lucey, J. Saragih, J. F. Cohn, and S. Sridharan, “Person-independent facial expression detection using Constrained Local Models,” in 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, FG 2011, 2011, pp. 915–920, doi: 10.1109/FG.2011.5771373.
A. Sanin, C. Sanderson, M. T. Harandi, and B. C. Lovell, “Spatio-temporal covariance descriptors for action and gesture recognition,” in Proceedings of IEEE Workshop on Applications of Computer Vision, 2013, pp. 103–110, doi: 10.1109/WACV.2013.6475006.
Y. Peng, L. Qingshan, and D. N. Metaxas, “Facial expression recognition using encoded dynamic features,” in Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007, 2007, pp. 1107–1110, doi: 10.1109/icme.2007.4284848.
Z. Guoying, “Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 6, p. 9, 2007.
G. Zhao, X. Huang, M. Taini, S. Z. Li, and M. Pietikäinen, “Facial expression recognition from near-infrared videos,” Image Vis. Comput., vol. 29, no. 9, pp. 607–619, 2011, doi: 10.1016/j.imavis.2011.07.002.
Xing Zhang, Lijun Yin, Jeff Cohn, Shaun Canavan, Michael Reale, Andy Horowitz, Peng Liu, and Jeff Girard, “BP4D-Spontaneous: A high resolution spontaneous 3D dynamic facial expression database”, Image and Vision Computing, 32 (2014), pp. 692-706 (special issue of the Best of FG13)
Xing Zhang, Lijun Yin, Jeff Cohn, Shaun Canavan, Michael Reale, Andy Horowitz, and Peng Liu, “A high resolution spontaneous 3D dynamic facial expression database”, The 10th IEEE International Conference on Automatic Face and Gesture Recognition (FG13), April, 2013.
Z. Yu, Q. Liu, and G. Liu, “Deeper cascaded peak-piloted network for weak expression recognition,” Vis. Comput., vol. 34, no. 12, pp. 1691–1699, 2018, doi: 10.1007/s00371-017-1443-0
A. Rosebrock, Deep Learning for Computer Vision With Python, 2rd ed. United States of America: PyImageSearch, 2018.
Copyright (c) 2020 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;