Scientific Journal Of King Faisal University: Basic and Applied Sciences

ع

Scientific Journal of King Faisal University: Basic and Applied Science

Deep Capsule Network for Facial Emotion Recognition

(Tegani Salem and Telli Abdelmoutia)

Abstract

Although the classification of images has become one of the most important challenges, neural networks have had the most success with this task; this has shifted the focus towards architecture-based engineering rather than feature engineering. However, the enormous success of the convolutional neural network (CNN) is still far from comparable to the human brain's performance. In this context, a new and promising algorithm called a capsule net that is based on dynamic routing and activity vectors between capsules appeared as an efficient technique to exceed the limitations of the artificial neural network (ANN), which is considered to be one of the most important existing classifiers. This paper presents a new method-based capsule network with light-gradient-boosting-machine (LightGBM) classifiers for facial emotion recognition. To achieve our aim, there were two steps to our technique. Initially, the capsule networks were merely employed for feature extraction. Then, using the outputs computed from the capsule networks, a LightGBM classifier was utilised to detect seven fundamental facial expressions. Experiments were carried out to evaluate the suggested facial-expression-recognition system's performance. The efficacy of our proposed method, which achieved an accuracy rate of 91%, was proven by its testing the results on the CK+ dataset.


KEYWORDS
Image classifications, LightGBM, machine learning, computer vision, CNN, deep learning

PDF

References

Black, M.J. and Yacoob, Y. (1995). Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In: Proceedings of IEEE International Conference on Computer Vision, Cambridge, MA, USA, 20-23/06/1995. DOI: 10.1109/ICCV.1995.466915.
Corneanu, C.A., Simón, M.O., Cohn, J.F. and Guerrero, S.E. (2016). Survey on rgb, 3d, thermal, and multimodal approaches for facial expression recognition: History, trends, and affect-related applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(8), 1548–68. DOI: 10.1109/TPAMI.2016.2515606. 
Darwin, C. and Prodger, P. (1998). The Expression of the Emotions in Man and Animals. Oxford: Oxford University Press.
Ekman, P. (1993). Facial expression and emotion. American psychologist, 48(4), 384–92. DOI : 10.1037/0003-066X.48.4.384.
Essa, I.A. and Pentland, A.P. (1997). Coding, analysis, interpretation, and recognition of facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 757–63. DOI: 10.1109/34.598232.
Friedman, J.H. (2001). 1999 Reitz Lecture. The Annals of Statistics, 29(5), 1189–232. 
Friesen, W.V. and Ekman, P. (1976). Pictures of Facial Affect. Palo Alto, CA, USA: Consulting Psychologists Press.
Hinton, G.E., Krizhevsky, A. and Wang, S.D. (2011). Transforming auto-encoders. In: International Conference on Artificial Neural Networks, Espoo, Finland, 14–17/06/2011.
Hinton, G.E., Sabour, S. and Frosst, N. (2018). Matrix capsules with EM routing. In: 6th International Conference on Learning Representations, Vancouver Convention Center, Vancouver, BC, Canada, 30–03/04–05/2018.
Hong, C., Chen, L., Liang, Y. and Zeng, Z. (2021). Stacked capsule graph autoencoders for geometry-aware 3D head pose estimation. Computer Vision and Image Understanding, 208(n/a), 103224.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W. and Liu, T.-Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30(n/a), 3146–54.
Kim, B.-K., Lee, H., Roh, J. and Lee, S.-Y. (2015). Hierarchical committee of deep cnns with exponentially-weighted decision fusion for static facial expression recognition. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, Seattle, Washington, USA, 09–13/11/2015.
Kim, S., Kavuri, S. and Lee, M. (2013). Deep network with support vector machines. In: International Conference on Neural Information Processing, (pp. 458–65), Springer, Berlin, Heidelberg, 3–7/11/2013. DOI: 10.1007/978-3-642-42054-2_57.
Ko, B.C. (2018). A brief review of facial emotion recognition based on visual information. Sensors, 18(2), 401. DOI : 10.3390/s18020401.
Liu, M., Shan, S., Wang, R. and Chen, X. (2014). Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28/06/2014.
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z. and Matthews, I. (2010). The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, San Francisco, CA, USA, 13–18/06/2010.
Matsumoto, D. (1992). More evidence for the universality of a contempt expression. Motivation and Emotion, 16(4), 363–8. 
Mehendale, N. (2020). Facial emotion recognition using convolutional neural networks (FERC). SN Applied Sciences, 2(3), 1–8. 
Mellouk, W. and Handouzi, W. (2020). Facial emotion recognition using deep learning: Review and insights. Procedia Computer Science, 175(n/a), 689–94.
Minaee, S., Minaei, M. and Abdolrashidi, A. (2021). Deep-emotion: Facial expression recognition using attentional convolutional network. Sensors, 21(9), 3046. DOI: 10.3390/s21093046
Patrick, M.K., Adekoya, A.F., Mighty, A.A. and Edward, B.Y. (2019). Capsule networks – a survey. Journal of King Saud University, Computer and Information Sciences, x(x), x-x. DOI: 10.1016/j.jksuci.2019.09.014
Ranzato, M.A., Susskind, J., Mnih, V. and Hinton, G. (2011). On deep generative models with applications to recognition. In CVPR 2011, n/a(n/a), 2857–64.
Sabour, S., Frosst, N. and Hinton, G.E. (2017). Dynamic routing between capsules. In: Proceedings of the 31st International Conference on Advances in Neural Information Processing Systems, Long Beach, California, USA, 04/12/2017.
Singh, S.P., Singh, P. and Mishra, A. (2020). Predicting potential applicants for any private college using lightGBM. In: 2020 International Conference on Innovative Trends in Information Technology (ICITIIT), Kottayam, India, 13–14/02/2020.
Sun, W., Zhao, H. and Jin, Z. (2017). An efficient unconstrained facial expression recognition algorithm based on stack binarized auto-encoders and binarized neural networks. Neurocomputing, 267(n/a), 385–95. 
Tereikovska, L., Tereikovskyi, I., Mussiraliyeva, S., Akhmed, G., Beketova, A. and Sambetbayeva, A. (2019). Recognition of emotions by facial geometry using a capsule neural network. International Journal of Civil Engineering and Technology, 10(3), 1424–34.
Terzopoulos, D. and Waters, K. (1990). Analysis of facial images using physical and anatomical models. In: Proceedings Third International Conference on Computer Vision, Osaka, Japan, 4–7/12/1990.
Tiwari, S. and Jain, A. (2021). Convolutional capsule network for COVID‐19 detection using radiography images. International Journal of Imaging Systems and Technology, 31(2), 525–39. 
Valdenegro-Toro, M., Arriaga, O. and Plöger, P. (2019). Real-time convolutional neural networks for emotion and gender classification. In: Proceedings of ESANN 2019 Conference, European Symposium on Artificial Neural Networks, Bruges, Belgium, 24–26/04/2019.
Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Kauai, HI, USA, 8–14/12/2001.
Yacoob, Y. and Davis, L.S. (1996). Recognizing human facial expressions from long image sequences using optical flow. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(6), 636–42. 
Yang, C. and Shi, Z. (2019). Research in breast cancer imaging diagnosis based on regularized lightGBM. In: International Conference on Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health, Beijing, China, 16–18/12/2019
Zeng, N., Zhang, H., Song, B., Liu, W., Li, Y. and Dobaie, A.M. (2018). Facial expression recognition via learning deep sparse autoencoders. Neurocomputing, 273(c), 643–49. 
Zhang, J. and Xiao, N. (2020). Capsule network-based facial expression recognition method for a humanoid robot. Recent Trends in Intelligent Computing, Communication and Devices, 1006(n/a), 113–21.