Face and Speech Emotion Recognition

Volume: 10 | Issue: 02 | Year 2024 | Subscription
International Journal of Image Processing and Pattern Recognition
Received Date: 06/28/2024
Acceptance Date: 09/27/2024
Published On: 2024-10-08
First Page:
Last Page:

Journal Menu

By: Akshay Shetye, Shruti Chavan, Akshay Parab, Kaushik Patil, and Sumitra Kulkarni

Abstract

The study “Faces and Speech Recognition” underscores the critical role of Speech Emotion Recognition (SER) and its diverse applications across various fields, such as medicine, human-computer interaction, and customer service. SER has gained significant importance in cognitive psychology due to its potential to enhance user experience, improve patient care, and optimize the interaction between users and products. The process involves the identification and extraction of essential features from speech signals, which are key to accurately recognizing emotional states. The study further explores a range of classification algorithms that are employed to categorize these features, showing a transition from traditional AI techniques, like voice and energy-based analysis, to more advanced deep learning methods. These modern approaches utilize big data and neural network architectures to significantly improve the accuracy, reliability, and robustness of speech emotion recognition systems.

Additionally, review articles in this field often draw upon data from SER research, offering valuable insights into the challenges and intricacies involved in the composition and implementation of such systems. The study traces the evolution of SER technology, emphasizing the benefits of deep learning, particularly its ability to learn directly from raw data, as well as the challenges posed by factors such as large log files and the need to accommodate various devices. Comprehensive reviews serve as crucial resources for researchers, practitioners, and policymakers, enabling a deeper understanding of the current state of SER technologies, including their strengths, limitations, and areas for future development. These insights are pivotal in driving the field forward, facilitating the development of innovative applications, and maximizing the potential of SER technologies in real-world scenarios.

Keywords: feature extraction, machine learning, classification algorithm, natural language processing.

Loading

Citation:

How to cite this article: Akshay Shetye, Shruti Chavan, Akshay Parab, Kaushik Patil, and Sumitra Kulkarni, Face and Speech Emotion Recognition. International Journal of Image Processing and Pattern Recognition. 2024; 10(02): -p.

How to cite this URL: Akshay Shetye, Shruti Chavan, Akshay Parab, Kaushik Patil, and Sumitra Kulkarni, Face and Speech Emotion Recognition. International Journal of Image Processing and Pattern Recognition. 2024; 10(02): -p. Available from:https://journalspub.com/publication/ijippr-v10i02-11111/

Refrences:

  1. Kamińska D, Sapiński T, Anbarjafari G. Efficiency of chosen speech descriptors in relation to emotion recognition. EURASIP Journal on Audio, Speech, and Music Processing. 2017 Dec;2017:1-9.
  2. Avots E, Sapiński T, Bachmann M, Kamińska D. Audiovisual emotion recognition in wild. Machine Vision and Applications. 2019 Jul 1;30(5):975-85.
  3. Baishya R. Unique solution of unpolarized evolution equations. TC. 2020;9:44.
  4. Poria S, Cambria E, Bajpai R, Hussain A. A review of affective computing: From unimodal analysis to multimodal fusion. Information fusion. 2017 Sep 1;37:98-125.
  5. Caliskan A, Bryson JJ, Narayanan A. Semantics derived automatically from language corpora contain human-like biases. Science. 2017 Apr 14;356(6334):183-6.
  6. Cho J, Pappagari R, Kulkarni P, Villalba J, Carmiel Y, Dehak N. Deep Neural Networks for Emotion Recognition Combining Audio and Transcripts. InInterspeech 2018 Sep (pp. 247-251).
  7. Zheng L, Li Q, Ban H, Liu S. Speech emotion recognition based on convolution neural network combined with random forest. In2018 Chinese control and decision conference (CCDC) 2018 Jun 9 (pp. 4143-4147). IEEE.
  8. Weißkirchen N, Bock R, Wendemuth A. Recognition of emotional speech with convolutional neural networks by means of spectral estimates. In2017 seventh international conference on affective computing and intelligent interaction workshops and demos (ACIIW) 2017 Oct 23 (pp. 50-55). IEEE.
  9. Pandey SK, Shekhawat HS, Prasanna SM. Deep learning techniques for speech emotion recognition: A review. In2019 29th international conference RADIOELEKTRONIKA (RADIOELEKTRONIKA) 2019 Apr 16 (pp. 1-6). IEEE.
  10. Liu, Y., Zhou, M., Cao, H., & Liu, H. (2023). Speech Emotion Recognition Based on Deep Learning: A Comprehensive Survey. IEEE Transactions on Affective Computing, 1-1.
  11. Trinh Van L, Dao Thi Le T, Le Xuan T, Castelli E. Emotional speech recognition using deep neural networks. Sensors. 2022 Feb 12;22(4):1414.
  12. Lieskovská E, Jakubec M, Jarina R, Chmulík M. A review on speech emotion recognition using deep learning and attention mechanism. Electronics. 2021 May 13;10(10):1163.
  13. Wani TM, Gunawan TS, Qadri SA, Kartiwi M, Ambikairajah E. A comprehensive review of speech emotion recognition systems. IEEE access. 2021 Mar 22;9:47795-814.
  14. Pan J, Fang W, Zhang Z, Chen B, Zhang Z, Wang S. Multimodal emotion recognition based on facial expressions, speech, and EEG. IEEE Open Journal of Engineering in Medicine and Biology. 2023 Jan 27.
  15. Han J, Zhang Z, Pantic M, Schuller B. Internet of emotional people: Towards continual affective computing cross cultures via audiovisual signals. Future Generation Computer Systems. 2021 Jan 1;114:294-306.