F0, LPC, and MFCC Analysis for Emotion Recognition Based on Speech Chapter Conference Paper uri icon

abstract

  • In this work, research was done to understand what is needed to build a database to recognise emotions through speech. Some features that can highlight a good success rate for emotion recognition through speech were investigated. Also studied were some characteristics (symptoms) that can be associated with a specific emotional state. On the other hand, we also studied some features that can be used to identify some emotional states. A System Emotion Recognition (SER) was built with SVM, and the binary analysis was compared with a multi-category analysis. The binary analysis achieved an accuracy of 87.5% and the multi-class 42.6%. The parameters Fundamental Frequency-F0, Linear Predictive Coefficients (LPC), and Mel Frequency Cepstral Coeficients (MFCC) were used. The modest accuracy of this work was achieved using only F0, LPC and MFCC features.
  • This work has the support of Research Centre in Digitalization and Intelligent Robotics (CEDRI), Instituto Polit´ecnico de Bragan¸ca (IPB), School of Sciences and Technology-Engineering Department (UTAD). This project is supported by the European Regional Development Fund (ERDF) through the Regional Operational Program North 2020, within the scope of Project GreenHealth - Digital strategies in biological assets to improve well-being and promote green health, Norte -01-0 145-FEDER-000042. The authors are grateful to the Foundation for Science and Technology (FCT, Portugal) for financial support through national funds FCT/MCTES (PIDDAC) to CeDRI (UIDB/05757/2020 and UIDP/05757/2020) and SusTEC (LA/P/0007/ 2021).

publication date

  • 2022