Utterance independent bimodal emotion recognition in spontaneous communication

doi:10.60692/nxtk3-6eb09

Published May 13, 2011 | Version v1

Publication Open

Utterance independent bimodal emotion recognition in spontaneous communication

1. Institute of Automation
2. Chinese Academy of Sciences

Emotion expressions sometimes are mixed with the utterance expression in spontaneous face-to-face communication, which makes difficulties for emotion recognition. This article introduces the methods of reducing the utterance influences in visual parameters for the audio-visual-based emotion recognition. The audio and visual channels are first combined under a Multistream Hidden Markov Model (MHMM). Then, the utterance reduction is finished by finding the residual between the real visual parameters and the outputs of the utterance related visual parameters. This article introduces the Fused Hidden Markov Model Inversion method which is trained in the neutral expressed audio-visual corpus to solve the problem. To reduce the computing complexity the inversion model is further simplified to a Gaussian Mixture Model (GMM) mapping. Compared with traditional bimodal emotion recognition methods (e.g., SVM, CART, Boosting), the utterance reduction method can give better results of emotion recognition. The experiments also show the effectiveness of our emotion recognition system when it was used in a live environment.

Translated Descriptions

This is an automatic machine translation with an accuracy of 90-95%

Translated Description (Arabic)

يتم خلط تعبيرات العاطفة في بعض الأحيان مع تعبير النطق في التواصل التلقائي وجهاً لوجه، مما يجعل من الصعب التعرف على العاطفة. تقدم هذه المقالة طرق الحد من تأثيرات الكلام في المعلمات المرئية للتعرف على المشاعر السمعية والبصرية. يتم دمج القنوات الصوتية والمرئية أولاً تحت نموذج ماركوف المخفي متعدد الدفق (MHMM). بعد ذلك، يتم الانتهاء من تقليل الكلام من خلال إيجاد المتبقي بين المعلمات المرئية الحقيقية ومخرجات المعلمات المرئية المتعلقة بالكلام. تقدم هذه المقالة طريقة انعكاس نموذج ماركوف المخفي المنصهر والتي يتم تدريبها في الجسم السمعي البصري المعبر عنه المحايد لحل المشكلة. لتقليل تعقيد الحوسبة، يتم تبسيط نموذج الانعكاس بشكل أكبر إلى تخطيط نموذج الخليط الغاوسي (GMM). مقارنة بالطرق التقليدية للتعرف على العاطفة ثنائية النمط (على سبيل المثال، طريقة SVM، CART، التعزيز)، يمكن أن تعطي طريقة الحد من الكلام نتائج أفضل للتعرف على العاطفة. تُظهر التجارب أيضًا فعالية نظام التعرف على المشاعر لدينا عند استخدامه في بيئة حية.

Translated Description (English)

Emotion expressions are sometimes mixed with the utterance expression in spontaneous face-to-face communication, which makes difficulties for emotion recognition. This article introduces the methods of reducing the utterance influences in visual parameters for the audio-visual-based emotion recognition. The audio and visual channels are first combined under a Multistream Hidden Markov Model (MHMM). Then, the utterance reduction is finished by finding the residual between the real visual parameters and the outputs of the utterance related visual parameters. This article introduces the Fused Hidden Markov Model Inversion method which is trained in the neutral expressed audio-visual corpus to solve the problem. To reduce the computing complexity the inversion model is further simplified to a Gaussian Mixture Model (GMM) mapping. Compared with traditional bimodal emotion recognition methods (e.g., SVM, CART, Boosting), the utterance reduction method can give better results of emotion recognition. The experiments also show the effectiveness of our emotion recognition system when it was used in a live environment.

Translated Description (French)

Emotion expressions submitmes are mixed with the utterance expression in spontaneous face-to-face communication, which makes difficulties for emotion recognition. This article introduites the methods of reducing the utterance influences in visual parameters for the audio-visual-based emotion recognition. The audio and visual channels are first combind under a Multistream Hidden Markov Model (MHMM). Then, the utterance reduction is finished by finding the residual between the real visual parameters and the outputs of the utterance related visual parameters. This article introces the Fused Hidden Markov Model Inversion method which is trained in the neutral expressed audio-visual corpus to solve the problem. To reduce the computing complexity the inversion model is further simplified to a Gaussian Mixture Model (GMM) mapping. Compared with traditional bimodal emotion recognition methods (par exemple, SVM, CART, Boosting), the utterance reduction method can give better results of emotion recognition. The experiments also show the effectiveness of our emotion recognition system when it was used in a live environment.

Translated Description (Spanish)

Emotion expressions sometimes are mixed with the utterance expression in spontaneous face-to-face communication, which makes difficulties for emotion recognition. This article introduces the methods of reducing the utterance influences in visual parameters for the audio-visual-based emotion recognition. The audio and visual channels are first combined under a Multistream Hidden Markov Model (MHMM). Then, the utterance reduction is finished by finding the residual between the real visual parameters and the outputs of the utterance related visual parameters. This article introduces the Fused Hidden Markov Model Inversion method which is trained in the neutral expressed audio-visual corpus to solve the problem. To reduce the computing complexity the inversion model is further simplified to a Gaussian Mixture Model (GMM) mapping. Compared with traditional bimodal emotion recognition methods (e.g., SVM, CART, Boosting), the utterance reduction method can give better results of emotion recognition. The experiments also show the effectiveness of our emotion recognition system when it was used in a live environment.

Files

1687-6180-2011-4.pdf

Files (933.0 kB)

Please wait a few minutes before your translated files are ready Note: Some files might be protected thus translations might not work.

Name	Size	Download all
1687-6180-2011-4.pdf md5:b0e371681298b7a6d8cbd368e3cae0c4	933.0 kB	Preview Download

Additional details

Translated title (Arabic): التعرف على العاطفة ثنائية النمط المستقلة عن الكلام في التواصل التلقائي
Translated title (English): Utterance independent bimodal emotion recognition in spontaneous communication
Translated title (French): Utterance independent bimodal emotion recognition in spontaneous communication
Translated title (Spanish): Utterance independent bimodal emotion recognition in spontaneous communication

Other: https://openalex.org/W2164643209
DOI: 10.1186/1687-6180-2011-4

Is Global South Knowledge: Yes
Country: China

https://openalex.org/W1509031088
https://openalex.org/W1552278919
https://openalex.org/W1581153084
https://openalex.org/W1815942593
https://openalex.org/W1923034539
https://openalex.org/W1971063881
https://openalex.org/W1978649172
https://openalex.org/W2020944977
https://openalex.org/W2021127571
https://openalex.org/W2033773055
https://openalex.org/W2058787788
https://openalex.org/W2059348974
https://openalex.org/W2070726616
https://openalex.org/W2098790470
https://openalex.org/W2103743127
https://openalex.org/W2106115875
https://openalex.org/W2106390385
https://openalex.org/W2109138290
https://openalex.org/W2118640726
https://openalex.org/W2120157855
https://openalex.org/W2122609807
https://openalex.org/W2127429655
https://openalex.org/W2127462305
https://openalex.org/W2127531292
https://openalex.org/W2148071321
https://openalex.org/W2156503193
https://openalex.org/W2159017231
https://openalex.org/W2168053878
https://openalex.org/W2171939880
https://openalex.org/W2173219554
https://openalex.org/W3097096317
https://openalex.org/W4244952642

	All versions	This version
Views	1	1
Downloads	1	1
Data volume	933.0 kB	933.0 kB

Utterance independent bimodal emotion recognition in spontaneous communication

Translated Descriptions

Translated Description (Arabic)

Translated Description (English)

Translated Description (French)

Translated Description (Spanish)

Files

1687-6180-2011-4.pdf

Files (933.0 kB)

Additional details

Additional titles

Identifiers

Related works

GreSIS Basics Section

References

Utterance independent bimodal emotion recognition in spontaneous communication

Creators

Description

Translated Descriptions

Translated Description (Arabic)

Translated Description (English)

Translated Description (French)

Translated Description (Spanish)

Files

1687-6180-2011-4.pdf

Files (933.0 kB)

Additional details

Additional titles

Identifiers

Related works

GreSIS Basics Section

References