Voice quality and f0 cues for affect expression

Comments

Transcription

Voice quality and f0 cues for affect expression
VOICE QUALITY AND F0 CUES
FOR AFFECT EXPRESSION
By I. Yanushevskaya, C. Gobl and N. Chasaide
OUTLINE
Introduction
 Synthetic stimuli
 Experiment setup
 Result
 Conclusion

INTRODUCTION
F0 cues are crucial for emotional speech
 What about Voice Quality?
 Base on previous works:

Adding voice quality cues enhance speech synthesis
 Several voice quality stimuli have similar result:

Tense ~= Harsh
 Breathy ~= whisper



Varying voice quality can influence listener’s judgment
Want to know the effect of varying voice quality
only.
SYNTHETIC STIMULI
15 synthetic stimuli: Ja adjö (Hello Goodbye)
 KLSYN88 as formant synthesizer
 3 groups stimuli: “VQ”, “F0”, “VQ+F0”

KLSYN88
VQ ONLY STIMULI
Modal, breathy, whispery, lax-creaky, tense
stimuli
 Omit harsh, creaky included in previous work
 Modal: Copy the natural utterance to KLSYN88
 Breathy: lower AV, higher OQ, lower SQ, higher
TL, wider B1
 Whispery: Aspiration noise
 Lax-creaky: Creaky+Breathy-Whispery
 Tense: lower OQ, higher SQ, lower TL, narrower
B1 higher F0
 NOT normalized with F0

F0 ONLY STIMULI
VQ+F0 STIMULI
Are these good pairs? We’ll see….
EXPERIMENT SETUP
20 native speakers
 10 of 15 stimuli presented
 Response a pair of opposite affective attribute








sad-happy
Intimate-formal
Relaxed-stressed
Bored-interested
Apologetic-indignant
Fearless-scared
ANOVA
RESULT
CONCLUSION

Showed that some voice quality is more related
than other in some emotions.
X Intimacy, sadness -> breathy
 O
-> lax-creaky


Voice quality is averagely better than F0 cues on
speech synthesis

Maybe because the voice quality already includes the
information of F0
THANKS FOR YOUR ATTENTION

Similar documents