Pres - 1.5 MB - International Association for Forensic Phonetics and

Transcription

The speaker discriminating power of within-speaker behaviour:
a study based on vowel formants
Gea de Jong, Kirsty McDougall, Francis Nolan and Toby Hudson
__________________________________________________________________
Dynamic variability in speech (DyViS):
a forensic study of British English
Principal investigator: Francis Nolan
Principal investigator: Francis Nolan
Research associates: Kirsty McDougall
Gea de Jong
Research assistant:
Toby Hudson
Technical support:
Geoffrey Potter
Consultant:
Mark Jones
ESRC Award no. RES-000-23-1248
__________________________________________________________________
Linguistics Department, Cambridge University
Hypothesis:
Is diachronic sound change a predictor
of where speaker idiosyncrasy lies?
Are sounds which are undergoing change those
which are most likely to differ between speakers?
__________________________________________________________________
Analysis: F1 and F2 formants
Stable
_____
vs
HOARD /
ɔ/
Changing
_______
WHO’D /
HEED /
i/
HOOD /
HARD /
ɑ/
HAD /
u/
ʊ/
æ/
Measurements:
Frequencies of F1 and F2 at its steady state and close to the centre
of the vowel if possible
Using Praat
__________________________________________________________________
Materials:
• 50 male SSBE speakers
• Each hVd word preceded by schwa and followed by
today produced in a sentence with nuclear stress:
–
–
–
–
–
–
It’s a warning we’d better HEED today
It’s only one loaf, but it’s all Peter HAD today
We worked rather HARD today
We built up quite a HOARD today
He insisted on wearing a HOOD today
He hates contracting words, but he said a WHO’D
today.
• 6 repetitions
__________________________________________________________________
Previous results: n=20
<--- Frequency of F2 (Hz)
2000
1500
1000
500
200
/ uː /
/ iː /
300
/ʊ/
/ ɔː /
400
500
600
700
/ ɑː /
Deterding 1990 n=8
/æ/
800
<--- Frequency of F1 (Hz
2500
900
Dyvis 2006 n=20
1000
__________________________________________________________________
New results: n=50
2000
/ iː /
1500
1000
500
200
/ uː /
300
/ʊ/
/ ɔː /
400
500
600
700
/ ɑː /
Deterding 1990 n=8
Dyvis 2006 n=20
Dyvis 2006 n=50
/æ/
800
2500
900
1000
__________________________________________________________________
Results: means of 6 tokens per subject x 50
Frequency of F2 (Hz)
2500
2000
1500
1000
500
200
heed
600
had
hard
hoard
hood
800
400
1000
who'd
1200
__________________________________________________________________
Formant means and ranges:
1200
max
F1
1184
1100
1000
900
2 SD
1 SD
FREQUENCY (HZ)
854
805
800
700
648
600
min
550
500
495
507
470
443
400
405
406
376
343
300
315
305
330
239
236
200
100
HEED
HAD*
HARD
HOARD
*HOOD
WHO'D*
__________________________________________________________________
Formant means and ranges:
2800
2700
2694
F2
2600
2500
2400
2300
2281
2200
2114
2100
2000
FREQUENCY (HZ)
1900
1907
1905
1863
1800
1700
1637
1600
1513
1500
1426
1400
1330
1300
1336
1200
1100
1088
1036
1000
920
900
800
937
898
759
700
600
576
500
400
HEED
HAD*
HARD
HOARD
*HOOD
WHO'D*
__________________________________________________________________
Results: Formant ranges: F1 vs F2
1200
Range in frequency (Hz)
1000
800
F1
F2
600
400
200
0
HEED
HAD*
HARD
HOARD
*HOOD
WHO'D*
__________________________________________________________________
Results: Formant ranges: F1 vs F2
HZ
7.0
1200
R a n g e in fr e q u e n c y (H z )
1000
ERB
F1
F2
6.0
5.0
800
4.0
600
3.0
400
2.0
200
1.0
0
0.0
HEED
HAD*
HARD
HOARD *HOOD WHO'D*
HEED
HAD*
HARD
HOARD
*HOOD
WHO'D*
__________________________________________________________________
Within-speaker variability:
Sounds like?
Comparing standard deviations for F2: stable vs non-stable
250
250
200
200
200
150
150
150
SD
SD
SD
250
SUBJECT
Stable: HOARD
0
SUBJECT
Non-stable: HOOD
49
2
50
15
24
53
44
30
31
48
54
46
26
42
8
40
6
3
11
49
25
53
39
2
46
6
27
0
43
0
34
50
54
50
19
50
13
100
22
100
32
100
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
SUBJECT
Non-stable: WHO’D
__________________________________________________________________
250
SD 156Hz
200
Very variable WHO’D:
SD
150
100
50
0
Subject 26
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
SUBJECT
‘He hates contracting words, but he said a WHO’D today ’ x 6
F2 in /uː/ :
1392
1716
1392
1310
1627
1461 Hz
__________________________________________________________________
Subject 26
F2 in /uː/ :
1392
1716
1392
1310
1627
1461 Hz
2300
2100
1900
1700
1500
1300
1100
900
700
500
200
/ iː /
300
/ ɔː /
400
500
600
700
Deterding 1990 n=8
Dyvis 2006 n=50
800
/æ/
2500
900
1000
__________________________________________________________________
Subject 26
F2 in /uː/ :
1392
1716
1392
1310
1627
1461 Hz
2300
2100
1900
/ iː /
1700
x x
1500
x xx
1300
1100
900
700
500
200
x
300
/ ɔː /
400
500
600
700
Deterding 1990 n=8
Dyvis 2006 n=50
800
/æ/
2500
900
1000
__________________________________________________________________
Very variable WHO’D: subject 26
F2:
1392
1716
1392
1310
1627
1461 Hz
__________________________________________________________________
Speaker discrimination power?
Between speaker
2500
2000
1500
1000
500
200
Within speaker
Mean Standard Deviation
600
80
70
Mean SD F1
800
Mean SD F2
60
400
1000
50
40
1200
30
20
10
0
HEED
HAD*
HARD
HOARD
HOOD*
WHO'D*
__________________________________________________________________
Speaker discrimination power?
LARGE SPREAD F2 !
2500
2000
1500
1000
500
200
Within speaker
Mean Standard Deviation
600
80
70
Mean SD F1
800
Mean SD F2
60
400
1000
50
40
1200
30
20
LARGE SPREAD F1 !
10
0
HEED
HAD*
HARD
HOARD
HOOD*
WHO'D*
__________________________________________________________________
F-RATIO: between-speaker/ within-speaker
90
80
70
F-RATIO
60
50
F1
F2
40
30
20
10
0
HEED
HAD*
HARD
HOARD
HOOD*
WHO'D*
__________________________________________________________________
F-RATIO: between-speaker/ within-speaker
90
80
70
F-RATIO
60
50
40
30
20
10
0
HEED
HAD*
HARD
HOARD
HOOD*
WHO'D*
__________________________________________________________________
Within-speaker patterns:
• If fronted WHO’D, then also fronted HOOD?
2100
1900
1700
1500
1300
1100
WHO’D
900
200
250
300
350
400
450
2300
500
HOOD
550
__________________________________________________________________
• Not necessarily!
2100
1900
1700
1500
1300
1100
WHO’D
900
200
250
300
350
400
450
2300
500
HOOD
550
__________________________________________________________________
6 most different F2 means: WHO’D very fronted, HOOD not/hardly fronted
2100
1900
1700
1500
1300
1100
WHO’D
900
200
250
300
350
400
450
2300
500
HOOD
550
__________________________________________________________________
• If fronted HOOD, then also fronted WHO’D?
2100
WHO’D
1900
1700
1500
1300
1100
?
900
200
250
300
350
400
450
2300
500
HOOD
550
__________________________________________________________________
• No large differences found:
2100
1900
1700
1500
1300
1100
WHO’D
900
200
250
300
350
400
450
2300
500
HOOD
550
__________________________________________________________________
HOOD F2
Fronted
Correlation: WHO’D fronting / HOOD fronting
2100
1900
R=0.55
1700
1500
1300
1100
900
900
Conservative
1100
1300
1500
WHO'D F2
1700
1900
2100
Fronted
__________________________________________________________________
WHO'D
2300
2100
1900
1700
1500
1300
1100
900
200
WHO’D*
300
Within-speaker
variation (in SD)
350
_______________
450
400
2300
2100
80-120
40-80
0-40
1900
1700
500
1100
900
550
300
320
HOOD*
340
360
380
400
420
120-
HOOD
1500
1300
F req u en cy o f F 1 (H z )
250
440
460
__________________________________________________________________
Within-speaker
variation (in SD)
WHO'D
2300
2100
1900
1700
1500
1300
1100
200
WHO’D*
250
300
12080-120
350
40-80
400
0-40
450
950
900
850
800
750
700
2300
650
600
2100
300
1900
HOARD
FrequencyHOOD
of F2 (Hz)
1700
1500
_______________
500
1300
1100
900
550
300
320
340
360
380
400
420
500
HOOD*
1000
900
440
460
__________________________________________________________________
FrequencyWHO'D
of F2 (Hz)
2100
1900
1700
1500
1300
1100
900
200
WHO’D*
250
300
Within-speaker
variation (in SD)
350
400
_______________
450
40-80
0-40
2300
2100
HOOD*
1900
1700
1500
1300
1100
500
900 550
300
320
340
360
380
400
420
80-120
S18
S6
FrequencyHOOD
of F2 (Hz)
120-
Matching speakers:
2300
440
460
__________________________________________________________________
FrequencyWHO'D
of F2 (Hz)
2100
1900
1700
1500
1300
1100
900
200
WHO’D*
250
300
Within-speaker
variation (in SD)
350
400
_______________
450
S51
80-120
40-80
0-40
2300
2100
HOOD*
1900
FrequencyHOOD
of F2 (Hz)
1700
1500
500
1300
1100
900 550
300
320
340
360
380
400
420
120-
Matching speakers:
2300
440
460
__________________________________________________________________
FrequencyWHO'D
of F2 (Hz)
2100
1900
1700
1500
1300
1100
900
200
WHO’D*
250
300
Within-speaker
variation (in SD)
350
400
_______________
450
80-120
40-80
0-40
2300
2100
HOOD*
1900
FrequencyHOOD
of F2 (Hz)
1700
1500
S22
1300
1100
500
900 550
300
320
340
360
380
400
420
120-
Matching speakers:
2300
440
460
__________________________________________________________________
Within-speaker variability patterns?
2100
1900
1700
1500
1300
1100
WHO’D
900
200
250
300
350
400
450
HOOD
2300
500
550
__________________________________________________________________
2100
1900
WHO’D
42
1700
69
66
1500
1300
1100
900
200
250
79
300
62
63
57
80
79
41
4:6 HOOD more variable than WHO’D
83
350
400
31
450
2300
500
HOOD
550
__________________________________________________________________
2100
1900
1700
1500
1300
1100
WHO’D
900
200
250
122
141
91
81
91 58
300
350
72
47
121
28
97 81
400
450
HOOD
2300
500
550
__________________________________________________________________
Conclusions:
•
This study confirmed:
•
•
Similar results for n=20 and n=50
Changing vowels in HOOD, WHO’D and HAD provide better speaker
discrimination than historically stable vowels in HOARD and HARD
WHO’D performs less well than HOOD due to its large within-speaker
variability
Overall, within-speaker variability larger for changing vowels than
stable vowels
Stable HEED F2 performs best due to large between-speaker
variability and relatively small within-speaker variability
When WHO’D is fronting first, fronting of HOOD may be delayed
When HOOD is fronting, also WHO’D is fronting
•
•
•
•
•
– fronting of /uː ʊ/ → F2 increased
– more open /æ/ → F1 increased
– Resulting in extra large formant ranges for these vowels
__________________________________________________________________
RELEVANCE FOR FORENSIC PHONETICS:
• Sounds undergoing change are useful as SPID
parameters: they show large between-speaker variability
and different usage patterns for speakers
• Concerning the stable vowels, F2 of HEED offers good
SPID
• Caution required when measuring changing vowel
formants: large range possible for formant frequencies
__________________________________________________________________
Acknowledgments
this research is supported by the UK Economic and Social Research Council
as part of the project ‘Dynamic Variability in Speech [DyViS]:
A Forensic Phonetic Study of British English’
our thanks to Geoff Potter for technical assistance
ESRC Award no. RES-000-23-1248
__________________________________________________________________
DyViS research findings can be found at:
http://www.ling.cam.ac.uk/dyvis/
__________________________________________________________________

Pres - 1.5 MB - International Association for Forensic Phonetics and

Transcription

Similar documents