slides - Sheffield Department of Computer Science

Transcription

slides - Sheffield Department of Computer Science
HumanpercepHonand
listeningbymachines
Cleo Pike (Chair)
SchoolofPsychologyandNeuroscience,UniversityofStAndrews,UK
Amy Beeston (Panelist)
DepartmentofComputerScience,UniversityofSheffield,UK
WorkshopW17·140thAudioEngineeringSocietyConvenHon·Paris,France·7June2016
Researchbackground
CleoPike
Academyof
Contemporary
MusicProducHon
UniversityofSurrey
MScPsychology
UniversityofSurrey UniversityofStAndrews
PhDPsychoacousHcs Research:MulH-Sensory
PercepHon
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Researchbackground
AmyBeeston
UniversityofOxford UniversityofEdinburgh Koncon,TheHague UniversityofSheffield
Physics
BMusMusicTechnology
MMusSonology
PhDComputerScience
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Thissession
Plan
SecHon1:Whatismachinelistening?
SecHon2:Whataretheprocessesinvolvedinmachinelistening
andtheproblemsencountered?
SecHon3:HowdohumansdoitbeYer?
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
IntroducHon
usesofinputaudio
Whatisamachinelistener?
Whatisamachine?
Amachinereceivesinputcommandsandfollowsrulesinso-waretoperformanac0on
Whatislistening?
Registeringaudioinput(Hearing)+anefforttointerpret(recognize/a7endto)input
Machinelisteners:
Hear
MechanicaltransducHon
Divisionbyfrequency
TransducHontoneuralfiring
FeatureselecHon
Listen
CategorisaHon
RecogniHon
Streaming
Act
Followrules/norms
foracHng
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Learn
?
ApplicaHonsofMachinelistening
usesofinputaudio
ASR
AutomaHcSpeechRecogniHon:
Siri(Apple)
Cortana(Windows)
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ApplicaHonsofMachinelistening
usesofinputaudio
ASR
Siri
Siri
Siri
Cortana
hYps://www.theguardian.com/technology/2015/aug/12/siri-real-voices-apple-ios-assistant-jon-briggs-susan-benneY-karen-jacobsen
KarenJacobson(Aus)
JonBriggs(UK)
KarenJacobson(USA) JenTaylor(USA)
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ApplicaHonsofMachinelistening
usesofinputaudio
ASR
Speechrecogni0on:
DictaHonsystems
TranslaHonsystems
EnglishspeechàEnglishword?---Frenchword?àFrenchspeech
Speakerrecogni0on:
VerificaHon(checkit’syou)
idenHficaHon(workoutwhoyouarecomparedtoNotherpeople)
hYp://peterthink.blogs.com/thinking/webtech/
hYp://www.amiproject.org/ami-scienHfic-portal/idiap-research-themes
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ApplicaHonsofmachinelistening
BeyondASR
para-linguisHcs
emoHonrecogniHon,
conversaHonanalysis
eventdetecHon
sonicinteracHon,
alarmnoHficaHon
engagement
sortandsearch,
informaHonretrieval
www.thatwhitepaperguy.com/images/using-voicerecogniHon.png
www.maximumpc.com/files/u96627/shout.jpg
www.kallbinauralaudio.com/wp-content/uploads/2012/02/
fingersnap.jpg
www-labs.iro.umontreal.ca/~pii3205/H09/graphics/spikes.png
hYp://www.slate.com/blogs/atlas_obscura/2013/10/11/
britain_s_giant_concrete_ears_built_to_warn_of_an_enemy_aircrai
_aYack.html
hYp://www.bbc.co.uk/staHcarchive/
94cc9170872b366182a452f7b9ba78e4d6b83342.jpg
hYp://nordicapis.com/20-emoHon-recogniHon-apis-that-will-leaveyou-impressed-and-concerned/
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ApplicaHonsofmachinelistening
usesofinputaudio
Example1.para-linguisHcs
hYps://www.newscienHst.com/arHcle/mg22229683-800-speech-analyser-monitors-emoHon-for-call-centres/
hYp://news.mit.edu/2016/startup-cogito-voice-analyHcs-call-centers-ptsd-0120
–helpscustomer-servicerepsbuildbeYerrapportwithcustomers
hYp://vms.mit.edu/cogito
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ApplicaHonsofmachinelistening
usesofinputaudio
Example2.eventdetecHon
hYp://www.audioanalyHc.com/uses/
“13,090,191,849secondsofaudioprocessed”
quotedfromhYp://www.audioanalyHc.com/accessed5June2016
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ApplicaHonsofmachinelistening
Example3.engagementusesofinputaudio
Shazam
–digitalfingerprint
hYp://www.shazam.com/assets/images/website/apps/mobile_ios_and_android-72a04dcf.png
Helpspeoplerecognizeandengagewiththeworldaroundthem
hYp://www.shazam.com/company
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ConceptualexplanaHonofML
usesofinputaudio
Processesbehindmachinelistening
ForanyMLweneed:
Hardwareandsoiware
Machines:
Engineeredperipherals(microphones)
Engineeredsoiware-algorithms
Engineeredperipherals(loudspeakers)
Humans:
Evolvedperipherals(ears,nerves)
Evolvedsoiware-algorithms
Evolvedperipherals(nerves,mouth)
MaterialistvsDualistPhilosophy
Descartes:mind/bodydualism.hYps://en.wikipedia.org/wiki/Mind
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ConceptualexplanaHonofML
usesofinputaudio
StaHsHcalASR
Decoding
Preprocessing
Feature
extraction
Pronunciation
model
Acoustic
model
Back end
Front end
P(Q|W)
k
P(X|Q)
hYp://slideplayer.com/slide/6218710/
Language
model
P(W)
@
“thecatchasedthemail”
Yoshiokaetal(2012).IEEESignalProcessMag,29(6):114–126
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ConceptualexplanaHonofML
usesofinputaudio
Ingeneral:taskandcontextdependencies
1-DHme-domain
audiblesignal
classificaHon
(supervised)
hYp://dsii.dsi.unifi.it/~moods/moods/images/note.gif
n-Dfeaturevectors
controlparameters
combiningfeatures
(mulHmodal)
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
clustering
(unsupervised)
Temporalfeature(1D)
Intensitytracking(Praat)usesofinputaudio
Praat
1.Tracktheintensityenvelope
-  doingphoneHcsbycomputer
–Openfile,showintensity
-  www.praat.org/
–ExtractvisibleintensityContour
2.Segmentsignal(voiceacHvitydetecHon)
–PraatObjectswindow,Intensity>ToTextGrid(silences)
3.SoundandTextGrid>View&Edit
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Temporalfeature(1D)
Intensitytrackingfail
usesofinputaudio
aircraft
upload.wikimedia.org/wikipedia/commons/3/3c/
Qantas_b747_over_houses_arp.jpg
Amplitude
+ve
snore sound
* * *
...
0
00:00
00:30
* * *
01:00
01:30
02:00
01:30
02:00
Time (mm:ss)
...
hYp://www.snoringmouthpieceguide.com/wp-content/uploads/
2013/07/me-snoring.jpg
schema-driven(top-down)
•  priorknowledge,
semanHcs,pragmaHcs
mid
low
00:00
aircraft
Frequency
high
00:30
01:00
Time (mm:ss)
primiHve(boYom-up)grouping
•  simultaneous(verHcal)–commonon/offset,harmonicity
•  sequenHal(horizontal)–conHnuity,proximity
BeestonandBrown(2015).NewcastleSleep2015,UK.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Spectralfeatures(2D)
usesofinputaudio
PitchesHmaHon(Sonicvisualiser)
1.Revealtheharmonicstructure
–Openfile
–Pane>Addspectrogram
2.EsHmatefundamentalfrequencyinharmonicregions
–Transform>AubioPitchDetector
SonicVisualiser
-  forviewingandanalysingthe
contentsofmusicaudiofiles
-  hYp://www.sonicvisualiser.org
AubioPitchDetector
-  hYp://www.vamp-plugins.org/
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Spectralfeatures(2D)
PitchesHmaHon‘fail’
usesofinputaudio
•  OwenGreen’sNowforsomemusic(2007)
–  ListenathYps://soundcloud.com/gungwho/and-now-for-some-music-2007
•  ImplementaHon
–  FirstpitchtrackeradapHvelydividesinputsoundintotwoclasses(pitchornoise)
–  Amountofdisagreementbetweenfirstandsecondpitchtrackercontrolssignalprocessing
resulHnginmore/lessperceivedroughness/disrupHonofinput
Green(2014).ProcNIME,1–6.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
MulH-dimensionalfeatures(nD)
usesofinputaudio
TimbraldescripHon(Max)
Max
sound,graphics,music,interacHvity
hYps://cycling74.com/
[analyzer~]objectbyTristanJehan
hYp://web.media.mit.edu/~tristan/
Matlabalt.
TimbreToolbox(McGill,Canada)
hYp://www.cirmmt.org/research/tools
MIRToolbox(Jyväskylä,Finland)
hYps://twiYer.com/mirtoolbox
Pythonalt.
EssenHa(Barcelona,Spain)
hYp://essenHa.upf.edu/
•  …perceiveddissimilaritydespitesameloudness,pitchandduraHon
•  Brightness=>spectralcentroid
•  Noisiness=>spectralflatness
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
DiScipio(2003).OrganisedSound,80(3),269-277.
Beeston(2015).2ndRoyalMusicalAssociaHonMPSGworkshop.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
zumrotenigel.files.wordpress.com/2013/02/
symphony-hall.jpg
www.flickr.com/photos/
130738664@N02/16122099133
•  humansadapttotheroom(andfast!)
•  ourmachinelistenerstypicallydon’t
abstractcriHcal.com/wp-content/uploads/
2014/01/PUMHint1_1.jpg
Spectro-temporalfeatures(nD)
TimbraldescripHon‘fail’usesofinputaudio
ProblemswithMLapplicaHons
usesofinputaudio
Machinelisteningisflawed
•  MLbreakswithcommonenvironmentalproblems(noise,channel
coloraHon,reverb)
•  WeareonlyusingonesourceofinformaHontoclassifysounds.Real
systemscanusemulHplesources
•  MostcueshaveproblemswithreverberaHon/backgroundnoiseand
coloraHon
•  Humanlistenershaveameansofovercomingtheseproblemsand
machinelisteningcanincorporatethis
•  Humanstakethecontextintoaccount
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforreverberaHon
usesofinputaudio
Overview
•  ReverberaHondegradesspeechintelligibility
–acousHccontentdifferswithdistance
–butphoneHccontentpersists
•  WecompensateforreverberaHon
–monaural/binaural
•  CompensaHonisreliantoncontextualsound
–whatfactorspromote/inhibitcompensaHon?
•  Canmachinelistenersuseequivalentcues?
–samemistakesashumans?
Beeston(2015).PhDThesis,UniversityofSheffield,UK.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforreverberaHon
usesofinputaudio
LatereverberaHon
far
0.2
0.2
0.2
0.1
0.1
0.1
Amplitude
Amplitude
near
Time
0.5
Time (s)
1
0.5
0.5
Time (s)
Time (s)
1
•  Latereverb=>noise-likeeffects
–Increasesnoisefloor
–Reducesdynamicrangeoftemporalenvelope
•  Stopconsonants=>verysensiHvetoreverb
–idenHficaHondependsonrapidamplitudemodulaHon,e.g.[t]dip
Náběleketal.(1989).JAcoustSocAm,86(4),1259-1265.
–peaksprolonged,dipsfilled
Drullmanetal.(1994).JAcoustSocAm,95(2),1053-1064.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
1
categoryboundary
CompensaHonforreverberaHon
usesofinputaudio
Watkins–Nextyou’llget{sir,sHr}toclickon
sHr
compensa1on
near
far
context context
fartest-word
neartest-word
sir
sHr
sHr
sir
sir
incr.reverb
ontest-word
incr.reverb
oncontext
Watkins(2005).JASA,118(1),249-262.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
HumanaudiHon
Peripheralandcentralprocessing
•  ConHnualrecalibraHon:feedback(centrallyandtotheperiphery)
–low-level/sHmulusdrivenandhigh-level/aYenHonaleffects
hYps://www.gallaudet.edu/images/clerc/ear1.GIF
Guinan(2011).Auditoryandves1bularefferents
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
ComputaHonalmodel
Efferent-inspiredauditorymodel
•  Efferentprocessing=>reduceresponsetoenergyinreverberanttail
Amp.
Input
efferent
ATT
metric
window
DRNL
hair cell
STEP
afferent
Amp.
OME
yin(t)
yom(t)
OME
ybm(t,c)
signal
Freq. (Hz)
DNRL
8000
sir/stir
Freq. (Hz)
Hair cell
100
8000
100
yhc(t,c)
STEP
Freq. (Hz)
8000
yan(n,c)
100
0
50
100
150
Time (ms)
200
250
FerryandMeddis(2007).JAcoustSocAm,122(6),3519-3526.
BeestonandBrown(2010).Interspeech,pp2462-2465.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforreverberaHon
usesofinputaudio
Findings
•  MonauralreplicaHonandextensionofWatkins’work
–realspeech,mulHpletalkers,incl.s+{t,k,p}+vowel
•  HumancompensaHon
–isapparentfor{p,t,k}whenhighfreqsarepresent
–isabolishedwithHme-reversereverberaHon
–usesintrinsicinfowhenextrinsiccontextisambiguous
–israpid(c.500ms)
•  CompensaHonmodel
–doesnotrequirephoneHcprocessing
–usesefferentprocessingtohelprecover[t]dip
–bestversionderivesinfofromreverberanttails
Beeston,BrownandWatkins(2014).JAcoustSocAm,136(6),3072-3084.
BeestonandBrown(2014).7thForumAcusHcum,Krakow,Poland.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
Aim
DohumanscompensateforspectraldistorHon(colouraHon)
causedbyenvironment?
Whataretheperceptualmechanisminvolved?
CanweapplyanytobenefitML?
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
RaHonale
Spectrum-keytorecogniHon
e.g./e/or/a/
Environment-rooms,
loudspeakers,microphone
SpectraldistorHon/colouraHon-
/e/physicallybecomes/a/
CompensaHon-wesHllhear
theintended/e/vowel
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
Experiment1
CleoPike
hYp://www.esm.rochester.edu/concerts/halls/hatch/
hYp://www.dogoilpress.com/FDS-375985.htmlimage:3784X2592/OFFICE/#375985
_
:
: (
_
:
:
_
:
:D
_
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
Results
CondiHon2
CondiHon1
1 2 3 2 3 1
1
2
3
1
TOBEUPDATED
Pike,C.D.(2015)TimbralconstancyandcompensaHonforspectraldistorHoncausedbyloudspeakerandroomacousHcs
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
ExplanaHonofresults usesofinputaudio
1
Longer
🕗
2
Longer
🕗
3
Longer
Amemoryeffect?(Oliveetal.1995)
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
Experiment2
Condition 3
Condition 1
1 2 3 2 3 1
Time
1
2
3
Time
IsHmebetweenlisteningacauseofcompensaHon?
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
1
CompensaHonforspectraldistorHon
usesofinputaudio
Results
Condition 3
Condition 1
1 2 3 2 3 1
1
2
3
1
TOBEUPDATED
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
ExplanaHon
1
🕗
2
🕗
3
Amemoryeffect?(Oliveetal1995)
MemorylossshouldcausenoiseinraHngsnotcontracHon
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
ExplanaHon
Aim:
FindmechanismstoexplaincompensaHonduetoHmegap
The‘auditoryenhancement’effect
HighFreq
LowFreq
Spectrum
1
Time
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
ExplanaHon
Aim:
FindmechanismstoexplaincompensaHonduetoHmegap
The‘auditoryenhancement’effect
HighFreq
LowFreq
Spectrum
1
Spectrum2
Time
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
ExplanaHon
Aim:
FindmechanismstoexplaincompensaHonduetoHmegap
The‘auditoryenhancement’effect
HighFreq
Spectrum2
Spectrum2
LowFreq
Spectrum
1
Time
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
Overallfindings
Enhancementenhancesspectralchangein
runningspeechormusic
Thisraisesspectralchangeinspeech/music
abovea‘colouraHonfloor.’
AddiHonally,thecolouraHonfloorcanbe
removedwithasimilarbutlongerHme
courseprocess…
Thespectralcompensa0oneffect
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
CompensaHonforspectraldistorHon
usesofinputaudio
ApplicaHontoML
Isthisprocessimplementedinmachines?
MachinelistenersdoremovecolouraHon:
‘SpeakervocaltractcompensaHon’
CanalsobeusedtoremovecolouraHon
byanychannel
VocalTractLengthNormalisaHon
CepstralmeansubtracHon
Aretheseevenneeded
forcolouraHonby
reverb?
DereverberaHon
processes
shouldalsoremove
colouraHon
Co-arHculaHoneffects
couldalsobe
compensatedforwith
an‘enhancement’
process
CondiHon*RoomF=187.22p<.001
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Otherwork
usesofinputaudio
ComputaHonalAuditorySceneAnalysis
Cocktailpartyproblemwereceivemixofsound
Howdowepickoutanyone?
ASA–principlesinhumanlisteningtosegregateauditorystreams
Knowledgebased(Schema)grouping–Priorexperience,topdown
Primi0vegrouping-lowlevel,boYomup
Examples:
Commononset
CommonAM,FM
Harmonicity
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Otherwork
CASA
usesofinputaudio
BlindsourceseparaHon,SpaHalfiltering,IndependentComponents
Analysis-Variousdrawbacks
CASAmimicstheauditorysystemfromthebeginning:
Justtwomicrophonesà
Gammatonefilterbankà
Cochleogramà
SegmentaHonintoTFunitsà
SegregaHonverHcallyandhorizontally-basedonASAprinciplesà
Binary/soimaskappliedtoisolatetargetfromnoise
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Otherwork
CASA
usesofinputaudio
Theidealbinarymaskisa0or1separaHonofnoisefromthebackground
Similartoocclusioninvision
OcclusionParHalOcclusionFigure/GroundSeparaHon
HoweveraYenHonisnotabsolutesofurtherpsychoacousHc
principlescouldbeadded…..
“Humanscanholdthenoisestreamsinmindbutthisisnotoienimplementedin
Machines.”(Wang2005)
hYp://vanseodesign.com/web-design/pictorial-depth-cues/
hYp://faculty.gvsu.edu/KEISTERD/design_principles_art/figure_ground.html
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Otherwork
CategoricalpercepHon usesofinputaudio
Otherimportanthumanperceptualmechanismsthathavebeenimplemented
inmachinelistening:
Categoricalpercep0on
ThelistenerhearsdisHnctcategoriesofmusicalorspeechsounds
ratherthanaconHnuum
CPcanremovevariaHoncausedbydistorHon
Clusteringalgorithmsmimicthis
Neuralnetworks-inspiredbyhumanpercepHon
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
Theend
Thankyou
PleaseaskusquesHons:
Amy
a.beeston@sheffield.ac.uk
Cleo
[email protected]
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016
references
Beeston,A.V.(2015).PerceptualcompensaHonforreverberaHoninhumanlisteners Green,O.(2014).MusicalityandpracHce-ledmethods.ProcNIME,1–6.
andmachines.PhDthesis,UniversityofSheffield.
J.J.Guinan,Jr.(2011).Physiologyofthemedialandlateralolivocochlearsystems.In
Beeston,A.V.(2015).DoweneedrobustaudiointerfacingbasedonpsychoacousHc D.K.Ryugo,R.R.Fay,andA.Popper,editors,Auditoryandves1bularefferents,39–
principlesofhearing?RoyalMusicalAssocia1onMusicPhilosophyStudyGroup
81.Springer,NewYork.
Workshop,Sheffield,UK,27May.
Nábělek,A.K.,Letowski,T.R.,&Tucker,F.M.(1989).Reverberantoverlap-andselfBeeston,A.V.andBrown,G.J.(2015).Auditorymodelsforrobustanalysisofsnoring maskinginconsonantidenHficaHon,JAcoustSocAm,86(4),1259–1265.
indomesHcrecordings.ProcNewcastleSleep2015-Bri1shSleepSocietyScien1fic
Olive,S.E.,Schuck,P.L.,Sally,S.,Bonneville,M.(1995).Thevariabilityof
conference,Newcastle,UK,22-24October.
loudspeakersoundqualityamongfourdomesHc-sizedrooms.ProcAES99,NewYork,
Beeston,A.V.andBrown,G.J.(2014).Consonantconfusionsprovidefurther
USA,preprint4092K-1.
evidencethatHme-reversedroomsdisturbcompensaHonforreverberaHon.Proc7th Pike,C.D.(2016).TimbralconstancyandcompensaHonforspectraldistorHoncaused
ForumAcus1cum,Krakow,Poland.
byloudspeakerandroomacousHcs.PhDthesis,UniversityofSurrey.
Beeston,A.V.andBrown,G.J.(2010).PerceptualcompensaHonforeffectsof
Pike,C.,Mason,R.,&Brookes,T.(2014).AuditoryCompensaHonforSpectral
reverberaHoninspeechidenHficaHon:acomputermodelbasedonauditoryefferent ColoraHon.AudioEngineeringSocietyConven1on137.
processing.ProcINTERSPEECH,2462–2465,Makuhari,Chiba,Japan.
Watkins,A.J.(2005).PerceptualcompensaHonforeffectsofreverberaHoninspeech
Beeston,A.V.,Brown,G.J.,andWatkins,A.J.(2014).PerceptualcompensaHonfor idenHficaHon.JAcoustSocAm,118(1):249–262.
theeffectsofreverberaHononconsonantidenHficaHon:Evidencefromstudieswith
Wang,D.(2005)OnIdealBinaryMaskasthecomputaHonalgoalofauditoryscene
monauralsHmuli.JAcoustSocAm,136(6),3072–3084.
analysis,inSpeechSeparaHonbyHumansandMachines.Springer,NewYork.pp.
DiScipio,A.(2003).Soundistheinterface:frominteracHvetoecosystemicsignal
181-197
processing.OrganisedSound,8(03),269–277.
Yoshioka,T.,Sehr,A.,Delcroix,M.,Kinoshita,K.,Maas,R.,Nakatani,T.,&
Drullman,R.,Festen,J.M.,&Plomp,R.(1994).Effectoftemporalenvelope
Kellermann,W.(2012).MakingMachinesUnderstandUsinReverberantRooms:
smearingonspeechrecepHon.JAcoustSocAm,95(2),1053–1064.
RobustnessAgainstReverberaHonforAutomaHcSpeechRecogniHon.IEEESignal
Ferry,R.,&Meddis,R.(2007).Acomputermodelofmedialefferentsuppressionin
ProcessingMagazine,29(6),114–126.
themammalianauditorysystem.JAcoustSocAm,122(6),3519–3526.
CleoPikeandAmyBeeston·AES140·W17·HumanpercepHonandlisteningbymachines·Paris·7June2016