Achieving Equal Loudness between Audio Files

Transcription

Achieving Equal Loudness between Audio Files
Achieving Equal Loudness
between Audio Files
Evaluation and improvements of loudness algorithms
PAUL
NYGREN
Master of Science Thesis
Stockholm, Sweden 2009
Achieving Equal Loudness
between Audio Files
Evaluation and improvements of loudness algorithms
PAUL
NYGREN
Master’s Thesis in Music Acoustics (30 ECTS credits)
at the School of Media Technology
Royal Institute of Technology year 2009
Supervisor at CSC was Svante Granqvist
Examiner was Sten Ternström
TRITA-CSC-E 2009:032
ISRN-KTH/CSC/E--09/032--SE
ISSN-1653-5715
Royal Institute of Technology
School of Computer Science and Communication
KTH CSC
SE-100 44 Stockholm, Sweden
URL: www.csc.kth.se
Achieving equal loudness between audio files
– Evaluation and improvements of loudness algorithms
Abstract
This master’s thesis presents the evaluation of several loudness calculation algorithms. Two of the algorithms are published standards: Replay Gain and ITU-R BS.
1770, and the others are modifications of these two. To evaluate the algorithms, a
loudness listening experiment was realized. In the experiment audio files containing
speech and pop music were used. The audio files and the corresponding subjective
loudness values retrieved from the experiment were used to evaluate the algorithms.
The precision of the algorithms was also tested using a subjective loudness database
from IRT (Institut für Rundfunktechnik). In both evaluations, ITU-R BS. 1770 produced lower errors than the Replay Gain algorithm, when compared to the subjective
loudness values. Of the modified algorithms, the ITU based algorithms that included
a gate function, produced the best result.
Att uppnå jämn hörstyrka mellan ljudfiler
– Utvärdering och förbättringar av hörstyrkealgoritmer
Sammanfattning
Detta examensarbete presenterar en utvärdering av flera hörstyrkeberäkningsalgoritmer. Två av algoritmerna är publicerade standarder: Replay Gain och ITU-R BS.
1770, och de andra är modifieringar av dessa två. För att kunna utvärdera algoritmerna så genomfördes ett lyssningsexperiment. I experimentet så användes ljudfiler
som innehöll tal och popmusik. Ljudfilerna och tillhörande subjektiva hörstyrkevärden från lyssningsexperimentet användes sedan för att utvärdera algoritmerna. Algoritmernas precision testades också med hjälp av en hörstyrkedatabas från IRT
(Institut für Rundfunktechnik). I båda utvärderingarna så gav ITU-R BS. 1770 lägre
fel än Replay Gain algoritmen, jämfört med de subjektiva hörstyrkevärdena. Av de
modifierade algoritmerna, så gav de ITU baserade algoritmerna som inkluderade en
”gate funktion” bäst resultat.
Recommendations to Swedish Radio
The conclusions from this project have led to these recommendations:
-
Change the loudness calculation method on audio files to ITU-R BS. 1770
-
Use ITU-R BS. 1770 together with a gate function for better precision.
-
Study if an adaptive gate function will give a more precise loudness calculation than a fixed gate function.
-
Study if another approach to can be used as a complement to the regular
audio file normalization process. One idea to a complement is what I call
“post normalization gain correction”. See chapter 6 for an explanation.
Table of contents
1.
Introduction ................................................................................................. 1
1.1.
Earlier work at Swedish Radio ............................................................................... 1
1.2.
Purpose and method ............................................................................................... 2
1.3.
Limitations ................................................................................................................ 2
1.4. Overview of the paper ............................................................................................ 2
2.
Theory .......................................................................................................... 3
2.1.
The auditory system................................................................................................. 3
2.2.
Loudness ................................................................................................................... 3
2.3.
Loudness level .......................................................................................................... 4
2.4.
Critical bands ............................................................................................................ 5
2.5.
Spectral effects.......................................................................................................... 5
2.6.
Temporal effects ...................................................................................................... 7
2.7.
Spatial effects............................................................................................................ 8
2.8.
Gain normalization .................................................................................................. 8
2.9. Replay Gain............................................................................................................... 9
2.9.1. Original Version......................................................................................................... 9
2.9.2. Swedish Radio version ............................................................................................ 10
2.10.
ITU-R BS. 1770...................................................................................................... 10
2.11.
Algorithm modifications....................................................................................... 12
3.
Method ....................................................................................................... 14
3.1.
Evaluation using listening experiment database from Swedish Radio ........... 14
3.1.1.
The loudness listening experiment at Swedish Radio....................................... 14
3.1.2. Evaluation method .................................................................................................. 18
3.2. Evaluation with loudness database from IRT ................................................... 20
3.2.1.
Evaluation method ................................................................................................. 20
4.
Results ........................................................................................................ 21
4.1.
Results from the listening experiment at Swedish Radio ................................. 21
4.2.
Results from the evaluation using loudness database from Swedish Radio.. 22
4.3.
Results from the evaluation using loudness database from IRT..................... 25
5.
Discussion .................................................................................................. 26
5.1.
Listening experiment methodology..................................................................... 26
5.2.
Listening experiment results................................................................................. 26
5.3.
How good can an algorithm become? ................................................................ 28
6.
Conclusions ................................................................................................ 29
7. Acknowledgements .................................................................................... 31
8.
References .................................................................................................. 32
9.
Appendix .................................................................................................... 34
Appendix A – Test subject listening level.............................................................................. 34
Appendix B – Audio file specifications .................................................................................. 35
Appendix C – Audio file histograms ...................................................................................... 36
Appendix D – Matlab code for the Replay Gain (Original) implementation .................. 40
Appendix E – Matlab code for the ITU-R BS. 1770 implementation .............................. 45
Appendix F – Matlab code for the ITU gate implementation............................................ 46
Appendix G – Matlab code for the ITU strongest section implementation.................... 48
Appendix H – Matlab code for the ITU with Replay Gain filter implementation .........50
Appendix I – Matlab code for the Replay Gain with ITU filter implementation ..........52
Appendix J – Matlab code for the RLB implementation .................................................... 54
Appendix K – Matlab code for the RLB gate implementation .......................................... 55
Appendix L – Matlab code for the regression analysis ........................................................ 57
Achieving equal loudness between audio files
1.
Introduction
For companies working with broadcasting, it is important to be able to keep an equal
perceived sound level, or equal loudness, between various programs and program
parts. Many people with different backgrounds, knowledge and preferences make the
programs, and in the current situation this would lead to major level differences if
program material for a whole day would be broadcasted without mixing. These differences would be annoying for the listeners, which would have to increase or decrease the volume every time the program content changes.
Nowadays, with a decreasing number of sound engineers working actively in the
productions and broadcasts, the number of people adjusting the incorrect levels is
getting fewer. For example, it is getting more and more common with what the
Swedish broadcast companies are calling “självkör” or “self-op” in English. This means
that one person can be working as a producer, host and sound engineer at the same
time in live broadcast. In a situation like this the person has no time, and probably
not the right education on how to adjust music and different program parts in level
so that they match each other. This is one of the reasons why broadcasting companies use different forms of process equipment to calculate and adjust sound levels on
audio files and live audio streams.
At Swedish Radio (SR) one process algorithm is used to calculate and adjust the
sound level on audio files ripped from audio CDs before the files are placed in the
audio database. Each audio file is adjusted with a single gain value. This
normalization process is used to achieve equal perceived sound level between the
different audio files in the database. The advantage with this adjustment is that it is
easier for both sound engineers and self-operators, which often have a very limited
time to listen through the music beforehand. The level differences between different
music pieces in broadcast will therefore decrease and a more equal loudness is
achieved.
The algorithm used at SR is called Replay Gain and is implemented in the software
AwaveSR, which is a company specific version of Awave Audio, developed by FMJsoftware (FMJ-software 2008). The algorithm is used on the music played in the popmusic oriented radio channels P3 and P4. According to several sound engineers at SR
the algorithm works well for most of the music that is played in P3 and P4, but of
course there are exceptions. After the normalization process, certain audio files still
have a perceived sound level that can differ from other music in the database by
more than 6 dB.
1.1.
Earlier work at Swedish Radio
In his master’s thesis, Matti Zemark (2007) presents a good overview of loudness
aspects in broadcast and gives suggestion on how to implement a standardized loudness measurement in all the steps of the broadcast production at SR. A new loudness
measurement algorithm is suggested, ITU-R BS. 1770, which also will be further
investigated in this thesis.
1
Achieving equal loudness between audio files
1.2.
Purpose and method
The purpose of this project is to evaluate the existing loudness calculation algorithm
Replay Gain, compare it to the standard suggested by the International
Telecommunication Union 2006, ITU-R BS. 1770 (ITU-R 2006), and also try to find
ways to improve the loudness calculation precision of the algorithms. An underlying
purpose is to investigate if it seems to be possible to use a normalization algorithm to
calculate loudness of audio files containing speech.
As a first step of the project relevant theory about psychoacoustics will be read. After
that an informal interview study with sound engineers and other personnel at SR will
be held to get initial knowledge of the specific conditions at SR and to get ideas for
how to improve the algorithms. The precision of Replay Gain, ITU-R BS. 1770, and
the new algorithm versions, will be measured using audio files and corresponding
subjective loudness data from a listening experiment that will be realized at SR. The
subjective data from the experiment will be compared to the values that the
algorithms give. The result from this evaluation will also be verified by a subjective
loudness database from IRT1.
1.3.
Limitations
This master’s project has its focus on the evaluation and improvement of loudness
calculation algorithms concerning whole audio files. The project will not evaluate and
analyze how the algorithms perform on real time audio streams.
Replay Gain is today used on audio files played in the pop music oriented channels
P3 and P4. Therefore the main focus in the evaluation part of this project is on pop
music. To evaluate and analyze the algorithms concerning the broad range of classical
music is beyond the scope of this project.
There are many aspects to take into account concerning the implementation of a
normalization process of speech material at Swedish Radio. This project will only
investigate how well the algorithms calculate loudness values of speech audio files.
1.4.
Overview of the paper
The first part of this master’s thesis is an introduction to relevant theory concerning
psychoacoustics and loudness. Gain normalization in general and the two loudness
algorithms are also described. Evaluation and analysis methods are described in the
next part of the thesis. The results are then presented, and after that a discussion
about methods and results are held. The thesis ends with the conclusions from the
evaluations and analysis of the algorithms.
IRT – Institut für Rundfunktechnik. A research institute for public-broadcasting organisations in
Germany, Austria and Switzerland (IRT 2009).
1
2
Achieving equal loudness between audio files
2.
Theory
This chapter is an introduction to the project area and is a summary of the literature study that was
made during the first part of the project.
2.1.
The auditory system
The auditory system will here be presented with focus on functions concerning loudness. For a deeper description see for example An introduction to the psychology of hearing
by Brian Moore (2003).
The outer, middle, and inner ear are the three main parts of the auditory system. The
pinna (the visual part of the ear) and the auditory canal form the outer ear. The pinna
works as a filter and affects the frequency content of incoming sound depending on
the direction of the sound, especially high frequency content. This helps us to localize sound sources (Moore 2003, p. 22). The auditory canal also affects the frequency
response of the auditory system because of its resonance at three kilohertz. The ear
therefore has extra sensitivity in this frequency region (Granqvist & Liljencrants
2004, p. 1-0). The function of the middle ear is to transfer incoming air born vibrations to liquid vibrations in the inner ear, with the assistance of the ear bones. Furthermore, the muscles of the ear bones can damp the transmission to protect the
inner ear from signals that is too great (Granqvist & Liljencrants 2004, p. 1-0). The
complicated structure of the inner ear consists among other things the cochlea where
the basilar membrane is located. The basilar membrane starts to oscillate when sound
is transferred through the auditory system in to the inner ear. The oscillation affects
the whole basilar membrane, but gets a maximum that is dependent on frequency
(Granqvist & Liljencrants 2004, p. 1-1). In the inner ear the oscillation is converted
to electrical impulses that reach the brain through nerve paths (Karolinska
Universitetssjukhuset 2009).
2.2.
Loudness
One of the abilities of the auditory system is to be able to order sounds from weak to
strong. Loudness, or “hörstyrka” in Swedish, is defined as that attribute of the auditory system. One problem concerning loudness is that it is a subjective entity, which
means that it can’t be measured directly (Moore 2003, p. 127). There are however
different ways to represent loudness. One is called “subjective loudness” (SL), and is
defined as “…the auditory sensation that allows us to order sounds on a scale from quiet to loud.
The loudness sensation may be described by either word labels or by numerical magnitude values.”
(Leijon 2007, p. 55). Another way is calculated loudness (CL). CL is defined as a
“…single number, based on physical sound measurements, such that the result can be assumed to
rank different sounds in the same order as the subjective loudness. The traditional unit of calculated
loudness is 1 sone, defined as the loudness of a pure 1000-Hz tone at 40 dB re 20 µPa.” (Leijon
2007, p. 55).
3
Achieving equal loudness between audio files
S.S. Stevens suggested one way to approximate the relation between physical intensity2 and loudness:
L = kI 0.3
(1)
where L is the loudness, I is the intensity, and k is a constant adapted to the units
used and to the subject. This approximation holds for sound levels over 40 dB
!
(Moore
2003, p. 132).
How the auditory system, and in the end the brain, does its analysis of how weak or
strong a sound is perceived is not entirely known. One assumption is that it is connected to the total neural activity raised by a specific sound (Moore 2003, p. 133).
2.3.
Loudness level
Barkhausen introduced the concept loudness level in the 20s (“hörnivå” in Swedish).
It was introduced to be able to make loudness comparisons between different
sounds. Zwicker and Fastl define the loudness level of a sound in their book
Psychoacoustics – Facts and Models (1999, p. 202) as “… the sound pressure level of a 1-kHz
tone in a plane wave and frontal incident that is as loud as the sound”. The unit is phon.
Loudness level can be measured for any sound, but most widely known are the
“Fletcher-Munson-curves”, see Figure 1, which shows the loudness level for
sinusoids (Zwicker & Fastl 1999, p. 203). The curves are based on tests made on
many American recruits (Granqvist & Liljencrants 2004, p. 1-7).
Figure 1 . The Fletcher-Munson-curves. Loudness level for sinusoids according to ISO standard (Granqvist &
Liljencrants 2004, p. 1-7).
It is interesting to note the effect of the ear canal resonance at 3 kHz, and the fact
that low frequency sinusoids needs a higher sound level to be perceived as loud as
middle and high frequency sinusoids.
If narrow band noise and a diffuse sound field are used instead of sinusoids the
curves change, see Figure 2.
Physical intensity – “…sound power transmitted through a given area in a sound field.” (Moore 2003, p.
402). One unit is W/m2.
2
4
Achieving equal loudness between audio files
Figure 2 . The change in loudness level with narrow band noise and a diffuse sound field (Granqvist & Liljencrants
2004, p. 1-7).
With loudness on a logarithmic vertical axis and loudness level on the horizontal axis
it can be seen that if the loudness level is raised by 9 phon, the loudness is doubled.
This rule is linear over approximately 40 phon.
Figure 3 . Loudness as a function of loudness level (Granqvist & Liljencrants 2004, p. 1-8).
2.4.
Critical bands
An important concept concerning loudness is the critical bands of the auditory system. It is a way to divide the frequency range of the ear in to different regions. This
makes it possible to calculate the loudness on wide band sounds by adding the loudness in each of the critical bands (Granqvist & Liljencrants 2004, p. 1-9). The frequency range is usually divided in to 25 regions with the unit Bark. Every “Bark” has
a center frequency and a bandwidth. The size of the bandwidth is dependent on the
center frequency. The bandwidth increases with higher center frequencies. One way
to determine the critical bandwidth at a certain center frequency is to let test subjects
listen to narrow band noise with a bandwidth that is gradually increased with a total
sound level that is constant. Up to a certain bandwidth the loudness will be perceived
as constant. When the loudness is perceived as higher than before, the critical bandwidth has been exceeded and consequently the critical band around a specific center
frequency can be determined (Granqvist & Liljencrants 2004, p. 1-9).
2.5.
Spectral effects
Different sounds have different spectrum. Certain sounds only have one frequency,
and other sounds have wide band spectrums. Zwicker and Fastl exemplify in their
book Psychoacoustics – Facts and Models one aspect of the loudness problems concerning that different sounds have different spectrum. For example, the loudness of 60
5
Achieving equal loudness between audio files
dB UEN3 is perceived approximately 3.5 times as loud than the loudness of a 1 kHz
sinusoid with the same physical sound pressure level. Therefore, the loudness is dependent on the width of the frequency content. Figure 4 illustrates the loudness difference between the 1 kHz sinusoid and the noise, which starts at the vertical dashed
line. The line coincides with the bandwidth of the critical band at 1 kHz for the
auditory system. If the bandwidth of the noise fits within a critical band, the loudness
increase mentioned above doesn’t appear. This is confirmed by similar measurement
at various center frequencies (Zwicker & Fastl 1999, p. 211).
Figure 4 . The sound level of a 1-kHz tone that is perceived as loud as band pass filtered noise at different
bandwidths. The total intensity of the noise is held constant (Zwicker & Fastl, 1999, p. 211).
A background noise can mask another sound, which also affects the loudness of
the sound. Figure 5 shows the loudness of a 1-kHz sinusoid as a function of its
sound pressure level. The dashed line corresponds to the loudness without the
noise, and the two lines correspond to the loudness with the masking noise. In this
case pink noise with a sound pressure level of 40 dB, and 60 dB per 1/3 octave
band.
Figure 5 . The loudness of a 1-kHz tone as a function of its sound pressure level. The dashed line corresponds to the
loudness without the noise and the two lines correspond to the loudness when a pink noise source with a sound pressure
level of 40 dB and 60 dB per 1/3 octave band is present (Zwicker & Fastl, 1999, p. 214).
UEN – Uniform exciting noise. White noise that has been filtered to give equal intensity in each
critical band (Zwicker & Fastl 1999, p. 170).
3
6
Achieving equal loudness between audio files
The masking of sounds is also dependent on how close the masking source is in
frequency to the other sound. In Figure 6, the loudness of a 60 dB sinusoid with a
varying frequency distance (∆f) to a high pass noise is illustrated. The high pass
noise has a cut-off frequency of 1-kHz and a sound pressure level of 65 dB in each
critical band. When ∆f is varied, the loudness of the tone is affected. If ∆f is getting
smaller, i.e. the sinusoid is getting close to the noise in frequency; the loudness of
the sinusoid is decreasing. The high pass noise will consequently decrease the loudness of the sinusoid, even if they are spectrally separated. Zwicker and Fastl writes
that these experiment result are important when modeling loudness (1999, p. 216).
Figure 6 . The loudness in sone for a sinusoid, when a high pass noise with a cut-off frequency of 1-kHz is present
at the same time, as a function of the distance between the sinusoid and the noise (Zwicker & Fastl, 1999, p. 215).
2.6.
Temporal effects
Most sounds vary over time, e.g. speech and music. This is why the temporal aspects
of sounds also are important for how humans perceive sound levels (Zwicker & Fastl
1999, p. 216). For example, a sinusoid with a length of 10 ms is perceived as weaker
than a similar sinusoid with the same sound pressure level, but with the length 100
ms. The length of a sound is affecting the loudness, and this is illustrated in Figure 7.
This feature of the auditory system is called “temporal integration” and holds up to
around 100 ms, and after that, the loudness of a sound is constant (Zwicker & Fastl
1999, p. 216). Moore (2003, p. 137) writes that extensive studies have been made on
the effects of temporal integration, but that the results are varying. The effect seem
to stop somewhere between 100-200 ms. Moore summarizes the measurements with
that up to approximately 80 ms, constant energy gives constant loudness.
Figure 7 . The loudness of a 2-kHz tone with a sound pressure level of 57 dB, as a function of the tone length in
milliseconds (Zwicker & Fastl, 1999, p. 216).
7
Achieving equal loudness between audio files
Another temporal effect that affects the loudness of a sound is masking sounds that
come before or after in time. Figure 8 shows loudness measurements when a 2-kHz
tone is played before noise (UEN). The tone is 5 ms long and has a sound level of
60 dB. ∆t is the distance in time between the tone and the masking noise. The
loudness will decrease when the distance between tone and masker decreases. The
implication of this is that even if the auditory system perceives the tone it takes
some time to create the impression of loudness. If the auditory system is disturbed
by another sound the impression of the earlier tone is interrupted (Zwicker & Fastl
1999, p. 219).
Figure 8 . The loudness of a 2-kHz tone that is played before noise, as a function of the distance between tone and
masker in milliseconds (Zwicker & Fastl, 1999, p. 219).
Something that also affects the loudness of a sound is if a person is exposed to high
sound levels a longer period of time. Eventually the hearing threshold will rise temporary. This effect can be measured right after a person has been exposed to the
sound and is called “temporary threshold shift” (Moore 2003, p. 146).
2.7.
Spatial effects
The filtering of the ear, depending of the direction of the sound, affects the perception of loudness. The fact that sound often reach the ears with a slight difference in
time also affects the loudness. This creates a complex phenomenon, e.g. when a
person listens to music with two loudspeakers in stereo, and is called “binaural loudness
summation”. The phenomenon can cause the perception of loudness with two loudspeakers to increase compared to one speaker when the both speaker systems are
playing at the same physical sound level (Skovenborg & Nielsen 2004, p. 3).
2.8.
Gain normalization
Normalization of audio files is a way to adjust the overall level of the files to match
each other by assigning a single gain to every file. There are many different methods
to normalize audio files. One way is to normalize the peak amplitude, i.e. find the
largest sample value in each file and adjust the overall gain so that the largest sample
in every audio file matches (Vickers 2001, p. 2). Another method is to measure the
mean level in the files and adjust them so that all files get the same mean level.
8
Achieving equal loudness between audio files
If equal loudness is the goal with the normalization, a weighting filter often is applied
before measuring the mean level in a file. This is done to give different frequencies
different weightings dependent on how loud the human ear perceives them
(Skovenborg & Nielsen 2004, p. 7).
2.9.
Replay Gain
2.9.1.
Original Version
Replay Gain is an open standard and is described on the website www.replaygain.org
(Replay Gain 2008). The algorithm was presented 2001, and the main idea is to calculate and store a value of the gain correction needed on an audio file to match the
perceived sound level of other audio files where the algorithm have been applied.
Replay Gain sets one gain correction value over a whole audio file, which for example could mean that the overall gain in a file could be adjusted with -10 dB.
There are two versions of Replay Gain; the first is called “Radio” Replay Gain and
works as the explanation above. The other one, “Audiophile” Replay Gain, calculates
the gain correction needed over a whole album, so that the intentional level differences within the album are left unchanged. In this thesis only “Radio” Replay Gain
will be described and evaluated.
The calculation process is divided into four steps, where the first is to filter the signal. The filter used is an inverted approximation of the Fletcher-Munson curves
(Replay Gain 2008), see Figure 9 for the filter target response.
Figure 9 . The target response for the Replay Gain filter, which is an inverted approximation of Fletcher-Munson
curves (Replay Gain, 2008).
Step two is to calculate the energy of the signal. The signal is divided into blocks of
50 milliseconds and the RMS4 energy is calculated over each block. Each value is
stored in an array. The energy of stereo files is calculated by adding the means of the
4
RMS – Root Mean Square.
9
Achieving equal loudness between audio files
two channels and divide by two before the square root is calculated. After the RMS
energy calculation, all values are converted into decibel.
The third step is to choose one RMS value from all the 50-millisecond blocks. Replay
Gain sorts all values into numerical order and picks the value that is stored 5 %
down in the array from the largest value.
The last step is to compare the calculated value with a reference. Replay Gain uses
pink noise with the RMS energy -20 dBFS5. The pink noise signal is sent through the
algorithm and the result is stored. Then the difference between the reference and the
audio file Replay Gain values is calculated, and this value is called the Replay Gain. In
Appendix D the Replay Gain algorithm is shown in Matlab code. The code is from
the website www.replaygain.org (Replay Gain 2008).
2.9.2.
Swedish Radio version
At SR, Replay Gain is implemented in the software AwaveSR. The difference between the original Replay Gain version and the one used in AwaveSR is the weighting filter. The frequency response of the AwaveSR filter together with the Replay
Gain original filter and the target response curve are shown in Figure 10.
Figure 10 . The weighting filter used in AwaveSR together with the Replay Gain original filter and target response.
2.10.
ITU-R BS. 1770
International Telecommunication Union (ITU) has suggested a standard for loudness
measurement (ITU-R 2006). The ITU document presents a way to measure loudness
for mono, stereo and multi-channel signals, and the suggested algorithm is a development of the Leq(RLB) algorithm.
RLB (Revised Low-frequency B-weighting) is a high-pass filter, and is a development
from the B-weighting curve, see Figure 11.
5
dBFS – Decibels relative to full scale. Used in digital systems with a maximum available level.
10
Achieving equal loudness between audio files
Figure 11 . The RLB weighting curve (ITU-R 2006, p.4).
The first stage of the algorithm is to apply a filter to account for the acoustic effects
of the head (ITU-R 2006, p. 4). The filter has a frequency response as shown in
Figure 12.
Figure 12 . The filter to account for the acoustic effects of the head (ITU-R
2006, p. 4).
The second stage is to filter the signal according to the RLB-weighting curve. The
two filters mentioned above is sometimes together named Leq(K) or Leq(R2LB) (Lund
2008, p. 3). After the filtering process the measurement is realized in two steps, and
the first is:
zi =
1
T
T
2
i
" y dt
(2)
0
where yi is the filtered input signal for channel i, and T is the interval of the measurement.
!
11
Achieving equal loudness between audio files
The other step is:
N
Loudness = "0.691+ 10log10 # Gi • zi
(3)
i=1
where Gi is a weighting coefficient for the different channels. For frontal channels Gi
equals 1 and for back channels 1.41. In Appendix E the implementation is shown in
! Matlab code.
2.11.
Algorithm modifications
During this project many different ITU and Replay Gain algorithm modifications
were tested. The ideas that these algorithms are based on, originate from discussions
with supervisors, sound engineers and other personnel at SR (except Center of Gravity).
In this thesis only the modified algorithms that produced low errors in relation to the
original algorithm versions are presented.
Center of Gravity – An ITU-R BS. 1770 implementation that includes an adaptive gate
function6. Described in the AES paper Loudness Descriptors to Characterize Programs and
Music Tracks by Esben Skovenborg and Thomas Lund (2008). (In this thesis only
tested using audio files and corresponding subjective loudness data from the SR listening experiment.)
ITU gate – An ITU-R BS. 1770 implementation with a fixed gate threshold value.
Different gate threshold values were tested, e.g. -70 dBFS. In Appendix F the implementation is shown in Matlab code.
ITU strongest section – An ITU-R BS. 1770 implementation, but instead of calculating
the RMS value over the whole file, the calculation was done over the strongest section. Different section sizes relative to the file length were tested, e.g. the strongest 5th
of each file. In Appendix G the implementation is shown in Matlab code.
ITU with Replay Gain filter – An ITU-R BS. 1770 implementation, but with the Replay
Gain filter instead. In Appendix H the implementation is shown in Matlab code.
Replay Gain with ITU filter – A Replay Gain implementation, but with the ITU
weighting instead. The difference from the original Replay Gain algorithm is shown
in Appendix I.
Replay Gain X ms – A Replay Gain implementation, but with a block length of X milliseconds instead of the standard 50 ms. In Appendix D the difference from the
Replay Gain (Original) algorithm is shown in a comment.
Replay Gain Y % – A Replay Gain implementation, but the value Y % down from the
maximum RMS value is chosen instead of the standard 5 %. In Appendix D the
difference from the Replay Gain (Original) algorithm is shown in a comment.
Gate function – A gate function disregards periods of audio with levels under a threshold value
during the loudness calculation. E.g. silence will be ignored.
6
12
Achieving equal loudness between audio files
RLB – An ITU-R BS. 1770 implementation, but without the 4 dB treble gain. The
Matlab implementation is shown in Appendix J.
RLB gate – As above, but with a fixed gate threshold value. Different gate threshold
values were tested, e.g. -70dBFS. The Matlab code is shown in Appendix K.
13
Achieving equal loudness between audio files
3.
Method
This chapter describes the evaluation and improvement methods used in the project.
To be able to evaluate the precision of the algorithms, audio files with corresponding
subjective loudness values collected from listening experiments were needed. The
audio files were run through the algorithms and the result was compared to the subjective loudness values from the listening experiments. An analysis with different
statistical measures was then realized, and the algorithm with the best precision from
the analysis would be considered to be the most accurate.
The algorithm ITU-R BS. 1770 and the Replay Gain (Original) algorithm were implemented in Matlab for the evaluation. The software AwaveSR was used to evaluate
Replay Gain (AwaveSR).
3.1.
Evaluation using listening experiment
database from Swedish Radio
3.1.1.
The loudness listening experiment at Swedish Radio
A loudness listening experiment was realized at SR to get a reference material to
evaluate the algorithms with. The test procedure was similar to the one that
Skovenborg et al. present in the paper Loudness Assessment of Music and Speech (2004). A
pair of audio files was presented to the test subject, the first one with a fixed sound
level and the second with an adjustable sound level. The task for the test subject was
to adjust the overall sound level of the second audio file so that it perceptually would
match the sound level of the first audio file. When the test subject was satisfied with
the adjustment, the level difference between the two audio files was stored and a new
pair of audio files was presented.
All audio files were normalized before they were used in the listening experiment and
the reason was to get a fairly equal listening level during the test. The normalization
procedure followed Skovenborg et al. (2004, pp. 4-5), but an RLB-weighted filter was
used instead of a B-weighted. The overall level in every audio file was adjusted so
that the (RLB-weighted) RMS value of each file matched the RMS value of a pink
noise filtered the same way. During the experiment, before each new level
adjustment, the second audio file in every pair also got a random level offset between
-6 dB and +6 dB to give each test subject different start level on the second audio
file. The gain change due to the normalization of each audio file were stored and
accounted for after each experiment so that the original level differences between all
matched pairs were stored.
The sound pressure level at the listening position was measured with the same pink
noise that was used for the normalization of the audio files. Before an experiment
started, each test subject had the possibility to adjust the mean listening level between
60 – 75 dB(C). Most of the test subjects preferred a listening level of 65 dB(C). In
Appendix A the listening level of each test subject is shown.
14
Achieving equal loudness between audio files
An interface was made in Pure Data7, see Figure 13, which controlled the listening
experiment. The interface was connected to a mixing console through MIDI8.
Figure 13 . The interface in Pure Data.
The test subjects controlled the interface with three faders and two mute buttons on
the mixing console. The mixing console is shown in Figure 14. The subjects could
choose to listen to one or both audio files at the same time, and also choose different
time positions in the files. One fader controlled the level of the second audio file and
the other two controlled the playback positions of the two audio files. By pressing
the buttons, the audio files could be muted. Before an experiment started, the test
subject got instructions and training on how the equipment and interface functioned.
The test equipment was set up and calibrated with the help of sound engineers at SR
and is shown and described in Table 1 and Figure 14 and 15 below. The
loudspeakers were placed 1.7 meter apart at a height of 1.0 meter and with a distance
of 2.4 meters from the listener.
Pure Data – A graphical programming environment for audio, video and graphical processing
(Pure Data 2009).
8 MIDI – Musical Instrument Digital Interface. (MIDI Manufacturers Association 2009)
7
15
Achieving equal loudness between audio files
Figure 14 . The interface controller unit. A digital mixing console connected to Pure Data through MIDI. The two
faders to the left controlled the playback positions of the two audio files. With the third fader (red colored), the level of the
second audio file was controlled.
Amplifier:
Audio D/A converter:
Computer:
Loudspeakers:
Mixing console:
Subwoofer:
Yamaha DSP-AX1
Grace design m902 (unbalanced analog outputs)
Macintosh Macbook 1.1
Chario Syntar 200
Yamaha 01V96
Martin Logan Descent
Ta ble 1. The test equipment list.
Figure 15 . The listening experiment setup.
16
Achieving equal loudness between audio files
The audio files used in the test were chosen from three categories:
1. Speech, recorded in a studio or with low background noise (this category is
called Speech).
2. Music, frequently played in the channels P3 and P4 (this category is called P3P4).
3. Music, spectrally different and/or dynamically changing, i.e. difficult to set
one general sound level (this category is called Hard).
Categories and test material were chosen in consultation with supervisors and sound
engineers at SR. The speech sounds were monophonic and collected from the internal audio database and edited to be approximately 30 seconds long. The music material was imported from CDs and all tracks were full-length and in stereo. See
Appendix B for a description of the audio files that were used in the listening experiment together with corresponding index.
The test material consisted of 24 audio files (7 speech files and 17 music tracks).
Every audio file occurred three times in the test as shown in Table 2, which implied
36 level adjustments for each test subject. Each category was matched against the
other two categories approximately the same amount of times. The vertical axis (A)
represents the index of the audio files with a fixed sound level, and the horizontal
axis (B) represents the index of the audio files with variable level. For example,
audio file #2 and #13 were matched to audio file #1 and audio file #1 was matched
to audio file #24. Because all audio files occurred with the same frequency in the test,
the possible bias due to a specific reference sound could be avoided (Skovenborg &
Nielsen 2004, p. 13). Also, before each test, the order of the 36 level adjustments was
randomized so that the pair matching sequence would be different for all the test
subjects.
Ta ble 2. How the audio files were paired in the test. The vertical axis represents the index of the audio files with a fix
sound level, and the horizontal axis represents the index of the audio files with variable level.
Half the test subjects did the listening experiment according to Table 2, and the other
half did the test the opposite way, i.e. the horizontal axis (B) represents the reference
17
Achieving equal loudness between audio files
sounds. This was done to avoid that the order of the two audio files in each matching
would influence the test result.
Sixteen test subjects participated in the test, all males. Eight of them were sound
engineers from SR and the other eight had either some form of audio technology
education or music education.
3.1.2.
Evaluation method
The resulting data from the listening experiment consisted of the difference in decibels between all the matched audio file pairs. The decibel difference between an
audio file with index i and an audio file with index j is called DL(i,j) (DifferenceLevel) by
Skovenborg et al. (2004). In the following chapters, the nomenclature used by
Skovenborg et al. (2004) is used where possible. Using all the DifferenceLevel values
from the 16 test subjects, regression analysis was used to obtain the SegmentLevel or
SL(i) values. The SegmentLevel values correspond to the subjective level of each of the
24 audio files used in the listening experiment. Regression analysis is a method to
analyze the relation between a response variable and one or many explanatory
variables (Nationalencyklopedin 2009). Shown below (eq. 4) are the equations that
formed the starting point for the regression analysis. The result from the regression
analysis is an estimate of the SL(i) values and is based on a least-squares error
solution (Skovenborg et al. 2004, p. 10). Appendix L contains the Matlab commands
used for the regression analysis.
DL(1,2)=1*SL(1) – 1*SL(2) + 0*SL(3) + … + 0*SL(24)
DL(2,3)=0*SL(1) + 1*SL(2) – 1*SL(3) + … + 0*SL(24)
.
.
.
DL(24,1)= –1*SL(1) + 0*SL(2) +0*SL(3) + … + 1*SL(24)
(4)
All 24 audio files were run through the algorithms and the results were stored. The
audio files used here were not normalized. Each algorithm assigned one value to each
audio file, which was compared to the 24 SegmentLevel values from the regression
analysis.
To avoid any constant difference between the algorithm values and the SegmentLevel
values, zero-order correction was added to all algorithm values. The formula used is
shown below:
ModelPr ediction(i) = ModelPr edictionuncal (i) "
N
1
# (ModelPr edictionuncal " SegmentLevel(i))
N i=1
(5)
where ModelPredictionuncal(i) is the algorithm value of the audio file with index i before
correction.
!
The difference between a predicted value from an algorithm and the corresponding
subjective value is named SegmentError(i).
SegmentError(i) = ModelPr ediction(i) " SegmentLevel(i)
18
!
(6)
Achieving equal loudness between audio files
In figure 16, the above-mentioned steps are shown schematically.
Algorithm predictions:
ModelPredictionuncal(i)
Listening experiment result:
DifferenceLevel(i,j)
Zero-order correction
Regression analysis
Zero-order corrected algorithm predictions:
ModelPrediction(i)
Regression analysis result:
SegmentLevel(i)
SegmentError(i)=ModelPrediction(i)-SegmentLevel(i)
Figure 16 . From listening experiment result and algorithm predictions to the SegmentError values.
Based on all SegmentError values five statistical measures were calculated:
Average Absolute Error (AAE) – the absolute mean of all SegmentError values for one
algorithm.
AAE =
1 N
" SegmentError(i)
N i=1
(7)
Absolute Standard Deviation (ASD) – the absolute standard deviation of all SegmentError
values for one algorithm.
!
ASD =
1
N
N
! ( SegmentError (i) " AAE )
2
(8)
i =1
Maximum error (MaxError) – the maximum error that an algorithm made when
comparing all SegmentLevel values.
Root Mean Square Error (RMSE) – the root mean square of the SegmentError values.
Gives more weight to the high errors (Skovenborg & Nielsen 2004, p. 17).
RMSE =
1 N
" SegmentError(i) 2
N i=1
(9)
!
19
Achieving equal loudness between audio files
95th Percentile Absolute Error (P95AE) – The value that 95 % of the absolute
SegmentError values are below.
Using the first results from the listening experiment as a starting point, explanations
for the differences between the algorithm values and the subjective listening experiment results were searched. Among other things amplitude histograms of the audio
files were plotted and analyzed. An amplitude histogram displays how the levels in an
audio file are distributed. The histograms are shown in Appendix C. The first results
were also used in the development process of the different algorithm modifications
described in section 2.12.
3.2.
Evaluation with loudness database from
IRT
At IRT, a loudness listening experiment has been realized. The loudness database
consisted of many audio segments, both speech and music, recorded from German
radio broadcasts. The segments were short, about 10-15 seconds and included several
music genres, such as: choral music, classical music, pop and metal. The speech segments consisted of male and female voices in different broadcast environments. Fifteen test subjects participated in the listening experiment. The result and audio segments from this listening experiment have been used to evaluate the algorithms in
this project.
3.2.1.
Evaluation method
From the IRT database, 68 audio segments were used to evaluate the algorithms. The
algorithms calculated and assigned loudness values for the audio segments and these
values were compared to the IRT database values using the same procedure as described in section 3.1.2, but no regression analysis was made since the database consisted of SegmentLevel values for all audio segments.
20
Achieving equal loudness between audio files
4.
Results
The result from the listening experiment at SR and from the two algorithm evaluations is presented
with the help of graphs and the five statistical measures described in section 3.1.2.
4.1.
Results from the listening experiment at
Swedish Radio
In Figure 17 the mean value and standard deviation of the 36 DifferenceLevel values for
the 16 test subjects in the listening experiment are shown. Every matching included a
pair of audio files. In this figure, Index of the audio files (i,j) corresponds to a specific
pair of audio files, see Appendix B for a description of the files.
Figure 17 . Mean and standard deviation of the DifferenceLevel values for the 16 test subjects for each matching in
the listening experiment.
21
Achieving equal loudness between audio files
4.2.
Results from the evaluation using
loudness database from Swedish Radio
In Table 3 and 4 the statistical measures from the first algorithm evaluation are presented. The subjective reference data is from the listening experiment at Swedish
Radio. Table 3 shows the results of the original algorithm versions.
ITU-R BS. 1770
Replay Gain (Original)
Replay Gain (AwaveSR)
AAE
1,04
1,18
1,35
ASD
0,69
0,77
0,96
MaxError RMSE
2,24
1,25
2,68
1,41
3,90
1,66
P95AE
2,17
2,67
3,28
Ta ble 3. ITU-R BS. 1770, Replay Gain (Original) and Replay Gain (AwaveSR) evaluated with the reference data
from the listening experiment at Swedish Radio. All measure units are in dB.
Table 4 shows the results of the modified algorithm versions. If several parameter
values of an algorithm version produced low errors in the evaluations, only the three
best parameter values of that algorithm version are presented here.
Center of Gravity
ITU gate (-60 dBFS)
ITU strongest 6th
ITU strongest 5th
ITU gate (-70 dBFS)
ITU gate (-63 dBFS)
Replay Gain 15%
Replay Gain 10 %
RLB gate (-60 dBFS)
RLB gate (-70 dBFS)
RLB gate (-65 dBFS)
RLB
ITU strongest 7th
Replay Gain 20%
Replay Gain 120ms
ITU with RG filter
Replay Gain with ITU filter
Replay Gain 150ms
Replay Gain 100ms
AAE
0,98
1,00
1,00
1,00
1,01
1,01
1,01
1,01
1,02
1,03
1,03
1,04
1,04
1,05
1,06
1,06
1,08
1,11
1,13
ASD
0,64
0,64
0,78
0,80
0,63
0,63
0,68
0,70
0,61
0,59
0,59
0,66
0,77
0,76
0,84
0,86
0,86
0,82
0,79
MaxError RMSE
2,32
1,17
2,48
1,19
3,14
1,27
3,31
1,28
2,40
1,19
2,45
1,19
2,96
1,22
2,91
1,23
2,15
1,19
2,11
1,18
2,15
1,19
2,30
1,23
3,18
1,30
2,99
1,3
2,65
1,35
2,91
1,36
3,54
1,39
2,80
1,38
2,65
1,38
P95AE
1,97
1,95
2,44
2,55
1,87
1,92
2,34
2,12
2,03
2,03
2,04
2,03
2,29
2,42
2,60
2,57
2,78
2,54
2,59
Ta ble 4. The results of the modified algorithm versions. All measure units are in dB.
In Figure 18 and 19 all 24 SegmentLevel values are shown together with ModelPrediction
values from eight of the above-mentioned algorithms. Three of them are the original
algorithm versions and the other five are chosen for their overall good performance
when considering all five statistical measures. A description of the audio files can be
found in Appendix B.
22
Achieving equal loudness between audio files
Figure 18 . The subjective loudness values (SegmentLevel values) of audio file 1-12 from the regression analysis
together with corresponding algorithm predictions.
23
Achieving equal loudness between audio files
Figure 19 . The subjective loudness values (SegmentLevel values) of audio file 13-24 from the regression analysis
together with corresponding algorithm predictions.
24
Achieving equal loudness between audio files
4.3.
Results from the evaluation using
loudness database from IRT
In Table 5 and 6 the statistical measures from the second algorithm evaluation are
presented. The subjective reference data is from IRT. Table 5 shows the results of
the original algorithm versions.
ITU-R BS. 1770
Replay Gain (Original)
Replay Gain (AwaveSR)
AAE
1,27
1,52
1,86
ASD
1,05
1,37
1,67
MaxError RMSE
5,23
1,65
6,15
2,04
8,86
2,50
P95AE
3,23
4,59
5,40
Ta ble 5. Replay Gain (AwaveSR), Replay Gain (Original) and ITU-R BS. 1770 evaluated with the reference data
from IRT. All measure units are in dB.
Table 6 shows the results of the modified algorithm versions. If several parameter
values of an algorithm version produced low errors in the evaluations, only the three
best parameter values of that algorithm version are presented here.
RLB gate (-50 dBFS)
ITU strongest 5th
RLB gate (-60 dBFS)
RLB gate (-65 dBFS)
ITU gate (-20 dBFS)
RLB
ITU gate (-50 dBFS)
ITU gate (-60 dBFS)
Replay Gain with ITU filter
Replay Gain 25%
Replay Gain 20%
Replay Gain 15%
ITU with Replay Gain filter
ITU strongest 6th
ITU strongest 7th
Replay Gain 80ms
Replay Gain 100ms
Replay Gain 120ms
AAE
1,15
1,15
1,15
1,16
1,18
1,18
1,22
1,24
1,33
1,42
1,39
1,44
1,45
1,48
1,50
1,54
1,54
1,61
ASD
1,06
1,07
1,07
1,06
1,01
1,05
1,05
1,05
1,20
1,39
1,44
1,37
1,31
1,26
1,31
1,45
1,50
1,51
MaxError RMSE
4,59
1,56
4,70
1,57
4,82
1,57
4,88
1,57
4,34
1,55
4,80
1,58
5,06
1,61
5,28
1,63
5,00
1,79
6,25
1,98
6,41
2,00
6,53
2,04
5,60
1,96
5,66
1,94
5,76
1,99
5,88
2,11
5,98
2,15
6,27
2,21
P95AE
3,43
3,40
3,36
3,34
3,23
3,27
3,04
3,06
3,74
4,40
4,85
4,97
4,29
3,62
3,92
4,74
5,14
5,35
Ta ble 6. The modified algorithms that produced an equal or lower AAE compared to ITU-R BS. 1770. All
measure units are in dB.
25
Achieving equal loudness between audio files
5.
Discussion
In the following chapter several aspects concerning the project methodology and results are discussed.
5.1.
Listening experiment methodology
When realizing a listening experiment it is always difficult to know if the test actually
will test the specific thesis or question that you have. There are many bias factors that
can influence your test and it might be difficult to draw any conclusions. In the listening experiment at SR several aspects concerning the methodology can be challenged, e.g.: Was the audio files to long? Was it good to let the test subjects choose
their own listening level? Was it a good choice to use a subwoofer? Was the mixing
console in the listening experiment a useable interface for the test subjects?
All these questions are relevant and can be discussed for a long time. The aim with
the listening experiment methodology was to try to simulate what the Replay Gain
algorithm is used for at SR and to let the test subjects use an interface that was intuitive to use. Using a fader as the level adjuster was considered to be the best way to
control the level, since a fader is the standard tool for a sound engineer when working with level adjustments. Replay Gain calculates one loudness value of a whole
audio file, and at SR most audio files that AwaveSR processes are imported from
CDs, with a length of 3-5 minutes in general. To simulate this, full-length music files
were used in the listening experiment. When sound engineers mixes live or recorded
audio in a studio they choose their own listening level. This is also the reason why
the test subjects could choose the reference level in the experiment. Whether the test
setup should have included a subwoofer or not is a complex question. If not, the test
result would probably be different, since the test subjects would have had different
sensations of the bass content in the audio files, but would it be more accurate? The
best way might have been to have two speaker setups and to let the test subjects do
the test twice and analyze the differences, but this was not possible due to time limitation.
In Figure 17 the mean value and standard deviation of each matching in the listening
experiment is shown. If the standard deviations would have been too large one could
have assumed that the listening experiment method was impracticable, but most of
the test subjects seem to agree fairly well on the gain adjustment between each audio
pair. The exceptions will be discussed in section 5.2.
In Loudness Assessment of Music and Speech, Skovenborg et al. (2004) model their listening experiment data with a General Linear Model (GLM), which included both regression analysis and an analysis of covariance. The authors also include several bias
factors in the model to try to reduce the influence of possible biases. In this project
only the regression analysis has been made. A deeper statistical analysis would
probably improve the validity of the data from the listening experiment.
5.2.
Listening experiment results
When analyzing the mean errors from the evaluations, both ITU-R BS. 1770 and
AwaveSR seem to match the mean subjective loudness values from the regression
26
Achieving equal loudness between audio files
analysis pretty well with ITU-R BS. 1770 having an AAE of 1.0 dB using the SR
subjective loudness data, and an AAE of 1.3 dB using the IRT subjective loudness
data. AwaveSR produces higher AAE values, but is not clearly worse. The biggest
difference between the ITU standard and AwaveSR is the maximum error, which is
considerably higher for AwaveSR, with a MaxError that is 1.7 dB (SR database) and
3.6 dB (IRT database) higher than the ITU standard.
When comparing Replay Gain (AwaveSR) and Replay Gain (Original), the second of
the two produced lower errors in both evaluations. The filter differences are mainly
in the bass region, and in Figure 10 it can be seen that the Replay Gain (Original)
filter is closer to the target response in the bass region (<200 Hz) than Replay Gain
(AwaveSR).
The modified algorithms that produced the lowest AAE in both the evaluations are
the ITU based ones that included a gate function. The ITU strongest 5th algorithm also
gave low AAE values in both evaluation but produced higher MaxError and P95AE
in the SR evaluation. It can also be discussed whether ITU strongest 5th can be evaluated using the IRT database since the length of the audio files are 10-15 seconds. To
calculate the loudness value over only 2-3 seconds of the file might not be a representative loudness value for the strongest section of a full-length audio file.
Many of the modified ITU versions that included a gate function produced low errors, but when comparing both evaluations, a single best gate threshold value could
not be found. This finding is confirmed in the paper Loudness Descriptors to Characterize
Programs and Music Tracks written by Skovenborg and Lund (2008, p. 2), where the
authors suggest an adaptive gate function instead of a fixed gate threshold.
One pattern can be seen when analyzing the mean value and standard deviation of
each pair-matching, see Figure 17. The standard deviation is higher when a speech
file is present. It can be discussed whether this is because balancing speech in general
is more difficult than to balance other audio material, or if the individual preferred
level difference between speech and music is the main cause. I believe that both
these explanations have affected the experiment result. In Appendix C the amplitude
histograms of the 24 audio files are shown. The music files have a much narrower
distribution of levels, and this can perhaps explain why the music files were easier to
balance than the speech files, but I also believe that there is a much larger, what
Skovenborg et al. (2004, p. 7) call, “between-subject disagreement” between music and
speech than it is between different pop music audio files. Different listeners prefer
different levels when listening to speech. Of course, the frequency spectrums of the
different audio files also affect the result that an algorithm produces. In the SR listening experiment the P3-P4 music all have quite similar spectrums, compared to the
two other categories speech and hard, where the frequency spectrums varies much
more. It seems to be easier to match audio files with similar spectrums, but more
exactly how different frequency spectrums affect the loudness calculation of an algorithm has not been closely studied in this project.
Concerning the number of audio files in the listening experiment at SR, one thing
must be noted. Since the number of audio files was low, an algorithm modification
that decreased the maximum error considerably, but maybe increased the mean error
of the other loudness predictions, could seem to be just as accurate to an algorithm
27
Achieving equal loudness between audio files
that has a lower mean error except for one really bad loudness prediction. The question is if it is better to have an algorithm that has a low mean error and few very bad
exceptions in the loudness predictions or the other way around? From my point of
view a low mean error is preferable in an audio file normalization process, since it
probably will be easier to locate and correct a few bad exceptions in the database.
The underlying purpose of this thesis was to investigate if it was possible to use a
normalization algorithm on audio files containing speech. This seems to be possible
after analyzing the results. All ITU based algorithms predicted the loudness fairly
well of the speech material available compared to the subjective references. When
comparing ITU-R BS. 1770 with Replay Gain (AwaveSR) and Replay Gain (Original)
concerning only the speech material, the ITU standard was closest to the subjective
reference data and Replay Gain (AwaveSR) did the worst loudness predictions. With
the data from the listening test at SR, the three worst loudness predictions from
Replay Gain (AwaveSR) are on speech material. When analyzing the results from the
evaluation with the subjective loudness database from IRT the same pattern can be
seen.
5.3.
How good can an algorithm become?
The result in this thesis indicates that many of the investigated algorithm versions
can produce good results, but the algorithms investigated here are probably not optimal. Neither Replay Gain, nor ITU-R BS. 1770 use an advanced, research based
loudness calculation model, which might improve the prediction of the algorithms.
On the other hand, Skovenborg and Nielsen (2004) investigated many loudness calculation algorithms and one of the findings was that several well-known models,
such as Leq(A) and the ISO 532-B implementation of a model suggested by Zwicker,
did not predict the loudness of music and speech satisfactorily. The predecessor to
the ITU standard, Leq(RLB), produced lower errors than both these two models. So it
is not obvious that a more advanced model would give a better algorithm.
With audio files and corresponding subjective loudness values retrieved from listening experiments, loudness algorithms can be optimized to match the subjective reference data in the best possible way, and this has also been done during this project.
The problem is that the optimization is done with very limited amount of data. There
is no guarantee that an “optimized algorithm” will be the best choice when using it
on other audio material.
Loudness calculation models, such as the ones used in Replay Gain and ITU-R BS.
1770, will never be good at predicting loudness for all audio material available. A
possible improvement might be to implement what Zemark (2007, p. 46) calls a
“category meter”, which would classify the audio material, for example into pop music,
classical music and speech, and after that perhaps use different loudness calculation
methods adapted to the different groups of audio material.
As a summary I believe that it is impossible to switch from having sound engineers
in the production chain to only use an automated audio file normalization process
without noticing any difference. But as a complementary tool, an audio file normalization process is a good way to facilitate for sound engineers and especially the selfoperators.
28
Achieving equal loudness between audio files
6.
Conclusions
In this master’s project several loudness calculation algorithms have been evaluated.
Two of them were published standards, Replay Gain and ITU-R BS. 1770, and the
others were modifications of these two. The algorithms were evaluated using two
subjective loudness databases. One was retrieved from a listening experiment realized
at Swedish Radio during the project, and the other one came from the research institute IRT. In both evaluations ITU-R BS. 1770 produced lower error than Replay
Gain. Of the modified algorithms, the ITU-based with gate function produced the
best result. Most of the gate threshold values tested produced slightly better result
than the ITU standard. A gate function seems to increase the precision of algorithms
that is based on mean level measurement, but when comparing the result from the
two evaluations, no single best gate threshold value could be found.
Concerning speech files, the ITU standard and modifications based on the ITU
standard, predicted loudness fairly well on the speech material available compared to
the subjective references. This indicates that it seems to be possible for SR to use a
similar normalization algorithm to adjust the overall level also on speech files in the
future.
The best choice for the audio file normalization process at Swedish Radio seems to
be an ITU-based loudness calculation algorithm together with a gate function. The
reason for this is not only based on the result from the evaluations made during this
project, but also because ITU-R BS. 1770 is a suggested professional loudness measurement standard. Such a standard will probably be tested, evaluated and also criticized by many different research groups and companies. The Replay Gain algorithm
is also a suggested standard, but developed by one person and not thoroughly evaluated. If a new better loudness calculation method is found in the future, it will
probably be tested against the ITU standard, but probably not against Replay Gain.
To choose an algorithm based on the ITU standard is considered by the author to be
a better choice than to stay with the Replay Gain algorithm, if Swedish Radio wants
to continue to follow the research concerning loudness calculation algorithms.
To investigate loudness calculation algorithms further, many different areas can be
studied. One area that connects closely to this project is to study if an adaptive gate
function will increase the precision of the ITU-R BS. 1770 algorithm compared to a
fixed gate function. An optimal fixed gate level threshold could not be found in this
project and this is why it would be interesting to see how an adaptive gate would
perform compared to a fixed gate level threshold. One ITU implementation with an
adaptive gate function was tested during this project, Center of Gravity (Skovenborg &
Lund 2008), but only using the loudness database from SR. The results seem promising, and this is why it would be interesting to continue to study different gate functions.
Another aspect to study is if a different approach can be used as a complement to
the audio file normalization process used today at SR. One idea to a complement is
what I call “post normalization gain correction”. If an audio file is detected that has considerably different subjective sound level than other audio files, sound engineers
should have the possibility to adjust the level of the file in the database. This could
29
Achieving equal loudness between audio files
also be automated by logging the lists of music played together with fader movements and time synchronization. If several sound engineers do (approximately) the
same level adjustment of an audio file, the level of the file should be adjusted.
30
Achieving equal loudness between audio files
7.
Acknowledgements
There are many people who have helped me during this project. I would like to
thank:
-
My supervisor at Swedish Radio: Christofer Bustad. For always helping out
with my theoretical and practical questions, for giving feedback and for putting so much time into the project.
-
My supervisor at Royal Institute of Technology: Ph.D. Svante Granqvist. For
giving feedback and for helping me with my questions concerning theory, my
report and the examination process.
-
Swedish Radio Production & Technical Development staff: Lars Jonsson,
Lars Mossberg, Bo Ternström and Hasse Wessman. For all help and support
during the project.
-
HD development manager Thomas Lund and Ph.D., Senior Research
Engineer Esben Skovenborg at TC Electronics. For hospitality, feedback and
interesting input to the project.
-
Dipl. -Ing. Gerhard Spikofski at IRT. For letting me use the subjective loudness database from IRT.
-
Software engineer Markus Dimdal, FMJ-Software. For answering my questions about AwaveSR and for giving me access to the source code of the
filter.
-
The listening experiment participants. Without you, no project!
-
My wife Anna for supporting and believing in me and for being a remembrance of the really important things in life.
31
Achieving equal loudness between audio files
8.
References
FMJ-Software (2008). FMJ-Software. (Online). Available: <http://www.fmjsoft.com>
(Accessed 2008-10-01).
Granqvist, S. & Liljencrants, J. (2004). Kompendium i Elektroakustik. Stockholm: Royal
Institute of Technology.
IRT (2009). IRT (Online). Available: <http://www.irt.de/en/irt.html>
(Accessed 2009-01-26).
ITU-R (2006). Rec, ITU-R BS. 1770-1, Algorithms to measure audio programme loudness
and true-peak audio level. International Telecommunication Union.
Karolinska Universitetssjukhuset (2009). Örats funktion. (Online). Available:
<http://www.karolinska.se/templates/Page____55847.aspx>
(Accessed 2009-01-28).
Leijon, A. (2007). Sound Perception: Introduction and Exercise Problems. Stockholm: Royal
Institute of Technology.
Lund, T. (2008). Inter-program level jumps in broadcast. Conference article from
”Broadcast Asia 2008”, 17-20 June 2008, Singapore.
MIDI Manufacturers Association (2009). Tutorial – MIDI and Music Synthesis.
(Online). Available: <http://www.midi.org/aboutmidi/tut_midimusicsynth.php>
(Accessed 2009-02-28).
Moore, B. (2003). An Introduction to the Psychology of Hearing. Great Britain: Academic
Press.
Nationalencyklopedin (2009). Regressionsanalys. (Online). Available:
<http://www.ne.se/artikel/291872> (Accessed 2009-01-28).
Pure Data (2009). Pure Data. (Online). Available: <http://puredata.info/>
(Accessed 2009-02-23).
Replay Gain (2008). Replay Gain – A Proposed Standard. (Online). Available:
<http://www.replaygain.org/> (Accessed 2008-09-10).
Skovenborg, E. & Lund, T. (2008). Loudness Descriptors to Characterize Programs and
Music Tracks. In Proc. of the AES 125th Convention, San Francisco.
32
Achieving equal loudness between audio files
Skovenborg, E. & Nielsen, S.H. (2004). Evaluation of Different Loudness Models with
Music and Speech material. In Proc. of the 117th AES Convention, San Francisco.
Skovenborg, E., Quesnel, R., & Nielsen, S.H. (2004). Loudness Assessment of Music and
Speech. In Proc. of the 116th AES Convention, Berlin.
Vickers, E. (2001). Automatic Long-term Loudness and Dynamics Matching. In Proc. of the
111th AES Convention, New York.
Zemark, M. (2007). Implementing methods for equal Loudness in Broadcasting at Swedish
Radio. Master of Science Thesis. Stockholm: Royal Institute of Technology.
Zwicker, E. & Fastl, H. (1999). Psychoacoustics: Facts and Models. Berlin: Springer.
33
Achieving equal loudness between audio files
9.
Appendix
Appendix A – Test subject listening level
Test subject Mean sound level at listener position, dB(C) (measured with pink noise)
1
60
2
65
3
65
4
65
5
65
6
65
7
65
8
65
9
65
10
65
11
60,5
12
66
13
66
14
60,5
15
70
16
61,5
34
Achieving equal loudness between audio files
Appendix B – Audio file specifications
The audio files used in the test were chosen from three categories:
1. Speech, recorded in a studio or with low background noise (Speech)
2. Music, frequently played in the channels P3 and P4 (P3-P4)
3. Music, spectrally different and/or dynamically changing, i.e. difficult to set
one general sound level (Hard)
Index
1
2
Category Audio file information (if from CD: Title, Artist, Record)
Alla vill till himmelen men ingen vill dö, Timbuktu, Alla vill till himmelen men
P3-P4
ingen vill dö
Speech Female voice (studio)
3
Hard
You raise me up, Josh Groban, Closer
4
P3-P4
Bills bills bills, Destiny's child, Bills bills bills
5
P3-P4
Viva la Vida, Coldplay, Viva la vida or death and his friends
6
Speech
Male voice (studio)
7
Speech
Female voices (treble dominated)
8
P3-P4
With Every Heartbeat, Kleerup feat. Robyn, Kleerup
9
Speech
Female voice (background noise)
10
P3-P4
Beautiful mourning, Machine Head, The blackening
11
Hard
My heart will go on, Celine Dion, My heart will go on
12
Hard
I will find you there, Michael Ruff, Speaking in melodies
13
Hard
Tennessee Waltz, Alma Cogan, Alma Cogan
14
Speech
Male voice (telephone)
15
P3-P4
Curly Sue, Takida, Bury the lies
16
P3-P4
Du Hast, Rammstein, Sehnsucht
17
Hard
Peach Blossom Spring, Yutaka Yokokura, Yutaka
18
Speech
19
Hard
Nightshift, The Commondores, Nightshift
20
Hard
Somliga går med trasiga skor, Cornelis Wreeswijk, Mäster Cees memoarer (2)
21
Speech
Male voice (studio)
22
P3-P4
Ligga lågt, Tomas Andersson Wij, Blues från Sverige
23
Hard
24
P3-P4
Killing me softly, The Fugees, The score
Cotton fields back home, Creedence Clearwater Revival, Willy and the poor
boys
Male voices (Sport interview)
35
Achieving equal loudness between audio files
Appendix C – Audio file histograms
(Before the amplitude histograms were calculated, any DC offset was removed)
Audio file #1 (P3-P4)
Audio file #2 (Speech)
Audio file #3 (Hard)
Audio file #4 (P3-P4)
Audio file #5 (P3-P4)
Audio file #6 (Speech)
36
Achieving equal loudness between audio files
Audio file #7 (Speech)
Audio file #8 (P3-P4)
Audio file #9 (Speech)
Audio file #10 (P3-P4)
Audio file #11 (Hard)
Audio file #12 (Hard)
37
Achieving equal loudness between audio files
Audio file #13 (Hard)
Audio file #14 (Speech)
Audio file #15 (P3-P4)
Audio file #16 (P3-P4)
Audio file #17 (Hard)
Audio file #18 (Speech)
38
Achieving equal loudness between audio files
Audio file #19 (Hard)
Audio file #20 (Hard)
Audio file #21 (Speech)
Audio file #22 (P3-P4)
Audio file #23 (Hard)
Audio file #24 (P3-P4)
39
Achieving equal loudness between audio files
Appendix D – Matlab code for the Replay Gain
(Original) implementation
The code is from the website www.replaygain.org (Replay Gain 2008).
%
%
%
%
%
replaygainscript
Asks user for name of wavefiles (or folders containing
wavefiles)
User gives null response to indicate all files entered
Calculates replay gain of file using "replaygain" function
% To enter entire folder, append / or \ (i.e. "maskers/"
% processes all files in directory "maskers")
% David Robinson, July 2001. http://www.David.Robinson.org/
clear Vrms filenamematrix
% Get filter co-efs for 44100 kkHz Equal Loudness Filter
%- (IN MY THESIS 48000HZ WAS USED INSTEAD)
%- //Paul Nygren, 2009-02-24
[a1,b1,a2,b2]=equalloudfilt(44100);
% Calculate perceived loudness of -20dB FS RMS pink noise
% This is the SMPTE reference signal. It calibrates to:
% 0dB on a studio meter / mixing desk
% 83dB SPL in a listening environment (THIS IS WHAT WE'RE
% USING HERE)
[ref_Vrms]=replaygain('ref_pink.wav',a1,b1,a2,b2);
filename='not empty';
filenumber=0;
filenumber=filenumber+1;
% Ask user for filename to process
filename=input(['enter filename ',num2str(filenumber), ...
' ? '],'s');
% do this loop while ever the user is entering files
% (user hits enter to proceed to calculation)
while length(filename)>0,
% Check if user has entered folder name
if filename(length(filename))=='/' | ...
filename(length(filename))=='\',
% Get directory listing of requested folder
d=dir([filename '*.wav']);
% If the folder exists and contains .wav files in
% the directory
if length(d)>0,
% Store each wavefilename for processing later
for loop=1:length(d)
realfilename=d(loop).name;
filenamematrix(filenumber).name= ...
[filename realfilename (1:length(realfilename)-4)];
filenumber=filenumber+1;
end
filename=input(['enter filename ', ...
num2str(filenumber),' ? '],'s');
40
Achieving equal loudness between audio files
else
% If the folder does nto exist or contains not .wavs,
% ask the user for another name
filename=input(['NOT FOUND. enter filename ', ...
num2str(filenumber),' ? '],'s');
end
% If the user has entered a file name (rather than folder)
else
% Add .wav to end if user failed to include it
if isempty(findstr(filename,'.wav')), filename= ...
[filename '.wav']; end
% Check the file exists
if ~exist(filename,'file'),
filename=input(['NOT FOUND. enter filename ', ...
num2str(filenumber),' ? '],'s');
else
% If it does, store the file name
% Strip .wav from end
filename=filename(1:length(filename)-4);
filenamematrix(filenumber).name=filename;
filenumber=filenumber+1;
filename=input(['enter filename ', ...
num2str(filenumber),' ? '],'s');
end
end
end
disp(char(13));
% If no files entered, end the program
if filenumber==1
error('Program Aborted: You must type something!!!');
end
% Start a timer to find out how long this takes!
tic;
% Loop through all the files
for loop=1:filenumber-1,
% Calculate the perceived loudness of the file
% using "replaygain" function.
% Subtract this from reference loudness to give
% actual replay gain relative to 83 dB level
Vrms(loop)=ref_Vrms-replaygain ...
(filenamematrix(loop).name,a1,b1,a2,b2);
% Output the result on screen
ref_Vrms
disp([filenamematrix(loop).name '.wav: ' ...
num2str(Vrms(loop)) ' dB']);
end
disp(char(13));
disp('== ReplayGainScript complete ==');
% Stop timer and display elapsed time
toc
41
Achieving equal loudness between audio files
function [a1,b1,a2,b2]=equalloudfilt(fs)
% Design a filter to match equal loudness curves
% 9/7/2001
% If the user hasn't specified a sampling frequency, use the CD
% default
if nargin<1,
fs=44100;
end
% Specify the 80 dB Equal Loudness curve
if fs==44100 | fs==48000,
EL80=[0,120;20,113;30,103;40,97;50,93;60,91;70,89;80,87; ...
90,86;100,85;200,78;300,76;400,76;500,76;600,76;700,77; ...
800,78;900,79.5;1000,80;1500,79;2000,77;2500,74;3000,71.5; ...
3700,70;4000,70.5;5000,74;6000,79;7000,84;8000,86;9000,86; ...
10000,85;12000,95;15000,110;20000,125;fs/2,140];
elseif fs==32000,
EL80=[0,120;20,113;30,103;40,97;50,93;60,91;70,89;80,87; ...
90,86;100,85;200,78;300,76;400,76;500,76;600,76;700,77; ...
800,78;900,79.5;1000,80;1500,79;2000,77;2500,74;3000,71.5; ...
3700,70;4000,70.5;5000,74;6000,79;7000,84;8000,86;9000,86; ...
10000,85;12000,95;15000,110;fs/2,115];
else
error('Filter not defined for current sample rate');
end
% convert frequency and amplitude of the equal loudness curve into
% format suitable for yulewalk
f=EL80(:,1)./(fs/2);
m=10.^((70-EL80(:,2))/20);
% Use a MATLAB utility to design a best bit IIR filter
[b1,a1]=yulewalk(10,f,m);
% Add a 2nd order high pass filter at 150Hz to finish the job
[b2,a2]=butter(2,(150/(fs/2)),'high');
42
Achieving equal loudness between audio files
function Vrms = replaygain(filename,a1,b1,a2,b2)
% Determine the perceived loudness of a file
% METHOD:
% 1) Calculate Vrms every 50ms
% 2) Sort in ascending order of loudness
% 3) Pick the 95% interval (i.e. go 95% up the list,
% and choose the value at this point)
% 4) Convert this value into dB
% 5) return this value.
% Back in the main program...
% 6) Subtract it from that calculated for -20dB FS
% RMS pink noise
% Result = required correction to replay gain
% (relative to 83dB reference)
% David Robinson, 10th July 2001.
% http://www.David.Robinson.org/
% Get information about file
lngth=wavread(filename,'size');
samples=lngth(1);
channels=lngth(2);
% Read sampling rate and No. of bits
[dummy,fs,bs]=wavread(filename,[1 2]);
% The the file isn't CD sample rate, try to
% generate appropriate equal loudness filter
if fs~=44100 || nargin<2,
[a1,b1,a2,b2]=equalloudfilt(fs);
end
%- BELOW, THE BLOCK LENGTH WAS CHANGED
%- IN THE Replay Gain X ms ALGORITHM
%- VERSION //Paul Nygren, 2009-02-24
% Set the Vrms window to 50ms
rms_window_length=round(50*(fs/1000));
%- BELOW, THE PERCENTAGE WAS CHANGED
%- IN THE Replay Gain Y % ALGORITHM
%- VERSION //Paul Nygren, 2009-02-24
% Set the interval to 95%
% Which rms value to take as typical of whole file
percentage=95;
% Set amount of data (in seconds) which
% Matlab on my PC happily copes with at once
% chunk data in from wave file in 2 second blocks
% - file less than this length will cause an error
block_length=2;
% Determine how many rms value to calculate
% per block of data
rms_per_block=fix((fs*block_length)/rms_window_length);
% Check that the file is long enough to
% process in block_length blocks
if lngth<(fs*block_length),
43
Achieving equal loudness between audio files
warning(['skipping ' filename ' because it is too short']);
Vrms=0;
Vrms_all=0;
return
end
% Display a Waitbar to show user how far into file we are
wbh=waitbar(0,'Processing...');
% Loop through all the file in blocks a defined above
for audio_block=0:fix(samples/(fs*block_length))-1,
% Update the waitbar display to reflect progress
waitbar(audio_block/(fix(samples/(fs*block_length))-1));
% Grab a section of audio
inaudio=wavread(filename,[(fs*block_length*audio_block)+1 ...
fs*block_length*(audio_block+1)]);
% Filter it using the equal loudness curve filter:
inaudio=filter(b1,a1,inaudio);
inaudio=filter(b2,a2,inaudio);
% Calculate Vrms:
for rms_block=0:rms_per_block-1,
% Mono signal: just do the one channel
if channels==1,
Vrms_all((audio_block*rms_per_block)+rms_block+1)= ...
mean(inaudio((rms_block*rms_window_length)+ ...
1:(rms_block+1)*rms_window_length).^2);
% Stereo signal: take average Vrms of both channels
elseif channels==2,
Vrms_left=mean(inaudio((rms_block*rms_window_length)+ ...
1:(rms_block+1)*rms_window_length,1).^2);
Vrms_right=mean(inaudio((rms_block*rms_window_length) ...
+1:(rms_block+1)*rms_window_length,2).^2);
Vrms_all((audio_block*rms_per_block)+rms_block+1)= ...
(Vrms_left+Vrms_right)/2;
end
end
end
% Close the waitbar
close(wbh);
% Convert to dB
Vrms_all=10*log10(Vrms_all+10^-10);
% Sort the Vrms values into numerical order
Vrms_all=sort(Vrms_all);
% Pick the 95% value
Vrms=Vrms_all(round(length(Vrms_all)*percentage/100));
return
44
Achieving equal loudness between audio files
Appendix E – Matlab code for the ITU-R BS. 1770
implementation
function ITUloudness = ITUOriginal(a1)
%
%--- Loudness calculation according to ITU-R BS. 1770-1 --%
% This script calculates the loudness value
% for a mono or stereo .wav-file according to ITU-R BS. 1770-1
% As input the script requires the name of a .wav-file
% on the form: 'name.wav'. The output is the calculated
% loudness value.
% --- 2009-02-17, Paul Nygren --%
% Reads an audio file
audiofile1=wavread(a1);
%Filter
%of the
Bhead =
Ahead =
coefficients for the modeling
acoustic effects of the head (48kHz)
[1.53512485958697 -2.69169618940638 1.19839281085285];
[1 -1.69065929318241 0.73248077421585];
%RLB filter coefficients (48kHz)
BRLB = [1 -2 1];
ARLB = [1 -1.99004745483398 0.99007225036621];
%filtering and calculation according
%to ITU-R BS. 1770-1 (stereo)
if size(audiofile1,2)==2
af1ch1=filter(Bhead, Ahead, audiofile1(:,1));
af1ch2=filter(Bhead, Ahead, audiofile1(:,2));
af1ch1filt=filter(BRLB, ARLB, af1ch1(:));
af1ch2filt=filter(BRLB, ARLB, af1ch2(:));
af1ch1filtsq=af1ch1filt.^2;
af1ch2filtsq=af1ch2filt.^2;
z11=mean(af1ch1filtsq);
z12=mean(af1ch2filtsq);
loud=-0.691+10*log10(z11+z12);
else
%filtering and calculation according
%to ITU-R BS. 1770-1 (mono)
af1ch1=filter(Bhead, Ahead, audiofile1(:,1));
af1ch1filt=filter(BRLB, ARLB, af1ch1(:));
af1ch1filtsq=af1ch1filt.^2;
z11=mean(af1ch1filtsq);
loud=-0.691+10*log10(z11);
end
ITUloudness=loud;
45
Achieving equal loudness between audio files
Appendix F – Matlab code for the ITU gate
implementation
function ITU = ITUgate(a1)
%
%--- ITU-R BS. 1770 with gate funcion --%
% Calculation according to the ITU standard,
% but values under a given threshold is ignored.
% --- 2009-02-17, Paul Nygren --% Reads an audio file
audiofile1=wavread(a1);
%Filter
%of the
Bhead =
Ahead =
coefficients for the modeling
acoustic effects of the head (48kHz)
[1.53512485958697 -2.69169618940638 1.19839281085285];
[1 -1.69065929318241 0.73248077421585];
%RLB filter coefficients (48kHz)
BRLB = [1 -2 1];
ARLB = [1 -1.99004745483398 0.99007225036621];
%The gate threshold value (this value corresponds to -50dBFS)
%Change to test different threshold values
threshold=0.00001;
%filtering and calculation according
%to ITU-R BS. 1770-1 (stereo)
if size(audiofile1,2)==2
af1ch1=filter(Bhead, Ahead, audiofile1(:,1));
af1ch2=filter(Bhead, Ahead, audiofile1(:,2));
af1ch1filt=filter(BRLB, ARLB, af1ch1(:));
af1ch2filt=filter(BRLB, ARLB, af1ch2(:));
af1ch1filtsq=af1ch1filt.^2;
af1ch2filtsq=af1ch2filt.^2;
%The squared values are sorted
sortedch1=sort(af1ch1filtsq);
sortedch2=sort(af1ch2filtsq);
%Find the index in the vectors where the
%threshold value is.
a=find(sortedch1<threshold);
b=find(sortedch2<threshold);
%The mean value is calculated including only values over
%the threshold in the vectors sortedch1 and sortedch2
z11=mean(sortedch1((length(a)+1):length(sortedch1)));
z12=mean(sortedch2((length(b)+1):length(sortedch2)));
loud=-0.691+10*log10(z11+z12);
46
Achieving equal loudness between audio files
else
%filtering and calculation according
%to ITU-R BS. 1770-1 (mono)
af1ch1=filter(Bhead, Ahead, audiofile1(:,1));
af1ch1filt=filter(BRLB, ARLB, af1ch1(:));
af1ch1filtsq=af1ch1filt.^2;
%The squared values are sorted
sortedch1=sort(af1ch1filtsq);
%Find the index in the vectors where the
%threshold value is
a=find(sortedch1<threshold);
%The mean value is calculated including only values over
%the threshold in the vector sortedch1
z11=mean(sortedch1((length(a)+1):length(sortedch1)));
loud=-0.691+10*log10(z11);
end
ITU=loud;
47
Achieving equal loudness between audio files
Appendix G – Matlab code for the ITU strongest
section implementation
function ITUloudness = ITUGate_Strongest(a1)
%
%--- ITU strongest section --%
% This script calculates the loudness value
% for a 2 channel .wav-file according to ITU-R BS. 1770-1
% but only over the strongest section of the file.
% As input the script requires the name of a .wav-file
% on the form: 'name.wav'. The output is the calculated
% loudness value.
% --- 2009-02-24, Paul Nygren --%
% Reads an audio file
[audiofile1,FS,nbits]=wavread(a1);
%Filter
%of the
Bhead =
Ahead =
coefficients for the modeling
acoustic effects of the head (48kHz)
[1.53512485958697 -2.69169618940638 1.19839281085285];
[1 -1.69065929318241 0.73248077421585];
%RLB filter coefficients (48kHz)
BRLB = [1 -2 1];
ARLB = [1 -1.99004745483398 0.99007225036621];
a=0;
afbothchfiltsq=0;
%filtering and calculation according
%to ITU-R BS. 1770-1 (stereo)
if size(audiofile1,2)==2
af1ch1=filter(Bhead, Ahead, audiofile1(:,1));
af1ch2=filter(Bhead, Ahead, audiofile1(:,2));
af1ch1filt=filter(BRLB, ARLB, af1ch1(:));
af1ch2filt=filter(BRLB, ARLB, af1ch2(:));
af1ch1filtsq=af1ch1filt.^2;
af1ch2filtsq=af1ch2filt.^2;
%The channels are summed and
%each sample are indexed and sorted
afbothchfiltsq=af1ch1filtsq+af1ch2filtsq;
temp=afbothchfiltsq';
temp=[temp;(1:length(temp))];
sortedtemp=sortrows(temp');
sortedtemp1=sortedtemp(:,1);
%Values below -50dBFS are ignored
a=find(sortedtemp1<0.000001);
48
Achieving equal loudness between audio files
b=sortedtemp((length(a)+1):length(sortedtemp),1);
c=sortedtemp((length(a)+1):length(sortedtemp),2);
b=[b c];
b=sortrows(b,2);
%The size of the strongest section is set here
lengthStrongest=round(length(audiofile1)/5);
temp2=sum(b(1:lengthStrongest));
strongest=temp2;
%The strongest section is found
for i=2:(length(b)-(lengthStrongest-1))
temp2=(temp2-b(i-1))+b(lengthStrongest+(i-1));
if temp2>strongest
strongest=temp2;
end
end
strongest=strongest/lengthStrongest;
loud1=-0.691+10*log10(strongest);
else
disp('--- Error: not a stereo audio file ---')
end
ITUloudness=loud1;
49
Achieving equal loudness between audio files
Appendix H – Matlab code for the ITU with
Replay Gain filter implementation
function ITURGloudness = ITUwithRG(a1)
%
%--- ITU with Replay Gain filter --%
% This script calculates loudness value for a mono or
% stereo .wav-file according to the ITU standard but using
% the weighting filters from Replay Gain.
%
% As input the script requires the name of a .wav-file
% on the form: 'name.wav'. The output is the calculated
% loudness value.
% --- 2009-02-24, Paul Nygren --[audiofile1, fs, NBITS]=wavread(a1);
%Getting the filter coefficients for the
%Replay Gain filters
[A1,B1,A2,B2]=equalloudfilt(fs);
%Stereo file calculation process according to
%the ITU standard but with the Replay Gain
%filter coefficients instead
if size(audiofile1,2)==2
af1ch1=filter(B1,A1, audiofile1(:,1));
af1ch2=filter(B1,A1, audiofile1(:,2));
af1ch1filt=filter(B2, A2, af1ch1(:));
af1ch2filt=filter(B2, A2, af1ch2(:));
af1ch1filtsq=af1ch1filt.^2;
af1ch2filtsq=af1ch2filt.^2;
z11=mean(af1ch1filtsq);
z12=mean(af1ch2filtsq);
loud1=-0.691+10*log10(z11+z12);
%Mono file calculation process according to
%the ITU standard but with the Replay Gain
%filter coefficients instead
else
af1ch1=filter(B1,A1, audiofile1(:,1));
af1ch1filt=filter(B2, A2, af1ch1(:));
af1ch1filtsq=af1ch1filt.^2;
z11=mean(af1ch1filtsq);
loud1=-0.691+10*log10(z11);
end
ITURGloudness=loud1;
50
Achieving equal loudness between audio files
The code below is from the website www.replaygain.org (Replay Gain 2008)
function [a1,b1,a2,b2]=equalloudfilt(fs)
% Design a filter to match equal loudness curves
% 9/7/2001
% If the user hasn't specified a sampling frequency, use the CD
% default
if nargin<1,
fs=44100;
end
% Specify the 80 dB Equal Loudness curve
if fs==44100 | fs==48000,
EL80=[0,120;20,113;30,103;40,97;50,93;60,91;70,89;80,87; ...
90,86;100,85;200,78;300,76;400,76;500,76;600,76;700,77; ...
800,78;900,79.5;1000,80;1500,79;2000,77;2500,74;3000,71.5; ...
3700,70;4000,70.5;5000,74;6000,79;7000,84;8000,86;9000,86; ...
10000,85;12000,95;15000,110;20000,125;fs/2,140];
elseif fs==32000,
EL80=[0,120;20,113;30,103;40,97;50,93;60,91;70,89;80,87; ...
90,86;100,85;200,78;300,76;400,76;500,76;600,76;700,77; ...
800,78;900,79.5;1000,80;1500,79;2000,77;2500,74;3000,71.5; ...
3700,70;4000,70.5;5000,74;6000,79;7000,84;8000,86;9000,86; ...
10000,85;12000,95;15000,110;fs/2,115];
else
error('Filter not defined for current sample rate');
end
% convert frequency and amplitude of the equal loudness curve into
% format suitable for yulewalk
f=EL80(:,1)./(fs/2);
m=10.^((70-EL80(:,2))/20);
% Use a MATLAB utility to design a best bit IIR filter
[b1,a1]=yulewalk(10,f,m);
% Add a 2nd order high pass filter at 150Hz to finish the job
[b2,a2]=butter(2,(150/(fs/2)),'high');
51
Achieving equal loudness between audio files
Appendix I – Matlab code for the Replay Gain
with ITU filter implementation
(The Replay Gain algorithm function that differs from the original implementation is
shown below. For the other two functions see Appendix D.)
function Vrms = replaygainITU(filename,a1,b1,a2,b2)
% Determine the perceived loudness of a file
% METHOD:
% 1) Calculate Vrms every 50ms
% 2) Sort in ascending order of loudness
% 3) Pick the 95% interval (i.e. go 95% up the list,
% and choose the value at this point)
% 4) Convert this value into dB
% 5) return this value.
% Back in the main program...
% 6) Subtract it from that calculated for -20dB FS
% RMS pink noise
% Result = required correction to replay gain
% (relative to 83dB reference)
% David Robinson, 10th July 2001.
% http://www.David.Robinson.org/
% Get information about file
lngth=wavread(filename,'size');
samples=lngth(1);
channels=lngth(2);
% Read sampling rate and No. of bits
[dummy,fs,bs]=wavread(filename,[1 2]);
%- FILTER COEFFICIENTS FOR THE MODELING
%- OF THE ACOUSTIC EFFECTS OF THE HEAD (48kHz)
%- //Paul Nygren, 2009-02-24
Bhead = [1.53512485958697 -2.69169618940638 1.19839281085285];
Ahead = [1 -1.69065929318241 0.73248077421585];
%- THE RLB WEIGHTING FILTER COEFFICIENTS
%- //Paul Nygren, 2009-02-24
BRLB = [1 -2 1];
ARLB = [1 -1.99004745483398 0.99007225036621];
% Set the Vrms window to 50ms
rms_window_length=round(50*(fs/1000));
% Set the interval to 95%
% Which rms value to take as typical of whole file
percentage=95;
% Set amount of data (in seconds) which
% Matlab on my PC happily copes with at once
% chunk data in from wave file in 2 second blocks
% - file less than this length will cause an error
block_length=2;
% Determine how many rms value to calculate
52
Achieving equal loudness between audio files
% per block of data
rms_per_block=fix((fs*block_length)/rms_window_length);
% Check that the file is long enough to
% process in block_length blocks
if lngth<(fs*block_length),
warning(['skipping ' filename ' because it is too short']);
Vrms=0;
Vrms_all=0;
return
end
% Display a Waitbar to show user how far into file we are
wbh=waitbar(0,'Processing...');
% Loop through all the file in blocks a defined above
for audio_block=0:fix(samples/(fs*block_length))-1,
% Update the waitbar display to reflect progress
waitbar(audio_block/(fix(samples/(fs*block_length))-1));
% Grab a section of audio
inaudio=wavread(filename,[(fs*block_length*audio_block)+1 ...
fs*block_length*(audio_block+1)]);
% Filter it using the equal loudness curve filter:
inaudio=filter(b1,a1,inaudio);
inaudio=filter(b2,a2,inaudio);
% Calculate Vrms:
for rms_block=0:rms_per_block-1,
% Mono signal: just do the one channel
if channels==1,
Vrms_all((audio_block*rms_per_block)+rms_block+1)= ...
mean(inaudio((rms_block*rms_window_length)+ ...
1:(rms_block+1)*rms_window_length).^2);
% Stereo signal: take average Vrms of both channels
elseif channels==2,
Vrms_left=mean(inaudio((rms_block*rms_window_length)+ ...
1:(rms_block+1)*rms_window_length,1).^2);
Vrms_right=mean(inaudio((rms_block*rms_window_length) ...
+1:(rms_block+1)*rms_window_length,2).^2);
Vrms_all((audio_block*rms_per_block)+rms_block+1)= ...
(Vrms_left+Vrms_right)/2;
end
end
end
% Close the waitbar
close(wbh);
% Convert to dB
Vrms_all=10*log10(Vrms_all+10^-10);
% Sort the Vrms values into numerical order
Vrms_all=sort(Vrms_all);
% Pick the 95% value
Vrms=Vrms_all(round(length(Vrms_all)*percentage/100));
return
53
Achieving equal loudness between audio files
Appendix J – Matlab code for the RLB
implementation
function RLBloudness = RLB (a1)
%
%--- RLB loudness calculation --%
% This script calculates the RLB loudness value
% for a mono or stereo .wav-file.
% As input the script requires the name of a .wav-file
% on the form: 'name.wav'. The output is the calculated
% loudness value.
% --- 2009-02-24, Paul Nygren --%
% Reads an audio file
audiofile1=wavread(a1);
%RLB filter coefficients (48kHz)
BRLB = [1 -2 1];
ARLB = [1 -1.99004745483398 0.99007225036621];
%filtering and calculation according
%to RLB (stereo)
if size(audiofile1,2)==2
af1ch1filt=filter(BRLB, ARLB, audiofile1(:,1));
af1ch2filt=filter(BRLB, ARLB, audiofile1(:,2));
af1ch1filtsq=af1ch1filt.^2;
af1ch2filtsq=af1ch2filt.^2;
z11=mean(af1ch1filtsq);
z12=mean(af1ch2filtsq);
loud=-0.691+10*log10(z11+z12);
else
%filtering and calculation according
%to RLB (mono)
af1ch1=filter(Bhead, Ahead, audiofile1(:,1));
af1ch1filt=filter(BRLB, ARLB, af1ch1(:));
af1ch1filtsq=af1ch1filt.^2;
z11=mean(af1ch1filtsq);
loud=-0.691+10*log10(z11);
end
RLBloudness=loud;
54
Achieving equal loudness between audio files
Appendix K – Matlab code for the RLB gate
implementation
function RLBgate = RLBgate(a1)
%
%--- RLB loudness calculation with gate funcion --%
% Calculation according to RLB,
% but values under a given threshold is ignored.
% --- 2009-02-24, Paul Nygren --% Reads an audio file
audiofile1=wavread(a1);
%RLB filter coefficients (48kHz)
BRLB = [1 -2 1];
ARLB = [1 -1.99004745483398 0.99007225036621];
%The gate threshold value (this value corresponds to -50dBFS)
%Change to test different threshold values
threshold=0.00001;
%filtering and calculation according
%to RLB (stereo)
if size(audiofile1,2)==2
af1ch1filt=filter(BRLB, ARLB, audiofile1(:,1));
af1ch2filt=filter(BRLB, ARLB, audiofile1(:,2));
af1ch1filtsq=af1ch1filt.^2;
af1ch2filtsq=af1ch2filt.^2;
%The squared values are sorted
sortedch1=sort(af1ch1filtsq);
sortedch2=sort(af1ch2filtsq);
%Find the index in the vectors where the
%threshold value is.
a=find(sortedch1<threshold);
b=find(sortedch2<threshold);
%The mean value is calculated including only values over
%the threshold in the vectors sortedch1 and sortedch2
z11=mean(sortedch1((length(a)+1):length(sortedch1)));
z12=mean(sortedch2((length(b)+1):length(sortedch2)));
loud=-0.691+10*log10(z11+z12);
55
Achieving equal loudness between audio files
else
%filtering and calculation according
%to RLB (mono)
af1ch1filt=filter(BRLB, ARLB, audiofile1(:,1));
af1ch1filtsq=af1ch1filt.^2;
%The squared values are sorted
sortedch1=sort(af1ch1filtsq);
%Find the index in the vectors where the
%threshold value is
a=find(sortedch1<threshold);
%The mean value is calculated including only values over
%the threshold in the vector sortedch1
z11=mean(sortedch1((length(a)+1):length(sortedch1)));
loud=-0.691+10*log10(z11);
end
RLBgate=loud;
56
Achieving equal loudness between audio files
Appendix L – Matlab code for the regression
analysis
Matlab code for the regression analysis, including the DifferenceLevel values from the
listening experiment.
%y1 to y16 represent the 16 test subjects and their 36
%DifferenceLevel values (the experiment normalization
%process is accounted for)
y1=[16.31; -0.49; -11.49; -0.27; -5.40; -0.74
NaN %Test subject mistake during experiment
1.97; 13.35; 9.71; 2.66; 2.66; -21.05; -7.03
12.45; 12.37; -7.27; 5.60; 5.26; 0.51; 2.96
5.61; -6.20; 4.20; 10.93; -12.92; -2.04; 10.22
6.40; -8.39; -0.69; 5.08; -12.92; 9.01; 1.20
-11.61];
%Test subject 1
y2=[13.21; -0.89; -13.99; -1.47; -1.10; -3.74
-1.02; 0.72; 16.95; 11.31; -1.29; 2.96; -21.10
-8.33; 14.65; 12.57; -12.07; 6.05; 5.86; 1.41
4.36; 8.96; -5.75; 4.30; 15.28; -15.32; -2.19
10.07; 9.90; -12.33; 0.06; 7.18; -13.72; 15.81
0.86; -8.16]; %Test subject 2
y3=[13.16; -1.49; -10.29; -5.42; -2.55; -1.99
1.18; 0.62; 14.80; 9.41; 0.41; 2.46; -18.10
-10.18; 11.45; 12.02; -6.18; 4.90; 3.41; 0.21
5.01; 8.01; -7.15; 3.15; 10.08; -8.77; -1.49
10.17; 9.15; -11.08; -0.59; 5.93; -12.47; 9.51
-3.55; -10.36]; %Test subject 3
y4=[15.86; 0.66; -12.29; -3.37; -4.35; -3.19
0.48; -1.23; 14.80; 10.81; -0.14; 5.06; -18.10
-5.33; 11.90; 11.02; -7.52; 5.35; 3.96; 1.66
2.51; 7.61; -6.85; 1.50; 14.28; -12.27; -1.54
9.57; 4.50; -12.24; 1.06; 6.13; -15.97; 11.56
-0.65; -11.51]; %Test subject 4
y5=[16.01; 1.41; -9.64; -2.37; -1.95; -4.64
2.68; 0.17; 11.85; 8.86; 3.21; 7.76; -17.95
-7.48; 13.50; 11.07; -8.52; 4.11; 3.16; -0.49
1.16; 5.06; -5.60; 2.85; 11.53; -11.57; -2.09
10.47; 12.70; -11.19; 0.36; 5.28; -14.32; 13.31
-0.69; -12.36]; %Test subject 5
y6=[17.41; -2.39; -19.79; -7.47; -0.45; 3.41
0.28; -0.23; 16.15; 11.36; 1.36; 1.46; -18.80
-8.78; 12.30; 12.97; -14.17; 4.36; 3.16; 4.16
8.06; 11.91; -9.00; 2.90; 14.53; -14.32; 0.71
11.27; 9.10; -11.44; 0.81; 8.93; -15.42; 10.36
1.86; -7.81]; %Test subject 6
y7=[14.61; 0.36; -11.49; -2.97; -3.40; -0.69
1.43; 0.47; 15.90; 10.26; 0.36; 4.96; -18.05
57
Achieving equal loudness between audio files
-6.38; 11.75; 11.82; -8.83; 7.31; 5.41; 0.06
3.76; 5.51; -7.50; 4.25; 11.83; -10.92; -0.89
10.27; 8.55; -11.48; 0.06; 4.83; -13.57; 12.16
-0.15; -12.21]; %Test subject 7
y8=[12.46; 2.16; -14.79; -0.07; -2.45; 0.01
3.23; 2.07; 13.70; 5.31; 1.41; 2.91; -21.60
-7.78; 13.10; 7.82; -7.77; 5.01; 3.41; -1.24
2.46; 5.86; -6.60; 4.15; 10.63; -7.37; -0.54
8.12; 6.40; -9.69; 1.01; 3.68; -12.42; 9.86
-1.09; -7.86]; %Test subject 8
y9=[12.16; -0.54; -6.99; 0.08; -7.75; -2.79
2.88; 2.97; 10.25; 8.41; 3.36; 7.16; -16.15
-6.08; 10.55; 11.37; -4.83; 6.35; 2.56; -2.99
1.51; 8.36; -5.65; 3.65; 9.63; -9.32; -2.74
6.27; 7.95; -8.88; -3.64; 4.78; -11.57; 12.71
-2.75; -10.66]; %Test subject 9
y10=[11.91; 1.31; -9.49; -4.97; -4.60; -1.19
2.53; 1.62; 10.15; 6.51; 2.41; 3.96; -17.30
-6.28; 8.85; 11.32; -6.73; 9.06; 5.76; -0.29
1.01; 5.96; -7.75; 2.80; 8.48; -9.47; -1.49
8.97; 6.10; -9.59; -0.89; 2.28; -12.07; 9.86
-0.55; -11.71]; %Test subject 10
y11=[12.56; 0.61; -12.44; -5.12; -3.55; -1.59
0.63; -0.68; 9.35; 7.96; 1.86; 2.21; -18.20
-8.58; 9.25; 11.42; -5.78; 4.91; 4.11; -3.09
2.01; 4.86; -7.65; 3.15; 9.28; -9.82; -1.54
9.42; 7.95; -9.44; -0.54; 4.93; -14.17; 9.71
1.26; -11.26]; %Test subject 11
y12=[13.71; 0.91; -2.84; 1.83; -4.25; -0.39
1.83; 3.47; 12.40; 8.51; 5.61; 6.76; -13.65
-7.13; 12.00; 9.17; -2.93; 7.56; 2.41; -1.24
3.26; 6.86; -7.00; 2.95; 9.38; -9.17; 0.86
9.22; 9.15; -2.09; -0.84; 3.58; -9.97; 14.06
-2.39; -10.21]; %Test subject 12
y13=[16.06; 1.71; -16.79; -1.37; -0.15; -1.34
0.43; 1.67; 15.40; 9.71; -0.34; 2.21; -22.75
-11.83; 15.65; 13.07; -10.73; 4.66; 2.86; -0.54
2.56; 8.11; -5.35; 4.15; 14.83; -14.87; -0.74
9.72; 7.15; -11.09; 0.56; 8.63; -13.67; 12.51
-1.65; -9.21]; %Test subject 13
y14=[10.51; -1.54; -12.59; -1.12; -1.40; -1.09
0.73; 1.77; 14.40; 9.21; 3.81; 4.21; -18.75
-8.18; 8.85; 8.02; -7.12; 7.91; 3.91; -1.74
2.31; 8.61; -6.45; 3.45; 10.43; -7.72; -0.49
9.57; 5.90; -11.19; 1.31; 5.83; -12.42; 12.11
-1.45; -11.96]; %Test subject 14
58
Achieving equal loudness between audio files
y15=[15.06; 1.46; -6.44; 0.58; -2.50; -3.49
-0.27; 0.82; 11.05; 11.21; 3.41; 5.41; -17.05
-2.58; 10.65; 11.22; -8.47; 8.05; 6.26; 2.41
3.56; 5.66; -6.80; 1.80; 11.83; -9.42; -1.29
10.12; 4.90; -5.98; -1.69; 4.48; -10.42; 10.01
1.81; -12.21]; %Test subject 15
y16=[13.96; 0.91; -10.69; 0.83; -4.05; -3.79
0.03; 0.92; 12.35; 9.76; 0.46; 6.76; -19.45
-7.78; 11.65; 11.02; -9.22; 6.05; 6.16; -0.34
2.76; 5.11; -6.45; 3.00; 9.73; -11.92; -0.34
12.82; 7.90; -9.64; -0.44; 6.33; -12.77; 9.86
-0.09; -11.66]; %Test subject 16
%The DifferenceLevel values of the 16 test subjects in one row
%vector.
y=[y1;y2;y3;y4;y5;y6;y7;y8;
y9;y10;y11;y12;y13;y14;y15;y16];
%A matrix representing which audio files that were matched during
%a listening experiment session
x=zeros(36,24);
x(1,1)=1;
x(1,2)=-1;
x(2,1)=1;
x(2,13)=-1;
x(3,2)=1;
x(3,3)=-1;
x(4,2)=1;
x(4,14)=-1;
x(5,3)=1;
x(5,4)=-1;
x(6,3)=1;
x(6,15)=-1;
x(7,4)=1;
x(7,5)=-1;
x(8,4)=1;
x(8,16)=-1;
x(9,5)=1;
x(9,6)=-1;
x(10,5)=1;
x(10,17)=-1;
x(11,6)=1;
x(11,7)=-1;
x(12,6)=1;
x(12,18)=-1;
x(13,7)=1;
x(13,8)=-1;
x(14,7)=1;
x(14,19)=-1;
x(15,8)=1;
x(15,9)=-1;
x(16,8)=1;
x(16,20)=-1;
x(17,9)=1;
x(17,10)=-1;
x(18,9)=1;
x(18,21)=-1;
59
Achieving equal loudness between audio files
x(19,10)=1;
x(19,11)=-1;
x(20,10)=1;
x(20,22)=-1;
x(21,11)=1;
x(21,12)=-1;
x(22,11)=1;
x(22,23)=-1;
x(23,12)=1;
x(23,13)=-1;
x(24,12)=1;
x(24,24)=-1;
x(25,13)=1;
x(25,14)=-1;
x(26,14)=1;
x(26,15)=-1;
x(27,15)=1;
x(27,16)=-1;
x(28,16)=1;
x(28,17)=-1;
x(29,17)=1;
x(29,18)=-1;
x(30,18)=1;
x(30,19)=-1;
x(31,19)=1;
x(31,20)=-1;
x(32,20)=1;
x(32,21)=-1;
x(33,21)=1;
x(33,22)=-1;
x(34,22)=1;
x(34,23)=-1;
x(35,23)=1;
x(35,24)=-1;
x(36,24)=1;
x(36,1)=-1;
%Matlab needed a constant term, see Matlab help for information
%about “regress”
x=[ones(36,1) x];
%One x matrix for each test subject
X=[x;x;x;x;x;x;x;x;x;x;x;x;x;x;x;x];
%B corresponds to the SegmentLevel values which is the
%result from the regression analysis
B = regress(y,X)
60
TRITA-CSC-E 2009:032
ISRN-KTH/CSC/E--09/032--SE
ISSN-1653-5715
www.kth.se