Automatic chord recognition with known algorithm G D

Transcription

Automatic chord recognition with known algorithm G D
Automatic chord recognition with known algorithm
Esben Paul Bugge
University of Copenhagen, Department of Computer Science
January 22nd 2010
13
Cm
11 D
F
sus2
G
CONTENTS
Contents
1 Introduction
5
2 Related work
9
3 The COCHONUT algorithm
10
4 MusicXML
11
5 System design
17
6 Implementation
19
7 Tests
26
8 Conclusions and future work
32
A Supported chord-types
38
2
ABSTRACT
Abstract
Chord recognition is the task of finding the chords that are used in a musical
score and where they are used. Chord recognition has several applications
including analyzation and classification of music. Several algorithms exist
for recognizing chords from both symbolic and acoustic music data. In this
paper I implement and test a system based on the COCHONUT algorithm
used for chord recognition on symbolic data - in this case the symbolic data
is MusicXML.
COCHONUT was originally made for chord recognition on jazz music but
in this project I test it on classical music, to examine if the algorithm is
suitable for music of this genre. The tests are made on 5 classical scores
and the results show that the COCHONUT algorithm is somewhat useful
on classical music. Furthermore, I present some ideas on how to improve
the algorithm.
3
PREFACE
Preface
This report is written as a candidate project in Computer Science at the
University of Copenhagen. The reader of the report should have a basic understanding of music theory, although some of the terms used in the report
will be shortly explained. Furthermore the reader should have a good understanding of software programming, in particular the Python programming
language which is used to implement the system. It is also recommended
that the reader has some basic knowledge about XML-documents and how
they are structured. The report has the following sections:
Section 1 contains a short introduction on the subject of chord recognition
as well as an introduction on what is examined in this project. Section 2
contains a short study on what others have tried in the field of automatic
chord recognition. In Section 3 the COCHONUT algorithm is described in
detail - the implemented system will be based on this algorithm. Section
4 contains a description of the MusicXML format, as it is music of this
format the system will be able to work on. Section 5 describes the design of
the system, while Section 6 contains a description of the system’s technical
implementation. In Section 7, some tests are set up and carried out and
this is followed by a discussion of the results. Finally, Section 8 contains
the conclusions of the work done plus a discussion of what could be done to
enhance the system in the future.
The project files (source code and test data) have been made available as a
Google Code project at http://code.google.com/p/cochonut/.
As a final note, I would like to send a special thanks to Louise Birkkjær for
reading and commenting on the report.
4
1
1
INTRODUCTION
Introduction
A musical chord is a set of musical notes played simultaneously and chord
recognition is the task of determining the chords in a musical score: at any
given point in the score we would like to know what chord is being played and
when it changes to another chord. Chord recognition can be carried out on
either acoustic data (audio recordings) or symbolic data (sheet music or some
kind of representation of this). In this project, I deal with automatic chord
recognition on symbolic data. The process is made automatic by creating a
computer program that will be able to read symbolic music data and output
the chords found in the these data. Automatic chord recognition has some
useful applications such as classification of music: knowing the chords of a
musical score, we can classify the score along with other scores that uses
similar chord sequences.
For the purpose of introduction, let us take a look at a simple example of
chord recognition. Figure 1 displays the notes of the first five measures of
the popular tune, Happy Birthday To You. In this set of notes, the chords
are not labeled; they have not been recognized yet.
43
3 4
Figure 1: The first five measures of Happy Birthday To You.
In Figure 2 the chords of the notes from Figure 1 have been labeled. The
notes in the first measure does not represent a chord. The notes of the
second measure represents a D-minor7, the notes of the third represents a
C7 and so on.
43
3 4
7
Dm
7
C
9
C
F
Figure 2: The first five measures of Happy Birthday To You - labeled with chords.
5
1
INTRODUCTION
The system developed in this project, should in a similar way be able to
output a description of what chords are used in the score and where each
chord is used. In the example of Happy Birhday To You, the notes of each
chord are not all played simultaneously - we say that the chords are broken:
the notes still represent a chord, but they are not all played (or sound)
simultaneously. The system in this project should also be able to recognize
chords that are broken.
As mentioned, the example of recognizing chords in Happy Birthday To You
is a very simple one. A more complex example is displayed in Figure 3.
These notes are from Bach’s Well-Tempered Clavier, Vol. I. This section of
the score uses dissonances, that is, tones that are unstable in the harmony
of the score. The presence of dissonance means that harmonic rules will be
harder to apply in order to find the chords of the score.
Figure 3: A section of Bach’s Well-Tempered Clavier, Vol. I
Figure 4 displays a short section of Chopin’s Minute Waltz. This section
includes two grace notes: notes that are used to decorate the music and
not directly part of the melody or harmony. The grace notes in Figure 4
are smaller in print size than the rest of the notes, but grace notes can also
be included in the score as ordinary notes. The presence of grace notes is
considered to increase the difficulty of chord recognition as the grace notes
are not part of the chords in the score and therefore is a “disturbance” for
the chord recognition system.
43 3
4
Figure 4: A short section of Chopin’s Minute Waltz that uses grace notes.
6
1
INTRODUCTION
The purpose of this project is to build a system that will be able to recognize
chords in simple cases as well as in complex cases such as the two latter.
In this project I deal with classical music, but the system might as well be
used with other music genres.
1.1
Symbolic music data
In the examples from Figures 1-4 the music is displayed symbolically: Symbols, and not acoustic sound, are used to define the music. The first of two
main advantages of working with symbolic data, when you are creating a
system for chord-recognition, is that you do not have to record a lot of audio
for both training and testing of your system. Instead, one can use scannedin music notes, converted to data that is easy for the computer to handle
and easy to work with for a programmer. The second main advantage is
that the data will not contain noise, as it most likely would if the music was
recorded. Because of these advantages, I will work on symbolic music data
in this project.
As for computer-readable symbolic music data, MIDI and MusicXML are
two useful formats. The MIDI protocol [7] was created in 1983 for connecting musical devices such as instruments and computers and controlling
them in real-time. The MIDI protocol can, for example, be used to connect
a synthesizer to a computer and hereby transfer the music played on the
synthesizer to the computer for editing, play-back etc. Later on, a storage
format was created to save music in so-called MIDI-files. Files of this format
holds symbolic music data. The MIDI format has many applications and it
is widely used throughout the music industry.
MusicXML [13] was created as an XML-specification to represent music
notation. According to the creators of MusicXML, Recordare, the format
is better for musical notation than MIDI-files because it holds much more
information. That said, the downside of MusicXML is that because of all this
information, the MusicXML format is very verbose and can be difficult to
handle. Figure 5 shows an example of a simple MusicXML file. It represents
a single whole note on middle C, based in 4/4 time.
Because of MusicXML’s abilities to thoroughly describe notes, I have decided
to work with this format in the project and the system that I implement
will thereby have to cope with the verbosity of the format. MusicXML will
be further described in Section 4.
7
1
INTRODUCTION
<?xml version=” 1 . 0 ” e n c o d i n g=”UTF−8” standalone=”no” ?>
< !DOCTYPE s c o r e −p a r t w i s e PUBLIC
”−// Rec ordare //DTD MusicXML 2 . 0 P a r t w i s e //EN”
” h t t p : //www. musicxml . o r g / d t d s / p a r t w i s e . dtd ”>
<s c o r e −p a r t w i s e version=” 2 . 0 ”>
<part− l i s t>
<s c o r e −p a r t i d=”P1”>
<part−name>Music</ part−name>
</ s c o r e −p a r t>
</ part− l i s t>
<p a r t i d=”P1”>
<measure number=” 1 ”>
<a t t r i b u t e s>
< d i v i s i o n s>1</ d i v i s i o n s>
<key>
< f i f t h s>0</ f i f t h s>
</ key>
<time>
<b e a t s>4</ b e a t s>
<beat−type>4</ beat−type>
</ time>
< c l e f>
<s i g n>G</ s i g n>
< l i n e>2</ l i n e>
</ c l e f>
</ a t t r i b u t e s>
<n o t e>
<p i t c h>
<s t e p>C</ s t e p>
<o c t a v e>4</ o c t a v e>
</ p i t c h>
<d u r a t i o n>4</ d u r a t i o n>
<type>whole</ type>
</ n o t e>
</ measure>
</ p a r t>
</ s c o r e −p a r t w i s e>
Figure 5: The “Hello World” of MusicXML: A whole C note.
8
2
2
RELATED WORK
Related work
In [9] a very general algorithm for chord recognition is described. The algorithm is divided in two parts (although the work of these parts is done
simultaneously): One for segmentation of the score and one for chord labeling. Segmentation is the task of finding the points of the music where the
chord may change. Chord labeling is the task of labeling each segment with
a chord. In the given algorithm, the score is segmented on every note hit (or
note-attack). A score is calculated for each segment representing how well
the notes of the segment represents a chord. Afterwards, the segments are
mapped to the vertices of a directed graph while the scores are mapped to
the values of the edges between vertices. The highest scoring path is then
found through the graph, and the result represents the chord labeling of the
entire score. Trying to make this algorithm as general as possible it has
been decided not to include any analysis on the musical context, only local
information is analyzed.
The COCHONUT algorithm is described in [15] as being an extension to the
algorithm found in [9]. The test-results described in this article show that
by using harmonic contextual information and a few other techniques, the
algorithm from [9] can be improved substantially. As I will be working with
the COCHONUT algorithm throughout this project, it will be described in
more detail in Section 3.
The Melisma Music Analyzer (MMA) described in [16] also works on symbolic data. MMA is divided into several programs, but the so-called Harmony Program is especially relevant in this context as it is able to output
the root note of each segment in the music by analyzing the possible roots
as how well they fit into the circle of fifths.
In [1] chord recognition is used as a tool for classification of music. Automatic genre classification is done by using chord sequences extracted from
both symbolic and acoustic data. Patterns are created for each musical
genre by looking at chord sequences in scores from that genre. These patterns are then used to classify other scores, simply by comparing patterns
in the test-score.
Along with chord recognition tools for symbolic data, a lot of work has been
done for dealing with chord recognition on acoustic data. [5], [17], [4] and
[6] use chromograms for this task, which is the most common approach. A
chromogram is a feature vector that holds intensities for the 12 pitch classes
found in music. To perform chord recognition, chromograms for different
chords are compared with the musical score.
9
3
3
THE COCHONUT ALGORITHM
The COCHONUT algorithm
The system I am creating is based on the COCHONUT-algorithm developed
by Scholz and Ramalho in [15]. The algorithm has three steps:
1. Segmentation of the score.
2. Chord-identification for each segment.
3. Contextual analysis to determine the best chord-candidate for each
segment.
Segmentation is accomplished by splitting up the score every time there
are three or more note-attacks within a small time-frame. The COCHONUT
algorithm is created for chord-recognition on jazz-music, and chord-changes
often occur in this kind of music, when there are three or more simultaneous
note-attacks. This idea is an extension of what was done in [9] where the
score was segmented every time a note-attack occurred.
Chord-identification is based on pattern matching. The pitches of each
segment found in the previous step are compared to a set of chord templates,
and from this comparison a list of chord-candidates for the given segment
is formed. Each chord-candidate is scored using a scoring-function from [9],
which gives a score that tells how well the chord-candidate represents the
pitches of the segment.
Contextual analysis is performed by creating a directed graph of the
chord-candidates. In this graph, all chord-candidates (represented by vertices) in a given segment are connected (by directed edges) to all chordcandidates in the next segment. Finding the best way through the graph is
done using chord-sequence patterns. These patterns are not explained in the
paper but it is assumed that the sequences are created from chord-sequences
found in a training-set of musical scores.
10
4
4
MUSICXML
MusicXML
Prior to reading this section, the reader should be familiar with the concepts
of measures and parts in a musical score. Measures are used to divide the
music on a time-based scale while parts are used to divide the music into
sections that represent different instruments or voices. In a regular music
sheet, measures are divided by vertical bars while each part of the score
has its own horizontal line. These relations are illustrated in Figure 6.
Furthermore, the reader should note that all XML-elements mentioned in
this section and in the following sections, are written in a bold font.
Piano
Violin
Figure 6: An example of a musical score using measures and parts. The score
contains three measures divided by the two middle vertical lines. The score contains
two parts: one for piano, which contains both a treble and a bass, and one for violin.
Parts are not necessarily named for a specific instrument.
MusicXML is based on XML. As done in a music sheet, MusicXML divides
the music into measures and parts. MusicXML has two ways of describing
music: part-wise or time-wise. Using a time-wise description, the hierarchical structure of the MusicXML, will be as in Figure 7: the score is divided
into measures, and each measure is divided into parts. Using a part-wise
description, the score would be divided into parts and each part into measures.
The MusicXML-example in Figure 5 has a part-wise structure. This can be
seen on the name of the root tag, which is score-partwise and by the fact,
that the only part of the score (specified by the part-element) contains a
measure (specified by the measure-element): the measure is a descendant
of the part.
11
4
MUSICXML
score
measure 2
measure 1
part 1
part 2
part 1
part 3
part 2
part 3
Figure 7: The hierarchical structure of a MusicXML-document, when the music is
described time-wise. The score is divided in measures which in turn are divided
in parts. Each part contains information about the music located in a given part
within a given measure.
The creators of MusicXML, Recordare, provides XSLT stylesheets that can
be used to convert MusicXML documents from one of these two formats
to the other. Regardless of which one of the two formats is chosen, the
structure of the music data in the children of parts/measures is the same.
This music data will be described below.
4.1
Details
This section describes the XML-elements that are used to represent the
actual music data. The elements mentioned in the following, are all needed
for the work done in the project.
4.1.1
Key
The attributes-element sets the attributes of the score. Within this element, the key of score can be set. This is done in the key-element, which
contains two elements: fifths and mode. The value of the fifths-element
sets how many sharps/flats there are in the key of the score. The number
of sharps are set with a positive integer while the number of flats is set with
a negative integer. For example, to specify four flats in the key, the value of
the fifths-element is set to -4. The mode-element is optional and is used
to specify whether the key is in major or minor by setting the value of the
element to either “major” or “minor”.
12
4
4.1.2
MUSICXML
Notes
The note-element describes a pitch or a rest. A note-element can contain
the following elements (among others):
pitch: If present, the note represents a pitch. The pitch element contains
three elements: step, octave and alter (optional). The step-element represents the pitch class (the possible values of this element are the letters A
to G) and octave represents the octave in which the pitch is set. alter represents the alternation of the pitch, that is, if the pitch is a sharp or a flat.
Regardless of the key specified, the alternation of a pitch must be specified
with this element - the value of the element is 1 for a sharp or -1 for a flat.
duration: This is the duration (or length) of the note. As seen in Section
4.1.3, this duration is set in terms of quarter-note parts.
chord: Only used when the note is a pitch. If present, this element states
that the pitch should be played as part of a chord (explained in Section
4.1.4).
rest: If present, the note represents a rest where no pitch sound.
grace: If present, the note is a grace note.
4.1.3
Note-length
The attributes-element hold a divisions-element which is important, as it
sets the shortest possible note-length, s. s is defined as follows:
s=
1
4
·
1
d
=
1
4d
where d is the value of the divisions-element. The attributes are commonly
specified within the first measure element of the score, but it is also possible
to change the attributes, and hereby change s within the score. Each note
holds an element, duration which sets the length, l of the note using the
following:
l =s·r
where r is the value of the duration-element. An example: if d = 12,
1
1
the shortest possible note has length, s = 4·12
= 48
. Each note must then
1
specify its length in terms of s. A note of length 8 should therefore set r = 6
1
because 48
· 6 = 81 .
13
4
MUSICXML
Grace notes do not have a duration element, which means that they have
no relevant length.
4.1.4
Simultaneous notes
MusicXML maintains a musical counter which can be moved forward and
backwards to set the order of the notes in the score. When a note is specified, the counter is moved forward by the value of the duration-element.
Simultaneous notes can be created by placing a chord-element in the next
note. This element specifies that the counter should not move forward, but
instead start the next note in the same place as the previous. Figure 8
contains an example of using of the chord-element.
<n o t e>
<p i t c h>
<s t e p>C</ s t e p>
<o c t a v e>4</ o c t a v e>
</ p i t c h>
<d u r a t i o n>4</ d u r a t i o n>
</ n o t e>
<n o t e>
<chord />
<p i t c h>
<s t e p>E</ s t e p>
<o c t a v e>4</ o c t a v e>
</ p i t c h>
<d u r a t i o n>4</ d u r a t i o n>
</ n o t e>
<n o t e>
<chord />
<p i t c h>
<s t e p>G</ s t e p>
<o c t a v e>4</ o c t a v e>
</ p i t c h>
<d u r a t i o n>4</ d u r a t i o n>
</ n o t e>
Figure 8: An example of displaying a chord in MusicXML using the chord-element.
The chord contains three pitches: C, E and G.
The counter can also be moved using the backup- and forward-elements.
These elements are specified on the same level as the note-elements of the
14
4
MUSICXML
score, one could say that they are siblings to note. Like note, backup
and forward both contain a duration-element that specifies how much the
counter should be moved. Figure 9 shows an example of moving the musical
counter backwards and forwards using these elements.
4.1.5
Parts of the score
The parts of the score are specified in the part-list element. Within this tag,
each part is represented by a score-part-element. A score-part-element
contains an attribute, id which is the identifier of the part and an element,
part-name holding the name of the part.
15
4
MUSICXML
<n o t e>
<p i t c h>
<s t e p>F</ s t e p>
<o c t a v e>4</ o c t a v e>
</ p i t c h>
<d u r a t i o n>4</ d u r a t i o n>
</ n o t e>
<n o t e>
<p i t c h>
<s t e p>A</ s t e p>
<o c t a v e>4</ o c t a v e>
</ p i t c h>
<d u r a t i o n>4</ d u r a t i o n>
</ n o t e>
<backup>
<d u r a t i o n>8</ d u r a t i o n>
</ backup>
<n o t e>
<p i t c h>
<s t e p>D</ s t e p>
<o c t a v e>4</ o c t a v e>
</ p i t c h>
<d u r a t i o n>4</ d u r a t i o n>
</ n o t e>
<f o r w a r d>
<d u r a t i o n>4</ d u r a t i o n>
</ f o r w a r d>
<n o t e>
<p i t c h>
<s t e p>G</ s t e p>
<o c t a v e>4</ o c t a v e>
</ p i t c h>
<d u r a t i o n>4</ d u r a t i o n>
</ n o t e>
Figure 9: An example of moving the musical counter using backup and forward
in MusicXML to construct a chord. In this case, an F is followed by an A. The
counter is then moved back, and a D is specified simultaneously with the F. At last
the counter is moved forward to specify a G after the A.
16
5
5
SYSTEM DESIGN
System design
The design of the system and its execution flow is illustrated in Figure 10.
The design is based on the design of the COCHONUT algorithm. Each
element of the system is described below.
MusicXML
Parser
Intervals
Partitioner
Segments
ChordIdentifier
Candidate chords
ContextAnalyser
Labeled score
Figure 10: The design and flow of the system, implementing the COCHONUT
algorithm. Data is represented by ellipses and the elements of the system are
represented by squares.
Parser: The Parser parses a MusicXML-document and returns the musical
data from this document, split up into so-called intervals that each has the
length of the shortest possible note in the score.
Partitioner: The Partitioner is given the intervals as input and produces
a list of segments as output. A segment in the system corresponds to a
segment defined in [15]. The work done of the Parser and the Partitioner
corresponds to the work done in the first step of the COCHONUT algorithm.
ChordIdentifier: The ChordIdentifier identifies the chord-candidates of
each segment and calculates a score for each candidate. The higher the
score, the more likely that the notes of the segment represents that chord.
This is the second step of the COCHONUT algorithm.
17
5
SYSTEM DESIGN
ContextAnalyzer: The ContextAnalyzer examines the chord-candidates
and selects the best chord for each segment. In the last step of the COCHONUT algorithm, the contextual analysis is done using chord-sequence
patterns, but as I have not been able to get a hold of these patterns, the
ContextAnalyzer in my system has been designed differently: [10] specifies a
comprehensive set of chord-transition rules. These rules specify legal transitions from and to different chord-types such as tonic, subdominant and
dominant, and these rules are used for contextual analysis in my system.
5.1
Data format
The system design has been based on the idea, that we will need a data
format that can hold the music data in a way that makes it easy to analyze.
The grammar below describes this data format.
X→T
T →S
T → TS
S → CN
N →M
N → NM
M → LJ
J →I
J → JI
I → AR
R→P
R → RP
P → DO
X: A score, T : a list of segments, S: a segment, C: a (recognized) chord, N :
a list of mini-segments, M : a mini-segment, L: length of the mini-segment
in terms of number of intervals, J: list of intervals, I: interval, A: number
of note-attacks at the start of the interval, R: list of pitches sounding in an
interval, P : a pitch, D: pitch-class, O: octave.
The format is created from the idea, that we want to end up with a list of
segments, each containing a chord. Basically, a score is divided in segments,
which in turn are divided in mini-segments, which in turn are divided in
intervals. The difference between a segment and a mini-segment is that a
mini-segment should be created every time the notes change (that is, upon
every note-attack) while a segment should be created every time there are
three or more note-attacks. The idea of a mini-segment was found in [9].
18
6
6
IMPLEMENTATION
Implementation
The source code that implements the system described in the previous section is found at the Google Code project on the following URL: http:
//code.google.com/p/cochonut/source/browse/#svn/trunk/src. The
system is implemented using Python 2.6 and Table 1 displays the files that
hold the implementation of the elements of the system: Parser, Partitioner,
ChordIdentifier and ContextAnalyzer. The remainder of this section, contains a description of the implementation of these elements. The file cochonut.py, which is also available at the Google Code project, is used to
run the program by calling the relevant functions from all elements, but the
implementation of this file will not be described.
Element
Parser
Partitioner
ChordIdentifier
ContextAnalyzer
File
parser.py
partitioner.py
chord identifier.py
contextanalyzer.py
Table 1: Implementation files.
6.1
Parser
The Parser depends on the MusicXML in the input to follow the specification
given at [12]. Furthermore it assumes that the XML holds information about
the key of the score, provided in a key-element. This information is returned
by the Parser and used to find the tonic of the score, as we will see in Section
6.4.
As explained earlier, MusicXML can be used to describe music either partwise or time-wise. I have made the assumption that a musical score always has one chord at one time in the music. Because of this, it would
be convenient to have the music data structured in a sequential way, and
this is exactly what the time-wise structure in MusicXML provides. Therefore the Parser works on time-wise MusicXML. For the system to support
both part-wise and time-wise MusicXML documents, the Parser converts
part-wise scores into time-wise scores using the lxml library [2]. The XSLstylesheet used for the conversion from part-wise to time-wise, is provided
at http://www.recordare.com/dtds/parttime.xsl. I could have opted
to work with a part-wise score in the system; in this case, the data of all
measures for each part should be read, before merging the information measure by measure. I have, however, found it more obvious and easier to work
19
6
IMPLEMENTATION
with a time-wise score.
The reading of the XML-elements in MusicXML is done by reading the data
into a tree-like structure. This is done using the lxml module as well: the
module is able to parse an XML-document from a text file and return a
tree-structured object representing the XML. This object can be queried for
specific elements, such as all measure-elements or all child-elements of a
given element. This makes it very easy to retrieve all the elements needed
from the XML.
From the Parser, the music of the score is output as a list of intervals. Each
interval has the length of the shortest possible note in the score. An interval
holds the count of how many pitches of a given pitch class that are sounding
in that interval, plus a count of how many note-attacks are made at the start
of the interval. The length of a note (pitch or a rest), that is, how many
intervals the note lasts, is found by dividing the length of the note with the
length of an interval. If for example the largest divisor in the score is 8:
1
1
in this case the shortest possible note is 4·8
= 32
- this is also the length
1
of an interval. The length of a 2 -note in terms of number of intervals will
1
be 21 / 32
= 16. Table 2 shows an example of how notes from Figure 11 are
mapped to the list of intervals.
Figure 11: A set of notes, that can be mapped to intervals.
The Parser maintains a divisor -variable, to hold to the value of the divisorelement (described in Section 4) for each part in the music. This is needed
because the divisor-value can be different for each part in the score. In
addition, the Parser maintains a variable that keeps track of the next interval
(in which we will place the next note) in each part, as we jump from part
to part when using a time-wise structure. This variable corresponds to the
musical counter. It is incremented/decremented when note-, backup- and
forward-elements are encountered.
A pitch is mapped to a pitch-class number and an octave number (although
only the pitch-class is used throughout the system). As grace notes are
explicitly defined in MusicXML using the grace-element, it is not hard for
the system to cope with these “decoration”-notes: they are merely skipped,
as they are not part of the current chord’s notes. Grace notes that are given
as regular notes are stored in the intervals, however. Whether or not these
notes are part of the current chord is up to the ChordIdentifier to decide.
20
6
Interval
1
2
3
4
5
6
7
8
Note-attacks
1
0
2
1
3
0
0
2
IMPLEMENTATION
Pitches
[C]
[C]
[D, F]
[D, F]
[C, E, G]
[]
[]
[A, A]
Table 2: How notes from Figure 11 are mapped to a list of eight intervals, each
interval having the length of a 41 -note. A row in the table corresponds to an interval.
At the start of the first interval, a C-note is attacked, this note lasts two intervals,
as it is a half-note. Therefore one note-attack is registered in the first interval, but
not in the second. The pitch C is registered in both intervals, though, as the pitch
sounds in both of them. Notice how the last interval holds two A’s as two pitches
of pitch-class A sound in this interval.
6.2
Partitioner
The Partitioner’s work is done in two sections: first it creates mini-segments
from the intervals and then it creates segments from the mini-segments.
As mentioned, a mini-segment should be created every time the sounding
pitches change. The Partitioner maintains a list of mini-segments which is
initially empty. It then iterates through the intervals, and every time an
interval is encountered which has a different set of pitches than the previous one, a new mini-segment is created and appended to the list of minisegments. Every mini-segment holds a variable with the count of how many
note-attacks was made at the start of the mini-segment.
A segment should be created every time there is a possible chord-change.
How many note-attacks is needed for a possible chord change is specified
in the required attacks parameter (as in the COCHONUT-algorithm this
parameter defaults to 3), which is set when calling the Partitioner. The
required amount of note-attacks for a new segment to begin may occur
within a given time-frame, t. The Partitioner is given the length of this
time-frame as a parameter which specifies the time-frame in terms of number
of intervals. In [15] the time-frame is based on the beat of the music, but it
is not explained in the article how the time-frame is calculated. Therefore
the length of time-frame is set as a parameter, T in the program - the value
1
of this parameter should be set to 18 or 16
. Recall from Section 4.1.3 that
the length, s of the shortest possible note is
21
6
s=
IMPLEMENTATION
1
4d
where d represents the value of the divisions-element. t is calculated (before
it is passed to the Partitioner) as
t=
T
s
= T · 4d
When creating segments, the Partitioner follows the pseudo-code in Algorithm 1. It basically loops through the mini-segments and creates a new
segment every time required attacks or more note-attacks occur. In the
pseudo-code, the intervals()-function, returns the length of a mini-segment
in terms of how many intervals it spans.
Algorithm 1 Pseudo-code for creating segments (comments are in brackets)
i=0
while i < length(mini segments) do
j =i+1
m = mini segments[i] {m is the current mini-segment}
n = mini segments[j] {n is the next mini-segment}
s = list(m) {Create a list, s with m as the single, initial member}
total attacks = m.note attacks
total length = intervals(m) {total length is the length of s in terms of
intervals}
while total length + intervals(n) ≤ time frame do
total length += intervals(n)
total attacks += n.note attacks
s.append(n)
j += 1
n = mini segments[j]
end while
last = length(segments)-1
if total attacks ≥ required attacks then
segments.append(s) {Mini-segments in s represent a new segment}
else
segments[last].add(s) {Mini-segments in s are part of an existing segment}
end if
i=j
end while
22
6
6.3
IMPLEMENTATION
ChordIdentifier
The ChordIdentifier uses the following procedure to identify the chordcandidates of each segment:
1. A weight-vector is calculated for each segment. This vector contains
twelve elements; one for each pitch-class. Each element in the vector
is a count of how many times a pitch occurs in the mini-segments
within the segment. Element 0 represents the count of C’s, element 1
represents the count of C#/Db’s, element 2 the count of D’s and so
on. If for example a G-pitch is present in three mini-segments within
the segment, the value of the seventh element will be 3.
2. The weight-vector of each segment is compared to all chord-templates
using a combination of all possible pitch-classes as roots in each chord.
For each combination, a score is calculated using a scoring-function.
3. The root/template-combinations that have a score of at least min of
the highest score are chosen as chord-candidates.
In the second step, the chord-candidates are scored using the scoring-function
found in [9], which measures the distance from the weight-vector of a segment to each template in a set of chord templates. A chord template is
represented by an array which holds the pitch class-indexes of the pitches
that are in the chord. For example, the array [0, 4, 7] represents a major
triad chord, meaning that in a major triad, the root plus the fourth and
seventh pitch-class (seen from the root) are sounding. The scoring-function
is implemented as a Python-function, which means that it easily can be replaced by another function (like it is discussed in Section 8.1). Furthermore,
the list of templates is given as a parameter to the ChordIdentifier, which
makes it possible for the user to provide her own chord-templates. In the
current function, the score, S for a given segment with a given weight vector,
is calculated in the following way:
S = P − (N + M )
P is the positive factor which is the sum of pitch-counts for the pitch-classes
in the template, N is the negative factor which is the sum of pitch-counts for
the pitch-classes not present in the template and M represents the misses
which is the count of pitches from the template not being played.
As an example of scoring a chord, we will consider the following weightvector, w:
23
6
IMPLEMENTATION
w = [1, 0, 0, 0, 3, 0, 1, 0, 0, 3, 0, 0]
w represents a segment where one C, three E’s, one F#/Gb and three A’s
are sounding. As mentioned, the template of a major triad chord is given by
the array [0, 4, 7]. This template is adjusted to a given root when used for
comparison. To adjust the major triad template so it represents a A-major
triad (A having pitch-class 9) will be ([0 + 9, 4 + 9, 7 + 9]) mod 12 = [9, 1, 4]
which corresponds to the pitch-classes A, C# and E. Using the weight vector,
w above and the template, t for a A-major triad, the scoring-function gives
the following calculations:
P =
P
N=
P
i∈t w(i)
i∈t
/ w(i)
= 6 as w(9) = 3, w(1) = 0 and w(4) = 3
= 2 as w(0) = 1 and w(6) = 1
M = 1 as w(4) = 0
S = 6 − (2 + 1) = 3
In [15], a directed graph is built from the chord-candidates where all the
chord-candidates of each segment point to the candidates of the next. The
ChordIdentifier in this system does not, however, need to build such a graph
as the segments are already sorted in the order of how they occur in the
music and therefore the candidates of one segment already “point” to the
candidates in the next segment, because each segment points to the next.
6.4
ContextAnalyzer
As mentioned in Section 5, the ContextAnalyzer uses a set of rules from [10]
specifying legal chord-transitions from and to different chord-types. Not all
these rules have been implemented as this paper does not define the chordtypes directly. Therefore I have implemented rules for all chord-types found
in [3]. The system is able to recognize a reasonable amount of chord-types
(14 in total). These are given in Appendix A.
The function that determines the chord-types is hard-coded in the ContextAnalyzer and the user of the system will therefore not be able simply give
new chord-types as parameters - at least not without changing the function.
To provide an easier way for the user to use other chord-types than the
ones already defined, a grammar could have been created to represent the
function. The user would then be able to provide a small function (which
obeys the rules of the grammar) when providing a new chord-type.
Before the ContextAnalyzer is able to find legal transitions, the tonic of
the score must be determined because all chord-types depend on the tonic24
6
IMPLEMENTATION
chord. For this purpose, the ContextAnalyzer is given the key of the score.
Using this key, it finds the tonic of the score using a method from [3]1 . The
tonic could also have been determined by looking at the last pitch of the
score. This approach have not been taken, though, since the last pitch can
be several pitches and we do not know which one of them is the root.
When doing the actual analysis, the ContextAnalyzer iterates through the
segments and selects the most appropriate chords from the list of candidatechords. Algorithm 2 shows the pseudo-code used to do this. In this code,
find legal transitions(p,c) finds the chords from c that are legal chord-transitions
from the previous chord, p while find best score(l) finds the chord with the
highest score from a list, l of chords, simply by comparing the scores of all
the chords and selecting the one with the highest score. If two or more candidates share the highest score, we simply pick the first one. This approach
may seem naive and it is only taken because the scoring-function returns
imprecise scores. In my opinion the scoring-function should be optimized to
return more precise scores - this is discussed in Section 8.
Algorithm 2 Context-analyzing the segments (comments are in brackets)
p = null {p is the previous chord, which is nothing at first}
for all s in segments do
c = s.chord candidates {c is now the list of chord-candidates}
l = find legal transitions(p,c) {l holds the chords from c that are legal
transitions from p}
if length(l) > 0 then
s.chord = find best score(l) {legal transition(s) found: set the chord
of s to the chord from l with best score}
p = s.chord
else
if length(c) > 0 then
s.chord = find best score(c) {no legal transition(s) found: set the
chord of s to the chord from c with best score}
p = s.chord
end if
end if
end for
1
This method is quite trivial. It finds the root of the key, and hereby the tonic, by
locating the key in the circle of fifths.
25
7
7
TESTS
Tests
Two types of tests have been made: First, a functional test has been created
to test if the system works as expected on a set of small test data. Second, a
score-test has been created to test the system on a set of musical scores. In
this section, the two tests are described and their results are evaluated. All
MusicXML-files used for tests are found at the Google Project, at the following URL: http://code.google.com/p/cochonut/source/browse/#svn/trunk/
test.
All tests were made with a set of 48 chord-templates collected from [3]
and [8]. In [15], only chord-patterns appropriate to jazz harmony are used.
I have, however, chosen to go with a single comprehensive set of chordpatterns. This should make the chord identification more precise, as there
are more templates to choose from. In the end, this should provide better
results. During the tests, chord-candidates are discarded if their score is
less than 85% of the highest score. This is the same approach as the one
taken in [15]. In addition, the time-frame in which a required number of
1
note-attacks should occur, is set to the length of a 16
-note. The number of
required note-attacks is set to 3.
Ideally, tests should have been carried out, trying different parameter-settings
to see if better results could be obtained with other settings than the ones
explained, but because of the small scope of this project, I have opted to
use just the settings described above.
7.1
Functional tests
When testing if the system works, the following factors need to be tested:
A. When there are no chord changes, no chord change should be registered.
B. The system should label the music with maximum one chord at any
given point in music.
C. The system should be able to recognized a chord when all its notes are
played simultaneously.
D. The system should be able to recognized a chord even though not all
its notes are played simultaneously.
E. A chord should be recognized even though it contains notes that are
not found in the chord regularly (grace notes).
26
7
TESTS
F. The system should be able to recognize the most simple chords such
as triads.
G. The system should be able to recognize the more complex chords with
five, six or seven notes.
H. Chord sequences from the possible chord-transitions should be recognized.
I. When no regular chord sequences from the possible chord-transitions
are recognized, the system should give a give a good guess on which
chords are played.
J. If the system encounters a segment of notes that does not represent
a chord, it should still be able to give a guess on what chord is being
played.
Furthermore, the chord recognition should not be influenced by the length
of the notes, except that a note which lasts longer may have a higher weight
in the weight-vector, and hereby chord templates that use this note, will
have a higher score.
To test the factors above, I have created seven tests, numbered 1-7. Table 3
illustrates the factors they test. Their notes (along with the expected results
and the actual results) are found in Table 4. For each test, the corresponding
XML-file at the URL above is named testX.xml, where X is the test number.
Test
1
2
3
4
5
6
7
A
x
B
x
x
x
x
x
x
x
C
D
E
F
x
x
x
x
x
x
x
x
x
x
x
x
x
G
H
x
x
x
x
x
I
J
x
x
x
Table 3: The factors tested by the seven functional tests. An ’x’ means that the
test tests the given factor. For example: Test 2 tests factors B, C, F and H.
Test 1 contains no chords. Notice, that the notes in this test may sound
as a two D-major triads, but the system will not be able to recognize these
chords, as it expects three or more simultaneous notes to detect a chord
change.
Test 2 contains some simple triad chords: C-major, F-major and G-major,
where all pitches of each chord are played simultaneously. The order of the
27
7
TESTS
chords are made from the pattern: tonic → subdominant → dominant →
tonic.
Test 3 contains two triad chords, D-major and G-major where not all the
notes of the chords are played simultaneously. The chord-sequence used is:
tonic → subdominant.
Test 4 contains two triad chords, C-minor and F-minor. The F-minor is
broken up and not all its notes are played simultaneously. In addition, a
grace note (a 18 -note: Ges), which does not fit into the template of the chord
is played during the F-minor.
Test 5 has contains a set of complex chords. The sequence used is: tonic →
tonic parallel → subdominant sixth → dominant seventh.
Test 6 contains a transition from a subdominant to the tonic, which is not
a legal transition according to the system.
Test 7 contains two sets of some simultaneously played notes, neither of
them does form a specific chord.
7.2
Score tests
The score tests are used to test the system on “real” music. Among the
selected scores should be:
• scores with few (one or two) parts
• scores with multiple (more than two) parts
• scores with many short notes
• scores with many simultaneous notes
28
Test
1
29
2
3
4
6
7
Expected output
5
Input
G C
D
C F
E
/11
13
6/add9
GC G
Any two chords are recognized
Table 4: Functional tests
C m A
G C
D
Cm Fm
C F
G
Actual output
G
Cm Fm
11
Bm
E
/11
13
6/add9
C m A
GC G
9/add 13
C
9/add13
C
11
Bm
7
Score
Composer
1
Shumann, R.
2
3
4
Mozart, W.A.
Beethoven, L.v.
Actor, L.
5
Bach, J.S.
Music
Im wundersch¨onen
Monat Mai,
Dictherliebe
Das Veilchen
An die ferne Geliebte
Prelude to a Tragedy
Brandenburg Concerto
No. 2 in F Major,
BWV 1047
Sheets
TESTS
File
2
dichter.xml
1
1
4
veilchen.xml
geliebte.xml
tragedy.xml
5
branden.xml
Table 5: Musical scores used for score tests.
Scores 1-4 were retrieved from [14] while Score 5 was retrieved from [11].
Score 1 was chosen because is contains a lot of short notes and few places
with three or more note-attacks. This will result in segments with more
complex weight-vectors. Score 2 also contain a lot of short notes, but it also
contain a lot simultaneous note-attacks. The same is the case for Score 3.
Score 4 and 5 have been chosen because of their multi-part music (Score 4
has 22 parts, Score 5 has four parts).
I have not been able to retrieve scores that were already labeled with chords,
neither have I had access to an expert who could label the scores for me.
Therefore, I have not been able to calculate precise results saying how many
chords were actually labeled correct. The evaluation of the results is merely
based on my own basic knowledge about chords.
7.3
Results
As seen from Table 4, all functional tests gave results as expected. The score
tests gave reasonable results: The system was able to guess for a chord in
every segment, and often the chord was a reasonable guess according to the
pitches being played. Figure 12 shows an example of some chords that were
recognized in Score 3.
During the score tests, it turned out that some chords may be given a high
score even though the root of the chord is not played within the segment.
Take for example the following weight-vector which represents the fifth segment of Score 3:
[1, 0, 1, 8, 0, 0, 0, 4, 0, 0, 1, 0]
30
7
Voice
43 43
Piano
3
4
p
E
6 E E C
Cm
9
7
A
TESTS
E Fm Fm F
7
Figure 12: Chord-labeling of the first four measures of Score 3.
Chord-identification with this vector yields a lot of candidates with scores
ranging from 13 to 15. Among these candidates are Fm13 and F13, although
no F-pitches sound in the segment. This is because the score function does
not take the root of the chord into account. It merely scores the weightvector based on the pitch-classes that should be in the chord, not taking
into account that the root is more important than the other classes, because
it is rarely left out of the chord. In Figure 12 the second and third segment
are labeled with chords although the roots of these chords are not sounding
in the segments.
The score tests also revealed that the system is having some difficulty in
dealing with multi-part scores. This is mainly because the system requires
a number of note-attacks for a new chord to start. In multi-part music,
this number can be quite hard to determine, as the different parts often use
different note-patterns and because there can be a lot of parts. Score 4, for
example, is written for an orchestra and contains 22 parts, where some of
the parts play a lot of short pitches while other parts play only long pitches.
The number of note-attacks at the points at which the chords change in this
score, does range from 3 to 28 and therefore it is almost impossible to set a
number of required note-attacks. Because of this, the results from Score 4
were not very good. The results of running Score 5 were somewhat better,
because this score has fewer parts and because the note-patterns in the parts
varies a lot.
The time that the system uses to process a MusicXML-file depends, naturally, on the size of the file. Score 2 contains approximately 4800 lines (2
parts in music) and is processed in 400ms while score 4 that contains 42900
lines (22 parts in music) is processed in 5500ms.
31
8
8
CONCLUSIONS AND FUTURE WORK
Conclusions and future work
A system has been created to parse music from MusicXML-files, partition
this music into segments where the notes of each segment represent a chord
and at last identify and analyze chord-candidates of each segment to label
each segment with a chord. The system is able to handle music with grace
notes and/or dissonance and according to the functional tests, the system
is working. According to the score tests, the system is useful on some kinds
of real classical music.
However, as seen from the test results, there are some issues that one could
deal with in the future to improve the system. Apart from these issues, I
am generally satisfied with the results that I was able to achieve with the
system. It is difficult to compare my results to the results obtained in [15]
as I was not able to test pre-labeled scores, and therefore have no precise
indication of how many of the chords that were correctly recognized. In [15],
the results show that the COCHONUT algorithm is able to recognize 6575% of the chords correctly. My guess is that the system developed in this
project has a lower recognition-rate. The recognition-rate does, however,
depend on the number of parts in the music: more parts make it more
difficult to determine the points in which the chord changes and therefore
lowers the recognition-rate.
The tests also revealed that the current scoring-function is not optimal. The
main problem is that the function is not taking the chord-root into account:
a set of pitches can achieve a relatively high score for a given chord-template
although the root of the chord is not among the pitches.
The system has been built to work on MusicXML, because this format has
been developed to represent sheet music in an accurate way. Working with
MusicXML presents some problems though: the format allows the creator of
the XML to leave out some parts of the music-specification like for example
the key of the music. This made it more difficult to parse MusicXML-files,
and because of this, it was need to put some further restrictions on the XML.
The remainder of this section presents some possible enhancements that
could be made to improve the system. The enhancements are presented in
a prioritized order, meaning that the suggestions given first should enhance
the system more effectively than the ones given last.
32
8
8.1
CONCLUSIONS AND FUTURE WORK
New scoring-function
As mentioned, the current scoring-function is not optimal, and a replacement
of this function may enhance the performance of the system substantially.
The new scoring-function should score the chord-candidates in a similar way
to what the current function does, but also take the root of each chord into
account: if the root is present, the score should be higher than if it is not.
The new function could also have more precise scoring, so we are able to
select the chord with a distinct highest scoring (with the current function,
different chord-candidates often have the same scores). The information,
about which pitches are lowest/highest at a given point in the music in
terms of octaves, is not currently used. This information could be used
to score the chord-candidates more precisely or to determine the chordtypes during the contextual analysis (some chord-types are defined by which
pitch is the lowest). The data format used in the system is already able to
handle information about octaves, so the information is easily accessible. At
last, a new scoring-function could be one which is genre-specific and able to
calculate more precise scores for the genre in question.
Currently, a parameter is set to filter candidates that have a score less than a
given threshold. If a new scoring-function is developed this function should
do the calculation of this threshold automatically: if, for example, we have
a lot of high scores the threshold should be set high, as we would only want
to consider the top-scoring candidates.
8.2
Analyzing specific part(s) of the music
When analyzing multi-part music, the approach of segmenting the score
whenever a given number of note-attacks occur, is not optimal, because it
is not easy to determine how many note-attacks that makes a chord change
in multi-part music. Instead of analyzing all parts of the score, another
approach could be taken, in which one would analyze a single part of a
score. In this case, the algorithm would have to locate the most “important”
part(s) and analyze this/these. This could be set as an requirement to the
input: when giving the input, the most important part of the music should
be marked, and the analysis should then be carried out only on this part.
One could also try to find the most important part(s) automatically within
the system, for example by removing the parts that contains a bass-key:
these parts does almost certainly not hold the treble, and hereby does not
hold the melody of the music. This makes the bass-parts unimportant when
recognizing chords.
33
8
8.3
CONCLUSIONS AND FUTURE WORK
More precise weight-vectors
The concept of a weight-vector has been found in [9]: a single state of pitches
is maintained for each mini-segment. The weight-vector of the segment is
then based on the pitches of the mini-segments within the segment. Consider
the example in Figure 13. The weight vector of the first segment for the notes
in this figure will be [1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1] which means that the short
F-pitch has the same importance as the longer G, B and D pitches (played
simultaneously) even though they last four times longer. This may result in
a different chord than the one intended by the composer.
Figure 13: Illustration of how a weight-vector is constructed. The first segment
in these notes would have the weight-vector [1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1], which does
not weigh the tones of the long G, B and D-pitches higher than the F-pitch.
A possible solution to this issue could be to maintain interval-weight-vectors
so in the example, we would get the weight vector [0, 0, 8, 0, 0, 1, 0, 8, 0, 0, 0, 8]
for the first segment, if the each interval is a 18 -note long. Having such
vectors would mean that a new scoring-function should be created, or that
the score function should be applied to the weight-vector of every interval
in the segment. If not, the misses factor would be too unimportant.
8.4
Use of chord-sequence rules or additional transitions
As I was not able to get a hold of the chord sequence rules that are used
in [15], implementing these rules into the current system should of course
be tried in the future. If this enhancement is not tried, and the user of the
system chooses to stick to the model with chord-transitions that I have implemented, the list of possible transitions should be extended. As described
in Section 6.4, the system could be extended so the user would be able to
pass other chord-types and -transitions to the system as parameters.
8.5
Thorough testing
As mentioned during the tests, the system has not been tested with different
parameters than the ones specified in [15]. A more thorough test of the
system may give different results and provide an indication of where the
34
8
CONCLUSIONS AND FUTURE WORK
system should be improved.
8.6
Use of an object-oriented model
Although it may not enhance the performance of the system in terms of
chord-recognition, it may be easier to develop the system further, if it were
built on an object-oriented model.
35
REFERENCES
References
[1] Anglade, A., Ramirez, R. and Dixon, S.: Genre Classification Using
Harmony Rules Induced from Automatic Chord Transcriptions, Proceedings of the 10th International Conference on Music Information
Retrieval (ISMIR 2009), Kobe, Japan, October, 2009.
[2] Behnel, S. et. al: lxml. http://codespeak.net/lxml/. Retrieved December 23rd 2009, 12:25 GMT.
[3] Grønager, J.: Nøgle til musikken - grundlæggende musikteori (eng.:
Key to the music - basic music theory). Systime, 2004.
[4] Khadekevich, M. and Omologo, M.: Use of hidden Markov Models and
Factored Language Models for Automatic Chord Recognition, Proceedings of the 10th International Conference on Music Information Retrieval (ISMIR 2009), Kobe, Japan, October, 2009.
[5] Lee, K.: Automatic Chord Recognition from Audio Using Enhanced
Pitch Class Profile. Proceedings of International Computer Music Conference, 2006.
[6] Lee, K. and Slaney, M.: Automatic Chord Recognition from Audio Using
an HMM with Supervised Learning, Proceedings of the 7th International
Conference on Music Information Retrieval, Victoria Canada, 2006.
[7] MIDI Manufacturers Association Incorporated: Tutorial: History of
MIDI, 1995-2009. http://www.midi.org/aboutmidi/tut_history.
php. Retrieved November 18th 2009, 10:50 GMT.
[8] Neut, E.v.d.: Chord House, Piano Room. http://www.looknohands.
com/chordhouse/piano/. Retrieved Januar 8th 2010, 10:00 GMT.
[9] Pardo, B. and Birmingham, W.: The Chordal Analysis of Tonal Music,
Technical Report CSE-TR-439-01. The University of Michigan, Electrical Engineering and Computer Science Department, 2001.
[10] Pedersen, E.: Verificering og generering af koralharmoniseringer (eng.:
Verification and generation of choral harmony). 2006.
[11] Project Gutenberg: Online Book Catalog. http://www.gutenberg.
org/catalog/. Retrieved January 18th, 12:00 GMT.
[12] Recordare: MusicXML 2.0 DTD Index. http://www.recordare.com/
dtds/index.html. Retrieved January 29th 2010, 13:45 GMT.
[13] Recordare: MusicXML FAQ. http://www.recordare.com/xml/faq.
html. Retrieved November 18th 2009, 12:00 GMT.
36
REFERENCES
[14] Recordare: MusicXML Samples. http://www.recordare.com/xml/
samples.html. Retrieved January 18th, 12:00 GMT.
[15] Scholz, R. and Ramalho, G.: COCHONUT: Recognizing Complex
Chords From MIDI Guitar Sequences, Proceedings of the 9th International Conference on Music Information Retrieval, 2008.
[16] Sleator, D. and Temperley, D.: The Melisma Music Analyzer. http://
www.link.cs.cmu.edu/music-analysis/. Retrieved December 26th
2009, 10:40 GMT.
[17] Yoshioka, T. et. al.: Automatic Chord Transcription with concurrent
recognition of chord symbols and boundaries. Proceedings of the 5th
International Conference on Music Information Retrieval, 2004.
37
A
A
SUPPORTED CHORD-TYPES
Supported chord-types
The ContextAnalyzer of the system is able to recognize the chord-types
given below. When referring to a “scale”, it is the scale used in the score.
Type
Tonic
Dominant
Dominant seventh
Subdominant
Tonic parallel
Subdominant parallel
Dominant parallel
In-complete dominant
Dominant none
Dominant quarter-sixth
Subdominant sixth
In-complete subdominant
Minor subdominant
Subdominant parallel seventh
Description
Chord build on the first step in scale
Build on the fifth step in scale
Dominant, added a small seventh
Build on fourth step in scale
Parallel-chord to tonic
Parallel-chord to subdominant
Parallel-chord to dominant
Dominant seventh with no pitch at root
Dominant, added a small none
Dominant where the third is replaced
with a quarter and the fifth is replaced
with a sixth
Subdominant, added a small sixth
Subdominant sixth with no fifth
Subdominant, added a large sixth
Subdominant parallel, added a seventh
38