| David Birchfield | ||||
| home | portfolio | research | teaching | contact/bio |
|---|---|---|---|---|
Neural Network Model for Tonal Harmony Feature ExtractionThe model employs a multi-tier architecture of three neural networks. Initially the lower two levels of the network are trained by a delta-rule algorithm which incorporates much of the empirical evidence from Carol Krumhansl's chord and key profiles. Given the input of a set of categorized pitch classes, the lowest tier of the model produces a ranking for each chord which corresponds to the model's confidence that the input pitch class form a given major, minor, or diminished triad. This chord output is utilized as a portion of the input to the second tier of the model. The input set to the second tier network is the sum of the seven most recent chord output values, where each chord is scaled to reflect a diminishing influence of chords which appeared in past time steps. This middle tier of the model produces an output which assigns a ranking to each of the twenty-four major and minor keys, corresponding to a hypothetical listener's perception of a tonal center. Finally, the uppermost tier of the model takes both the most recent chord and key outputs from the lower tiers, and produces an output ranking for each of the major, minor, and diminished chords, reflecting the model's relative expectation that a given chord will be the next harmonic event. While the training sets for the lower two neural networks are inherent to the model, this upper tier is minutely modified with each input step and constantly adjusts its weights in to more accurately align its expectation ratings with the actual harmonic progressions to which it is exposed. Chord distance is the inverse of high expectation. In regard to the model described above, a high rating that a given chord will occur next corresponds to a small distance. Thus, the model establishes a tonal pitch space whereby given pitch class inputs, it will determine a chord and its key context, and at each time step the upper tier will produce a distance rating for all major, minor, and diminished triads. Given that this upper tier is initially seeded with random values, and is gradually trained exclusively by the input pitch classes, the model can produce a subtly detailed pitch space which is sensitive to the stylistic idiosyncrasies of the music to which it is exposed.
|
January 26, 2004