Master's thesis

A comparison and evaluation of approaches to the automatic formal analysis of musical audio

Full text PDF, BIB
Presented as a poster to the joint AMS/SMT 2010 Annual Meeting. Poster PDF.
GitHub repo with audio, annotation and evaluation data.

I conducted a comparative evaluation of a handful of algorithms that produce formal analyses of music on a diverse set of corpora, including a new corpus of public domain music. In the same spirit, all of the evaluation data (including the full output of each algorithm) is available too.

Abstract

Analyzing the form or structure of pieces of music is a fundamental task for music theorists. Several algorithms have been developed to automatically produce formal analyses of music. However, comparing these algorithms to one another and judging their relative merits has been very difficult, principally because the algorithms are usually evaluated on separate data sets, consisting of different songs or representing wholly different genres of music, and methods of evaluating the performance of these algorithms have varied significantly. As a result, there has been little benchmarking of performance in this area of research. This work aims to address this by directly comparing several music structure analysis algorithms.

Five structure analysis algorithms representing a variety of approaches have been executed on three corpora of music, one of which was newly assembled from freely distributable music. The performance of each algorithm on each corpus has been measured using each of an extensive list of performance metrics.

Audio and annotation data

The evaluation used:

The Beatles dataset, a standard collection of all the band’s album recordings, with structural annotations by TUT, UPF and QMUL, all based on annotations by Allan Pollack;
The Popular Music Database of the RWC collection, with annotations also created by RWC;
Two new sets of jazz and classical music assembled and annotated for this thesis.

This last corpus contains entirely public domain (and hence freely shareable) pieces downloaded from the Internet Archive. The audio and annotations can both be accessed on GitHub.

Evaluation data

The output of the five algorithms on all of the audio files in the above corpora are also available to downnload at the same GitHub link.