MARS >   etd @ Mason (Electronic Theses and Dissertations) >   The Volgenau School of Engineering >

Please use this identifier to cite or link to this item:

Title: Robust Realtime Polyphonic Pitch Detection
Author(s): Thomas, John M.
Advisor(s): Nelson, Jill
Keywords: Music Transcription
Pitch Detection
Optimal Accuracy
Issue Date: 12-Jun-2012
Abstract: Pitch detection is a subset of automatic music transcription, which is the application of various signal processing algorithms with the specific intent of automatically gathering musical information from audio signals. This field, in various forms, has been the subject of much research over the years, as it has virtually endless possibilities for application. Much work has been done on monophonic signals, however, much less has been done to tackle the problem of polyphonic music. One major issue for polyphonic pitch detection systems is efficiency. Most existing algorithms sacrifice efficiency for accuracy and robustness, while some others take the opposite tradeoff. The purpose of this paper is to work toward new systems that are both robust and efficient enough to run in realtime. First, the relevant background information necessary to explore this topic is presented. Musical terminology and concepts are explained, and some common analytical tools and algorithms used in existing systems are described. Three relatively efficient reference systems are then presented. The first is a multiple fundamental frequency (F0) estimator based on the Auto-Correlation Function (ACF) that utilizes a unique enhancement algorithm to easily identify individual pitch components. The second is a multiple F0 estimator based on the Fast Fourier Transform (FFT) that exploits the harmonic nature of musical sounds. The third reference system outputs directly to pitch numbers by using a modified form of an unsupervised learning algorithm called Non-Negative Matrix Factorization (NMF). It is clear from investigation of the first two reference systems that the two opposing camps on fundamental frequency estimation (FFT vs. ACF) are actually quite complementary. Therefore, the remainder of the paper explores the inherent high-frequency versus low-frequency accuracy tradeoffs and proposes potential solutions. A novel analysis tool called the Combined ACF/FFT Representation (CAFR) is developed and three new pitch detection algorithms are devised from it. These algorithms are then evaluated for both robustness and efficiency and compared against results for the three reference systems.
Degree: Masters in Electrical Engineering
Type: Thesis
Appears in Collections:The Volgenau School of Engineering

Files in This Item:

File Description SizeFormat
Thomas_John_thesis_2012.pdf1.42 MBAdobe PDFView/Open

Items in MARS are protected by copyright, with all rights reserved, unless otherwise indicated.


University Libraries |  Feedback