Cheng (Calvin) Yang's Music IR Research Page

Content-Based Music Retrieval

Developments in internet technology have made a large volume of multimedia data, in particular music audio data, available to the general public, and yet there are not many search tools that can help users search through these data. Most existing search tools rely on file names or text labels, but they become useless when meaningful text descriptions are not available. A truly content-based music retrieval system should have the ability to find similar songs based on their underlying score or melody, regardless of their text description. Past research on content-based music retrieval has primarily focused on score-based data such as MIDI, rather than raw audio music. However, most music data is found in various raw audio formats, and there is no known algorithm to convert raw audio music files into MIDI-style representation.

My primary research interest is on content-based music retrieval for raw audio databases; both the underlying database and the user query are given in raw audio formats such as .wav.

Similarity is based on the intuitive notion of similarity perceived by humans: two pieces are similar if they are fully or partially based on the same score, even if they are performed by different people or at different tempo. More specifically, we identify five different types of "similar" music pairs, with increasing levels of difficulty:

Type I: Identical digital copy
Type II: Same analog source, different digital copies, possibly with noise
- Example: e043.wav vs. e108.wav (Mendelssohn - Spring Song)
Type III: Same instrumental performance, different vocal components
- Example: e116.wav vs. e117.wav (Kirka - Surun Pyyhit Silmistäni)
Type IV: Same score, different performances (possibly at different tempo)
- Example: e071.wav vs. e107.wav (Tchaikovsky - Piano Concerto No. 1)
Type V: Same underlying melody, different otherwise, with possible transposition
- Example: e106.wav vs. e114.wav (Beethoven - Symphony No. 5)

Our current retrieval system can deal with the first 4 types of "similarity" with reasonable accuracy.

Publications

Music Database Retrieval Based on Spectral Similarity. PDF file. (Stanford University Database Group technical report 2001-14.) A shorter version of this paper appears in International Symposium on Music Information Retrieval, 2001.
MACS: Music Audio Characteristic Sequence Indexing for Similarity Retrieval. PDF file. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics , 2001.
Cheng Yang: The MACSIS Acoustic Indexing Framework for Music Retrieval: An Experimental Study. In International Conference on Music Information Retrieval, 2002.
Cheng Yang: Efficient Acoustic Index for Music Retrieval with Various Degrees of Similarity. In Proc. ACM Multimedia, 2002. PS file, PDF file.
Cheng Yang: Peer-to-Peer Architecture for Content-Based Music Retrieval on Acoustic Data. In International World Wide Web Conference, 2003.

Other Publications

Cheng (Calvin) Yang / yangc@cs.stanford.edu / Stanford University Database Group