Cheng (Calvin) Yang's Music IR Research Page
Content-Based Music Retrieval
Developments in internet technology have made a large volume of
multimedia data, in particular music audio data, available to the
general public, and yet there are not many search tools that can help
users search through these data. Most existing search tools rely on
file names or text labels, but they become useless when meaningful
text descriptions are not available. A truly content-based music
retrieval system should have the ability to find similar songs based
on their underlying score or melody, regardless of their text
description. Past research on content-based music retrieval has
primarily focused on score-based data such as MIDI, rather than raw
audio music. However, most music data is found in various raw audio
formats, and there is no known algorithm to convert raw audio music
files into MIDI-style representation.
My primary research interest is on content-based music retrieval for
raw audio databases; both the underlying database and the user query
are given in raw audio formats such as .wav.
Similarity is based on the intuitive notion of similarity perceived by
humans: two pieces are similar if they are fully or partially based on
the same score, even if they are performed by different people or at
different tempo. More specifically, we identify five different types
of "similar" music pairs, with increasing levels of difficulty:
- Type I: Identical digital copy
- Type II: Same analog source, different digital copies, possibly with noise
- Example: e043.wav vs. e108.wav
(Mendelssohn - Spring Song)
- Type III: Same instrumental performance, different vocal components
- Example: e116.wav vs. e117.wav
(Kirka - Surun Pyyhit Silmistäni)
- Type IV: Same score, different performances (possibly at different tempo)
- Example: e071.wav vs. e107.wav
(Tchaikovsky - Piano Concerto No. 1)
- Type V: Same underlying melody, different otherwise, with possible transposition
- Example: e106.wav vs. e114.wav
(Beethoven - Symphony No. 5)
Our current retrieval system can deal with the first 4 types of "similarity" with reasonable accuracy.
Publications
- Music Database Retrieval Based on Spectral Similarity.
PDF file.
(Stanford University Database Group technical report 2001-14.)
A shorter version of this paper appears in
International Symposium on Music Information Retrieval, 2001.
- MACS: Music Audio Characteristic Sequence Indexing for Similarity Retrieval.
PDF file.
In
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
, 2001.
- Cheng Yang: The MACSIS Acoustic Indexing Framework for Music Retrieval: An Experimental Study.
In International Conference on Music Information Retrieval, 2002.
- Cheng Yang: Efficient Acoustic Index for Music Retrieval with Various Degrees of Similarity.
In Proc. ACM Multimedia, 2002.
PS file,
PDF file.
- Cheng Yang: Peer-to-Peer Architecture for Content-Based Music Retrieval on Acoustic Data.
In International World Wide Web Conference, 2003.
Other Publications
Cheng (Calvin) Yang /
yangc@cs.stanford.edu
/
Stanford University Database Group