TY - GEN
T1 - Mixed type audio classification with support vector machine
AU - Chen, Lei
AU - Gündüz, Sule
AU - Özsu, M. Tamer
PY - 2006
Y1 - 2006
N2 - Content-based classification of audio data is an important problem for various applications such as overall analysis of audio-visual streams, boundary detection of video story segment, extraction of speech segments from video, and content-based video retrieval. Though the classification of audio into single type such as music, speech, environmental sound and silence is well studied, classification of mixed type audio data, such as clips having speech with music as background, is still considered a difficult problem. In this paper, we present a mixed type audio classification system based on Support Vector Machine (SVM). In order to capture characteristics of different types of audio data, besides selecting audio features, we also design four different representation formats for each feature. Our SVM-based audio classifier can classify audio data into five types: music, speech, environment sound, speech mixed with music, and music mixed with environment sound. The experimental results show that our system outperforms other classification systems using k Nearest Neighbor (k-NN), Neural Network (NN), and Naive Bayes (NB).
AB - Content-based classification of audio data is an important problem for various applications such as overall analysis of audio-visual streams, boundary detection of video story segment, extraction of speech segments from video, and content-based video retrieval. Though the classification of audio into single type such as music, speech, environmental sound and silence is well studied, classification of mixed type audio data, such as clips having speech with music as background, is still considered a difficult problem. In this paper, we present a mixed type audio classification system based on Support Vector Machine (SVM). In order to capture characteristics of different types of audio data, besides selecting audio features, we also design four different representation formats for each feature. Our SVM-based audio classifier can classify audio data into five types: music, speech, environment sound, speech mixed with music, and music mixed with environment sound. The experimental results show that our system outperforms other classification systems using k Nearest Neighbor (k-NN), Neural Network (NN), and Naive Bayes (NB).
UR - http://www.scopus.com/inward/record.url?scp=34247632250&partnerID=8YFLogxK
U2 - 10.1109/ICME.2006.262954
DO - 10.1109/ICME.2006.262954
M3 - Conference contribution
AN - SCOPUS:34247632250
SN - 1424403677
SN - 9781424403677
T3 - 2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings
SP - 781
EP - 784
BT - 2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings
T2 - 2006 IEEE International Conference on Multimedia and Expo, ICME 2006
Y2 - 9 July 2006 through 12 July 2006
ER -