Abstract
In this paper, we investigate the problem of animal sound classification using deep learning and propose a system based on convolutional neural network architecture. As the input to the network, sound files were preprocessed to extract Mel Frequency Cepstral Coefficients (MFCC) using LibROSA library. To train and test the system we have collected 875 animal sound samples from an online sound source site for 10 different animal types. We report classification confusion matrices and the results obtained by different gradient descent optimizers. The best accuracy of 75% was obtained by Nesterov-accelerated Adaptive Moment Estimation (Nadam).
Original language | English |
---|---|
Title of host publication | UBMK 2018 - 3rd International Conference on Computer Science and Engineering |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 625-629 |
Number of pages | 5 |
ISBN (Electronic) | 9781538678930 |
DOIs | |
Publication status | Published - 6 Dec 2018 |
Externally published | Yes |
Event | 3rd International Conference on Computer Science and Engineering, UBMK 2018 - Sarajevo, Bosnia and Herzegovina Duration: 20 Sept 2018 → 23 Sept 2018 |
Publication series
Name | UBMK 2018 - 3rd International Conference on Computer Science and Engineering |
---|
Conference
Conference | 3rd International Conference on Computer Science and Engineering, UBMK 2018 |
---|---|
Country/Territory | Bosnia and Herzegovina |
City | Sarajevo |
Period | 20/09/18 → 23/09/18 |
Bibliographical note
Publisher Copyright:© 2018 IEEE.
Keywords
- Animal sound classification
- Confusion Matrix (CF)
- Convolution Neural Network (CNN)
- Mel Frequency Cepstral Coefficient (MFCC)