Page 48 - Kỷ yếu hội thảo khoa học lần thứ 12 - Công nghệ thông tin và Ứng dụng trong các lĩnh vực (CITA 2023)
P. 48
32
microcontroller kit and the system can classify the labeled audio with an accuracy of
about 97%. One of the main limitations of this research is the scare of dataset that sig-
nificantly affect the overall performance and ongoing research are being performed
with dataset built based on the real hardware. Moreover, despite the positive result in
the training process, further work should be carried out to investigate the stability of
the model when running on different hardware devices, rather than the digital MEM
microphone MPD401 in this work, and in noisy environments (e.g., with different type
of noise form the surrounding).
References
1. R. Togneri, T. Ogunfunmi, M. Narasimha: Speech and Audio Processing for Coding,
Enhancement and Recognition. Springer, New York (2014).
2. R. Arandjelovic and A. Zisserman: Look, listen and learn. In Proceedings of the IEEE
international conference on computer vision (ICCV), pp. 609 617. Venice (2017).
3. M. Valenti, S. Squartini, A. Diment, G. Parascandolo, and T. Virtanen: A convolutional
neural network approach for acoustic scene classification. In: Proceedings of 2017
International Joint Conference on Neural Networks (IJCNN), pp. 1547 1554, IEEE Xplore
(2017).
4. M. Huzaifah: Comparison of time-frequency representations for environmental sound
classification using convolutional neural networks. arXiv preprint arXiv:1706.07156.
(2017).
5. J. J. Huang and J. J. A. Leanos: AclNet: Efficient end-to-end audio classification CNN.
arXiv preprint arXiv:1811.06669. (2018).
6. X. Zhang, Y. Zou, and W. Wang: LD-CNN: A lightweight dilated convolutional neural
network for environmental sound classification. In Proceedings of 2018 24th International
Conference on Pattern Recognition (ICPR), pp. 373 378. (2018).
7. F. Naccari, I. Guarneri, S. Curti, A. A. Savi: Embedded acoustic scene classification for low
power microcontroller devices. In: Proceedings of Detection and Classification of Acoustic
Scenes and Events 2020 (DCASE2020), Tokyo (2020).
8. Mesaros, A., Heittola, T., Virtanen, T: Acoustic scene classification: an overview of
DCASE 2017 challenge entries. In: 2018 16th International Workshop on Acoustic Signal
Enhancement (IWAENC), pp. 411-415. IEEE, (2018).
9. M. Meyer, T. Farei-Campagna, A. Pasztor, R. Da Forno, T. Gsell, J. Faillettaz, A. Vieli,
S. Weber, J. Beutel, L. Thiele: Event-triggered natural hazard monitoring with convolutional
neural networks on the edge. In: Proceedings of the 18th International Conference on
Infomation Processing in Sensor Networks (IPSN '19), pp. 73 84, ACM, (2019).
10. Developersbreach, https://developersbreach.com/convolution-neural-network-deep-learn-
ing/, last accessed 2023/01/11.
11. S.A. Arusha, Scream Dataset, https://www.kaggle.com/datasets/sanzidaakterarusha/scream-
dataset, last accessed 2023/01/11.
12. ST Electronic homepage, https://www.st.com/en/embedded-software/x-cube-ai.html, last
accessed 2023/01/11.
CITA 2023 ISBN: 978-604-80-8083-9