Data Augmentation Methods for Cross-Device Acoustic Scene Classification

Dang, An; Vu, Toan

Please use this identifier to cite or link to this item: https://elib.vku.udn.vn/handle/123456789/4287

Full metadata record

DC Field	Value	Language
dc.contributor.author	Dang, An	-
dc.contributor.author	Vu, Toan	-
dc.date.accessioned	2024-12-06T06:59:30Z	-
dc.date.available	2024-12-06T06:59:30Z	-
dc.date.issued	2024-11	-
dc.identifier.isbn	978-3-031-74126-5	-
dc.identifier.uri	https://elib.vku.udn.vn/handle/123456789/4287	-
dc.identifier.uri	https://doi.org/10.1007/978-3-031-74127-2_24	-
dc.description	Lecture Notes in Networks and Systems (LNNS,volume 882); The 13th Conference on Information Technology and Its Applications (CITA 2024) ; pp: 283-294.	vi_VN
dc.description.abstract	Recent advances in deep neural network (DNN) methods have improved the accuracy of acoustic scene classification (ASC). However, these DNN systems have struggled to classify audio scenes across domains, and when faced with domain imbalance in ASC datasets. In this study, we propose an ASC system that addresses these issues using two data augmentation methods. The first method, MixStyleFreq, reduces device mismatch problems by combining the frequency-wise means and standard deviations of convolutional feature maps from different audio scenes. The second method, Spectrum Normalization Augmentation (SpecNormAug), generates additional data for minority devices based on majority devices, improving the representation of minority devices and reducing bias in DNNs toward dominant devices. Our model is built on the efficient MobileNetV2 network, suitable for ASC applications on devices with limited computational capacity. We evaluate our methods on the TAU Urban Acoustic Scene 2020 Mobile dataset, featuring audio scenes recorded by multiple devices. Our approaches significantly improve generalization performance for ASC tasks compared to other data augmentation methods and achieve competitive results compared to state-of-the-art methods.	vi_VN
dc.language.iso	en	vi_VN
dc.publisher	Springer Nature	vi_VN
dc.subject	TAU Urban Acoustic Scene 2020 Mobile dataset, featuring audio scenes recorded by multiple devices	vi_VN
dc.subject	Neural network (DNN) methods have improved the accuracy of acoustic scene classification (ASC)	vi_VN
dc.title	Data Augmentation Methods for Cross-Device Acoustic Scene Classification	vi_VN
dc.type	Working Paper	vi_VN
Appears in Collections:	CITA 2024 (International)

Files in This Item:

Sign in to read

Show simple item record