Vui lòng dùng định danh này để trích dẫn hoặc liên kết đến tài liệu này: https://elib.vku.udn.vn/handle/123456789/3995
Toàn bộ biểu ghi siêu dữ liệu
Trường DCGiá trị Ngôn ngữ
dc.contributor.authorNguyen, Ket Doan-
dc.contributor.authorTran, Nguyen Anh-
dc.contributor.authorVo, Van Nam-
dc.contributor.authorNguyen, Tran Tien-
dc.contributor.authorLe, Pham Tuyen-
dc.contributor.authorNguyen, Quoc Vuong-
dc.contributor.authorNguyen, Huu Nhat Minh-
dc.date.accessioned2024-07-30T01:28:19Z-
dc.date.available2024-07-30T01:28:19Z-
dc.date.issued2024-06-
dc.identifier.issn1859-3526-
dc.identifier.urihttps://doi.org/10.32913/mic-ict-research-vn.v2024.n1.1271-
dc.identifier.urihttps://ictmag.vn/cntt-tt/article/view/1271/566-
dc.identifier.urihttps://elib.vku.udn.vn/handle/123456789/3995-
dc.descriptionResearch and Development on Information and Communication Technology; pp: 49-55.vi_VN
dc.description.abstractAutomatic Speech Recognition, also known as ASR, has grown exponentially over the past decade and is used to recognize and translate human speech into readable text automatically. However, Vietnamese Speech Recognition faces critical challenges such as frequent mispronunciations as well as a huge variant in Vietnamese speech. In this work, we dive into the difficult challenge of Mispronunciation Detection (MD) in the Vietnamese language. As such a tonal language, Vietnamese is not only based on consonants and vowels but also on variations in pitch or tone during pronunciation. In this paper, we propose DaNangVMD model for detecting mispronunciations in Vietnamese speech based on the audio speech and canonical transcript. By leveraging multi-head attention-based multimodal representation from the embeddings of the phonetic encoder and linguistic encoder, DaNangVMD aims to provide a robust solution for accurate mispronunciation detection and diagnosis. Throughout the extensive evaluation, the proposed DaNangVMD exhibits superior performances rather than that of the PAPL baseline models by 15% in F1 score and 13% in accuracy.vi_VN
dc.language.isoenvi_VN
dc.publisherJournal of Infomation & Communicationsvi_VN
dc.subjectMispronunciation Detectionvi_VN
dc.subjectMultimodal Embeddingvi_VN
dc.subjectVietnamese Speech Recognitionvi_VN
dc.titleDaNangVMD: Vietnamese Speech Mispronunciation Detectionvi_VN
dc.title.alternativeDaNangVMD: Nhận diện phát âm sai tiếng Việtvi_VN
dc.typeWorking Papervi_VN
Bộ sưu tập: NĂM 2024

Các tập tin trong tài liệu này:

 Đăng nhập để xem toàn văn



Khi sử dụng các tài liệu trong Thư viện số phải tuân thủ Luật bản quyền.