A Hybrid Learning of Lexical and Language Processing for Domain Credibility Classification

Nguyen, Huu Nhat Minh; Nguyen, D. Bao; Ton, That Ron; Truong, The Quoc Dung; Truong, Dinh Dung; Phung, Anh Sang; Pham, Van Nam; Tran, The Son

Vui lòng dùng định danh này để trích dẫn hoặc liên kết đến tài liệu này: https://elib.vku.udn.vn/handle/123456789/5067

Nhan đề:	A Hybrid Learning of Lexical and Language Processing for Domain Credibility Classification
Tác giả:	Nguyen, Huu Nhat Minh Nguyen, D. Bao Ton, That Ron Truong, The Quoc Dung Truong, Dinh Dung Phung, Anh Sang Pham, Van Nam Tran, The Son
Từ khoá:	Domain credibility Hybrid learning Natural language processing
Năm xuất bản:	thá-2024
Nhà xuất bản:	IEEE
Tóm tắt:	Malicious domains and websites pose a significant threat to normal users and their increasing prevalence demands for early detection methods. More and more domains registered with malicious intent are becoming more excessively difficult to prevent and detect. Leveraging the recent powerful BERT based-language representation and conventional lexical feature representation, we introduce a novel hybrid learning model that utilizes both lexical characteristics and semantic language features of inspected domains for domain credibility classification. The proposed model employs a combination of lexical and language encoders to process lexical features like length, special character count, domain type, domain entropy, and domain digits while fine-tuning the pre-trained language models such as Vietnamese PhoBERT and multilingual XLM-RoBERTa to capture semantic information from the domain. Through the experimental results, the hybrid learning models outperform the baselines such as using solely lexical encoder or language encoder in differentiatioz between High or Low credibility domains.
Mô tả:	2024 International Conference on Advanced Technologies for Communications (ATC 2024);
Định danh:	10.1109/ATC63255.2024.10908153 https://elib.vku.udn.vn/handle/123456789/5067
ISBN:	979-8-3503-5397-6
ISSN:	2162-1020
Bộ sưu tập:	NĂM 2024

Các tập tin trong tài liệu này:

Đăng nhập để xem toàn văn

Hiển thị đầy đủ biểu ghi tài liệu Xem thống kê

Khi sử dụng các tài liệu trong Thư viện số phải tuân thủ Luật bản quyền.