Please use this identifier to cite or link to this item: https://elib.vku.udn.vn/handle/123456789/6214
Title: A Comparative Study on Domain and Content-Based Approaches for Abusive Website Detection
Authors: Nguyen, Quoc Vuong
Le, Tang Phu Quy
Pham, Van Nam
Ton, That Ron
Phung, Anh Sang
Truong, The Quoc Dung
Nguyen, Ngoc Xuan Quynh
Nguyen, Huu Nhat Minh
Keywords: Abusive website detection
Machine learning
Language model
Feature engineering
Issue Date: Jan-2026
Publisher: Springer Nature
Abstract: The proliferation of abusive websites, particularly those facilitating phishing, fraud has emerged as a critical cybersecurity threat. Detecting these abusive websites efficiently remains a crucial challenge, necessitating sophisticated feature engineering and advanced machine learning techniques. In this paper, we present a comprehensive comparative study of domain-based and content-based approaches for abusive website detection with two datasets such as Vietnamese abusive websites and international phising datasets. Through extensive evaluation, we demonstrate that the integration of multiple feature types significantly enhances the detection accuracy. In particular, hosting-related features exhibit strong independent predictive capability, while machine learning models that take advantage of these features continue to achieve robust performance. Although extracted features contribute substantially to high-accuracy detection, our findings indicate that source code analysis is the most effective method for identifying abusive websites. In particular, language models, such as Phishlang, excel at capturing the textual patterns within website source code, achieving outstanding performance with an accuracy of 0.98 and an F1-score of 0.97.
Description: Lecture Notes in Networks and Systems (LNNS,volume 1581); The 14th Conference on Information Technology and Its Applications (CITA 2025) ; pp: 273-285
URI: https://doi.org/10.1007/978-3-032-00972-2_21
https://elib.vku.udn.vn/handle/123456789/6214
ISBN: 978-3-032-00971-5 (p)
978-3-032-00972-2 (e)
Appears in Collections:CITA 2025 (International)

Files in This Item:

 Sign in to read



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.