Vui lòng dùng định danh này để trích dẫn hoặc liên kết đến tài liệu này: https://elib.vku.udn.vn/handle/123456789/2728
Nhan đề: Information Technology Skills Extractor for Job Descriptions in vku-ITSkills Dataset Using Natural Language Processing
Tác giả: Nguyen, Huu Nhat Minh
Nguyen, Ket Doan
Pham, Quoc Huy
Kieu, Xuan Loc
Hoang, Nguyen Vu
Nguyen, Huy
Huynh, Cong Phap
Từ khoá: IT Skills Dataset
Named Entity Recognition
Natural Language Processing
Năm xuất bản: thá-2023
Nhà xuất bản: Springer Nature
Tóm tắt: The IT skills extractor is convenient and efficient for recent job recommendation systems and job seekers to find suitable jobs. In this paper, we design an efficient SpaCy pipeline for extracting IT skills based on Natural Language Processing (NLP) and Named Entity Recognition (NER) methods from the job description. The main proposed method helps to extract potential hard-soft skills and later could provide to job recommender and job seekers. As the state-of-the-art open-source NLP framework, we first construct a new IT skills dictionary based on ChatGPT and perform automatic labeling for scrapped job description dataset, named vku-ITSkills dataset. Using this dataset, the quality of labels could be improved by the Part-of-Speech (POS) function and additional rules. We then fine-tune the pre-trained RoBERTa-base model for Transformer based word embedding in training NER model to extract skills. Thereafter, we define additional logical rules to enhance the extracted results that could further find out more skills based on syntactic such as the comma rule. In this language pipeline, RoBERTa embedding, NER, and additional rules play important roles to cope with unseen and new IT skills that are non-existed in vku-ITSkills dataset and are missed from NER. Throughout the evaluation, we test the proposed pipeline with 200 job descriptions manually labeled by our team and demonstrate the efficiency of each step in the pipeline.
Mô tả: Lecture Notes in Networks and Systems (LNNS, volume 734); CITA: Conference on Information Technology and its Applications; pp: 250-261.
Định danh: https://link.springer.com/chapter/10.1007/978-3-031-36886-8_21
http://elib.vku.udn.vn/handle/123456789/2728
ISBN: 978-3-031-36886-8
Bộ sưu tập: CITA 2023 (International)

Các tập tin trong tài liệu này:

 Đăng nhập để xem toàn văn



Khi sử dụng các tài liệu trong Thư viện số phải tuân thủ Luật bản quyền.