Please use this identifier to cite or link to this item: https://elib.vku.udn.vn/handle/123456789/2728
Full metadata record
DC FieldValueLanguage
dc.contributor.authorNguyen, Huu Nhat Minh-
dc.contributor.authorNguyen, Ket Doan-
dc.contributor.authorPham, Quoc Huy-
dc.contributor.authorKieu, Xuan Loc-
dc.contributor.authorHoang, Nguyen Vu-
dc.contributor.authorNguyen, Huy-
dc.contributor.authorHuynh, Cong Phap-
dc.date.accessioned2023-09-26T01:43:53Z-
dc.date.available2023-09-26T01:43:53Z-
dc.date.issued2023-07-
dc.identifier.isbn978-3-031-36886-8-
dc.identifier.urihttps://link.springer.com/chapter/10.1007/978-3-031-36886-8_21-
dc.identifier.urihttp://elib.vku.udn.vn/handle/123456789/2728-
dc.descriptionLecture Notes in Networks and Systems (LNNS, volume 734); CITA: Conference on Information Technology and its Applications; pp: 250-261.vi_VN
dc.description.abstractThe IT skills extractor is convenient and efficient for recent job recommendation systems and job seekers to find suitable jobs. In this paper, we design an efficient SpaCy pipeline for extracting IT skills based on Natural Language Processing (NLP) and Named Entity Recognition (NER) methods from the job description. The main proposed method helps to extract potential hard-soft skills and later could provide to job recommender and job seekers. As the state-of-the-art open-source NLP framework, we first construct a new IT skills dictionary based on ChatGPT and perform automatic labeling for scrapped job description dataset, named vku-ITSkills dataset. Using this dataset, the quality of labels could be improved by the Part-of-Speech (POS) function and additional rules. We then fine-tune the pre-trained RoBERTa-base model for Transformer based word embedding in training NER model to extract skills. Thereafter, we define additional logical rules to enhance the extracted results that could further find out more skills based on syntactic such as the comma rule. In this language pipeline, RoBERTa embedding, NER, and additional rules play important roles to cope with unseen and new IT skills that are non-existed in vku-ITSkills dataset and are missed from NER. Throughout the evaluation, we test the proposed pipeline with 200 job descriptions manually labeled by our team and demonstrate the efficiency of each step in the pipeline.vi_VN
dc.language.isoenvi_VN
dc.publisherSpringer Naturevi_VN
dc.subjectIT Skills Datasetvi_VN
dc.subjectNamed Entity Recognitionvi_VN
dc.subjectNatural Language Processingvi_VN
dc.titleInformation Technology Skills Extractor for Job Descriptions in vku-ITSkills Dataset Using Natural Language Processingvi_VN
dc.typeWorking Papervi_VN
Appears in Collections:CITA 2023 (International)

Files in This Item:

 Sign in to read



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.