Please use this identifier to cite or link to this item: https://elib.vku.udn.vn/handle/123456789/6203
Full metadata record
DC FieldValueLanguage
dc.contributor.authorC. Quan, Khanh An-
dc.contributor.authorNguyen, Qui Ngoc-
dc.date.accessioned2026-01-20T01:49:22Z-
dc.date.available2026-01-20T01:49:22Z-
dc.date.issued2026-01-
dc.identifier.isbn978-3-032-00971-5 (p)-
dc.identifier.isbn978-3-032-00972-2 (e)-
dc.identifier.urihttps://doi.org/10.1007/978-3-032-00972-2_32-
dc.identifier.urihttps://elib.vku.udn.vn/handle/123456789/6203-
dc.descriptionLecture Notes in Networks and Systems (LNNS,volume 1581); The 14th Conference on Information Technology and Its Applications (CITA 2025) ; pp: 429-443vi_VN
dc.description.abstractVideo retrieval is a crucial problem in computer vision and natural language processing, aiming to search and retrieve videos (or scenes, shots, and frames) based on a user given a video query. Among recent approaches for this task, keyframes-based content retrieval with a vision language model has shown significant efficiency and potential results. In this study, we evaluate various recent vision-language models for their effectiveness in replacing traditional CLIP in video retrieval problems with textual descriptions. We also introduced a customized keyframe extraction module using visual similarity to solve this issue. The findings indicate that, in terms of retrieval performance, EVA-CLIP and SigLIP notably surpass CLIP and alternative models when evaluated on the large-scale video datasets V3C. Furthermore, we introduce a customization keyframe extracting module to address the issue of information lacking in the default keyframe given in the dataset when compared to textual queries. Our customized keyframe extraction not only decreased the number of keyframes but also significantly improved retrieval outcomes compared to the default keyframes provided in the V3C dataset and extracted by Cineast.vi_VN
dc.language.isoenvi_VN
dc.publisherSpringer Naturevi_VN
dc.subjectVideo retrievalvi_VN
dc.subjectMultimodal retrievalvi_VN
dc.subjectKeyframes extractionvi_VN
dc.titleA Video Retrieval System with EVA-CLIP and Customized Keyframes Extractionvi_VN
dc.typeWorking Papervi_VN
Appears in Collections:CITA 2025 (International)

Files in This Item:

 Sign in to read



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.