Please use this identifier to cite or link to this item:
https://elib.vku.udn.vn/handle/123456789/6203| Title: | A Video Retrieval System with EVA-CLIP and Customized Keyframes Extraction |
| Authors: | C. Quan, Khanh An Nguyen, Qui Ngoc |
| Keywords: | Video retrieval Multimodal retrieval Keyframes extraction |
| Issue Date: | Jan-2026 |
| Publisher: | Springer Nature |
| Abstract: | Video retrieval is a crucial problem in computer vision and natural language processing, aiming to search and retrieve videos (or scenes, shots, and frames) based on a user given a video query. Among recent approaches for this task, keyframes-based content retrieval with a vision language model has shown significant efficiency and potential results. In this study, we evaluate various recent vision-language models for their effectiveness in replacing traditional CLIP in video retrieval problems with textual descriptions. We also introduced a customized keyframe extraction module using visual similarity to solve this issue. The findings indicate that, in terms of retrieval performance, EVA-CLIP and SigLIP notably surpass CLIP and alternative models when evaluated on the large-scale video datasets V3C. Furthermore, we introduce a customization keyframe extracting module to address the issue of information lacking in the default keyframe given in the dataset when compared to textual queries. Our customized keyframe extraction not only decreased the number of keyframes but also significantly improved retrieval outcomes compared to the default keyframes provided in the V3C dataset and extracted by Cineast. |
| Description: | Lecture Notes in Networks and Systems (LNNS,volume 1581); The 14th Conference on Information Technology and Its Applications (CITA 2025) ; pp: 429-443 |
| URI: | https://doi.org/10.1007/978-3-032-00972-2_32 https://elib.vku.udn.vn/handle/123456789/6203 |
| ISBN: | 978-3-032-00971-5 (p) 978-3-032-00972-2 (e) |
| Appears in Collections: | CITA 2025 (International) |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.