Vui lòng dùng định danh này để trích dẫn hoặc liên kết đến tài liệu này: https://elib.vku.udn.vn/handle/123456789/6203
Nhan đề: A Video Retrieval System with EVA-CLIP and Customized Keyframes Extraction
Tác giả: C. Quan, Khanh An
Nguyen, Qui Ngoc
Từ khoá: Video retrieval
Multimodal retrieval
Keyframes extraction
Năm xuất bản: thá-2026
Nhà xuất bản: Springer Nature
Tóm tắt: Video retrieval is a crucial problem in computer vision and natural language processing, aiming to search and retrieve videos (or scenes, shots, and frames) based on a user given a video query. Among recent approaches for this task, keyframes-based content retrieval with a vision language model has shown significant efficiency and potential results. In this study, we evaluate various recent vision-language models for their effectiveness in replacing traditional CLIP in video retrieval problems with textual descriptions. We also introduced a customized keyframe extraction module using visual similarity to solve this issue. The findings indicate that, in terms of retrieval performance, EVA-CLIP and SigLIP notably surpass CLIP and alternative models when evaluated on the large-scale video datasets V3C. Furthermore, we introduce a customization keyframe extracting module to address the issue of information lacking in the default keyframe given in the dataset when compared to textual queries. Our customized keyframe extraction not only decreased the number of keyframes but also significantly improved retrieval outcomes compared to the default keyframes provided in the V3C dataset and extracted by Cineast.
Mô tả: Lecture Notes in Networks and Systems (LNNS,volume 1581); The 14th Conference on Information Technology and Its Applications (CITA 2025) ; pp: 429-443
Định danh: https://doi.org/10.1007/978-3-032-00972-2_32
https://elib.vku.udn.vn/handle/123456789/6203
ISBN: 978-3-032-00971-5 (p)
978-3-032-00972-2 (e)
Bộ sưu tập: CITA 2025 (International)

Các tập tin trong tài liệu này:

 Đăng nhập để xem toàn văn



Khi sử dụng các tài liệu trong Thư viện số phải tuân thủ Luật bản quyền.