Dual-Transformer Framework for Vietnamese Image-to-Poem
Generation

Nguyen, Minh Chau; Dang, Van Thin; Nguyen, Vinh Tiep; Truong, Quoc Truong; Nguyen, Thanh Son

Please use this identifier to cite or link to this item: https://elib.vku.udn.vn/handle/123456789/4002

Full metadata record

DC Field	Value	Language
dc.contributor.author	Nguyen, Minh Chau	-
dc.contributor.author	Dang, Van Thin	-
dc.contributor.author	Nguyen, Vinh Tiep	-
dc.contributor.author	Truong, Quoc Truong	-
dc.contributor.author	Nguyen, Thanh Son	-
dc.date.accessioned	2024-07-30T02:48:27Z	-
dc.date.available	2024-07-30T02:48:27Z	-
dc.date.issued	2024-07	-
dc.identifier.isbn	978-604-80-9774-5	-
dc.identifier.uri	https://elib.vku.udn.vn/handle/123456789/4002	-
dc.description	Proceedings of the 13th International Conference on Information Technology and Its Applications (CITA 2024); pp: 2-13.	vi_VN
dc.description.abstract	The Image-to-Poem is one of the novelty tasks in Artificial Intelligence and has received researchers' attention in recent years. This task requires the system to automatically generate a poem which is a creative content and specific structure in an aesthetically pleasing manner based on the input's image. However, most existing methods still have problems with topic inconsistency and irregularity to tackle this task. Moreover, the lack of benchmark datasets is a big problem because of the different points of view of a person which leads to difficulty in creating Image-to-Poem datasets, especially the low-resource languages. Therefore, in this paper, we present a Visual-68Poem dataset for the Vietnamese Image-to-Poem task with six-eight poems with a variety of content and context. In addition, we propose a Dual-Transformer architecture, including a component to extract the main objects and concept keywords from the image and a language model to generate a six-eight poem. Specifically, the generated poems must be similar to the context of the image and are complied with the rules of the structural genre of poetry. Experimental results on our standard dataset show that our proposed models consistently achieve competitive performance over other models on different measure scores. We release our dataset and code to facilitate future work on this task.	vi_VN
dc.language.iso	en	vi_VN
dc.publisher	Vietnam-Korea University of Information and Communication Technology	vi_VN
dc.relation.ispartofseries	CITA;	-
dc.subject	Language model	vi_VN
dc.subject	Text generation	vi_VN
dc.subject	Poem generation	vi_VN
dc.subject	Image-to-text	vi_VN
dc.subject	Vietnamese language	vi_VN
dc.subject	Transformer architectures	vi_VN
dc.title	Dual-Transformer Framework for Vietnamese Image-to-Poem Generation	vi_VN
dc.type	Working Paper	vi_VN
Appears in Collections:	CITA 2024 (Proceeding - Vol 2)

Files in This Item:

Sign in to read

Show simple item record