기본 정보
연구 분야
프로젝트
발행물
구성원
article|
gold
·인용수 0
·2025
Diverse Knowledge Selection for Enhanced Zero-shot Visual Question Answering
Seunghoon Han, Min-Gyu Choi, Hyewon Lee, Soyoung Park, Jong-Ryul Lee, Sungsu Lim, Taeho Kim
초록

Visual Question Answering (VQA) is one of the important tasks that can help artificial intelligence understand the real world. Recently, with the growing popularity of zero-shot VQA, research has focused on utilizing external knowledge to tackle complex problems, especially those requiring common sense. However, existing studies that attempt to leverage external knowledge often use large amounts of knowledge without any selection process. Since some of this knowledge may not contribute to accurate predictions, the process of selecting relevant knowledge is essential. To address this issue, we propose Diverse Knowledge Selection for Enhanced Visual Question Answering (DKSVQA), which consists of three stages: Image-Context Generation, Similarity-based Knowledge Selection, and Query-Knowledge Graph-based Knowledge Selection. We demonstrate the superior performance of DKSVQA on two VQA benchmark datasets and compare it with zero-shot VQA baseline models. We highlight both the effectiveness and efficiency of DKSVQA through extensive experiments. For reproducibility, the source code is available at https://github.com/gooriiie/DKSVQA.

키워드
Question answeringSelection (genetic algorithm)Computer scienceZero (linguistics)Shot (pellet)Artificial intelligenceInformation retrievalChemistryLinguistics
타입
article
IF / 인용수
- / 0
게재 연도
2025