Diverse Knowledge Selection for Enhanced Zero-shot Visual Question Answering | 임성수 교수 연구실 | 충남대학교

임성수 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

발행물

구성원

article|

gold

·인용수 0

·2025

Diverse Knowledge Selection for Enhanced Zero-shot Visual Question Answering

Seunghoon Han, Min-Gyu Choi, Hyewon Lee, Soyoung Park, Jong-Ryul Lee, Sungsu Lim, Taeho Kim

초록

Visual Question Answering (VQA) is one of the important tasks that can help artificial intelligence understand the real world. Recently, with the growing popularity of zero-shot VQA, research has focused on utilizing external knowledge to tackle complex problems, especially those requiring common sense. However, existing studies that attempt to leverage external knowledge often use large amounts of knowledge without any selection process. Since some of this knowledge may not contribute to accurate predictions, the process of selecting relevant knowledge is essential. To address this issue, we propose Diverse Knowledge Selection for Enhanced Visual Question Answering (DKSVQA), which consists of three stages: Image-Context Generation, Similarity-based Knowledge Selection, and Query-Knowledge Graph-based Knowledge Selection. We demonstrate the superior performance of DKSVQA on two VQA benchmark datasets and compare it with zero-shot VQA baseline models. We highlight both the effectiveness and efficiency of DKSVQA through extensive experiments. For reproducibility, the source code is available at https://github.com/gooriiie/DKSVQA.

키워드

Question answeringSelection (genetic algorithm)Computer scienceZero (linguistics)Shot (pellet)Artificial intelligenceInformation retrievalChemistryLinguistics

타입

article

IF / 인용수

- / 0

원문

https://doi.org/10.1145/3701716.3717563

게재 연도

2025