Visual Question Answering (VQA) is one of the important tasks that can help artificial intelligence understand the real world. Recently, with the growing popularity of zero-shot VQA, research has focused on utilizing external knowledge to tackle complex problems, especially those requiring common sense. However, existing studies that attempt to leverage external knowledge often use large amounts of knowledge without any selection process. Since some of this knowledge may not contribute to accurate predictions, the process of selecting relevant knowledge is essential. To address this issue, we propose Diverse Knowledge Selection for Enhanced Visual Question Answering (DKSVQA), which consists of three stages: Image-Context Generation, Similarity-based Knowledge Selection, and Query-Knowledge Graph-based Knowledge Selection. We demonstrate the superior performance of DKSVQA on two VQA benchmark datasets and compare it with zero-shot VQA baseline models. We highlight both the effectiveness and efficiency of DKSVQA through extensive experiments. For reproducibility, the source code is available at https://github.com/gooriiie/DKSVQA.