Authors: Prateek Ralhan; M.V. Ranjith Kumar; P. Madhavan
Addresses: Department of Electronics and Communication Engineering, SRM Institute of Science and Technology, Kattankulathur, Chennai, India ' Department of Computer Science and Engineering, SRM Institute of Science and Technology, Kattankulathur, Chennai, India ' Department of Computer Science and Engineering, SRM Institute of Science and Technology, Kattankulathur, Chennai, India
Abstract: One of the most overwhelming and exhilarating advancements in the field of intelligent systems has been the introduction of visual question answering (VQA), a new and exciting problem that makes use of natural language processing and computer vision for measuring the ability of a system for image understanding and generating inferences beyond object recognition, segmentation and image captioning. One of the primary goals of research in visual-based artificial intelligence is to design systems that can understand and reply to questions about visual data. The first part of the study very vividly presents numerous diversified as well as diagnostic datasets testing different visual reasoning abilities, while the second part of the study details a plethora of the recently developed approaches for VQA. Finally, a qualitative comparison of the diversity present in the various approaches has been done that can serve as an important benchmark for analysing and comparing the different characteristics and applications of these algorithms and datasets that can prove to be helpful to someone new to the field of VQA and its community.
Keywords: intelligent systems; instance segmentation; object detection; relationships; inferential statistics; Karpathy splits; model and image captioning.
International Journal of High Performance Computing and Networking, 2020 Vol.16 No.2/3, pp.125 - 136
Received: 02 Jul 2020
Accepted: 12 Sep 2020
Published online: 12 Jan 2021 *