Papers by Ekta Sood have been accepted at the Conference on Computational Natural Language Learning (CoNLL) 2021 and the International Conference on Computer Vision (ICCV) 2021.
Conference on Computational Natural Language Learning (CoNLL) 2021
CoNLL is a yearly conference organized by SIGNLL, focusing on theoretically, cognitively and scientifically motivated approaches to computational linguistics.
The paper on “VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering” (Ekta Sood, Fabian Kögel, Florian Strohm, Prajit Dhar and Andreas Bulling) was selected for an oral presentation at CoNLL 2021 and deals with VQA-MHUG - a novel 49-participant dataset of multimodal human gaze on both images and questions during visual question answering (VQA) collected using a high-speed eye tracker.
“We use our dataset to analyze the similarity between human and neural attentive strategies learned by five state-of-the-art VQA models: Modulated Co-Attention Network (MCAN) with either grid or region features, Pythia, Bilinear Attention Network (BAN), and the Multimodal Factorized Bilinear Pooling Network (MFB). While prior work has focused on studying the image modality, our analyses show - for the first time - that for all models, higher correlation with human attention on text is a significant predictor of VQA performance. This finding points at a potential for improving VQA performance and, at the same time, calls for further research on neural text attention mechanisms and their integration into architectures for vision and language tasks, including but potentially also beyond VQA.”, explains Ekta Sood.
International Conference on Computer Vision (ICCV) 2021
ICCV is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers.
In the paper “Neural Photofit: Gaze-based Mental Image Reconstruction” (Florian Strohm, Ekta Sood, Sven Mayer, Phulipp Müller, Mihai Bâce, Andreas Bulling), the authors propose a novel method that leverages human fixations to visually decode the image a person has in mind into a photofit (facial composite). Our method combines three neural networks: An encoder, a scoring network, and a decoder. The encoder extracts image features and predicts a neural activation map for each face looked at by a human observer. A neural scoring network compares the human and neural attention and predicts a relevance score for each extracted image feature. Finally, image features are aggregated into a single feature vector as a linear combination of all features weighted by relevance which a decoder decodes into the final photofit. We train the neural scoring network on a novel dataset containing gaze data of 19 participants looking at collages of synthetic faces. We show that our method significantly outperforms a mean baseline predictor and report on a human study that shows that we can decode photofits that are visually plausible and close to the observer’s mental image. Code and dataset available upon request.
About Ekta Sood
Ekta Sood is a PhD student at the University of Stuttgart, supervised by Prof. Dr. Andreas Bulling in the Perceptual User Interfaces group. Discovering her passion for cognitive systems, with a focus on neuro-linguistics, Ekta explored different avenues for her next educational steps, ultimately deciding on the M.Sc. program in Computational Linguistics at the University of Stuttgart. In her PhD, Ekta works on modeling fundamental cognitive functions of attention and memory at the intersection of computer vision and natural language processing. She focuses on advancing performance in machine comprehension with gaze-assisted deep learning approaches, bridging the gap between data-driven and cognitive models and exploring new ways to interpret neural attention using human physiological data. She holds a bachelors degree in Cognitive Science from the University of California, Santa Cruz. Ekta is a Google Scholarship 2019 awardee and won best poster award at ComCo 2019. More recently, her work was accepted at NeurIPS and CoNLL 2020.
For more information on her work see: https://perceptualui.org/people/sood/ and https://perceptualui.org/.