Between Mind and Machine: NLP Discoveries

This Q&A highlight features Anya Ji, a Finalist in the 2023 CRA Outstanding Undergraduate Researchers award program. Anya finished her undergraduate degree in Computer Science and Psychology and her M.Eng in Computer Science at Cornell University.

What brought you to computing research?

My initial interest in cognitive science was broad and exploratory, leading me on a quest to pinpoint areas within the field that truly captivated me. Thus, I started as a research assistant in two labs. At the Affect and Cognition Lab, working on a project examining time perception/orienting response, I learned how humans understand time. At the Attention, Memory, and Perception Lab, where I worked on event segmentation advised by Professor Khena Swallow, I learned how people segment continuous activities (such as watching a movie, attending a lecture, or observing a sports game) into discrete events (major plot points or topics of the lecture or winners of the game) for better comprehension and memory. While working on these projects, I found my passion in language and, later, natural language processing (NLP). A project analyzing free recall data from the event segmentation experiments inspired me to look for research opportunities in the intersection of NLP and psychology.

How did you connect with your undergraduate research advisor?

Fascinated by Professor Yoav Artzi’s work in NLP with an emphasis on learning from dynamic interaction with humans, I reached out to him to express my interest in NLP and cognitive science. I asked about potential research opportunities in his lab. My interdisciplinary background aligned with one of his new research collaborations with Professor Robert Hawkins at the University of Wisconsin Madison, so I was onboarded to work on this research.

What challenges did you encounter when you first started your research?

There was a lot of ambiguity since it was a new project, and we were starting from scratch. To navigate this, I engaged deeply, posing numerous questions to clarify our objectives and direction. I also proposed ideas for the experiment design, both as a demonstration of my grasp of the subject and as a means to solicit feedback for refinement. Through this process, I learned two things: firstly, that research fundamentally diverges from the structured nature of course assignments, which typically feature clearly delineated questions; and secondly, to make use of the mentorship available, do not be afraid of asking “dumb” questions, and take the initiative to understand the project fully and actively contribute to it.

Can you tell us about your project?

The challenge of interpreting abstract visual stimuli, like tangrams — puzzles made of seven geometric pieces that form meaningful shapes — stems from their inherent ambiguity, posing difficulties for both humans and multi-modal models. These puzzles are often used in cognitive science research for their abstract nature, which can drive consensus or foster diverse interpretations. However, existing tangram datasets are small, limiting their utility in machine learning. A larger and more diverse dataset could significantly advance research in abstract visual reasoning across NLP and cognitive science, introducing fresh scientific inquiries. To address the scarcity of resources for studying abstract visual reasoning in both human and machine learning models, we developed an interactive crowdsourcing platform that amassed over 13,000 annotations and shape segmentations of 1,016 tangrams, creating the KiloGram dataset. Our metrics showed that pre-trained language models like CLIP and ViLT struggle with abstract tasks, but fine-tuning models improves their abstraction capabilities, highlighting a generalization gap. I presented this work at EMNLP 2022 and received the Best Long Paper Award.

What challenges did you encounter throughout the research process?

Faced with daunting tasks in research, such as initiating a crowdsourcing experiment, conducting complex data analysis, and fine-tuning novel models, I did not know where to start. I learned to break these tasks down into smaller, reachable goals and tackle them individually, acquiring many intermediate skills on the way. I learned that the research process could be messy, but the key to navigating it lies in meticulous planning, flexibility, and willingness to learn and adapt when initial strategies falter.

What were some of your favorite aspects of research?

Upon joining Prof. Artzi’s lab, Alane Suhr and Noriyuki Kojima (Ph.D. students in the lab at the time)generously shared their knowledge in crowdsourcing and model training with me by discussing their projects, which offered both technical insights and inspiration. This experience catalyzed my learning journey, where a straightforward data collection task became a gateway to mastering a diverse set of skills.

How has participating in research shaped your professional path?

My research has fueled my curiosity by exploring the possibilities that emerge from applying insights from both human and machine reasoning to practical uses. In my career, I aim to engage in multimodal machine learning research and engineering to develop virtual and physical agents that will possess a deep understanding of human cognition and the ability to generalize using vision and language. My goal is to create solutions like software copilots or robotic companions that seamlessly integrate with human activities, enhancing our capabilities and collaboration in everyday scenarios.

Do you have any advice for other students looking to get into research?

Start early and try different things. Expect ambiguity, ask questions, and be prepared to learn a lot of things to solve a problem from scratch.
— Edited by Yasra Chandio and Alejandro Velasco Dimate