Addressing Non-Determinism: Enhancing Reproducibility of ML Models

Stephen Price, BS in Computer Science, at Worcester Polytechnic Institute This Q&A highlight features Stephen Price, an Honorable Mention in the 2023 CRA Outstanding Undergraduate Researchers award program. Stephen finished his BS at the Worcester Polytechnic Institute (WPI) and is now pursuing an MS/PhD in Computer Science there.

What brought you to computing research?

In my freshman year, during the peak of COVID-19, I felt disconnected from my college journey. I initially wanted to graduate early, but my academic advisor, Prof. Rodica Neamtu, persuaded me to attend one of her research group meetings to try and get me more involved on campus instead of graduating early and entering the industry.

Initially skeptical about joining a research lab, I found myself captivated by the world of research after just one meeting. I found the research discussions intriguing and was fascinated by the challenges and possibilities research offered. What began as a quest to “get through” college transformed into a journey of discovery and growth. Because of Prof. Neamtu, I pursued a Master’s and will work towards my Ph.D.

How did you find your first research project?

My first project was reproducing a computer vision model for our material science collaborators.  Initially, I received quite favorable results, and we published them. However, after publishing, I demonstrated the model to a friend, and the results, which I expected to be identical,  were significantly worse. Concerned about this disparity, I delved into analyzing the reproducibility of machine learning (ML) models. 

Can you tell us about your project?

Prof. Rodica Neamtu selected my first project. However, the second project was based on that personal discovery of the non-determinism of ML models. ML models might produce different results when the same code runs multiple times. I discovered that this inherent non-determinism for the computer vision model I used stemmed from the model’s architecture, the software libraries utilized, and the GPU. Reproducibility in ML is vital, particularly in collaborative group efforts. Discrepancies in results can pose challenges for teams, especially if they are unaware of these differences. To make the model result reproducible, we systematically changed each aspect of the process, focusing on identifying changes until we could ensure consistent reproducibility. This meant identifying factors impacting the result’s reproducibility, integrating robust software engineering, providing in-depth documentation, and applying specific software libraries and algorithms. I presented this work at the International Conference on Pattern Recognition and Artificial Intelligence (ICPRAI). 

What challenges did you encounter when first getting started in research? 

Starting my research in an interdisciplinary group combining computer science with material science, I often needed help with unfamiliar terminology. To address this, I either sought clarifications from the graduate student on the project, Bryer Sousa, or noted the terms to research later. During one-on-one meetings with my advisor, I would raise my uncertainties. However, she often encouraged me to research unfamiliar terminology independently and present my findings in our next meetings. This approach prompted me to adopt a self-learning mindset. During our weekly discussions, I would share what I had learned independently.

What was the most challenging part of the research process? 

The most challenging aspect of the research was managing unfavorable results. The process from data collection to model training is extensive. Much time elapses before determining the effectiveness of a solution.  When results are unexpected, especially in uncharted research areas, it is tough but essential to adapt, learn from them, and identify the causes of suboptimal results, guiding future decision-making. 

How did your identity affect your research experience? 

While my personal identities have not affected my research experience, age has been a factor. I faced challenges when interacting with seasoned professionals at conferences, especially being an undergraduate student, with just a high school diploma as my highest form of completed education. Most of this stemmed from my hesitation, but there were instances where I felt undervalued when I tried to engage. 

How did you balance your research activities with your coursework?

Throughout my research career, I have utilized independent studies to allocate time for research. This allowed me to earn credits based on my research. This not only provided the time to focus on my research but also helped me to give more structure to the process since I had to fulfill certain requirements to receive those credits. 

Do you have any advice for other students looking to get into research?

My biggest advice is just to get started. There were moments when I made mistakes and things I wish I had known earlier. However, I only gained this knowledge by actively getting involved. Do not hesitate; reach out to a professor. At worst, they might be unavailable. At best, you get published as a first author, and they might even fly you to Paris to present your work.

— Edited by Yasra Chandio and Alejandro Velasco Dimate

Leave a comment

Your email address will not be published. Required fields are marked *