Researchers studying the basis of visual recognition in two distinct disciplines—computer science and brain science—have put their heads together to advance both fields at once.
Specifically, they’ve captured the brain activity of four people viewing each image in a 5,000-image set, then drawn from a deep-learning algorithm trained on the same image set to better understand how the brain processes visual information.
The hope is that this insight, in turn, will symbiotically improve deep-learning methodologies going forward.
The team, comprising experts from Carnegie Mellon University in Pittsburgh and Fordham University in New York City, described their work in a study published online May 6 in Scientific Data.
The four volunteers who viewed the images underwent functional MRI (fMRI) brain scanning totaling at least 20 hours each.
“The extreme design decision to run the same individuals over so many sessions was necessary for disentangling the neural responses associated with individual images,” the team explained in a news release sent by Carnegie Mellon.
“As we learn more about the neural basis of visual recognition, we will also be better positioned to contribute to advances in artificial vision,” added study co-author Michael Tarr, who heads Carnegie Mellon’s psychology department.
Tarr pointed out that, while the study’s image dataset was hefty relative to others used in prior research, much larger sets are needed to advance computer-vision models as well as neuroscience’s understanding of biological vision.
A reasonable fMRI dataset “would require at least 50,000 stimulus images and many more volunteers to make headway in light of the fact that the class of deep neural nets used to analyze visual imagery are trained on millions of images,” Tarr said.
Click here to read the full study for free and here for the news release.