IU psychologist leads $700,000 NSF grant to create machines that think like toddlers

500 hours of video, 54 million images from over 100 children to inform machine model of childhood visual recognition and language learning

  • Sept. 18, 2015


BLOOMINGTON, Ind. -- Linda Smith, an internationally recognized expert in human cognition at Indiana University, and IU professor Chen Yu, in collaboration with computer vision researchers from Georgia Tech, have received $700,000 from the National Science Foundation to lead new research that could strengthen understanding of how children learn to recognize discrete categories of objects.

Members of the IU Bloomington College of Arts and Sciences' Department of Psychological and Brain Sciences, Smith and Yu said the work is expected to help in the creation of machines that can learn how to visually recognize objects with the same ease as children. It could also lead to new, more sophisticated digital object-recognition technology.

The IU team will recruit over 100 families to gather first-person image data from infants and toddlers in their own homes using eye-tracking technology and extremely lightweight, GoPro-like, head-mounted mini-cameras. The team from Georgia Tech will then use the images to design machine learning models that mimic the toddlers' ability to recognize objects.

"The study addresses a critical need to better understand the visual side of object name learning," said Smith, a Distinguished Professor and Chancellor’s Professor of Psychological and Brain Sciences. "Emerging evidence from labs across the country suggests that children who are slow word learners also are slower, or weaker, in their visual object recognition skills. It could be that learning object names teaches visual object recognition or that poor or slowly developing visual object recognition limits early word learning.

"Either way, visual object recognition is intricately connected to the early language-learning process."

Participating families will hail from Monroe County, Ind. Caregivers will place head cameras on their children for six hours in a day or multiple times over a week to capture moment-to-moment, "high-density" eye movement information as they interact and play -- a total of 54 million images and 500 hours of head camera video.

"The visual data and footage from these devices will undergo a rigorous data mining and quantitative analysis using computer vision and machine learning techniques, which could ultimately advance how researchers study learning in young infants and toddlers," Yu said.

"Our easy-to-use system was designed to fit parent and toddler needs," Smith added. "Our goal will be to gather everyday toddler-perspective scenes without influencing -- by our own expectations or parent expectations -- the recorded scenes."

IU researchers will follow up with the participating families after one year to record language progression in the children, allowing the team to connect the new data back to the visual information from the initial portion of the study.

The formation of object recognition skills has remained a notoriously "unsolved puzzle" in psychology, said Smith, noting that most object recognition research and technology is based on the assumption that humans acquire these skills through the accumulation of numerous examples of a single object.

Smith and Yu's preliminary work, however, suggests a very different scenario: that numerous views, including partial and limited ones, of a single object lead to the development of visual object recognition skills. So in addition to tallying instances of exposure to different categories of objects -- cars, cups, chairs or ducks, for instance -- IU researchers will record toddlers as they encounter similar objects in different forms -- a toy duck, a soap dish shaped like a duck, a duck-shaped candy dispenser -- as well as extended interactions with a single object.

"The key to this study is capturing egocentric, first-person views of the natural visual environment from the perspective of infants," said Smith, who said the camera and eye-tracking techniques are unique to her studies. "Objects are not recognized based upon the number of instances of the object in their environment but rather the limits of time and place, and by the young child’s body, activities and needs."

Smith also serves as the director of the Cognitive Development Lab in the Department of Psychological and Brain Sciences. Yu is director of the Computational Cognition and Learning Lab in the department.

The receipt of the NSF grant would not have been possible without support from the IU Bloomington Office of the Provost’s Faculty Research Support Program, Smith said.

Related Links

Toddler perspective images

Toy car images from a toddler's point of view captured using head-mounted technology. | Photo by Sven Bambach

Print-Quality Photo

Toddler wearing head camera

A toddler wears a head-mounted camera. | Photo by Sven Bambach

Print-Quality Photo

Linda Smith

Linda Smith | Photo by Indiana University

Print-Quality Photo

Chen Yu

Chen Yu | Photo by Indiana University

Print-Quality Photo

Media Contacts

Kevin Fryling

  • Office 812-856-2988
  • kfryling@iu.edu

Liz Rosdeitcher

  • Department of Psychological and Brain Sciences
  • Office 812-855-4507
  • rosdeitc@indiana.edu