The idea of “seeing the world through the eyes of a child” takes on new meaning when the observer is a computer. Institute scientists in the Lab for Vision and Robotics Research took their computer right back to babyhood and used it to ask how infants first learn to identify objects in their visual field.
How do you create an algorithm that imitates the earliest learning processes? What do you assume is already hard-wired into the newborn brain, as opposed to the new information it picks up by repeated observation? And finally, how do you get a computer to make that leap from a data-crunching machine that “learns” from information that humans have already sorted, labeled and annotated to a true learning machine that can figure out how to make sense of visual input just by observing?
These researchers started with a basic theory and some fundamental knowledge of what attracts the attention of an infant. And they did manage to create an algorithm with which a computer could, just by watching videos, learn to identify a hand (and, without giving the whole story away, show that one, particular way of learning was much better than others). Then they extended their insight to direction of gaze – something an infant learns sometime between six months and a year – and the computer learned to tell where the person in a photo or video was looking as well as an adult human.
To understand what an achievement this is, remember that just a few years ago, computer vision researchers were trying to get a computer to tell objects apart in images or, more recently, to be able to group together views of the same object seen from different angles (not that these are trivial, or have been fully resolved). Learning to identify a hand involves not only the ability to comprehend that the moving, changing object seen from various viewpoints is one thing, but to begin to grasp the concept of a “hand” long before the verbal skills to name it develop.
One of the nice things about these models is that they can be tested in real life. In fact, while they hold clear implications for the field of artificial intelligence, the researchers are quite jazzed that a computer model such as theirs has the potential to reveal something so basic about human development.