In Crossing the Uncanny Valley, Joe Kloc explores the unsettled feeling that we get when we see a robot or animation that is not a person but looks a little too much like a person. In Perfect Strangers Greg Laden comments on this work from the anthropological perspective. I thought that I might give my two cents by considering why, as a computer visionist, human look-a-likes disturb me.
The ability to detect faces from imagery, be it still photos or video is becoming a real bread and butter algorithm that industry is willing to pay for. I could write algorithms to detect motorcycles or camels in profile, but nobody is going to pay as much for these capabilities as they are for accurate face detection. In this case accuracy means two things: the ability to detect faces when they are there (true positives) and the ability to not be fooled when confronted with non-faces (false positives). As always in order for me to get paid, the true positive rate has to be incredibly high and the false positive rate has to be impossibly low.
Lets now start with how many face detectors work. One popular approach is to use what is known as a sliding window approach. For a given image, an exhaustive set of sub images ranging over location and scale are extracted. Each sub-window is then classified as either being an image of a face or not. This is a two-class problem: the class of face images and the class of non-face images. Given this distinction, we then go through the process of defining a feature vector that is used to describe all images. Each of these features by themselves will only have a limited ability to discriminate between the two image classes. One strategy is to look for features that are highly correlated for faces but are relatively un-correlated for non-faces. We can use a variety of approaches such as wide margin classifiers to find such descriptors. Given this representation, our model of the world comes down to the following: human faces generate highly correlated feature vector responses and non-human faces generate essentially random feature vector responses.
OK so I am sitting in my office and all of a sudden my manager comes in and tells me that a customer is reporting that our face detector is repeatedly producing feature vectors that are very close but not quite at the face threshold for what they claim are non-faces. I say that this does not make sense, consistently getting that close with a non-face is like hitting the lottery over and over again; I get the queasy feeling that maybe my model for the non-face universe is wrong. If this is the case we might have to pull back many of our planned moneymaking applications, maybe they will hire that new kid fresh out of university with his new batch of secret sauce...
When Freud noticed that his loaf of bread cost 62 cents, it was 62 degrees in his office and his last appointment lasted exactly 62 minutes, he might have felt the uncanny because this isolated evidence seemed to contradict his assumption that such events are independent. When it really counts, we need to rely on our models of how the world works - when we observe evidence to the contrary, it just spooks us.

Greg Laden is a blogger, writer and independent scholar who occassionally teaches. He has a PhD from Harvard in Archaeology and Biological Anthropology, as well as a Masters Degree in the same subjects. He is a biological anthropologist, but for many years before going to graduate school to study human evolution, he did archaeology in North America. He thinks of himself as a biologist who focuses on humans (past and present) and who uses archaeology as one of the tools of the trade. Greg blogs regularly on ScienceBlogs at http://www.scienceblogs.com/gregladen/.
Dr. Joseph J. Salvo attended Phillips Andover Academy, received his A.B. degree from Harvard University and his Master and Ph.D. degrees in Molecular Biophysics and Biochemistry from Yale University. Dr. Salvo joined the GE Global Research Center in 1988. His early work focused on the development of genetically modified bacteria and fungus, for the production of novel high performance polymers. In the mid 1990's he turned his group's efforts towards developing large-scale internet-based sensing arrays to manage and oversee
business systems. Most recently, he and his team have developed a number of complex decision engines that deliver customer value through system transparency and knowledge-based computational algorithms. Commercial business implementations of his work are currently active in Europe, and Asia as well as North and South America.
Dr. Peter Tu received his undergraduate degree in Systems Design Engineering from the University of Waterloo, Canada, and his doctorate from Oxford University England.
In 1997, he joined the Visualization and Computer Vision Group at the GE Global Research Center in Niskayuna, NY. He has developed algorithms for the FBI Automatic Fingerprint Identification System. He is the principle investigator for the ReFace program, which has the goal of automatically computing the appearance of a person’s face from skeletal remains. Dr. Tu has also developed a number of algorithms for the precise measurement of specular and high curvature objects. His current focus is the development of intelligent video algorithms for surveillance applications.



Comments
How do you feel when somebody says, "You look exactly like another person I know - you even walk like them, and have the same overall pitch and tone of voice!"
Then I think, "would I like to meet that person?" No. And why not? Well, what if I didn't like them? What if I met someone with many similarities to me and I didn't like them?
Ugh.
Posted by: yogi-one | November 16, 2009 11:34 AM
Like hearing your own voice recorded.
Posted by: Greg Laden | November 16, 2009 11:52 AM
Peter, do you think robots should try to be built to exist on the "other side" (the perfect side) of uncanny vally (the Bladerunner model) or should they remain clearly non-human (the I Robot movie model)?
Posted by: anon | November 17, 2009 10:45 AM
Anon,
My thought is that there has to be a reason for trying to develop BladeRunner-esque robots. Applications that come to mind would include: companions for the elderly and or shut-ins, playmates for children and customer sales/relations.
This being said, enormous resources would need to be devoted to such a project and I am not sure that such applications warrant this level of investment.
Peter.
Posted by: Peter Tu | November 17, 2009 3:49 PM
Of course, it is possible that it has already been done.
Posted by: HF | November 17, 2009 9:22 PM
Like hearing your own voice recorded.
Hmm, I have to disagree on that one; I don't think it's related. The creepy thing about hearing your voice recorded is just that it sounds different than it does in your head (presumably because you're hear your own voice through internal sound transmission, not just through your ears.) Unlike the uncanny "not quite human" feeling, it's entirely dependent on the intellectual understanding that it is your voice. I doubt anyone could design an experiment where monkeys experience the "hearing your own voice" creepiness.
Posted by: Redshift | November 19, 2009 3:26 PM
Redshift: I may have to disagree with your disagreeing. The "uncanny valley" idea (in all its historical richness) is not strictly related to the "not quite human' case, but to a broader range of cases. Hearing your voice exactly as you think it would have sounded, and hearing your "voice" imitated by a mocking fellow human would define the "normal" terrain. Your voice that SHOULD sound like you think it should sound, but sounding different, occupies the valley.
Posted by: Greg Laden | November 20, 2009 9:20 AM