What part of the body do you listen with? The ear is the obvious answer, but it's only part of the story - your skin is also involved. When we listen to someone else speaking, our brain combines the sounds that our ears pick up with the sight of the speaker's lips and face, and subtle changes in air movements over our skin. Only by melding our senses of hearing, vision and touch do we get a full impression of what we're listening to.
When we speak, many of the sounds we make (such as the English "p" or "t") involve small puffs of air. These are known as "aspirations". We can't hear them, but they can greatly affect the sounds we perceive. For example, syllables like "ba" and "da" are simply versions of "pa" and "ta" without the aspirated puffs.
If you looked at the airflow produced by a puff, you'd see a distinctive pattern - a burst of high pressure at the start, followed by a short round of turbulence. This pressure signature is readily detected by our skin, and it can be easily faked by clever researchers like Bryan Gick and Donald Derrick from the University of British Columbia.
Gick and Derrick used an air compressor to blow small puffs of air, like those made during aspirated speech, onto the skin of blindfolded volunteers. At the same time, they heard recordings of different syllables - either "pa", "ba", "ta" or "da" - all of which had been standardised so they lasted the same time, were equally loud, and had the same frequency.
Gick and Derrick found that the fake puffs of air could fool the volunteers into "hearing" a different syllable to the one that was actually played. They were more likely to mishear "ba" as "pa", and to think that a "da" was a "ta". They were also more likely to correctly identify "pa" and "ta" sounds when they were paired with the inaudible puffs.
This deceptively simple experiment shows that our brain considers the tactile information picked up from our skin when it deciphers the sounds we're listening to. Even parts of our body that are relatively insensitive to touch can provide valuable clues. Gick and Derrick found that their fake air puffs worked if they were blown onto the sensitive skin on the back of the hand, which often pick up air currents that we ourselves create when we speak. But the trick also worked on the back of the neck, which is much less sensitive and unaffected by our own spoken breaths.
While many studies have shown that we hear speech more accurately when it's paired with visual info from a speaker's face, this study clearly shows that touch is important too. In some ways, the integration of hearing and touch isn't surprising - both senses involve detecting the movement of molecules vibrating in the world around us. Gick and Derrick suggest that their result might prove useful in designing aids for people who are hard of hearing.
Reference: Nature doi:10.1038/nature08572
More on perception:
- Infants match human words to human faces and monkey calls to monkey faces (but not quacks to duck faces)
- Brain treats tools as temporary body parts
- How wearing a cast affects sense of touch and brain activity
- Autistic children are less sensitive to the movements of living things
- Blind man navigates obstacle course perfectly with no visual awareness
"For example, syllables like "ba" and "da" are simply versions of "pa" and "ta" without the aspirated puffs."
This is a bit confusing Ed. While it's true in English that ta and pa are more aspirated than da and ba, I think most linguists would agree that the salient difference is that the former feature "voiceless" consonants while the latter are "voiced".
On a related note, the sounds /p/ and /t/ are not always aspirated. Consider the difference between the /p/ in "spin" compared to "pickle". Place a sheet of paper in front of your face and you'll see the difference.
I haven't read the article abstract yet but, from your summary, it seems to imply that we use information on aspiration picked up by the skin to imply voicing, which is still really cool.
If you're wanting to get your head around it a bit more, do a search of "manner of articulation".
Usually a study like this fascinates me, but I don't know...I don't know if I buy this. I think the problem I have is that the only stimuli were those four phonemes.
Surely in regular conversation people differentiate between "pa" and "ba" sounds by semantic context--not gentle air pressure on my skin. How often is a person close enough to the speaker to be able to rely on air pressure?
Sorry about the patronising end to the last post ("If you're wanting to get your head around it a bit more..."!).
On a more relevant note, it occurred to me that this study also relates to the "McGurk effect" where vision influences speech perception.
If your readers are interested, they can read about it here:
Dougal - not at all. Don't mind it if people point out cases where I've gone all writerly at the expense of accuracy. I'll read the links you suggest.
As one of the authors of the article above, I want to thank Ed Yong for the quality of this article. We've seen many news stories about our research at this point, and this is the best and most accurate piece we've seen to date.
I also want to address the second comment because it involves a very important observation: the tiny puffs of air used in our experiment are about as unimportant as anyone can imagine - simulating the very least significant thing produced during a speech act. Yet our brains can still integrate that information in speech perception, and these are things that we do when we speak and perceive speech that we are not even aware of.
Here is an article with quotes from researchers who really understood the significance of this research:
That's interesting. I wonder how hearing is affected when watching tv where there are visual cues but no touch.
Normally when people are conversing they are standing some distance apart, they may not be facing one another, and most of their skin is covered by clothing. They may well also be in an environment where there are all sorts of other air currents. Yet they usually seem to understand one another. I am very skeptical as to whether this phenomenon plays any role in normal speech understanding. (Unlike the McGurk effect, which clearly does.)
It may be the reason why we get such unbearable goose pimples when someone scratches the blackboard. Someone had suggested it was a fear response and I was not at all convinced by that..now this is much more likely.
As far as normal speech/sound is concerned, how can we then hear nuances of music over head phones?
"syllables like "ba" and "da" are simply versions of "pa" and "ta" without the aspirated puffs."
Not quite. The phonemes /b/, /d/ and /g/ are voiced - the vocal cords vibrate as we enunciate them. The phonemes /p/, /t/ and /k/ can be thought of as unvoiced versions - that is, as identical to /b/, /d/ and /g/ in what we do with lips, tongue and teeth, but different only in that the vocal cords do not vibrate.
(Actually there are very slight differences in enunciation between voiced consonants and their unvoiced equivalents, just to make things more complex for those of us who study phonology.)
In English, unvoiced plosive consonants - /p/, /t/ and /k/ - are usually aspirated, and the corresponding voiced sounds are not. One complication is that the /p/ in "Pit" is heavily aspirated, but the /p/ in "Spit" is very lightly aspirated - the same with "Top" and "Stop", "Kit" and "Skit".
Some languages, for instance Spanish, don't aspirate their plosives at all.
To expand on what Dougal said, the main linguistic difference between the phonemes /p/ and /b/ is a voiceless sound vs a voiced sound. However, when you added the voiced vowel of 'a' (and vowels in English are always voiced), the voiceless phoneme /p/ becomes partially voice and the syllables 'pa' and 'ba' become a lot more similar.
The way I read it, we use the tactile information to imply lack of voicing (which is still really cool). As to the fact that we can distinguish these sounds when we can't feel the aspiration, I think the point isn't that we "need" to feel the airflow, but that feeling the airflow gives us that extra bit of speech information that can make the difference in understanding. Could have applications in all sorts of situations where speech communication is compromised and accurate transmission is crucial.
as an Interpreter for the Deaf - I can see how this "sense" of air through the skin works linguistically for those who cannot hear as well... Deaf people use alot of mouth blowing, slow and fast hand movements and hand/arm impacts (which causes puffs of air around the signs)... signs look different when you attempt to sign in such a way that reduces either the sound or the force of those impacts and movements... change the air flow, and you've dampened the "language"...
very cool article - and very cool research!