Auditory Scene Analysis and The Cocktail Party Effect

Discover why you have trouble hearing in a crowd.

Have you ever wondered why you can’t hear what you want to hear when you’re in a crowded space? Before I explain, let me tell you a bit about how we hear complex sound.

When we hear a complex sound, our hearing system – from the ears to the brain – analyses the sound into its component parts, and then the brain groups the components together into meaningful events.

When the complex sound reaches the ear, it’s mixed up as a combination of pressure waves, and each pressure wave is a representation of all sorts of activities occurring in the environment at that instant.

Suppose you are talking to someone and a car goes by. How does your brain know which sound is which? How does your brain know which direction it’s come from, or who is speaking? This is more complicated when the sounds you don’t want to hear are also voices, and even more complicated when it’s nearby voices with meaningful conversation.

Have you ever stopped to think what a smart job we do in working out what the person we want to hear is saying? We can manage to focus our auditory attention on a particular stimulus while filtering out a range of other competing sounds and conversations. Isn’t that clever? We use information from both ears to do this. Each of your ears gets a different complex sound wave at one instant in time, representing. Each ear is receiving a relatively complete collection of auditory events emanating from an infinite variety of sound sources. The brain has to assign all these components to events.

Let’s think about this in terms of vision instead. You see a scene before you. The eye sees the colours and lights but the brain organises these cues into meaningful patterns and forms. With two eyes and the brain we can work out things like how quickly an object is approaching, or how far away it is.

The cocktail party effect

Sorting out the sound we want to hear from background conversation is still sometimes called “The cocktail party effect”. This term was coined by a British cognitive scientist called Colin Cherry in the 1950’s, who did seminal research to understand how we hear what we want to hear against other competing conversations. Cherry’s work showed that many factors affect our ability to separate sounds from background noise, including the direction from which the sound is coming, the pitch, the rate of speech, and whether there is meaning in the competitive noise.

The Cocktail Party was invented in the 1920s by Alec Waugh, brother of the writer, Evelyn Waugh, who felt the need for a pleasant interlude before a dinner party, in fashionable London. It was still popular in the 1950’s when Cherry was doing his research, but cocktail parties dropped from fashion in the 70’s and 80’s. We are seeing a comeback, so the term has become a relevant descriptor again, although the phenomenon has never gone away.

Auditory scene analysis

The ability to hear in the cocktail party type environment is a subset of a broader phenomenon called auditory scene analysis, a term coined by Canadian psychologist Albert Bregman. In his laboratory, he and his team research how we understand the sounds around us, and extract the meaning. Auditory scene analysis is a way of describing how we decompose complex mixes of sounds so that the brain has separate mental descriptions of the individual sounds in the mixture.

Quoting from Bregman’s own website, his team’s research has found:

“A number of auditory phenomena have been related to the grouping of sounds into auditory streams. They include speech perception, the perception of the order and other temporal properties of sound sequences, the combining of evidence from the two ears, the perception of numerosity, the perception of patterns by infants, the detection of patterns embedded in other sounds, the perception of simultaneous “layers” of sounds (e.g., in music), the perceived continuity of sounds through interrupting noise, perceived timbre and rhythm, and the perception of tonal sequences.” (Reviewed in Bregman, 1990/1994)

What an awesome list.

The brain has to deal with two different types of analysis. At anyone instant, the sounds that are all mixed up has to be separated into different sounds. This is called simultaneous integration. Essentially, the brain is grouping sounds with similar patterns. But as well as this, the sounds that follow one another to form patterns, such as words, have to be identified. This is called sequential integration. In this case, the brain is grouping sounds with some similar features occurring over a period of time. The brain separates out the sequential patterns and latches onto it, so that the separate channels are streamed. So, the ear and brain ha their own streaming channels first.

So, back to explaining why you might be having trouble hearing in background noise; the two functions of simultaneous integration and sequential integration are not entirely separate, and both rely on underlying mechanisms that combine the ability of the ear to code and sensitive and accurate neural function, to code differences in pitch and timing of sound signals. Unfortunately, our ability with both these tasks gets worse as we age, and so it gets harder to hear in noise, even if our hearing acuity remains pretty good. Neural timing also gets less accurate as we age. Thus, it gets harder to hear in background noise.

There’s some good news though. The hearing system is quite good at using predictability in sound identification. This attribute has not been shown to deteriorate with age, and indeed there are some suggestions that it may even get better.

It’s nice to find that we can go to the cocktail party, armed with our life knowledge and experience as an asset.

2 Comments

As a Special needs teacher in the late 70s early 80s, I am grateful for the work on auditory figure/ground perception pioneered by Eddie Keir of Melbourne.

As a user of hearing aids (Blamey Saunders, of course), I’m wondering if hearing aids that communicate with each other could provide additonal information that would aid users in noisy environments? For instance, could they, calculate distance thru simple trigonometry and thus filter out conversations a few tables away when I want to focus on the conversation of the person nearby? Could this possibly enhance the current practice of reducing background noise by switching off the rear microphone?

Andrew Holborn

14/07/2015 at 5:14 PM 10 years ago

1. Good comment Andrew. We’ve done some research in this area actually. Happy to chat more, off line
  
  Elaine
  
  14/07/2015 at 5:22 PM 10 years ago

2 Comments

Leave a Reply Cancel reply