Critical bands

From Wikipedia, the free encyclopedia

The term critical band, introduced by Harvey Fletcher in the 1940s, referred to the frequency bandwidth of the then-loosely-defined auditory filter. Since Georg von Békésy’s studies (1960), the term also refers literally to the specific area on the basilar membrane (an elongated thin sheet of fibers located in the inner ear, inside the cochlea) that goes into vibration in response to an incoming sine wave. Its length depends on the elastic properties of the membrane and on active feedback mechanisms operating within the hearing organ. Converging psychophysical and psychophysiological experiments indicate that the average length of the critical band is ~1mm. Psychophysiologically, beating and auditory roughness sensations can be linked to the inability of the auditory frequency-analysis mechanism to resolve inputs whose frequency difference is smaller than the critical bandwidth and to the resulting instability or periodic “tickling” (Campbell and Greated 1987:61) of the mechanical system (basilar membrane) that resonates in response to such inputs. Critical bands are also closely related to auditory masking phenomena (i.e. reduced audibility of a sound signal when in the presence of a second signal of higher intensity and within the same critical band). Masking phenomena have wide implications, ranging from a complex relationship between loudness (perceptual frame of reference) and intensity (physical frame of reference) to sound compression algorithms.

1 What is an auditory filter?
2 Psychoacoustic tuning curves
3 Anatomy and Physiology of the Basilar Membrane
4 Relationship to masking
5 Normal and Impaired Auditory Filters
6 See also
7 Sources:

[edit] What is an auditory filter?

Filters are used in many aspects of audiology and psychoacoustics including the peripheral auditory system. A filter is a device which boosts certain frequencies whilst attenuating others. In particular, a band-pass filter allows a range of frequencies within the bandwidth to pass through whilst stopping those which are outside the cut-off frequencies (Gelfand 2004).

Figure 1: A Band-pass filter showing the centre frequency(Fc), the lower(Fl) and upper(Fu) cut off frequencies and the bandwidth. The upper and lower cut-off frequencies are defined as the point where the amplitude falls to 3dB below the peak amplitude. The bandwidth is the area between the upper and lower cut-off frequencies and is the range of frequencies passed by the filter.

The auditory system is thought to contain an array of over-lapping band-pass filters known as ‘auditory filters’ (Fletcher (1940) cited in (Moore 1998)). They occur along the basilar membrane (BM) and increase the frequency selectivity of the cochlea (for anatomy and physiology see later) and therefore the listener’s discrimination between different sounds (Gelfand 2004) (Moore 1998). They are non-linear, level-dependent and the bandwidth increases from the apex to base of the cochlea as the tuning on the BM changes from low to high frequency (Moore 1986) (Moore 1998) (Gelfand 2004). The bandwidth of the auditory filter is called the critical bandwidth, this was first suggested by Fletcher (1940) (cited in (Gelfand 2004)) and is the band of frequencies which are passed by the filter. If a signal and masker are presented simultaneously then only the masker frequencies falling within the critical bandwidth contribute to masking of the signal. The larger the critical bandwidth the lower the signal-to-noise ratio (SNR) and the more the signal is masked. Another concept associated with the auditory filter is the Equivalent Rectangular Bandwidth (ERB). The ERB shows the relationship between the auditory filter, frequency and the critical bandwidth. An ERB passes the same amount of energy as the auditory filter it corresponds to and shows how it changes with input frequency (Gelfand 2004) (Moore 1998). The ERB is calculated using the following equation:

ERB = 24.7 (4.37F + 1)

Where the ERB is in Hz and F is the centre frequency in kHz (Moore 1998).

Figure 2: ERB related to centre frequency. The diagram shows the ERB averaged from many studies and how it increases with increasing centre frequency. Adapted from Moore (1998)

It is thought that each ERB is the equivalent of around 0.9mm on the BM (Moore 1986, 1998). The ERB can be converted into a scale that relates to frequency and shows the position of the auditory filter along the BM. For example an ERB number of 3.36 corresponds to a frequency at the apical end of the BM whereas an ERB number of 38.9 corresponds to the base and a value of 19.5 falls half-way between the two (Moore 1998).

[edit] Psychoacoustic tuning curves

The shape of auditory filters are found by rotating psychoacoustic tuning curves by 180 degrees, changing the shape from a v-like shape to an inverted v shape. Psycoacoustic tuning curves are graphs which show a subjects threshold for a tone when the amount of frequencies present in the masker are varied (Glasberg & Moore 1990).

Psychoacoustic tuning curves can be measured using the notched-noise method. This form of measurement can take a considerable amount of time and can take around 30 minutes to find each masked threshold (Nakaichi et al 2003). In the notched-noise method the subject is presented with a notched noise as the masker and a sinusoid (pure tone) as the signal. Notched noise is used as a masker to prevent the subject hearing beats which occur if a sinusoidal masker is used (Moore 1986). The notched noise is noise which has a notch around the frequency of the signal which the subject is trying to detect, and contains noise within a certain bandwidth. The bandwidth of the noise is changed and the masked thresholds for the sinusoid are measured. The masked thresholds are calculated through simultaneous masking when the signal is played to the subject at the same time as the masker and not after.

To get a true representation of the auditory filters in one subject, many psychoacoustic tuning curves need to be calculated with the signal at different frequencies. For each psychoacoustic tuning curve which is being measured at least five but preferably between thirteen and fifteen thresholds have to be calculated, with different notch widths (Nakaichi et al 2003). Also a large number of thresholds need to be calculated because the auditory filters are asymmetrical, so thresholds should also be measured with the notch asymmetric to the frequency of the signal (Glasberg & Moore 1990). Because of the many measurements which are needed, the amount of time needed to find the shape of a person's auditory filters is very long. To reduce the amount of time needed, the ascending method can be used when finding the masked thresholds. If the ascending method is used to calculate the threshold the time needed to calculate the shape of the filter is reduced dramatically, as it takes around two minutes to calculate the threshold (Nakaichi et al 2003). This is because the threshold which is recorded is when the subject first hears the tone, instead of when they respond to a certain stimulus level a certain percentage of the time.

[edit] Anatomy and Physiology of the Basilar Membrane

The human ear is made up of three areas: the outer, middle and inner ear. Within the inner ear sits the cochlea. The cochlea can be described as a snail-shaped formation which enables the transmission of sound via a sensorineural route, rather than through a conductive pathway (Plewes 2006). The cochlea is a complex structure, consisting of three layers of fluid. The scala vestibuli and scala media are separated by Reissner’s Membrane whereas the scala media and scala tympani are divided by the basilar membrane (BM) (Plewes 2006). The diagram below illustrates the complex layout of the compartments and their divisions (Gelfand 2004):

Figure 3: Cross-section through the cochlea, showing the different compartments (as described above)

The BM widens as it progresses from base to apex. Therefore, the base (the thinnest part) has a greater stiffness than the apex (Gelfand 2004). This means that the amplitude of a sound wave travelling through the basilar membrane varies as it travels through the cochlea (Plewes 2006). When a vibration is carried through the cochlea, the fluid within the three compartments causes the BM to respond in a wave-like manner. This wave is referred to as a ’travelling wave’; this term means that the basilar membrane does not simply vibrate as one unit from the base towards the apex.

When a sound is presented to the human ear, the time taken for the wave to travel through the cochlea is only 5 milliseconds (Plewes 2006).

When low frequency travelling waves pass through the cochlea, the wave maximises in amplitude gradually, then decays almost immediately. The placement of vibration on the cochlea depends upon the frequency of the presented stimuli. For example, lower frequencies stimulate the apex mostly, in comparison to higher frequencies which stimulate the base of the cochlea. This attribute of the physiology of the basilar membrane can be illustrated in the form of the ‘Place-Frequency Map’: (Blatrix 2003).

Figure 4: The Basilar membrane showing the change in frequency from base to apex

The basilar membrane supports the Organ of Corti which sits within the scala media (Gelfand 2004). The Organ of Corti comprises both outer and inner hair cells. There are approximately between 15,000 and 16,000 of these hair cells in one ear (Plewes 2006). Outer hair cells have stereocilia projecting towards the tectorial membrane, which sits above the Organ of corti. Stereocilia respond to movement of the tectorial membrane when a sound causes vibration through the cochlea. When this occurs, the stereocilia separate and a channel is formed which allows for chemical processes to take place; eventually the signal reaches the eighth nerve, followed by processing in the brain (Plewes 2006)

[edit] Relationship to masking

Auditory filters are closely associated with masking in the way they are measured and also the way they work in the auditory system. As described previously the critical bandwidth of the filter increases in size with increasing frequency, along with this the filter becomes more asymmetrical with increasing level.

Figure 5: Asymmetry of the auditory filter. The diagram shows the increasing asymmetry of the auditory filter with increasing input level. The highlighted filters show the shape for 90dB input level (Pink) and a 20dB input level (Green). Diagram adapted from Moore and Glasberg (cited in Gelfand, 2004)

These two properties of the auditory filter are thought to contribute to the upward spread of masking, that is low frequencies mask high frequencies better than the reverse. As increasing the level makes the low frequency slope shallower, by increasing its amplitude, it means that the low frequencies will mask the high more than at a lower input level.

The auditory filter can reduce the effects of a masker when listening to a signal in background noise using off-frequency listening. This is possible when the centre frequency of the masker is different from that of the signal. In most situations the listener chooses to listen ‘through’ the auditory filter that is centred on the signal however if there is a masker present this may not be appropriate. The auditory filter centred on the signal may also contain a large amount of masker causing the SNR of the filter to be low and decreasing the listeners ability to detect the signal. However, if the listener listened through a slightly different filter which still contained a substantial amount of signal but less masker then the SNR would be increased allowing the listener to detect the signal (Gelfand 2004).

Figure 6a and 6b: Off-frequency listening Diagram 6a shows the auditory filter centred on the signal and how some of the masker falls within that filter, this will result in a low SNR. Diagram 6b shows the next filter along the BM, which is not centred on the signal but contains a substantial amount of that signal and less masker. This reduces the effect of the masker by increasing the SNR. Diagram adapted from Gelfand (2004)

The first diagram above shows the auditory filter centred on the signal and how some of the masker falls within that filter, this will result in a low SNR. The second diagram shows the next filter along the BM, which is not centred on the signal but contains a substantial amount of that signal and less masker. This reduces the effect of the masker by increasing the SNR.

The above applies to the power-spectrum model of masking. In general this model relies on the auditory system containing the array of auditory filters and choosing the filter with the signal at its centre or with the best SNR. Only masker that falls into the auditory filter contributes to masking and the person’s threshold for hearing the signal is determined by that masker (Moore 1998).

[edit] Normal and Impaired Auditory Filters

In a ‘normal’ ear the auditory filter has a shape similar to the one shown below. This graph reflects the frequency selectivity and the tuning of the Basilar Membrane .

Figure 7: The auditory filter of a 'normal' cochlea

The tuning of the Basilar Membrane is due to its mechanical structure. At the base of the Basilar Membrane it is narrow and stiff and is most responsive to high frequencies. However, at the apex the Basilar Membrane is wide and flexible and is most responsive to low frequencies. Therefore, different sections of the Basilar Membrane vibrate depending on the frequency of the sound and give a maximum response at that particular frequency.

In an Impaired ear, however the auditory filter has a different shape compared to that of a ‘normal’ ear (Moore 2003).

Figure 8: The auditory filter of an impaired cochlea

The auditory filter of an impaired ear is flatter and broader compared to a normal ear. This is due to the fact that the frequency selectivity and the tuning of the Basilar Membrane is reduced as the Outer Hair Cells are damaged. When only the Outer Hair Cells are damaged the filter is broader on the low frequency side. When both the outer and Inner Hair Cells are damaged the filter is broader on both sides. This is less common. The broadening of the auditory filter is mainly on the low frequency side of the filter. This increases susceptibility to low frequency masking i.e. upward spread of masking as described above (Moore 1998).

[edit] See also

As an example, the perceived loudness of a very narrow-band noise-source with a constant sound pressure level initially remains constant as the noise-bandwidth is gradually increased. Beyond a certain noise-bandwidth, called the critical bandwidth, the loudness begins to increase significantly.

[edit] Sources:

Blatrix, S. (2003) available from http://www.iurc.montp.inserm.fr/cric/audition/english/cochlea/fcochlea.htm

Gelfand, S. A., (2004), Hearing: an introduction to psychological and physiological acoustics, fourth edition, Marcel Dekker: New York.

Glasberg, B. R., Moore, B. C. J., (1990) Derivation of auditory filter shapes from notched-noise data. Hear. Res. 47:103-138

Moore, B. C. J., (1986), Parallels between frequency selectivity measured psychophysically and in cochlear mechanics. Scand Audiol Suppl 1986;25:139–52

Moore, B. C. J., (1998) Cochlear hearing loss, Whurr Publishers Ltd.: London

Moore, B. C. J., (2003) An introduction to the psychology of hearing 5th ed, San Diego, CA : Academic Press,

Nakaichi, T., Watanuki, K., Sakamoto, S., (2003) a simplified measurement method of auditory filters for hearing impaired listeners. Acoust. Sci. and tech. 24;6:365-375

Plewes, K. (2006) Anatomy and physiology of the ear [online] Available from: http://www.coastnet.com/~mcneill/anatomy/html

Backus, J. (1977) (2nd ed). The Acoustical Foundations of Music. New York: W.W. Norton and Company. (Click here for notes and comments on Chapter 5 by S. Hamm, University of Florida)
Békésy, G. von. (1960) [1989]. Experiments in Hearing. New York: Acoustical Society of America Press.
Campbell, M. and Greated, C. (1987). The Musician’s Guide to Acoustics. New York: Schirmer Books.
Vassilakis, P.N. (2005). Auditory roughness as means of musical expression. Selected Reports in Ethnomusicology, 12: 119-144.
Vassilakis, P.N. and Fitz, K. (2007). SRA: A Web-based Research Tool for Spectral and Roughness Analysis of Sound Signals. Supported by a Northwest Academic Computing Consortium grant to J. Middleton, Eastern Washington University.