Psychoacoustics

Psychoacoustics is the scientific study of sound perception. More specifically, it is the branch of science studying the psychological and physiological responses associated with sound (including speech and music). It can be further categorized as a branch of psychophysics.

1 Background
2 Limits of perception
3 Sound localization
4 Masking effects
5 Missing fundamental
6 Software
7 Music
8 Applied psychoacoustics
9 See also
- 9.1 Related fields
- 9.2 Psychoacoustic topics
10 References
11 External links

Background

Hearing is not a purely mechanical phenomenon of wave propagation, but is also a sensory and perceptual event; in other words, when a person hears something, that something arrives at the ear as a mechanical sound wave traveling through the air, but within the ear it is transformed into neural action potentials. These nerve pulses then travel to the brain where they are perceived. Hence, in many problems in acoustics, such as for audio processing, it is advantageous to take into account not just the mechanics of the environment, but also the fact that both the ear and the brain are involved in a person’s listening experience.

The inner ear, for example, does significant signal processing in converting sound waveforms into neural stimuli, so certain differences between waveforms may be imperceptible.^[1] Data compression techniques, such as MP3, make use of this fact.^[2] In addition, the ear has a nonlinear response to sounds of different loudness levels. Telephone networks and audio noise reduction systems make use of this fact by nonlinearly compressing data samples before transmission, and then expanding them for playback.^[3] Another effect of the ear's nonlinear response is that sounds that are close in frequency produce phantom beat notes, or intermodulation distortion products.^[4]

Limits of perception

The human ear can nominally hear sounds in the range 20 Hz to 20,000 Hz (20 kHz). The upper limit tends to decrease with age; most adults are unable to hear above 16 kHz. The lowest frequency that has been identified as a musical tone is 12 Hz under ideal laboratory conditions.^[5] Tones between 4 and 16 Hz can be perceived via the body's sense of touch.

Frequency resolution of the ear is 3.6 Hz within the octave of 1,000–2,000 Hz. That is, changes in pitch larger than 3.6 Hz can be perceived in a clinical setting.^[5] However, even smaller pitch differences can be perceived through other means. For example, the interference of two pitches can often be heard as a (low-)frequency difference pitch. This effect of phase variance upon the resultant sound is known as beating.

The semitone scale used in Western musical notation is not a linear frequency scale but logarithmic. Other scales have been derived directly from experiments on human hearing perception, such as the mel scale and Bark scale (these are used in studying perception, but not usually in musical composition), and these are approximately logarithmic in frequency at the high-frequency end, but nearly linear at the low-frequency end.

The intensity range of audible sounds is enormous. Our ear drums are sensitive only to variations in the sound pressure, but can detect pressure changes from as small as a few micropascals to greater than 1 bar. For this reason, sound pressure level is also measured logarithmically, with all pressures referenced to 20 µPa (or 1.97385×10⁻¹⁰ atm). The lower limit of audibility is therefore defined as 0 dB, but the upper limit is not as clearly defined. The upper limit is more a question of the limit where the ear will be physically harmed or with the potential to cause noise-induced hearing loss.

A more rigorous exploration of the lower limits of audibility determines that the minimum threshold at which a sound can be heard is frequency dependent. By measuring this minimum intensity for testing tones of various frequencies, a frequency dependent absolute threshold of hearing (ATH) curve may be derived. Typically, the ear shows a peak of sensitivity (i.e., its lowest ATH) between 1 kHz and 5 kHz, though the threshold changes with age, with older ears showing decreased sensitivity above 2 kHz.

The ATH is the lowest of the equal-loudness contours. Equal-loudness contours indicate the sound pressure level (dB), over the range of audible frequencies, which are perceived as being of equal loudness. Equal-loudness contours were first measured by Fletcher and Munson at Bell Labs in 1933 using pure tones reproduced via headphones, and the data they collected are called Fletcher-Munson curves. Because subjective loudness was difficult to measure, the Fletcher-Munson curves were averaged over many subjects.

Robinson and Dadson refined the process in 1956 to obtain a new set of equal-loudness curves for a frontal sound source measured in an anechoic chamber. The Robinson-Dadson curves were standardized as ISO 226 in 1986. In 2003, ISO 226 was revised as equal-loudness contour using data collected from 12 international studies.

Sound localization

Main article: Sound localization

Sound localization is the process of determining the location of a sound source. The brain utilizes subtle differences in intensity, spectral, and timing cues to allow us to localize sound sources.^[6] Localization can be described in terms of three-dimensional position: the azimuth or horizontal angle, the zenith or vertical angle, and the distance (for static sounds) or velocity (for moving sounds).^[7] The basis of localization is based on the slight difference in loudness, tone and timing between the two ears. Humans as most four legged animals are adept at detecting direction in the horizontal, but less so in the vertical due to the ears being placed symmetrically. Some species of owls have their ears placed asymmetrically, and can detect sound in all three planes, an adaption to hunt small mammals in the dark.^[8]

Masking effects

Main article: Auditory masking

In some situations an otherwise clearly audible sound can be masked by another sound. For example, conversation at a bus stop can be completely impossible if a loud bus is driving past. This phenomenon is called masking. A weaker sound is masked if it is made inaudible in the presence of a louder sound.

Missing fundamental

Main article: Missing fundamental

A harmonic series of pitches that are related 2×f, 3×f, 4×f, 5×f, etc., give human hearing the psychoacoustic impression that the pitch 1×f is present.

Software

The psychoacoustic model provides for high quality lossy signal compression by describing which parts of a given digital audio signal can be removed (or aggressively compressed) safely — that is, without significant losses in the (consciously) perceived quality of the sound.

It can explain how a sharp clap of the hands might seem painfully loud in a quiet library, but is hardly noticeable after a car backfires on a busy, urban street. This provides great benefit to the overall compression ratio, and psychoacoustic analysis routinely leads to compressed music files that are 1/10th to 1/12th the size of high quality masters, but with discernibly less proportional quality loss. Such compression is a feature of nearly all modern lossy audio compression formats. Some of these formats include Dolby Digital (AC-3), MP3, Ogg Vorbis, AAC, WMA, MPEG-1 Layer II (used for digital audio broadcasting in several countries) and ATRAC, the compression used in MiniDisc and some Walkman models.

Psychoacoustics is based heavily on human anatomy, especially the ear's limitations in perceiving sound as outlined previously. To summarize, these limitations are:

Given that the ear will not be at peak perceptive capacity when dealing with these limitations, a compression algorithm can assign a lower priority to sounds outside the range of human hearing. By carefully shifting bits away from the unimportant components and toward the important ones, the algorithm ensures that the sounds a listener is most likely to perceive are of the highest quality.

Music

Psychoacoustics include topics and studies which are relevant to music psychology and music therapy. Theorists such as Benjamin Boretz consider some of the results of psychoacoustics to be meaningful only in a musical context.^[9]

Applied psychoacoustics

Psychoacoustics is presently applied within many fields from software development, where developers map proven and experimental mathematical patterns; in digital signal processing, where many audio compression codecs such as MP3 use a psychoacoustic model to increase compression ratios; in the design of (high end) audio systems for accurate reproduction of music in theatres and homes; as well as defense systems where scientists have experimented with limited success in creating new acoustic weapons, which emit frequencies that may impair, harm, or kill (see [1]). It is also applied today within music, where musicians and artists continue to create new auditory experiences by masking unwanted frequencies of instruments, causing other frequencies to be enhanced. Yet another application is in design of small or lower-quality loudspeakers, which use the phenomenon of missing fundamentals to give the effect of low frequency bass notes that the system, due to frequency limitations, cannot actually reproduce (see references).

References

^ Christopher J. Plack (2005). The Sense of Hearing. Routledge. ISBN 0805848843. http://books.google.com/?id=DoGzm3soUoMC&pg=PA65&dq=ear+hearing+cochlea++inauthor:plack.
^ Lars Ahlzen, Clarence Song (2003). The Sound Blaster Live! Book. No Starch Press. ISBN 1886411735. http://books.google.com/?id=tKO-truWww8C&pg=PA310&dq=mp3++imperceptible+ear.
^ Rudolf F. Graf (1999). Modern dictionary of electronics. Newnes. ISBN 0750698667. http://books.google.com/?id=o2I1JWPpdusC&pg=PA137&dq=compression+expansion+noise-reduction+telephone.
^ Jack Katz, Robert F. Burkard, and Larry Medwetsky (2002). Handbook of Clinical Audiology. Lippincott Williams & Wilkins. ISBN 0683307657. http://books.google.com/?id=Aj6nVIegE6AC&pg=PA43&dq=beat+distortion++ear.
^ ^a ^b Olson, Harry F. (1967). Music, Physics and Engineering. Dover Publications. pp. 248–251. ISBN 0486217698. http://books.google.com/books?id=RUDTFBbb7jAC.
^ Thompson, Daniel M. Understanding Audio: Getting the Most out of Your Project or Professional Recording Studio. Boston, MA: Berklee, 2005. Print.
^ Roads, Curtis. The Computer Music Tutorial. Cambridge, MA: MIT, 2007. Print.
^ Lewis, D.P. (2007): Owl ears and hearing. Owl Pages [Online]. Available: http://www.owlpages.com/articles.php?section=Owl+Physiology&title=Hearing [2011, April 5]
^ Sterne, Jonathan (2003). The Audible Past: Cultural Origins of Sound Reproduction. Durham: Duke University Press.

E. Larsen and R.M. Aarts (2004), Audio Bandwidth extension. Application of Psychoacoustics, Signal Processing and Loudspeaker Design., J. Wiley.
Larsen E., Aarts R.M. (March 2002). "Reproducing low-pitched signals through small loudspeakers" (PDF). J. Audio Eng. Soc. 50 (3): 147–164. http://www.extra.research.philips.com/hera/people/aarts/papers/aar02n4.pdf.
Oohashi T., Kawai N., Nishina E., Honda M., Yagi R., Nakamura S., Morimoto M., Maekawa T., Yonekura Y., Shibasaki H. (February 2006). "The role of biological system other than auditory air-conduction in the emergence of the hypersonic effect". Brain Research 1073: 339–347. doi:10.1016/j.brainres.2005.12.096. PMID 16458271.

External links

The musical ear — Perception of sound
Müller C, Schnider P, Persterer A, Opitz M, Nefjodova MV, Berger M (1993). "[Applied psychoacoustics in space flight]" (in German). Wien Med Wochenschr 143 (23–24): 633–5. PMID 8178525. — Simulation of free field hearing by head phones
GPSYCHO — an open source psycho-acoustic and noise shaping model for ISO based MP3 encoders.
Definition of: perceptual audio coding
Java applet demonstrating masking
HyperPhysics Concepts – sound and hearing
- The MP3 as Standard Object