Depth perception

From Wikipedia, the free encyclopedia

Depth perception is the visual ability to perceive the world in three dimensions. It is a trait common to many higher animals. Depth perception allows the beholder to accurately gauge the distance to an object.

In modern terminology, stereopsis is depth perception from binocular vision through exploitation of parallax. Depth perception does indeed rely on binocular vision, but it also uses many other monocular cues.

Contents

[edit] Depth cues

Depth perception combines several types of depth cues grouped into two categories: monocular cues (cues available from the input of just one eye) and binocular cues (cues that require input from both eyes).

[edit] Monocular cues

  • Motion parallax - The apparent relative motion of several stationary objects against a background when the observer moves gives hints about their relative distance. This effect can be seen clearly when driving in a car nearby things pass quickly, while far off objects appear stationary. Some animals that lack binocular vision due to wide placement of the eyes employ parallax more explicitly than humans for depth cueing (e.g. some types of birds, which bob their heads to achieve motion parallax, and squirrels, which move in lines orthogonal to an object of interest to do the same).1
  • Color vision - Correct interpretation of color, and especially lighting cues, allows the beholder to determine the shape of objects, and thus their arrangement in space. The color of distant objects is also shifted towards the blue end of the spectrum. (e.g. distant mountains.) Painters, notably Cezanne, employ "warm" pigments (red, yellow and orange) to bring features forward towards the viewer, and "cool" ones (blue, violet, and blue-green) to indicate the part of a form that curves away from the picture plane.
  • Perspective - The property of parallel lines converging at infinity allows us to reconstruct the relative distance of two parts of an object, or of landscape features.
  • Relative size - An automobile that is close to us looks larger than one that is far away; our visual system exploits the relative size of similar (or familiar) objects to judge distance.
  • Distance fog - Due to light scattering by the atmosphere, objects that are a great distance away look hazier. In painting, this is called "atmospheric perspective". The foreground is sharply defined; the background is relatively blurred.
  • Depth from Focus - The lens of the eye can change its shape to bring objects at different distances into focus. Knowing at what distance the lens is focused when viewing an object means knowing the approximate distance to that object.
  • Occlusion - Occlusion (blocking the sight) of objects by others is also a clue which provides information about relative distance. However, this information only allows the observer to create a "ranking" of relative nearness.
  • Peripheral vision - At the outer extremes of the visual field, parallel lines become curved, as in a photo taken through a fish-eye lens. This effect, although it's usually elimated from both art and photos by the cropping or framing of a picture, greatly enhances the viewer's sense of being positioned within a real, three dimensional space. (Classical perspective has no use for this so-called "distortion", although in fact the "distortions" strictly obey optical laws and provide perfectly valid visual information, just as classical perspective does for the part of the field of vision that falls within its frame.)
  • Texture gradient - Suppose you are standing on a gravel road. The gravel near you can be clearly seen in terms of shape, size and colour. As your vision shifts to wards the distant road the texture cannot be clearly differentiated.

[edit] Binocular and occulomotor cues

  • Stereopsis - Animals that have their eyes placed frontally can also use information derived from the different projection of objects onto each retina to judge depth. By using two images of the same scene obtained from slightly different angles, it is possible to triangulate the distance to an object with a high degree of accuracy. If an object is far away, the disparity of that image falling on both retinas will be small. If the object is close or near, the disparity will be large. It is stereopsis that tricks people into thinking they perceive depth when viewing Magic Eyes, Autostereograms, 3D movies and stereoscopic photos.
  • Accommodation - This is an oculomotor cue for depth perception. When we try to focus on far away objects, the ciliary muscles stretches the eye lens, making it thinner. The kinesthetic sensations of the contracting and relaxing ciliary muscles (intraocular muscles) is sent to the visual cortex where it is used for intrepeting distance/depth.
  • Convergence - This is also an oculomotor cue for distance/depth perception. By virtue of stereopsis the two eye balls focus on the same object. In doing so they converge. The convergence will stretch the extraocular muscles. Kinesthetic sensations from these extraocular muscles also help in depth/distance perception. The angle of convergence is larger when the eye is fixating on far away objects.

Of these various cues, only convergence, focus and familiar size provide absolute distance information. All other cues are relative (ie, they can only be used to tell which objects are closer relative to others). Stereopsis is merely relative because a greater or lesser disparity for nearby objects could either mean that those objects differ more or less substantially in relative depth or that the foveated object is nearer or further away (the further away a scene is, the smaller is the retinal disparity indicating the same depth difference).

Binocular cues can be directly perceived far more easily and eloquently than they can be described in words. Try looking around at the room you're in with just one eye open. Then look with just the other eye; the difference you notice will probably be negligible. After that, open both eyes, and see what happens.

[edit] The geometry of binocular vision (versus classical perspective)

Like gravity and electromagnetism, monocular "classical perspective" obeys the inverse square law. A doubling of distance from the viewer's eye reduces the apparent size of an object to one-quarter its previous dimensions. Conversely, halving the distance quadruples its apparent size; two squared makes four. One-third the distance increases apparent size by nine times, or three squared. (The height and width of the object will each be increased by two or three when distance is reduced to a half or a third. But the object's area is what determines its apparent size, and the area results from multiplying its height by its width: 2x2=4, 3x3=9.) Each of your eyes, like a camera lens, sees the world in "correct" classical perspective. Since binocular vision takes our perception of space to a stage beyond what we can see with just one eye, it follows that it operates by exaggerating and enhacing perspective in generating a single new "3-D" image out of the two "2-D" images the brain receives and transforms. Binocular depth perception might be described mathematically through some kind of non-Euclidian geometry. Binocular vision may make use of the fact that the "parallel" lines that converge at the vanishing point are actually curved, rather than straight, although this rarely noticed except in fish-eye lens photography. The curving is almost imperceptible near the focal point at the center of the visual field, but it may be greatly enhanced even there by the brain's synthesizing the eyes' two separate views. (See also parallax article.)

As often happens in science, practical applications go beyond our theoretical understanding of the underlying principles involved in the phenomenon. Artists employ subtle exaggerations and enhancements of classical perspective to suggest depth perception, to give the viewer a sense of subjective involvement in the picture's visual drama. Excellent examples of this can be found in Michelangelo's drawings with their anatomical "distortions", and in the best superhero comics. As explained in the book, How to Draw Comics the Marvel Way, a fist or a sword coming at the viewer won't have much gut-level visual impact unless it seems to be flying off the page into the viewer's face. This only happens when it's drawn as if the viewer's eye were just inches away from the closest part of it, even if the panel's overall composition suggests a more remote point of view. Thus the fist, or the tip of the sword, will be drawn significantly larger than indicated by correct (monocular) perspective; its exact dimensions will depend on the artist's intuitive judgment. Similarly, the convergence of parallel lines at their various vanishing points throughout a scene will be subtly exaggerated to create a more convincing sense of spacial depth than "correct" (monocular or classical) perspective can offer. In classical perspective, all vanishing points are exiled at a hypothetical ideal "infinity"; with binocular depth perception, vanishing points are brought into the realm of real space we all inhabit.

[edit] Evolution

Most herbivores, especially hooved grazers, lack depth perception. Instead, they have their eyes on the side of the head, providing a panoramic, almost 360º, view of the horizon - enabling them to notice the approach of predators from any direction. But both avian and mammalian predators have frontal eyes, allowing them to precisely judge distances when they pounce, or swoop down, onto their prey. Our own evolutionary line arrived at depth perception by a different route. When the ancestors of lemurs and monkeys ascended into the trees, their survival depended on judging distances from one branch to another. Natural selection would have done its brutal work by dropping a lot of tree shrews or other arboreal proto-primates onto the ground before they reached sexual maturity, leaving specimens with more frontal vision up in the treetops, able to reproduce and pass on their increasing depth perception. Your most human characteristics — not only your hands, but also the form of the face you see in the mirror — were in a sense created by trees.

[edit] Philosophical implications

Depth perception is (along with sexual reproduction), a convincing real-life example of the "thesis+antithesis>synthesis" model of progress developed by Hegel, Fichte, and Engels. Dialectics, including dialectical materialism, derives from the idea of a dialogue between people representing different points of view on a subject, who arrive through argument at a new way of seeing the subject that preserves whatever remains valid from both sides of the discussion (as in the Socratic dialogues of Plato). Binocular vision is a sort of argument between one eye (thesis) and the other (antithesis), each seeing the organism's environment from a slightly different perspective, which the brain resolves into a three-dimensional image containing contributions from both but transcending their limits. As in any good synthesis, depth perception is an almost magical leap to a higher level, embodying a "qualitative change" that could hardly have been imagined or predicted, by examining its component parts.

[edit] Depth perception in art

Photographs capturing perspective are two-dimensional images that often illustrate the illusion of depth. (This differs from a painting, which may use the physical matter of the paint to create a real presence of convex forms and spacial depth.) Stereoscopes and Viewmasters, as well as 3-D movies, employ binocular vision by forcing the viewer to see two images created from slightly different positions (points of view). By contrast, a telephoto lens — used in televised sports, for example, to zero in on members of a stadium audience — has the opposite effect. The viewer sees the size and detail of the scene as if it were close enough to touch, but the camera's perspective is still derived from its actual position a hundred meters away, so background faces and objects appear about the same size as those in the foreground.

Trained artists are keenly aware of the various methods for indicating spacial depth (color shading, distance fog, perspective and relative size), and take advantage of them to make their works appear "real". The viewer feels it would be possible to reach in and grab the nose of a Rembrandt portrait or an apple in a Cezanne still life — or step inside a landscape and walk around among its trees and rocks.

Cubism was based on the idea of incorporating multiple points of view in a painted image, as if to simulate the visual experience of being physically in the presence of the subject, and seeing it from different angles. The radical "High Cubist" experiments of Braque and Picasso circa 1909 are interesting but more bizarre than convincing in visual terms. Slightly later paintings by their followers, such as Robert Delaunay's views of the Eiffel Tower, or John Marin's Manhattan cityscapes, borrow the explosive angularity of Cubism to exaggerate the traditional illusion of three-dimensional space. A century after the Cubist adventure, the verdict of art history is that the most subtle and successful use of multiple points of view can be found in the pioneering late work of Cezanne, which both anticipated and inspired the first actual Cubists. Cezanne's landscapes and still lifes powerfully suggest the artist's own highly-developed depth perception. At the same time, like the other Post-Impressionists, Cezanne had learned from Japanese prints the significance of respecting the flat (two-dimensional) rectangle of the picture itself; Hokusai and Hiroshige ignored or even reversed linear perspective and thereby remind the viewer that a the picture can only be "true" when it acknowledges the truth of its own flat surface. By contrast, European "academic" painting was devoted to a sort of Big Lie that the surface of the canvas is only an enchanted doorway to a "real" scene unfolding beyond, and that the artist's main task is to distract the viewer from any disenchanting awareness of the presence of the painted canvas. Cubism, and indeed most of modern art is a struggle to confront, if not resolve, the paradox of suggesting spacial depth on a flat surface, and explore that inherent contradiction through innovative ways of seeing, as well as new methods of drawing and painting.


[edit] References

  • Palmer, S. E. (1999) Vision science: Photons to phenomenology. Cambridge, MA: Bradford Books/MIT Press.
  • Pinker, S. (1997). The Mind’s Eye. In How the Mind Works (pp. 211–233) ISBN 0-393-31848-6
  • Purves D, Lotto B (2003) Why We See What We Do: An Empirical Theory of Vision. Sunderland, MA: Sinauer Associates.

[edit] External links

  • Dale Purves Lab [1]

[edit] See also

[edit] Notes

1 The term 'parallax vision' is often used as a synonym for binocular vision, and should not be confused with motion parallax. The former allows far more accurate gauging of depth than the latter.

In other languages