Axiomatic theory of receptive fields

Scale space
Scale-space axioms
Scale-space implementation
Feature detection
Edge detection
Blob detection
Corner detection
Ridge detection
Interest point detection
Scale selection
Affine shape adaptation
Scale-space segmentation
Axiomatic theory of receptive fields

Receptive field profiles registered by cell recordings have shown that mammalian vision has developed receptive fields tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time.[1][2][3][4][5] Corresponding cell recordings in the auditory system has shown that mammals have developed receptive fields tuned to different frequencies as well as temporal transients.[6][7][8][9] This article describes normative theories that have been developed to explain these properties of sensory receptive fields based on structural properties of the environment. Beyond theoretical explanation of biological phenomena, these theories can also be used for computational modelling of biological receptive fields and for building algorithms for artificial perception based on sensory data.

Computational theory of visual receptive fields

Idealized models of visual receptive fields similar to those found in the retina, the lateral geniculate nucleus and the primary visual cortex of higher mammals can be derived in an axiomatic way from structural requirements on the first stages of visual processing that reflect symmetry properties of the surrounding world in combination with additional assumptions to ensure internally consistent image representations at multiple spatial and temporal scales.[10][11] Specifically, idealized functional models for linear spatio-temporal receptive fields can be derived in a principled manner to constitute a combination of Gaussian derivatives over the spatial domain and either non-causal Gaussian derivatives or truly time-causal temporal scale-space kernels over the temporal domain: [10][11][12]

where

Correspondingly, and with similar notation idealized functional models for spatial receptive fields can be expressed of the form

This model specifically generalizes the receptive field model in terms of Gaussian derivatives[13][14][15][16][17]

from directional derivatives of rotationally Gaussian kernels to directional derivatives of affine Gaussian kernels .

Idealized functional models of receptive fields of these forms have been shown to quite well reproduce the shape of spatial and spatio-temporal receptive fields measured by cell recordings of neurons in the LGN and of simple cells in the primary visual cortex (V1).[10][11][12][3][4]

Theoretical arguments have been presented of preferring this generalized Gaussian model of receptive fields over a Gabor model of receptive fields, because of the better theoretical properties of the generalized Gaussian model under natural image transformations.[10][18] Specifically, these generalized Gaussian receptive fields can be shown to enable computation of invariant visual representations under natural image transformations.[18] By these results, the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, can be seen as well adapted to structure of the physical world and be explained from the requirement that the visual system should have the possibility of being invariant to the natural types of image transformations that occur in its environment.[10][11][18]

Computational theory of auditory receptive fields

A computational theory for auditory receptive fields can be expressed in a structurally similar way, permitting the derivation of auditory receptive fields in two stages:[19][20]

where denotes time, denotes the angular frequency, denotes the temporal scale of the window function , which can be chosen as either Gabor functions in the case of non-causal time or Gammatone functions alternatively generalized Gammatone functions for a truly time-causal model in which the future cannot be accessed,

applied to the magnitude of the logarithmically transformed spectrogram

where

and with the temporal smoothing kernels chosen as either Gaussian kernels over time in the case of non-causal time or first-order integrators (truncated exponential kernels) coupled in cascade in the case of truly time-causal operations.

Interestingly, the shapes of the receptive field functions in these models can be determined by necessity from structural properties of the environment combined with requirements about the internal structure of the auditory system to enable theoretically well-founded processing of sound signals at different temporal and log-spectral scales. Specifically, the resulting spectro-temporal fields in this model obey invariance or covariance properties over natural sound transformations including: (i) temporal shifts, (ii) variations in sound pressure, (iii) the distance between the sound source and the observer, (iv) a shift in the frequencies of auditory stimuli and (v) glissando transformations.[19][20]

Idealized receptive fields of this form can be shown to well model the qualitative shape of spectro-temporal receptive fields as measured by cell recordings in the inferior colliculus (ICC) as well as the linear component of some receptive fields measured in the primary auditory cortex.[19][20]

See also

References

  1. D. Hubel and T. N. Wiesel (1959) "Receptive field of single neurons in the cat’s striate cortex", J Physiol 147, 226–238.
  2. D. Hubel and T. N. Wiesel (2005) Brain and Visual Perception: The Story of a 25-Year Collaboration. Oxford University Press.
  3. 1 2 G. C. DeAngelis, I. Ohzawa and R. D. Freeman (1995) "Receptive field dynamics in the central visual pathways". Trends Neurosci. 18(10), 451–457.
  4. 1 2 G. C. DeAngelis and A. Anzai (2004) "A modern view of the classical receptive field: linear and non-linear spatio-temporal processing by V1 neurons. In: Chalupa, L.M., Werner, J.S. (eds.) The Visual Neurosciences, vol. 1, pp. 704–719. MIT Press, Cambridge.
  5. B. R. Conway and M. S. Livingstone (2006) "Spatial and temporal properties of cone signals in alert macaque primary visual cortex", The Journal of Neuroscience 26(42): 10826-10846.
  6. L. M. Miller, N. A. Escabi, H. L. Read and C. Schreiber (2001) "Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex". J. Neurophys. 87:516-527.
  7. A. Qiu, C. E. Schreiber and M.A. Escape (2003) "Gabor analysis of auditory midbrain receptive fields: Spectro-temporal and binaural composition", J. of Neurophysiology 90: 456-476.
  8. M. Elhilali, J. Fritz, T. S. Chi and S. Shamma (2007) "Auditory cortical receptive fields: Stable entities with plastic abilities", J. of Neuroscience 27: 10372-10382.
  9. C. A. Atencio and C. E. Schreiber (2012) "Spectrotemporal processing in spectral tuning modules of cat primary auditory cortex", PLOS ONE 7:e31537.
  10. 1 2 3 4 5 T. Lindeberg (2013) "A computational theory of visual receptive fields", Biological Cybernetics, 107(6): 589-635.
  11. 1 2 3 4 T. Lindeberg (2016) "Time-causal and time-recursive spatio-temporal receptive fields", Journal of Mathematical Imaging and Vision 55(1): 50-88.
  12. 1 2 T. Lindeberg (2011) "Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space", Journal of Mathematical Imaging and Vision, 40(1): 36-81.
  13. J. J. Koenderink and A. J. van Doorn (1987) "Representation of local geometry in the visual system", Biological Cybernetics 55:367–375.
  14. R. A. Young (1987) "The Gaussian derivative model for spatial vision: I. Retinal mechanisms", Spatial Vision 2(4): 273-293.
  15. J. J. Koenderink and A. J. van Doorn (1992) "Generic neighbourhood operators", IEEE Transactions on Pattern Analysis and Machine Intelligence, 14: 597-605.
  16. T. Lindeberg (1993) Scale-Space Theory in Computer Vision, Springer, 1993, ISBN 0-7923-9418-6.
  17. T. Lindeberg (1994). "Scale-space theory: A basic tool for analysing structures at different scales". Journal of Applied Statistics. 21 (2). pp. 224–270. doi:10.1080/757582976.
  18. 1 2 3 T. Lindeberg (2013) "Invariance of visual operations at the level of receptive fields", PLOS ONE 8(7): e66990, pages 1-33.
  19. 1 2 3 T. Lindeberg and A. Friberg (2015) "Idealized computational models of auditory receptive fields", PLOS ONE, 10(3): e0119032, pages 1-58.
  20. 1 2 3 T. Lindeberg and A. Friberg (2015) "Scale-space theory for auditory signals", Proc. SSVM 2015: Scale-Space and Variational Methods in Computer Vision, Springer LNCS 9087: 3-15.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.