Visual descriptors

In computer vision, visual descriptors or image descriptors are descriptions of the visual features of the contents in images, videos, algorithms, or applications that produce such descriptions. They describe elementary characteristics such as the shape, the color, the texture or the motion, among others.

Contents

Introduction

As a result of the new communication technologies and the massive use of Internet in our society, the amount of audio-visual information available in digital format is increasing considerably. Therefore, it has been necessary to design some systems that allow us to describe the content of several types of multimedia information in order to search and classify them.

The audio-visual descriptors are in charge of the contents description. These descriptors have a good knowledge of the objects and events found in a video, image or audio and they allow the quick and efficient searches of the audio-visual content.

This system can be compared to the search engines for textual contents. Although it is certain, that it is relatively easy to find text with a computer, is much more difficult to find concrete audio and video parts. For instance, imagine somebody searching a scene of a happy person. The happiness is a feeling and it is not evident its shape, color and texture description in images.

The description of the audio-visual content is not a superficial task and it is essential for the effective use of this type of archives. The standardization system that deals with audio-visual descriptors is the MPEG-7 (Motion Picture Expert Group - 7).

Types of visual descriptors

Descriptors are the first step to find out the connection between pixels contained in a digital image and what humans recall after having observed an image or a group of images after some minutes.

Visual descriptors are divided in two main groups:

  1. General information descriptors: they contain low level descriptors which give a description about color, shape, regions, textures and motion.
  2. Specific domain information descriptors: they give information about objects and events in the scene. A concrete example would be face recognition.

General information descriptors

General information descriptors consist of a set of descriptors that covers different basic and elementary features like: color, texture, shape, motion, location and others. This description is automatically generated by means of signal processing.

Specific domain information descriptors

These descriptors, which give information about objects and events in the scene, are not easily extractable, even more when the extraction is to be automatically done. Nevertheless they can be manually processed.

As mentioned before, face recognition is a concrete example of an application that tries to automatically obtain this information.

Descriptors applications

Among all applications, the most important ones are:

See also

MPEG-7

DSpace

Feature detection

References

B.S. Manjunath (Editor), Philippe Salembier (Editor), and Thomas Sikora (Editor): Introduction to MPEG-7: Multimedia Content Description Interface. Wiley & Sons, April 2002 - ISBN 0-471-48678-7

External links