Information integration
From Wikipedia, the free encyclopedia
Information integration (II) (also called information fusion, deduplication and referential integrity) is the merging of information from disparate sources with differing conceptual, contextual and typographical representations. It is used in data mining and consolidation of data from unstructured or semi-structured resources. Typically, information integration refers to textual representations of knowledge but is sometimes applied to rich media content.
Among the technologies available to integrate information are string metrics that allow detection of similar text in different data sources by fuzzy matching.