Talk:Record linkage
From Wikipedia, the free encyclopedia
Record linkage and deduplication are NOT the same thing. The first is linking two or more datasets, the second is removing duplicate entries in a single dataset. It's helpful (often very important) to DEDUPLICATE before attempting a RECORD LINKAGE.
Using your definition, deduplication is a simple(r) instance of record linkage. In depuplication the "two" datasets are the same, and have the same structure in terms of fields, something that is not always the case with Record Linkage. The two terms are often used interchangeably (and many other terms are also used to refer to the same concept, which is kind of ironic if you think about it) Ipeirotis 04:46, 1 February 2007 (UTC)