Watermark (data file)

From Wikipedia, the free encyclopedia

See also Watermark for other uses of the term.

A watermark stored in a data file may refer to a method for ensuring data integrity which combines aspects of data hashing and digital watermarking.

Data hashing and digital watermarking are useful for tamper detection, but at the same time these techniques have associated disadvantages. For example, a typical data hash will process an input file to produce an alphanumeric string unique to the data file. If one or more bit changes occur within this original file, thereby resulting in a modified data file, the same hash process on the modified file will produce a completely different alphanumeric. In this manner, if a trusted source calculates the hash of the original data file, subscribers can verify the integrity of the data. The subscriber simply compares a hash of the received data file with the known hash from the trusted source. If the hash results are the same, they can assign an appropriate degree of confidence to the integrity of the received data. On the other hand, if the hash results are different, they can conclude that the received data file was altered. A disadvantage of this hash process is that no indications exist as to the extent or location of changes within the received data file. It's an all or nothing process; either the entire received data file is trustworthy or none of it is.

Digital watermarking is distinctly different from data hashing, with associated advantages and disadvantages. Digital watermarking is the process of altering the original data file, allowing for the subsequent recovery of embedded auxiliary data referred to as a watermark. A subscriber, with knowledge of the watermark and how it is recovered, can determine, to some extent, whether or not significant changes occurred within the data file. Depending on the specific method used, recovery of the embedded auxiliary data can be robust to post-processing (e.g., lossy compression). For example, if the data file to be retrieved is an image, the provider can embed a watermark for protection purposes. In this case, the process allows tolerance to some change, while maintaining an association with the original image file. Researchers have also developed techniques that embed components of the image within the image. This can help identify portions of the image that may contain unauthorized changes and even help in recovering some of the lost data. A disadvantage of digital watermarking is that a subscriber cannot significantly alter some files without sacrificing the quality or utility of the data. This can be true of various files including image data, audio data, and computer code.

[edit] Source and further information