Data loss

From Wikipedia, the free encyclopedia

In the field of information technology, data loss refers to the unforeseen loss of data or information. An occurrence of data loss can be called a Data Loss Event and there are several possible root causes. Data loss must be distinguished from data unavailability, such as may arise from a network outage. Although the two have substantially similar effects, data unavailability is temporary while data loss is permanent. Backup and recovery schemes are developed to restore lost data.

Contents

[edit] Types of Data Loss Events

  • Intentional Action
    • Intentional deletion of a file or program
  • Unintentional Action
    • Accidental deletion of a file or program
    • Misplacement of CDs or floppies
    • Administration errors
    • Inability to read unknown file format
  • Failure
    • Power failure, resulting in data in volatile memory not being saved to permanent memory.
    • Hardware failure, such as a head crash in a hard disk.
    • A software crash or freeze, resulting in data not being saved.
    • Software bugs or poor usability, such as not confirming a file delete command.
    • Data corruption, such as filesystem corruption or database corruption.
  • Disaster
  • Crime
    • Theft, hacking, sabotage, etc.
    • A malicious act, such as a worm, virus, hacker or theft of physical media.

Studies have consistently shown hardware failure and human error to be two most common causes of data loss, accounting for roughly three quarters of all incidents. [1] A commonly overlooked cause is a natural disaster. Although the probability is small, the only way to recover from data loss due to a natural disaster is to store backup data in a physically separate location.

[edit] Cost of data loss

The cost of a Data Loss Event is directly related to the value of the data and the length of time that it is needed, but unavailable. Consider:

  • The cost of continuing without the data.
  • The cost of recreating the data.
  • The cost of notifying users in the event of a compromise

[edit] Organizational Responsibility

Recent statistics show the number of publicized data loss events involving sensitive data is on the rise[2], in part due to recent legislation, including the landmark California SB 1386, requiring the notification of data loss. This and other legislation has forced organizations to notify victims that their identity has potentially been compromised.

[edit] Preventing data loss

There is no guaranteed way to prevent data loss. However, the frequency of data loss events and their impact can be greatly mitigated by taking proper precautions. The different types of data loss events demand different types of precautions. For example, multiple power circuits with battery backup and a generator will only protect against power failures. Similarly, using a journaling filesystem and RAID storage will only protect against certain types of software and hardware failure. Regular data backups are an important asset to have when trying to recover after a data loss event, but they don't do much to prevent user errors or system failures.

A well rounded approach to data protection has the best chance of avoiding data loss events. Such an approach will also include such mundane tasks as maintaining antivirus protection and network firewalls, as well as staying up to date with all published security fixes and system patches. User education is probably the most important, and most difficult, aspect of preventing data loss. Nothing else will prevent users from making mistakes that jeopardize data security.

Also see: Data Loss Prevention

[edit] Recovery from data loss

Main article: Disaster recovery

Successful recovery from a Data Loss Event generally requires an effective backup strategy. Without a backup strategy, recovery requires reinstallation of programs and regeneration of data. Even with an effective backup strategy, restoring a system to the precise state it was in prior to the Data Loss Event is extremely difficult. Some level of compromise between granularity of recoverability and cost is necessary. Furthermore, a Data Loss Event may not be immediately apparent. An effective backup strategy must also consider the cost of maintaining the ability to recover lost data for long periods of time.

The most convenient backup system would have duplicate copies of every file and program that were immediately accessible whenever a Data Loss Event was noticed. However, in most situations, there is an inverse correlation between the value of a unit of data and the length of time it takes to notice the loss of that data. Taking this into consideration, many backup strategies decrease the granularity of restorability as the time increases since the potential Data Loss Event. By this logic, recovery from recent Data Loss Events is easier and more complete than recovery from Data Loss Events that happened further in the past.

Recovery is also related to the type of Data Loss Event. Recovering a single lost file is going to be substantially different than recovering a whole system that was destroyed in a flood. An effective backup regimen will have some proportionality between the magnitude of Data Loss and the magnitude of effort required to recover. For example, it should be far easier to restore the single lost file than to recover the whole system destroyed in a flood.

[edit] Data recovery

Main article: Data recovery

There are commercial services that attempt to recover data from physically damaged media. These services are typically very expensive.

Filesystem corruption can usually be repaired by the user or the system administrator with the right software tools. A deleted file is rarely overwritten on disk; it is more usual for the operating system to simply delete its entry in the filesystem index. This can be easily reversed.

[edit] See also

[edit] References

  1. ^ The cost of lost data - Graziadio Business Report
  2. ^ Etiolated Statistics. Etiolated Consumer\Citizen. Retrieved on 2007-06-05.
Languages