dd (Unix)

dd is a command-line utility for Unix and Unix-like operating systems whose primary purpose is to convert and copy files.[1]

On Unix, device drivers for hardware (such as hard disk drives) and special device files (such as /dev/zero and /dev/random) appear in the file system just like normal files; dd can also read and/or write from/to these files, provided that function is implemented in their respective driver. As a result, dd can be used for tasks such as backing up the boot sector of a hard drive, and obtaining a fixed amount of random data. The dd program can also perform conversions on the data as it is copied, including byte order swapping and conversion to and from the ASCII and EBCDIC text encodings.[2]

The name dd is an allusion to the DD statement found in IBM's Job Control Language (JCL),[3][4] in which the initials stand for "Data Definition".[5] The command's syntax resembles the JCL statement more than it does other Unix commands, so the syntax may have been a joke.[3]

Originally intended to convert between ASCII and EBCDIC, dd first appeared in Version 5 Unix.[6] The dd command is specified by IEEE Std 1003.1-2008, which is part of the Single UNIX Specification.

Usage

The command line syntax of dd differs from many other Unix programs, in that it uses the syntax option=value for its command line options, rather than the more-standard -option value or --option=value formats. By default, dd reads from stdin and writes to stdout, but these can be changed by using the if (input file) and of (output file) options.

Usage varies across different operating systems. Also, certain features of dd will depend on the computer system capabilities, such as dd's ability to implement an option for direct memory access. Sending a SIGINFO signal (or a USR1 signal on Linux) to a running dd process makes it print I/O statistics to standard error once and then continue copying. dd can read standard input from the keyboard. When end-of-file (EOF) is reached, dd will exit. Signals and EOF are determined by the software. For example, Unix tools ported to Windows vary as to the EOF: Cygwin uses Ctrl+D (the usual Unix EOF) and MKS Toolkit uses ctrl+z (the usual Windows EOF).

Output messages

The GNU variant of dd as supplied with coreutils on Linux does not describe the format of the messages displayed on standard output on completion. However, these are described by other implementations, e.g. that with BSD.

Each of the "Records in" and "Records out" lines shows the number of complete blocks transferred + the number of partial blocks, e.g. because the physical medium ended before a complete block was read, or a physical error prevented reading the complete block.

Block size

A block is a unit measuring the number of bytes that are read, written, or converted at one time. Command line options can specify a different block size for input/reading (ibs) compared to output/writing (obs), though the block size (bs) option will override both ibs and obs. The default value for both input and output block sizes is 512 bytes (the traditional block size of disks, and POSIX-mandated size of "a block"). The count option for copying is measured in blocks, as are both the skip count for reading and seek count for writing. Conversion operations are also affected by the "conversion block size" (cbs).

The value provided for block size options is interpreted as a decimal (base 10) integer number of bytes. It can also contain suffixes to indicate that the block size is an integer number of larger units than bytes. The suffix w (words) means multiplication by 2, lowercase b (blocks) means 512, lowercase k (kibibytes) means 1024, then uppercase M (Mebibytes) means 1024 × 1024, G (Gibibytes) means 1024 × 1024 × 1024, and so on for Tebibytes, Exbibytes, Pebibytes, Zebibytes, and Yobibytes. Some implementations also understand the suffix uppercase B to indicate SI units such as kB (kilobytes) for 1000 bytes or MB (Megabytes) for 1,000,000 bytes. Thus bs=16M indicates a blocksize of 16 mebibytes (16,777,216 bytes), or bs=3kB specifies 3,000 bytes.

Additionally, some implementations understand the x character as a multiplication operator for both block size and count parameters. For example, bs=2x80x18b is interpreted as 2 × 80 × 18 × 512 = 1474560 bytes, the exact size of a 1440 KiB floppy disk.

For some uses of the dd command, block size has an effect on performance. Doing many small reads or writes is often slower than doing fewer large ones. Using large blocks requires more RAM and can complicate error recovery. When dd is used with variable-block-size devices such as tape drives or networks, the block size may determine the tape record size or packet size, depending on the network protocol used.

Uses

The dd command can be used for a variety of purposes.

Data transfer

dd can duplicate data across files, devices, partitions and volumes. The data may be input or output to and from any of these; but there are important differences concerning the output when going to a partition. Also, during the transfer, the data can be modified using the conv options to suit the medium.

An attempt to copy the entire disk using cp may omit the final block if it is of an unexpected length; whereas dd may succeed. The source and destination disks should have the same size.

Data transfer forms of dd
blocks=$(isosize -d 2048 /dev/sr0)<br/>dd if=/dev/sr0 of=isoimage.iso bs=2048 count=$blocks status=progress
Creates an ISO disk image from a CD-ROM, DVD or Blu-ray disk.[7]
dd if=system.img of=[[/dev/sdc]] bs=4096 conv=noerror
Restores a hard disk drive (or an SD card, for example) from a previously created image.
dd if=/dev/sda2 of=/dev/sdb2 bs=4096 conv=noerror
Clones one partition to another.
dd if=/dev/ad0 of=/dev/ad1 bs=1M conv=noerror
Clones a hard disk drive "ad0" to "ad1".

The noerror option means to keep going if there is an error, while the sync option causes output blocks to be padded.

Master boot record backup and restore

It is possible to repair a master boot record. It can be transferred to and from a repair file.

To duplicate the first two sectors of a floppy drive:

dd if=/dev/fd0 of=MBRboot[[.img]] bs=512 count=2

To create an image of the entire x86 master boot record (including a MS-DOS partition table and MBR magic bytes):

dd if=[[/dev/sda]] of=MBR[[.img]] bs=512 count=1

To create an image of only the boot code of the master boot record (without the partition table and without the magic bytes required for booting):

dd if=[[/dev/sda]] of=MBR_boot[[.img]] bs=446 count=1

Data modification

dd can modify data in place. For example, this overwrites the first 512 bytes of a file with null bytes:

dd if=[[/dev/zero]] of=path/to/file bs=512 count=1 conv=notrunc

The notrunc conversion option means do not truncate the output file — that is, if the output file already exists, just replace the specified bytes and leave the rest of the output file alone. Without this option, dd would create an output file 512 bytes long.

To duplicate a disk partition as a disk image file on a different partition:

dd if=/dev/sdb2 of=partition.image bs=4096 conv=noerror

Disk wipe

For security reasons, it is sometimes necessary to have a disk wipe of a discarded device.

To wipe a disk by writing zeros to it, dd can be used this way:

dd if=[[/dev/zero]] of=[[/dev/sda]] bs=16M

Another approach could be to wipe a disk by writing random data to it:

dd if=[[/dev/urandom]] of=[[/dev/sda]] bs=16M

When compared to the data modification example above, notrunc conversion option is not required as it has no effect when the dd's output file is a block device.[8]

The bs=16M option makes dd read and write 16 Mebibytes at a time. For modern systems, an even greater block size may be faster. Note that filling the drive with random data may take longer than zeroing the drive, because the random data must be created by the CPU, while creating zeroes is very fast. On modern hard-disk drives, zeroing the drive will render most data it contains permanently irrecoverable.[9] However, with other kinds of drives such as flash memories, much data may still be recoverable by special laboratory techniques.

Modern hard disk drives contain a Secure Erase command designed to permanently and securely erase every accessible and inaccessible portion of a drive. It may also work for some Solid-state drives (flash drives). As of 2017, it does not work on USB flash drives nor on Secure Digital flash memories. When available, this is both faster than using dd, and more secure. On Linux machines it is accessible via the hdparm command's --security-erase-enhanced option.

The shred program offers multiple overwrites as well as more-secure deletion of individual files.

Data recovery

The early history of open-source software for data recovery and restoration of files, drives and partitions included the GNU dd, whose copyright notice starts in 1985,[10] with one block size per dd process, and no recovery algorithm other than the user's interactive session running one form of dd after another. Then, a C program called dd_rescue[11] was written in October 1999, having two block sizes in its algorithm. However, the author of the 2003 shell script dd_rhelp, which enhances dd_rescue's data recovery algorithm, recommends GNU ddrescue,[12][13] a data recovery program unrelated to dd that was initially released in 2004.

To help distinguish the newer GNU program from the older script, alternate names are sometimes used for GNU's ddrescue, including addrescue (the name on freecode.com and freshmeat.net), gddrescue (Debian package name), and gnu_ddrescue (openSUSE package name). Another open-source program called savehd7 uses a sophisticated algorithm, but it also requires the installation of its own programming-language interpreter.

Benchmarking drive performance

To make drive benchmark test and analyze the sequential (and usually single-threaded) system read and write performance for 1024-byte blocks:

dd if=[[/dev/zero]] bs=1024 count=1000000 of=file_1GB
dd if=file_1GB of=[[/dev/null]] bs=1024

Generating a file with random data

To make a file of 100 random bytes using the kernel random driver:

dd if=[[/dev/urandom]] of=myrandom bs=100 count=1

Converting a file to upper case

To convert a file to uppercase:

dd if=filename of=filename1 conv=ucase,notrunc

Limitations

As stated in a part of documentation provided by Seagate, "certain disc [sic] utilities, such as DD, which depend on low-level disc [sic] access may not support 48-bit LBAs until they are updated".[14] Using ATA hard disk drives over 128 GiB in size requires system support 48-bit LBA; however, in Linux, dd uses the kernel to read or write to raw device files instead of accessing hardware directly.[lower-alpha 1] At the same time, support for 48-bit LBA has been present since version 2.4.23 of the kernel, released in 2003.[15][16]

Dcfldd

dcfldd is a fork of dd that is an enhanced version developed by Nick Harbour, who at the time was working for the United States' Department of Defense Computer Forensics Lab.[17][18][19] Compared to dd, dcfldd allows for more than one output file, supports simultaneous multiple checksum calculations, provides a verification mode for file matching, and can display the percentage progress of an operation.

See also

Notes

  1. This is verifiable with strace.

References

  1. Austin Group. "POSIX standard: dd invocation". Retrieved 2016-09-29.
  2. Sam Chessman. "How and when to use the dd command?". CodeCoffee. Retrieved 2008-02-19.
  3. 1 2 Eric S. Raymond. "dd". Retrieved 2008-02-19.
  4. Dennis Ritchie (Feb 17, 2004). "Re: origin of the UNIX dd command". Newsgroup: alt.folklore.computers. Usenet: c0s1he$1atuh9$1@ID-156882.news.uni-berlin.de. Retrieved January 10, 2016. dd was always named after JCL dd cards.
  5. Barry Shein (Apr 22, 1990). "Re: etymology of the Unix "dd" command". Newsgroup: alt.folklore.computers. Usenet: 1990Apr22.191928.11180@world.std.com. Retrieved 2016-07-14.
  6. McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.
  7. Reading an ISO image from a CD, DVD, or BD, ARCH linux documentation, accessed: 2017-01-22.
  8. "linux - Why using conv=notrunc when cloning a disk with dd?". Stack Overflow. 2013-12-11. Retrieved 2014-03-24.
  9. Wright, Craig; Kleiman, Dave; Sundhar R.S., Shyaam (2008). "Overwriting Hard Drive Data: The Great Wiping Controversy". Lecture Notes in Computer Science. Information Systems Security. 5352: 243–257. doi:10.1007/978-3-540-89862-7_21. Retrieved 7 March 2012.
  10. "Savannah Git Hosting – coreutils.git/blob – src/dd.c". git.savannah.gnu.org. Retrieved January 21, 2015.
  11. "dd_rescue". garloff.de.
  12. "Ddrescue - GNU Project - Free Software Foundation (FSF)". gnu.org.
  13. LAB Valentin (19 September 2011). "dd_rhelp author's repository". Important note : For some times, dd_rhelp was the only tool (AFAIK) that did this type of job, but since a few years, it is not true anymore: Antonio Diaz did write a ideal replacement for my tool: GNU 'ddrescue'.
  14. Windows 137GB (128 GiB) Capacity Barrier - Seagate Technology (March 2003)
  15. "ChangeLog-2.4.23". www.kernel.org. Retrieved 2009-12-07.
  16. Linux-2.4.23 released Linux kernel mailing list, 2003.
  17. "DCFLDD at Source Forge". Source Forge. Retrieved 2013-08-17.
  18. Jeremy Faircloth, Chris Hurley (2007). Penetration Tester's Open Source Toolkit. Syngress. pp. 470–472. ISBN 9780080556079.
  19. Jack Wiles, Anthony Reyes (2011). The Best Damn Cybercrime and Digital Forensics Book Period. Syngress. pp. 408–411. ISBN 9780080556086.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.