Data segment

In computing, a data segment (often denoted .data) is a portion of an object file or the corresponding virtual address space of a program that contains initialized static variables, that is, global variables and static local variables. The size of this segment is determined by the size of the values in the program's source code, and does not change at run time.

The data segment is read-write, since the values of variables can be altered at run time. This is in contrast to the read-only data segment (rodata segment or .rodata), which contains static constants rather than variables; it also contrasts to the code segment, also known as the text segment, which is read-only on many architectures. Uninitialized data, both variables and constants, is instead in the BSS segment.

The PC architecture supports a few basic read-write memory regions in a program: stack, data and code. The heap is another region of address space available to a program, from which memory can be dynamically allocated or freed by the operating system in response to system calls such as malloc and free.

Program memory

The computer program memory is organized into the data segment (data and BSS), heap, stack, and code segment, as further explained below.

Data

The data area contains global and static variables used by the program that are explicitly initialized with a non-zero (or non-NULL) value. This segment can be further classified into a read-only area and read-write area. For instance, the string defined by char s[] = "hello world" in C, or a C statement like int debug = 1 outside the main function, would be stored in initialized read-write area. Also, a C statement like const char* string = "hello world", placed outside the main function, stores the string literal "hello world" in the read-only area (however, some C implementations may store it in the code segment instead), and the character pointer variable string in initialized read-write area.

BSS

The BSS segment, also known as uninitialized data, is usually adjacent to the data segment. The BSS segment contains all global variables and static variables that are initialized to zero or do not have explicit initialization in source code. For instance, a variable defined as static int i; would be contained in the BSS segment.

Heap

The heap area commonly begins at the end of the .bss and .data segments and grows to larger addresses from there. The heap area is managed by malloc, realloc, and free, which may use the brk and sbrk system calls to adjust its size (note that the use of brk/sbrk and a single "heap area" is not required to fulfill the contract of malloc/realloc/free; they may also be implemented using mmap to reserve potentially non-contiguous regions of virtual memory into the process' virtual address space). The heap area is shared by all threads, shared libraries, and dynamically loaded modules in a process.

Stack

Main article: Call stack

The stack area contains the program stack, a LIFO structure, typically located in the higher parts of memory. A "stack pointer" register tracks the top of the stack; it is adjusted each time a value is "pushed" onto the stack. The set of values pushed for one function call is termed a "stack frame". A stack frame consists at minimum of a return address. Automatic variables are also allocated on the stack.

The stack area traditionally adjoined the heap area and they grew towards each other; when the stack pointer met the heap pointer, free memory was exhausted. With large address spaces and virtual memory techniques they tend to be placed more freely, but they still typically grow in opposite directions. On the standard PC x86 architecture the stack grows toward address zero, meaning that more recent items, deeper in the call chain, are at numerically lower addresses and closer to the heap. On some other architectures it grows the opposite direction.

Interpreted languages

Some interpreted languages offer a similar facility to the data segment, notably Perl[1] and Ruby.[2] In these languages, including the line __DATA__ (Perl) or __END__ (Ruby, old Perl) marks the end of the code segment and the start of the data segment. Only the contents prior to this line are executed, and the contents of the source file after this line are available as a file object: PACKAGE::DATA in Perl (e.g., main::DATA) and DATA in Ruby. This can be considered a form of here document (a file literal).

See also

References

External links