C file input/output

From Wikipedia, the free encyclopedia

The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header <stdio.h>.

The I/O functionality of C is fairly low-level by modern standards; C abstracts all file operations into operations on streams of bytes, which may be "input streams" or "output streams". Unlike some earlier programming languages, C has no direct support for random-access data files; to read from a record in the middle of a file, the programmer must create a stream, seek to the middle of the file, and then read bytes in sequence from the stream.

The stream model of file I/O was popularized by the Unix operating system, which was developed concurrently with the C programming language itself. The vast majority of modern operating systems have inherited streams from Unix, and many languages in the C programming language family have inherited C's file I/O interface with few if any changes (for example, PHP). The C++ standard library reflects the "stream" concept in its syntax; see iostream.

Contents

[edit] Opening a file using fopen

A file is opened using fopen, which returns an I/O stream attached to the specified file or other device from which reading and writing can be done. If the function fails, it returns a null pointer.

The related C library function freopen performs the same operation after first closing any open stream associated with its parameter.

They are defined as

FILE *fopen(const char *path, const char *mode);
FILE *freopen(const char *path, const char *mode, FILE *fp);

The fopen function is essentially a slightly higher-level wrapper for the open system call of Unix operating systems. In the same way, fclose is often a thin wrapper for the Unix system call close, and the C FILE structure itself often corresponds to a Unix file descriptor. In POSIX environments, the fdopen function can be used to initialize a FILE structure from a file descriptor; however, file descriptors are a purely Unix concept not present in standard C.

The mode parameter to fopen and freopen must be a string that begins with one of the following sequences:

mode description starts..
r rb open for reading beginning
w wb open for writing (creates file if it doesn't exist). Deletes content and overwrites the file. beginning
a ab open for appending (creates file if it doesn't exist) end
r+ rb+ r+b open for reading and writing beginning
w+ wb+ w+b open for reading and writing. Deletes content and overwrites the file. beginning
a+ ab+ a+b open for reading and writing (append if file exists) end

The "b" stands for binary. The C standard provides for two kinds of files — text files and binary files — although operating systems are not required to distinguish between the two. A text file is a file consisting of text arranged in lines with some sort of distinguishing end-of-line character or sequence (in Unix, a bare line feed character; in Microsoft Windows, a carriage return followed by a line feed). When bytes are read in from a text file, an end-of-line sequence is usually mapped to a linefeed for ease in processing. When a text file is written to, a bare linefeed is mapped to the OS-specific end-of-line character sequence before writing. A binary file is a file where bytes are read in "raw", and delivered "raw", without any kind of mapping.

When a file is opened with update mode ( '+' as the second or third character in the mode argument), both input and output may be performed on the associated stream. However, writes cannot be followed by reads without an intervening call to fflush or to a file positioning function ( fseek, fsetpos, or rewind), and reads cannot be followed by writes without an intervening call to a file positioning function. [1]

Writing and appending modes will attempt to create a file of the given name, if no such file already exists. As mentioned above, if this operation fails, fopen will return NULL.

[edit] Closing a stream using fclose

The fclose function takes one argument: a pointer to the FILE structure of the stream to close.

int fclose(FILE *fp);

The function returns zero on success, or EOF on failure.

[edit] Reading from a stream using fgetc

The fgetc function is used to read a character from a stream.

int fgetc(FILE *fp);

If successful, fgetc returns the next byte or character from the stream (depending on whether the file is "binary" or "text", as discussed under fopen above). If unsuccessful, fgetc returns EOF. (The exact type of error can be determined by calling ferror or feof with the file pointer.)

The standard macro getc, also defined in <stdio.h>, behaves in almost the same way as fgetc, except that — being a macro — it may evaluate its arguments more than once.

The standard function getchar, also defined in <stdio.h>, takes no arguments, and is equivalent to fgetc(stdin).

[edit] The EOF pitfall

A common mistake when using fgetc, getc, or getchar is to assign the result to a variable of type char before comparing it to EOF. The following snippets of code exhibit this mistake, and then show the correct approach:

char c;
while ((c = getchar()) != EOF) { /* Bad! */
    putchar(c);
}
int c;
while ((c = getchar()) != EOF) { /* Okay! */
    putchar(c);
}

Consider a system in which the type char is 8 bits wide, representing 256 different values. getchar may return any of the 256 possible characters, and it also may return EOF to indicate end-of-file, for a total of 257 different possible return values.

When getchar's result is assigned to a char, which can represent only 256 different values, there is necessarily some loss of information — when packing 257 items into 256 slots, there must be a collision. The EOF value, when converted to char, becomes indistinguishable from whichever one of the 256 characters shares its numerical value. If that character is found in the file, the above example may mistake it for an end-of-file indicator; or, just as bad, if type char is unsigned, then because EOF is negative, it can never be equal to any unsigned char, so the above example will not terminate at end-of-file. It will loop forever, repeatedly printing the character which results from converting EOF to char.

On systems where int and char are the same size, even the "good" example will suffer from the indistinguishability of EOF and some character's value. The proper way to handle this situation is to check feof and ferror after getchar returns EOF. If feof indicates that end-of-file has not been reached, and ferror indicates that no errors have occurred, then the EOF returned by getchar can be assumed to represent an actual character. These extra checks are rarely done, because most programmers assume that their code will never need to run on one of these "big char" systems.

[edit] External link

  • Question 12.1 in the C FAQ: using char to hold getc's return value

[edit] Writing to a stream using fputc

The fputc function is used to write a character to a stream.

int fputc(int c, FILE *fp);

The parameter c is silently converted to an unsigned char before being output. If successful, fputc returns the character written. If unsuccessful, fputc returns EOF.

The standard macro putc, also defined in <stdio.h>, behaves in almost the same way as fputc, except that — being a macro — it may evaluate its arguments more than once.

The standard function putchar, also defined in <stdio.h>, takes only the first argument, and is equivalent to fputc(c, stdout) where c is that argument.

[edit] Example usage

The following C program opens a binary file called myfile.dat, reads five bytes from it, and then closes the file.

#include <stdio.h>

int main(void) 
{
    char buffer[5] = {0};  /* initialized to zeros */
    int i;
    FILE *fp = fopen("myfile.dat", "wb");
    if (fp == NULL) {
        printf("The file didn't open.\n");
        return 0;
    }
    for (i=0; i < 5; ++i) {
        int rc = fgetc(fp);
        if (rc == EOF) {
            printf("There was an error reading the file.\n");
            break;
        }
        buffer[i] = rc;
    }
    fclose(fp);
    if (i == 5) {
        printf("The bytes read were...\n");
        putchar(buffer[0]);
        putchar(buffer[1]);
        putchar(buffer[2]);
        putchar(buffer[3]);
        putchar(buffer[4]);
        putc('\n', stdout);
    }
    return 0;
}

[edit] See also

[edit] External links

In other languages