User:Omphaloscope/CSV
From Wikipedia, the free encyclopedia
Comma-separated values | |
---|---|
File name extension | .csv |
Internet media type | text/csv |
The comma-separated values (or CSV) file format is a delimited data format commonly used for storing tabular data, such as an electronic spreadsheet. Data in CSV format typically appears like this, although there are variants:
"ID","Last name","First name","Email" "42","Adams","Douglas","douglas.adams@wikipedia.org"
Each line has a number of fields separated (or, delimited) by comma characters. Rows are separated by line breaks (specifically, newlines). Fields which themselves contain a comma, newline, or double quotation mark character, or which start or end with whitespace, must be enclosed in double quotation marks. Furthermore, if a line contains a single entry which is the empty string, it must be enclosed in double quotation marks. If a field's value contains a double quotation mark character it is escaped by placing another double quotation mark character next to it. The CSV file format does not require a specific character encoding, byte order, or line terminator format.
Contents |
[edit] Specification
While no formal specification for CSV exists, RFC 4180 describes a common format and establishes "text/csv" as the MIME type registered with the IANA. Many informal documents exist that describe the CSV format. How To: The Comma Separated Value (CSV) File Format provides an overview of the CSV format in the most widely used applications and explains how it can best be used and supported.
[edit] Example
1997 | Ford | E350 | ac, abs, moon | 3000.00 |
1999 | Chevy | Venture "Extended Edition" | 4900.00 | |
1996 | Jeep | Grand Cherokee | MUST SELL! air, moon roof, loaded |
4799.00 |
The above table of data may be represented in CSV format as follows:
1997,Ford,E350,"ac, abs, moon",3000.00 1999,Chevy,"Venture ""Extended Edition""",,4900.00 1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799.00
This CSV example illustrates that:
- fields that contain commas, double-quotes, or line-breaks must be quoted,
- a quote within a field must be escaped with an additional quote immediately preceding the literal quote,
- space before and after delimiter commas may be trimmed, and
- a line break within an element must be preserved.
[edit] Application support
The CSV file format is a very simple data file format that is supported by almost all spreadsheet software such as Excel (although some local versions use semicolons instead of commas), Calc, and Gnumeric. Any programming language that has input/output and string processing functionality will be able to read and write CSV files.
CSV files are ubiquitous for tabular data, as are ASCII files for text data.
[edit] Programming language tools
Language | Tool | Notes |
---|---|---|
BASIC | none required | supported internally |
C/C++ | Free Tools: | No comments in code. separated documentation.
Well documented, includes a CSV BNF grammar. |
Haskell | Text.CSV module by Jaap Weel | |
Java | Several free CSV tools exist:
CSVReader/Writer CSVFile [1] [2] [3] and commercial tools: Ricebridge Java CSV Component. There are also JDBC drivers available: [4] [5] [6] [7] and an ODBC driver: [8] |
|
LISP | fare-csv, csv-parser | fare-csv is an ASDF package, csv-parser is a .lisp file |
Mathematica | ||
MATLAB | csvread, dlmread. | In the standard library. |
.Net | FileHelpers - An Automatic File Import/Export Framework by Marcos Meli (LGPL)
Fast CSV Reader by Sébastien Lorion. Open Source class (MIT licence). GemBox.Spreadsheet by GemBox Software for CSV <==> XLS conversion. |
|
OCaml | OCaml CSV
Col: conversion between lists of records and CSV files with header (Camlp4 syntax extension) |
|
Perl | Text::CSV_XS, Text::CSV_PP, or using a Perl DBI interface: | from CPAN |
PHP | fgetcsv() function | In the standard library. Does not support newlines within element. |
Python | Python CSV module | In the standard library. |
R | read.csv |
In the standard library. |
Ruby | Ruby CSV module, or FasterCSV by James Gray | In the standard library. |
Scheme | Chicken Scheme CSV module |
[edit] Utilities
The csvprint utility will reformat CSV input based on a format string. This can be useful for reordering fields or generating source code or tables as illustrated in the following example:
$ csvprint data.csv "\t{ %0, %1, %2, \"%3\" },\n" { 0xC0000008, 0x00060001, NT_STATUS_INVALID_HANDLE, "The handle is invalid." },
csvdiff is a perl script to compare/diff two (comma) separated files with each other. The part that is different to standard diff is, that you'll get the number of the record where the difference occours and the field/column which is different. The separator can be set to the value you want it to, not just comma. Also you can to provide a third file which contains the columnnames in one(!) line separated by your separator. If you do so, columnnames are shown if a difference is found. Example:
$ perl csvdiff.pl -a act.csv -e exp.csv -s ";" -c col_names.csv -k "2" -t -i Record with key "200100500" is different: Actual line 006 > 200100500;200100500;6;;;;;;000;0;2005-12-20;55 < Expected line 008 > 200100500;200100500;6;;;;;;000;0;2005-12-19;55 < Difference in field no.: 11 - field name: Dat_Rueckgabe Actual > 2005-12-20 < Expected > 2005-12-19 <
[edit] External links
- RFC 4180: Common Format and MIME Type for Comma-Separated Values (CSV) Files
- How To: The Comma Separated Value (CSV) File Format