Netstrings
From Wikipedia, the free encyclopedia
In computer programming, a netstring refers to a self-delimited way of encoding a (byte)string, by storing the byte length data that follows. This makes it easier to unambiguously pass text and byte data which may include values that could otherwise be interpreted as delimiters or terminators (such as a null character).
Netstrings are described in a document by D. J. Bernstein and are used, among other places, in the Simple Common Gateway Interface (SCGI) and the Quick Mail Queuing Protocol (QMQP).
The format consists of the string's length written using ASCII digits, followed by a colon, the data, and a comma. For example, "hello world!" encodes as:
12:hello world!,
And an empty string as:
0:,
The comma makes it slightly simpler for humans to read netstrings that are used as adjacent records, and provides weak verification of correct parsing. Note that without the comma, the format mirrors how Bencode encodes strings.
In practice, netstrings are often used to simplify exchange of (lists of) bytestrings between programs. Since the format is easy to generate and to parse, it is easily supported by programs written in different programming languages, and avoid problems that may cause programmers to resort ot custom formats.
Netstrings avoid complications that arise in trying to embed arbitrary data in delimited formats. For example, XML may not contain certain byte values and requires a nontrivial combination of escaping and delimiting, while generating multipart MIME messages involves choosing a delimiter that must not clash with the content of the data. Note that since no limitations are posed on the contents of the data, netstrings can not be embedded in any delimited format without the possibility of them interfering with the delimiting of the containing format.
In the context of network programming it is potentially useful that the receiving end is informed of the size of the data that follows, as it can allocate exactly enough memory and avoid the need for continuous reallocation to accommodate more data.