Indent style

In computer programming, an indentation style is a convention governing the indenting of blocks of code to convey program structure. This article largely addresses the free-form languages, such as C and its descendants, but can be (and often is) applied to most other programming languages (especially those in the curly bracket family), where whitespace is otherwise insignificant. Indentation style is only one aspect of programming style.

Indenting is not a requirement of most programming languages, where it is used as secondary notation. Rather, indenting helps better convey the structure of a program to human readers. Especially, it is used to clarify the link between control flow constructs such as conditions or loops, and code contained within and outside of them. However, some languages (such as Python and occam) use indenting to determine the structure instead of using braces or keywords; this is termed the off-side rule. In such languages, indenting is meaningful to the compiler or interpreter; it is more than only a clarity or style issue.

This article uses "brackets" to refer to what are termed parentheses in American English, and "braces" to refer to what are termed curly brackets in American English.

Brace placement in compound statements

The main difference between indent styles lies in the placing of the braces of the compound statement ({...}) that often follows a control statement (if, while, for...). The table below shows this placement for the style of statements discussed in this article; function declaration style is another case. The style for brace placement in statements may differ from the style for brace placement of a function definition. For consistency, the indent depth has been kept constant at 4 spaces, regardless of the preferred indent depth of each style.

Brace placement Styles
while (x == y) {
    something();
    somethingelse();
}
K&R and variants:
1TBS, Stroustrup, Linux kernel, BSD KNF
while (x == y)
{
    something();
    somethingelse();
}
Allman
while (x == y)
  {
    something();
    somethingelse();
  }
GNU
while (x == y)
    {
    something();
    somethingelse();
    }
Whitesmiths
while (x == y)
{   something();
    somethingelse();
}
Horstmann
while (x == y)
{   something();
    somethingelse(); }
Pico
while (x == y) {
    something();
    somethingelse();
    }
Ratliff
while (x == y) {
    something();
    somethingelse(); }
Lisp

Tabs, spaces, and size of indent

Many early programs used tab characters to indent, for simplicity and to save on source file size. Unix editors generally view tabs as equaling eight characters, while Macintosh and Windows environments would set them to four, creating confusion when code was transferred between environments. Modern programming editors can now often set arbitrary indent sizes, and will insert the proper mix of tabs and spaces.

The issue of using hard tabs or spaces is an ongoing debate in the programming community. Some programmers such as Jamie Zawinski state that spaces instead of tabs increase cross-platform portability.[1] Others, such as the writers of the WordPress coding standards state the opposite, that hard tabs increase portability.[2]

The size of the indent is usually independent of the style. In an experiment from 1983 performed on PASCAL code, a significant influence of indentation size on comprehensibility was found. The results indicate that indentation levels in the range from 2 to 4 characters ensure best comprehensibility.[3] For Ruby, many shell scripting languages, and some forms of HTML formatting, two spaces per indent level is generally used.

Tools

There are many computer programs that automatically correct indent styles (according to the preferences of the program author) and the length of indents associated with tabs. A famous one is indent, a program included with many Unix-like operating systems.

In Emacs, various commands are available to automatically fix indenting problems, including hitting Tab on a given line (in the default configuration). M-x indent-region can be used to properly indent large sections of code. Depending on the mode, Emacs can also replace leading indent spaces with the proper number of tabs followed by spaces, which results in the minimal number of characters for indenting each source line.

Elastic tabstops is a tabulation style which requires support from the text editor, where entire blocks of text are kept automatically aligned when the length of one line in the block changes.

Styles

K&R

The K&R style is commonly used in C, C++, and other curly brace programming languages. Used in Kernighan and Ritchie's book The C Programming Language, it had its origins in Kernighan and Plauger's The Elements of Programming Style and Software Tools.

When following K&R, each function has its opening brace at the next line on the same indent level as its header, the statements within the braces are indented, and the closing brace at the end is on the same indent level as the header of the function at a line of its own. The blocks inside a function, however, have their opening braces at the same line as their respective control statements; closing braces remain in a line of their own, unless followed by a keyword else or while.

int main(int argc, char *argv[])
{
    ...
    while (x == y) {
        something();
        somethingelse();

        if (some_error)
            do_correct();
        else
            continue_as_usual();
    }

    finalthing();
    ...
}

In old versions of the C language, the functions were braced distinctly. The opening function brace of a function was placed on the line following the declaration section, and at the same indent level as the declaration (header of the function). This is because in the original C language, argument types needed to be declared on the subsequent line (i.e., just after the header of the function), whereas when no arguments were needed, the opening brace would not appear in the same line with the function declaration. The opening brace for function declarations was an exception to the currently basic rule stating that the statements and blocks of a function are all enclosed in the function braces.

/* Original pre-ISO C style without function prototypes */
int main(argc, argv)
    int   argc;
    char  *argv[];
{
    ...
}

Variant: 1TBS (OTBS)

Advocates of this style sometimes refer to it as "the one true brace style"[4] (abbreviated as 1TBS or OTBS). The main difference from the K&R style is that the braces are not omitted for a control statement with only a single statement in its scope.

In this style, the constructs that allow insertions of new code lines are on separate lines, and constructs that prohibit insertions are on one line. This principle is amplified by bracing every if, else, while, etc., including single-line conditionals, so that insertion of a new line of code anywhere is always safe (i.e., such an insertion will not make the flow of execution disagree with the source code indenting).

Suggested advantages of this style are that the starting brace needs no extra line alone; and the ending brace lines up with the statement it conceptually belongs to. One cost of this style is that the ending brace of a block needs a full line alone, which can be partly resolved in if/else blocks and do/while blocks:

    if (x < 0) {
        puts("Negative");
    } else {
        nonnegative(x);
    }

Variant: Java

While Java is sometimes written in other styles, a significant body of Java code uses a minor variant of the K&R style in which the opening brace is on the same line not only for the blocks inside a function, but also for class or method declarations. This style is widespread largely because Sun Microsystems's original style guides[5][6][7] used this K&R variant, and as a result most of the standard source code for the Java API is written in this style. It is also a popular indent style for ActionScript and JavaScript, along with the Allman style.

The C Programming Language does not explicitly specify this style, though it is followed consistently throughout the book. From the book:

The position of braces is less important, although people hold passionate beliefs. We have chosen one of several popular styles. Pick a style that suits you, then use it consistently.

Variant: Stroustrup

Stroustrup style is Bjarne Stroustrup's adaptation of K&R style for C++, as used in his books, such as Programming: Principles and Practice using C++ and The C++ Programming Language.[8]

Unlike the variants above, Stroustrup does not use a “cuddled else”. Thus, Stroustrup would write[8]

    if (x < 0) {
        puts("Negative");
        negative(x);
    }
    else {
        puts("Non-negative");
        nonnegative(x);
    }

Stroustrup extends K&R style for classes, writing them as follows:

    class Vector { 
    public:
        Vector(int s) :elem(new double[s]), sz(s) { }   // construct a Vector
        double& operator[](int i) { return elem[i]; }   // element access: subscripting
        int size() { return sz; } 
    private:
        double * elem;    // pointer to the elements
        int sz;           // number of elements
    };

Stroustrup does not indent the labels public: and private:. Also, in this style, while the opening brace of a function starts on a new line, the opening brace of a class is on the same line as the class name.

Stroustrup allows writing short functions all on one line. Stroustrup style is a named indentation style available in the editor Emacs. As of 2015, Stroustrup encourages a more Allman-style layout with C++ as stated in his modern C++ Core Guidelines.[9]

Variant: Linux kernel

The kernel style is known for its extensive use in the source tree of the Linux kernel. Linus Torvalds strongly advises all contributors to follow it. A detailed description of the style (which considers indenting, and naming conventions, comments, and various other aspects) can be found at kernel.org. The style borrows some elements from K&R, below.

The kernel style uses tabs (with tab stops set at 8 characters) for indenting. Opening curly braces of a function go to the start of the line following the function header. Any other opening curly braces go on the same line as the corresponding statement, separated by a space. Labels in a switch statement are aligned with the enclosing block (there is only one level of indents). A single-statement body of a compound statement (such as if, while, and do-while) need not be surrounded by curly braces. If, however, one or more of the substatements in an if-else statement require braces, then both substatements should be wrapped inside curly braces. Line length is limited to 80 characters.

int power(int x, int y)
{
        int result;

        if (y < 0) {
                result = 0;
        } else {
                result = 1;
                while (y-- > 0)
                        result *= x;

        }
        return result;
}

Variant: BSD KNF

Also termed Kernel Normal Form, this is the form of most of the code used in the Berkeley Software Distribution (BSD) operating systems. Although mostly intended for kernel code, it is also widely used in userland code. It is essentially a thoroughly-documented variant of K&R style as used in the Bell Labs Version 6 & 7 Unix source code.[10]

The SunOS kernel and userland uses a similar indenting style.[10] Like KNF, this also was based on AT&T style documents and that is sometimes termed Bill Joy Normal Form.[11] The SunOS guideline was published in 1996; ANSI C is discussed briefly. The correctness of the indenting of a list of source files can be verified by the cstyle program written by Bill Shannon.[12]

In this style, the hard tabulator (ts in vi) is kept at eight columns, while a soft tabulator is often defined as a helper also (sw in vi), and set at four. The hard tabulators are used to indent code blocks, while a soft tabulator (four spaces) of additional indent is used for all continuing lines that must be split over multiple lines.

Moreover, function calls do not use a space before the parenthesis, although C language native statements such as if, while, do, switch and return do (in the case where return is used with parens). Functions that declare no local variables in their top-level block should also leave an empty line after their opening block brace.

Here follow a few samples:

while (x == y) {
        something();
        somethingelse();
}
finalthing();

 

if (data != NULL && res > 0) {
        if (JS_DefineProperty(cx, o, "data",
            STRING_TO_JSVAL(JS_NewStringCopyN(cx, data, res)),
            NULL, NULL, JSPROP_ENUMERATE) != 0) {
                QUEUE_EXCEPTION("Internal error!");
                goto err;
        }
        PQfreemem(data);
} else {
        if (JS_DefineProperty(cx, o, "data", OBJECT_TO_JSVAL(NULL),
            NULL, NULL, JSPROP_ENUMERATE) != 0) {
                QUEUE_EXCEPTION("Internal error!");
                goto err;
        }
}

 

static JSBool
pgresult_constructor(JSContext *cx, JSObject *obj, uintN argc,
    jsval *argv, jsval *rval)
{

        QUEUE_EXCEPTION("PGresult class not user-instantiable");

        return (JS_FALSE);
}

Allman style

The Allman style is named after Eric Allman. It is also sometimes termed BSD style since Allman wrote many of the utilities for BSD Unix (although this should not be confused with the different "BSD KNF style"; see above).

This style puts the brace associated with a control statement on the next line, indented to the same level as the control statement. Statements within the braces are indented to the next level.

while (x == y)
{
    something();
    somethingelse();
}

finalthing();

This style is similar to the standard indenting used by the Pascal languages and Transact-SQL, where the braces are equivalent to the keywords begin and end.

(* Example Allman code indenting style in Pascal *)
procedure dosomething(x, y: Integer);
begin
    while x = y do
    begin
        something();
        somethingelse();
    end;
end;

Consequences of this style are that the indented code is clearly set apart from the containing statement by lines that are almost all whitespace and the closing brace lines up in the same column as the opening brace. Some people feel this makes it easy to find matching braces. The blocking style also delineates the block of code from the associated control statement. Commenting out or removing a control statement or block of code, or code refactoring, are all less likely to introduce syntax errors via dangling or missing braces. Also, it is consistent with brace placement for the outer-function block.

For example, the following is still correct syntactically:

// while (x == y)
{
    something();
    somethingelse();
}

As is this:

// for (int i=0; i < x; i++)
// while (x == y)
if (x == y)
{
    something();
    somethingelse();
}

Even like this, with conditional compilation:

    int c;
#ifdef HAS_GETCH
    while ((c = getch()) != EOF)
#else
    while ((c = getchar()) != EOF)
#endif
    {
        do_something(c);
    }

Variant: Allman-8

A popular variant for use in education, Allman-8 simply uses the 8 space indent tabs and 80 column limit of the Linux Kernel variant of K&R. The style purportedly helps improve readability on projectors. Also, the indent size and column restriction help create a visual cue for identifying excessive nesting of code blocks. These advantages combine to help provide newer developers and learners implicit guidance to manage code complexity.

Whitesmiths style

The Whitesmiths style, also sometimes termed Wishart style, was originally used in the documentation for the first commercial C compiler, the Whitesmiths Compiler. It was also popular in the early days of Windows, since it was used in three influential Windows programming books, Programmer's Guide to Windows by Durant, Carlson & Yao, Programming Windows by Petzold, and Windows 3.0 Power Programming Techniques by Norton & Yao.

Whitesmiths, along with Allman, have been the most common bracing styles with equal popularity according to the Jargon File.[4]

This style puts the brace associated with a control statement on the next line, indented. Statements within the braces are indented to the same level as the braces.

while (x == y)
    {
    something();
    somethingelse();
    }

finalthing();

The advantages of this style are similar to those of the Allman style. Blocks are clearly set apart from control statements. The alignment of the braces with the block emphasizes that the full block is conceptually, and programmatically, one compound statement. Indenting the braces emphasizes that they are subordinate to the control statement. The ending brace no longer lines up with the statement, but instead with the opening brace.

An example:

if (data != NULL && res > 0)
    {
    if (!JS_DefineProperty(cx, o, "data", STRING_TO_JSVAL(JS_NewStringCopyN(cx, data, res)), NULL, NULL, JSPROP_ENUMERATE))
        {
        QUEUE_EXCEPTION("Internal error!");
        goto err;
        }
    PQfreemem(data);
    }
else if (!JS_DefineProperty(cx, o, "data", OBJECT_TO_JSVAL(NULL), NULL, NULL, JSPROP_ENUMERATE))
    {
    QUEUE_EXCEPTION("Internal error!");
    goto err;
    }

else if are treated as statement, much like the #elif preprocessor statement.

GNU style

Like the Allman and Whitesmiths styles, GNU style puts braces on a line by themselves, indented by two spaces, except when opening a function definition, where they are not indented.[13] In either case, the contained code is indented by two spaces from the braces.

Popularised by Richard Stallman, the layout may be influenced by his background of writing Lisp code.[14] In Lisp, the equivalent to a block (a progn) is a first-class data entity, and giving it its own indent level helps to emphasize that, whereas in C, a block is only syntax. Although not directly related to indenting, GNU coding style also includes a space before the bracketed list of arguments to a function.

static char *
concat (char *s1, char *s2)
{
  while (x == y)
    {
      something ();
      somethingelse ();
    }
  finalthing ();
}

[13]

This style combines the advantages of Allman and Whitesmiths, thereby removing the possible Whitesmiths disadvantage of braces not standing out from the block. One disadvantage is that the ending brace no longer lines up with the statement it conceptually belongs to. Another possible disadvantage is that it might waste space by using two visual levels of indents for one conceptual level, but in reality this is unlikely because, in systems with single-level indenting, each level is usually at least 4 spaces, same as 2 * 2 spaces in GNU style.

The GNU Coding Standards recommend this style, and nearly all maintainers of GNU project software use it.

The GNU Emacs text editor and the GNU systems' indent command will reformat code according to this style by default. Those who do not use GNU Emacs, or similarly extensible/customisable editors, may find that the automatic indenting settings of their editor are unhelpful for this style. However, many editors defaulting to KNF style cope well with the GNU style when the tab width is set to two spaces; likewise, GNU Emacs adapts well to KNF style by simply setting the tab width to eight spaces. In both cases, automatic reformatting destroys the original spacing, but automatic line indenting will work properly.

Steve McConnell, in his book Code Complete, advises against using this style: he marks a code sample which uses it with a "Coding Horror" icon, symbolizing especially dangerous code, and states that it impedes readability.[15] The Linux kernel coding style documentation also strongly recommends against this style, urging readers to burn a copy of the GNU coding standards as a "great symbolic gesture".[16]

Horstmann style

The 1997 edition of Computing Concepts with C++ Essentials by Cay S. Horstmann adapts Allman by placing the first statement of a block on the same line as the opening brace.

while (x == y)
{   something();
    somethingelse();
    //...
    if (x < 0)
    {   printf("Negative");
        negative(x);
    }
    else
    {   printf("Non-negative");
        nonnegative(x);
    }
}
finalthing();

This style combines the advantages of Allman by keeping the vertical alignment of the braces for readability, and identifying blocks easily, with the saving of a line of the K&R style. However, the 2003 edition now uses Allman style throughout.[17]

Pico style

The style used most commonly in the language Pico by its designers is different from the aforementioned styles. Pico lacks return statements, and uses semicolons as statement separators instead of terminators. It yields this syntax:

stuff(n):
{ x: 3 * n;
  y: doStuff(x);
  y + x }

The advantages and disadvantages are similar to those of saving screen real estate with K&R style. An added advantage is that the starting and closing braces are consistent in application (both share space with a line of code), relative to K&R style, where one brace shares space with a line of code and one brace has a line alone.

Ratliff style

In the book Programmers at Work,[18] C. Wayne Ratliff discussed using the style below. The style begins much like 1TBS but then the closing brace lines up with the indent of the nested block. Ratliff was the original programmer behind the popular dBase-II and -III fourth-generation programming languages. He indicated that it was originally documented in material from Digital Research Inc. This style has sometimes been termed banner style,[19] possibly for the resemblance to a banner hanging from a pole. In this style, which is to Whitesmiths as K&R is to Allman, the closing control is indented as the last item in the list (and thus properly loses salience). The style can make visual scanning easier for some, since the headers of any block are the only thing exdented at that level (the theory being that the closing control of the prior block interferes with the visual flow of the next block header in the K&R and Allman styles).

 // In C
 for (i = 0; i < 10; i++) {
     if (i % 2 == 0) {
         doSomething(i);
         }
     else {
         doSomethingElse(i);
         }
     }

or, in a markup language...

<table>
  <tr>
    <td> lots of stuff...
      more stuff
      </td>
    <td> alternative for short lines </td>
    <td> etc. </td>
    </tr>
  </table>

<table>
  <tr> ... etc
  </table>

Lisp style

A programmer may even go as far as to insert closing braces in the last line of a block. This style makes indenting the only way to distinguish blocks of code, but has the advantage of containing no uninformative lines. This could easily be called the Lisp style (because this style is very common in Lisp code) or the Python style (Python has no braces, but the layout is very similar, as shown in the code blocks below). In Python, layout is a part of the language, called the off-side rule.

 // In C
 for (i = 0; i < 10; i++) {
     if (i % 2 == 0)
         doSomething(i);
     else {
         doSomethingElse(i);
         doThirdThing(i);}}

 

 # In Python
 for i in range(10):
     if i % 2 == 0:
         do_something(i)
     else:
         do_something_else(i)
         do_third_thing(i)

 

 ;; In Lisp
 (dotimes (i 10)
   (if (= (rem i 2) 0)
       (do-something i)
       (progn
         (do-something-else i)
         (do-third-thing i))))

Haskell style

Haskell layout can make the placement of braces optional, although braces and semicolons are allowed in the language.[20] The two segments below are equally acceptable to the compiler:

braceless = do
  text <- getContents
  let
    firstWord = head $ words text
    bigWord = map toUpper firstWord
  putStrLn bigWord

braceful = do
  { text <- getContents
  ; let
      { firstWord = head $ words text
      ; bigWord = map toUpper firstWord
      }
  ; putStrLn bigWord
  }

In Haskell, indentation is significant, and layout can replace braces. Usually the braces and semicolons are omitted for procedural do sections and the program text in general, but the style is commonly used for lists, records and other syntactic elements made up of some pair of parentheses or braces, which are separated with commas or semicolons.[21]

Other considerations

Losing track of blocks

In some situations, there is a risk of losing track of block boundaries. This is often seen in large sections of code containing many compound statements nested to many levels of indents. By the time the programmer scrolls to the bottom of a huge set of nested statements, he or she may have lost track of which control statements go where. However, overly long code could have other causes, such as being too complex, and a programmer facing this problem might instead consider whether code refactoring would help in the longer term.

Programmers who rely on counting the opening braces may have difficulty with indenting styles such as K&R, where the starting brace is not visually separated from its control statement. Programmers who rely more on indenting will gain more from styles that are vertically compact, such as K&R, because the blocks are shorter.

To avoid losing track of control statements such as for, a large indent can be used, such as an 8-unit wide hard tab, along with breaking up large functions into smaller and more readable functions. Linux is done this way, while using the K&R style.

In text editors of the vi family, one means to track block boundaries is to position the text cursor over one of the braces, and press the % key. The cursor then jumps to the opposing brace. Since the text cursor's next key (viz., the n key) retained directional positioning information (whether the up or down key was formerly pressed), the dot macro (the . key) could then be used to place the text cursor on the next brace,[22] given a suitable coding style. Instead, inspecting the block boundaries using the % key can be used to enforce a coding standard.

Another way is to use inline comments added after the closing brace:

for (int i = 0; i < total; i++) {
    foo(bar);
} //for (i)
if (x < 0) {
   bar(foo);
} //if (x < 0)

The major disadvantage of this method is maintaining duplicate code in multiple locations.

Another solution is implemented in a folding editor, which can hide or reveal blocks of code via their indent level or compound statement structure. Many editors will also highlight matching brackets or braces when the cursor is positioned next to one.

Statement insertion

K&R style prevents another common error suffered when using the standard Unix line editor, ed. A statement mistakenly inserted between the control statement and the opening brace of the loop block turns the body of the loop into a single trip.

for (int i = 0; i < 10; i++)
    whoops(bar);   /* repeated 10 times, with i from 0 to 9 */
{
    only_once();   /* Programmer intended this to be done 10 times */
} //for (i) <-- This comment is no longer valid, and is very misleading!

K&R style avoids this problem by keeping the control statement and the opening brace on the same line.

See also

References

  1. Zawinski, Jamie (2000). "Tabs versus Spaces: An Eternal Holy War". Retrieved 6 June 2016.
  2. "WordPress Coding Standards". Retrieved 6 June 2016.
  3. Miara, Richard J.; Musselman, Joyce A.; Navarro, Juan A. & Shneiderman, Ben (November 1983). "Program Indentation and Comprehensibility" (PDF). Communications of the ACM. 26 (11): 861–867. Retrieved 3 August 2017.
  4. 1 2 "The Jargon File". 4.4.7. 29 December 2003. Retrieved 18 August 2014.
  5. Reddy, Achut (30 March 2000). "Java Coding Style Guide" (PDF). Sun Microsystems. Archived from the original (PDF) on 28 February 2006. Retrieved 30 May 2008.
  6. "Java Code Conventions" (PDF). Sun Microsystems. 12 September 1997. Archived from the original (PDF) on 13 May 2008. Retrieved 30 May 2008.
  7. "Code Conventions for the Java Programming Language". Sun Microsystems. 20 March 1997. Retrieved 30 May 2008.
  8. 1 2 Stroustrup, Bjarne (September 2010). "PPP Style Guide" (PDF).
  9. Stroustrup, Bjarne. "C++ Core Guidelines". GitHub. Retrieved 17 December 2015.
  10. 1 2 Shannon, Bill (19 August 1996). "C Style and Coding Standards for SunOS" (PDF). 1.8. Sun Microsystems, Inc. Retrieved 6 February 2015.
  11. Gregg, Brendan. "DTraceToolkit Style Guide". Retrieved 6 February 2015.
  12. Shannon, Bill (9 September 1998). "cstyle.pl". illumos-gate. 1.58. Sun Microsystems, Inc. Retrieved 6 February 2015.
  13. 1 2 "Formatting Your Source Code". GNU Coding Standards. Retrieved 6 June 2016.
  14. Stallman, Richard (28 October 2002). "My Lisp Experiences and the Development of GNU Emacs (Transcript of speech at the International Lisp Conference)". Retrieved 6 June 2016.
  15. McConnell, Steve (2004). Code Complete: A practical handbook of software construction. Redmond, WA: Microsoft Press. pp. 746–747. ISBN 0-7356-1967-0.
  16. "Linux kernel coding style". Retrieved 1 January 2017.
  17. Horstmann Style Guide
  18. Lammers, Susan (1986). Programmers at Work. Microsoft Press. ISBN 0-914845-71-3.
  19. Pattee, Jim. "Artistic Style 2.05 Documentation". Artistic Style. Retrieved 24 April 2015.
  20. "The Haskell 98 Report". haskell.org. Retrieved 3 March 2016.
  21. Lipovača, Miran. "Making Our Own Types and Typeclasses". learnyouahaskell.com. Retrieved 3 February 2016.
  22. Lamb, Linda. Learning the vi editor. O'Reilly.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.