Wikipedia:Algorithms on Wikipedia

From Wikipedia, the free encyclopedia

Guidance on style
Manual of Style
Supplementary manuals

Abbreviations
Biographies
Capital letters
Command-line examples
Dashes
Dates and numbers
Headings
Links
Mathematics
Pronunciation
Sister projects
Text formatting
Titles
Trademarks

Special article styles

Disambiguation pages
Arabic transliteration
China-related articles
Ethiopia-related articles
Indic-related articles
Ireland-related articles
Islam-related articles
Japan-related articles
Korea-related articles

Other guidance

How to edit a page
Guide to layout
Captions
Categorization
Categorization of people
Cite sources
Explain jargon
Footnotes
Writing better articles
Lists
Music samples
Naming conventions
Overlinking
Picture tutorial
Proper names
Sections
Technical terms
and definitions

Words to avoid
Writing about fiction

This is a draft of hints for good sample implementations of algorithms for Wikipedia. Further discussion is invited.


Algorithms, like all other content in Wikipedia, should be traceable to reliable, published sources. Every presentation of an algorithm (both pseudocode and sample code) should include a citation.

Where possible, algorithms should be presented in pseudocode. Pseudocode provides far more flexibility with regard to the level of implementation detail, allowing algorithms to be presented at however high a level is required to focus on the algorithm and its core ideas, rather than the details of how it is implemented. In addition, suitably high-level pseudocode provides the most accessible presentation of an algorithm, particularly for non-programmers. Finally, use of pseudocode is completely language agnostic.

Here is an example of a high-level pseudocode description that focuses on the essentials of the algorithm rather than specific implementation details:

algorithm ford-fulkerson is
    input: Graph G with flow capacity c, 
           source node s, 
           sink node t
    output: Flow f such that f is maximal from s to t

    (Note that f(u,v) is the flow from node u to node v,
     and c(u,v) is the flow capacity from node u to node v)

    for each edge (u, v) in GE do
        f(u, v) ← 0
        f(v, u) ← 0

    while there exists a path p from s to t in the residual network Gf do
        let cf be the flow capacity of the residual network Gf
        cf(p) ← min{cf(u, v) | (u, v) in p}
        for each edge (u, v) in p do
            f(u, v)f(u, v) + cf(p)
            f(v, u) ← −f(u, v)

    return f

Contents

[edit] General guidelines for writing pseudocode

There is no defined standard for pseudocode on Wikipedia. A past attempt at a standardized Wikipedia pseudocode is archived at User:Dcoetzee/Wikicode, though "[t]he author advises that such a proposal not be advanced again, as it is unlikely to gain consent".

In general, the best approach is to use whatever notation provides the clearest expression of the algorithm in question. However, it is a good idea to try to follow the style guidelines in this section where possible, since that helps to make algorithm presentation more consistent across different articles. Some general guideline are:

  • Try to focus on outlining the algorithm, and where possible keep discussion and explanation of the algorithm outside of the pseudocode.
  • Try to ensure the algorithm is understandable without having to read this page first.
  • All assignment, comparison, and other mathematical operators should be rendered with proper mathematical symbols wherever possible:
Operator Result Entity Example
Assignment ← or := ←, := c ← 2πr, c := 2πr
Comparison =, ≠, <, >, ≤, ≥ =, &ne;, <, >, &le;, &ge;
Arithmetic +, −, ×, /, mod +, &minus;, &times;, /, mod
Floor/ceiling ⌊, ⌋, ⌈, ⌉ &lfloor;, &rfloor;, &lceil;, &rceil; a ← ⌊b⌋ + ⌈c
Logical and, or '''and''', '''or''' ab and ac
Sums ∑ ∏ &sum; &prod; h ← ∑aA 1/a

[edit] Low Level Pseudocode

  • All control structure keywords should be bolded, comments should be in italics (in addition to whatever other manner for denoting comments is used)
  • Try to keep the algorithm description sufficiently high level so as to avoid most implementation specific details.
  • Try to avoid using structures or techniques that are idiomatic to a particular language or programming paradigm.
  • If idiomatic structures or techniques are unavoidable try to present them in a generic high level way in line with the style outlined below:
  • The preferred conditional structure is
if condition then
    code path
else if condition then
    code path
else
    code path
end if
  • The preferred looping constructs are
while condition do
    code block
repeat
and
for loop_control do
    code block
repeat
where loop_control is any suitable clause to control a for loop, such as item in list or 1 ≤ i ≤ n etc.
  • The preferred function definition structure is
function function_name is
    code block
    return variable
end function
where function_name is any reasonable format of function name and arguments. Alternatively inputs and outputs can be specified within the function block:
function function_name is
    input: description of inputs
    output: description of outputs
    code block
    return variable
end function
  • Code blocks should be indented. If sufficiently clear, block closing keywords (end, repeat etc.) may be omitted.

Example of pseudocode roughly hewing to these guidelines is provided as the example on the Algorithm page, and in Bucket sort and Topological sort.

[edit] High Level Pseudocode

Inputs description of input arguments
Output description of outputs
  1. (description of a step in the algorithm)
  2. (the next step in the algorithm)
    1. (substep)
  3. (etc.)
using markup:
 :'''Inputs''' ''description of input arguments''
 :'''Output''' ''description of outputs''
 :# ''(description of a step in the algorithm)''
 :# ''(the next step in the algorithm)''
 :## ''(substep)''
 :# ''(etc.)'' 
 
  • Descriptions of steps should be high level, and may simply be English sentences.
  • Substeps of the algorithm, due to branch conditions or loop structures should be indented and subnumbered.
  • Termination of the algorithm with a return value should be denoted using the keyword return

Examples of algorithms in this format include Buchberger's algorithm, Pohlig-Hellman algorithm, Itoh-Tsujii inversion algorithm, Baby-step giant-step, Pollard's p-1 algorithm, and Pollard's rho algorithm.

[edit] Code samples

Samples of actual source get included in articles for a variety of reasons, although the most typical reasons are to demonstrate the "look" of a particular language, to provide examples of language-specific constructs or features, and to provide examples of algorithms not easily expressed in pseudocode. While there's nothing inherently wrong with including sample code, excessive amounts of it can detract from the content of the article itself; avoid writing sample code unless it contributes significantly to a fundamental understanding of the encyclopedic content.

Wikipedia is not a source code repository. Code that is not relevant to encyclopedic content should be moved out of Wikipedia. WikiBooks is the appropriate WikiMedia project for existing GFDL-compatible code.

Some general guidelines on code samples:

  • Sample implementations of algorithms are fine, but every algorithm article should include a pseudocode description of the core algorithm when possible, so that anyone can understand how the algorithm works. See above for guidelines on pseudocode.
  • The sample should use a language that clearly illustrates the algorithm to a reader who is relatively unfamiliar with the language — even if you believe that the language is well-known. To remain language-neutral, choose languages based on clarity, not popularity. Avoid esoteric or language-specific operations.
  • Source code implementations must be compatible with the GFDL (which is incompatible with the GPL).
  • Multiple source code implementations are not appropriate unless they contrast specific aspects of the code and that contrast is important to the encyclopedic content of the article. If possible, accentuate differences by providing the alternate implementation in the same language as the original.
  • All code samples should be typeset in a monospaced typeface to ensure that spacing is preserved. Monospace fonts can be obtained by indenting lines by one or more spaces, by surrounding a block of code in <pre> tags, or by enclosing inline text in <code> tags. Doing this is particulary important for languages where whitespace has syntactic significance — ideally we'd like people to be able to copy and paste sample code into a text editor or IDE. For example,
int main(void) {
    printf("hello, world\n");
    return 0;
}
  • All code samples should be enclosed in <code> tags, regardless of whether such a tag is required for proper formatting. This will allow greater flexibility in future MediaWiki revisions, as well as providing additional information to screen readers and users with customized CSS.