Wikipedia:How to create graphs for Wikipedia articles

From Wikipedia, the free encyclopedia

PNG version of a graph
PNG version of a graph
SVG version of a graph meant to look similar to the PNG
SVG version of a graph meant to look similar to the PNG

Graphs and other pictures can contribute substantially to an article. Here are some hints on how to create a graph. The source code for each of the example images on this page can be accessed by clicking the image to go to the image description page.

  1. Use the SVG format whenever possible.
    • If you can't, use any software to create the plot in a bitmap format but make it very large, for instance 6000 x 4500 pixel size with Postscript Times or Symbol font size 48 and a line thickness of 17 pixels. Then use software like Photoshop or GIMP to Gaussian blur it at 2 pixels. Then reduce it down to about 1000 pixels on a side (e.g. 1300 x 975) using bicubic interpolation. This gives a plot with no jagged lines. It is, however, big enough so that someone could download it and use it for projection purposes without pixellation. Save as PNG.
  2. Plots should be as language-free as possible, and uploaded to Wikimedia Commons, so that they may be used in any language version of Wikipedia.
  3. The descriptive text should be confined to the caption as much as possible. (Try to put as little text as possible in the image itself.) You can also put additional text on the image description page.
  4. It is best if color coding is not the only thing that differentiates parts of the graph. The graph should be understandable even when the article is printed on a gray-scale printer, viewed by the colorblind, or seen on a monochrome display. Use dashed or dotted lines or differently-shaped symbols to identify different objects. In order to make the picture clearer and nicer, you can certainly use color to add redundant information: for example you could plot two different functions with a solid and a dotted line in two different colors. If you really have to use only colors to convey information, choose them so that, if the picture is converted to gray-scale, you still can distinguish them thanks to the evident contrast. See Wikipedia:Manual of Style#Color_coding
  5. Include the commands by which you created the plot on the image description page so others can replicate your work to make additions, fixes, translations, and so on. Ideally someone else can copy and paste the commands and obtain the same result. One way to test that is to write a script and execute the script (instead of interactively typing the commands). Then just copy and paste the script into the image description page. Commenting your code is very helpful.
  6. Be sure to include a licensing tag (GFDL, CC, public domain, etc.) on the image description page.
  7. If you are creating an SVG picture and you want to insert Greek or other special characters in it, please make sure they will display correctly in the output. The most common problem is characters in the Symbol font, which do not display correctly. If, for instance, "π" is displayed as "p" in Mediawiki's output, your software is generating characters with the Symbol font. Unicode characters usually display correctly, which you can replace in a program like Inkscape, or you can convert the characters to traces. You can find the complete Greek alphabet on Commons, here you can copy and paste any letter, they are already in the right format.

See also the graphics tutorials on how to create pictures, and the picture tutorial on how to include them in articles. There is additional discussion of plotting on Template talk:Probability distribution#Standard_Plots.

Contents

[edit] Plotting

[edit] gnuplot

For example, to plot the data in file "data": set xlabel "steps" set ylabel "result" unset key

  1. use bars in plot: with boxes
  2. choose line color/style in plot: linetype n
  3. plot filled bars (fs): pattern n

set style fill pattern 2 plot "data" with boxes linetype 3 fs

Many of the graphs on Wikipedia were made with the free software program gnuplot. It can be used by itself or in conjunction with other software.

[edit] SVG

A plot of Hermite polynomials, generated by gnuplot in SVG format
A plot of Hermite polynomials, generated by gnuplot in SVG format
A plot of the floor function, generated by gnuplot in SVG format
A plot of the floor function, generated by gnuplot in SVG format
List of line and symbols types in the gnuplot svg terminal
List of line and symbols types in the gnuplot svg terminal

Now that Mediawiki supports SVG, it's usually best to generate SVG images directly. SVG images have many advantages, like being fully resizable, easier to modify, and so on, though they are sometimes inferior to raster images. Decide on a case-by-case basis.

A typical plt file could start with:

set terminal svg enhanced size 1000 1000 fname "Times" fsize 36
set output "filename.svg"
size
Sets the size of the plot. This controls the size of features in the PNG rendered by Wikipedia.
fname
Sets the font
fsize
Sets the font size. Also sets the size of plotted points
set output
Sets the filename for saving the SVG information

[edit] Raster

A plot of the normal distribution, generated by gnuplot
A plot of the normal distribution, generated by gnuplot
The lines and symbols availables on gnuplot with the postscript terminal and the options by default
The lines and symbols availables on gnuplot with the postscript terminal and the options by default
The lines and symbols availables on gnuplot with the postscript terminal and the option color
The lines and symbols availables on gnuplot with the postscript terminal and the option color
The lines and symbols availables on gnuplot with the postscript terminal and the options color and solid
The lines and symbols availables on gnuplot with the postscript terminal and the options color and solid

Gnuplot can also generate raster images (PNG):

For the best results, a PostScript file should be generated and converted into PNG in an external program, like the GIMP. PostScript is generated with the line set terminal postscript enhanced:

set terminal postscript enhanced color solid lw 2 "Times-Roman" 20
set output "filename.ps"
color
Make a color plot instead of black-and-white
solid
Make all lines solid instead of dashed. You may want to remove this to make dashed lines which are distinguishable on both color and black and white versions of the same plot.
lw 2
Sets the linewidth of all the lines at once.
"Times-Roman" 20
Sets the font and font size
set output
Sets the filename for saving the Postscript information

You should use a large number of samples for high-quality plots:

set samples 1001

This is important to prevent aliasing or jagged linear interpolation (see Image:Exponentialchirp.png and its history for an example of aliasing). Labels are helpful, but remember to keep language-specific information in the caption if it's not too inconvenient. Including the source code and/or an image without text helps other users create versions in their own language if text is included in the image.

set xlabel "Time (s)"
set ylabel "Amplitude"

The legend or key is positioned according to the coordinate system you used for the graph itself:

set key 4,0

Most other options are not Wikipedia-graph-specific, and should be gleaned from documentation or the source code included with other plots. An example of a plot generated with gnuplot is shown on the right, with source code on the image description page.

[edit] Maxima

A plot of the Hilbert transform of a square wave, generated by gnuplot from Maxima
A plot of the Hilbert transform of a square wave, generated by gnuplot from Maxima

Maxima is a computer algebra system licensed under the GPL, similar to Mathematica or Maple. It uses gnuplot as its default plotter, though others are available, such as openmath. Plotting directly to PostScript from Maxima is supported, but gnuplot's PostScript output is more powerful.

The most-used commands are plot2d and plot3d:

plot2d (sin(x), [x, 0, 2*%pi], [nticks, 500]);
plot3d (x^2-y^2, [x, -2, 2], [y, -2, 2], [grid, 12, 12]);

Since the plot is sent to gnuplot as a series of samples, not as a function, the Maxima nticks option is used to set the number of sampling points instead of gnuplot's set samples. Additional plot options are included in brackets inside the plot command. To use the same options as in the above gnuplot example, add these lines to the end of the plot command:

PostScript output:

[gnuplot_term, ps]
[gnuplot_ps_term_command, "set term postscript enhanced color solid lw 2 'Times-Roman' 20"]

SVG output:

[gnuplot_term, ps]
[gnuplot_ps_term_command, "set terminal svg enhanced size 1000 1000 fname 'Times' fsize 36"]

Output filename:

[gnuplot_out_file, "filename.ps"]

Additional gnuplot commands:

[gnuplot_preamble, "set xlabel 'Time (s)'; set ylabel 'Amplitude'; set key 4,0"]

Like so:

 plot2d (sin(x), [x, 0, 2*%pi], [nticks, 500], [gnuplot_term, ps], [gnuplot_ps_term_command, "set term postscript enhanced color solid lw 2 'Times-Roman' 20"], [gnuplot_out_file, "filename.ps"], [gnuplot_preamble, "set xlabel 'Time (s)'; set ylabel 'Amplitude'; set key 4,0"]);

Similar for svg output:

 plot2d (sin(x), [x, 0, 2*%pi], [nticks, 500], [gnuplot_term, ps], [gnuplot_ps_term_command, "set terminal svg enhanced size 1000 1000 fname 'Times' fsize 36"], [gnuplot_out_file, "filename.svg"]);

Note that the font and labels are in single quotes now, nested inside double quotes. Multiple commands are separated by semicolons.

An example of a plot generated with gnuplot in Maxima is shown on the right, with source code on the image description page.

[edit] GNU Octave

A graph of the envelope of a wave in GNU octave and gnuplot
A graph of the envelope of a wave in GNU octave and gnuplot

GNU Octave is a numerical computation program; effectively a MATLAB clone. It uses Gnuplot extensively (though also offers interfaces to Grace and other graphing software).

The commands are plot (2D) and splot (surface plot), or gplot and gsplot ("almost exactly" the same).

gnuplot settings are accessed with the gset command:

t = [0 : .01 : 1];  
y = sin (2*pi*t);
gset terminal postscript enhanced color solid lw 2 "Times-Roman" 20
gset output "filename.ps"     
gset xlabel "Time (s)"
gset ylabel "Amplitude"
gset key 4,0
plot (t,y)

If x functions are plotted, separated by commas, they will all appear on page x of the resulting .ps file.

[edit] R

an example of a scatterplot created by R
an example of a scatterplot created by R

The statistical package R (see R programming language) can make a wide variety of nice-looking graphics. It is especially effective to display statistical data.

[edit] Post-processing

[edit] Modifying SVG images

SVG images can be post-processed in Inkscape. Line styles and colors can be changed with the Fill and Stroke tool. Objects can be moved in front of other objects with the ObjectRaise and Lower menu commands. Saving from Inkscape also seems to add helpful CSS information that isn't present in gnuplot's default output (Firefox will not render the file natively without it).

[edit] Converting PostScript to SVG

PostScript can be converted to SVG with a bash script in Linux, if necessary. Direct SVG output is probably better if the program supports it. See Wikipedia:WikiProject Electronics/How to draw SVG circuits using Xcircuit for an example.

[edit] Editing PostScript colors and linestyles manually

Setting colors and linestyles in gnuplot is not easy. They can more easily be changed after the PostScript file is generated by editing the PostScript file itself in a regular text editor.

This avoids needing to open in proprietary software, and really isn't that difficult (especially if you are unfamiliar with other PS editing software).

Find the section of the .ps file with several lines starting with /LT. Identify the lines easily by their color ("the arrow is currently magenta and I want it to be black. Ah, there is the entry with 1 0 1, red + blue = magenta") or by using the gnuplot linestyle−1 (for instance, gnuplot's linestyle 3 corresponds to the ps file's /LT2). Then you can edit the colors and dashes by hand.

/LT0 { PL [] 1 0 0 DL } def

/LT0 corresponds to gnuplot's linestyle 1. The [] represents a solid line. 1 0 0 is the color of the line; an RGB triplet with values from 0 to 1. This line is red.

/LT2 { PL [2 dl 3 dl] 0 0 1 DL } def

/LT2 corresponds to gnuplot's linestyle 3. The [2 dl 3 dl] represents a dashed line. There are 2 units of line followed by 3 units of empty space, and so on. 0 0 1 represents the color blue.

/LT5 { PL [5 dl 2 dl 1 dl 2 dl] 0.5 0.5 0.5 DL } def

/LT5 corresponds to gnuplot's linestyle 6. The [5 dl 2 dl 1 dl 2 dl] represents a dash-dot line. There are 5 units of line (the dash) followed by 2 units of empty space, 1 unit of line (the dot), 2 more units of empty space, and then it starts over again. 0.5 0.5 0.5 represents the color gray.

/LTb is the graph's border, and /LTa is for the zero axes.

[edit] Converting PostScript to PNG and editing with the GIMP

To post-process PostScript files for raster output (vector is preferred):

[edit] Manual conversion

  1. Open the file in the GIMP (make sure you have ghostscript installed! — Windows Ghostscript installation instructions)
    • Enter 500 in the "resolution" input box
    • You may need to uncheck "try bounding box", since the bounding box sometimes cuts off part of the image.
      • Enter large values for Height and Width if not using the bounding box
    • Select color
    • Select strong anti-aliasing for both graphics and text
  2. Crop off extra whitespace (shift+C if you can't find it in the toolbox)
  3. ImageTransform → Rotate 90 degrees clockwise
  4. FiltersBlurGaussian blur (No need to blur if you use strong anti-aliasing during conversion. No significant difference between end results.)
    • 2.0 px
  5. ImageScale Image...
    • 25%
    • Cubic interpolation
  6. You can view at normal size if you want by pressing 1, Ctrl+E
  7. Save as File_name.png

[edit] Command-line method

Another route to convert a ps or eps file (postscript) in png is to use Imagemagick, available on many operating systems. A single command is needed:

convert -density 300 file.ps file.png

the density parameter is the output resolution, expressed in dots per inch. With the standart 5x3.5in size of a gnuplot graph, this results in a 1500x1050 pixels png image. Imagemagick automatically applies antialiasing, so no post-processing is needed, making this technique especially suited to batch processing. The following Makefile automatically compiles all gnuplot files in a directory in eps figures, converts them in eps and then clears the intermediate eps files. It assumes that all gnuplot files have a .plt extension and that they produce an eps file with the same name, and the .eps extension:

GNUPLOT_FILES = $(wildcard *.plt)
# create the target file list by substituting the extensions of the plt files
FICHIERS_PNG = $(patsubst %.plt,%.png,  $(FICHIERS_GNUPLOT))

all: $(FICHIERS_PNG)

%.eps: %.plt
        @ echo "compillation of "$<
        @gnuplot $<

%.png: %.eps
        @echo "conversion in png format"
        @convert -density 300 $< $*.png 
        @echo "end"

[edit] See also