Wikipedia:Preparing images for upload
From Wikipedia, the free encyclopedia
When uploading image files to Wikipedia or, preferably, to the Wikimedia Commons, it is important to use the right file format for the content. Space on the images server is not at a premium, and we should not throw information away.
While some formats offer multiple compression systems, in general the format and the compression system are tied together. The basic choices are SVG for simple diagrams (esp. those that need to be scaled), PNG for diagrams that can't be easily or efficiently created as vector graphics, and JPEG for photographs or similar images (e.g., screenshots of modern 3D video games, or anything else that's not mainly solid colors).
Other image formats should be avoided in most cases:
- BMP - Images are uncompressed, resulting in larger file sizes. Should usually be converted to PNG.
- GIF - Files may be larger, less scalable, and not as colorful. Should usually be converted to PNG unless animated.
- TIFF - Should usually be converted to PNG or JPEG as discussed above.
Contents |
[edit] Diagrams
The editability and scalability of SVG (Scalable Vector Graphics) make the format the obvious choice for graphic representation of data and illustrations. However, it is not always easy to convert raster images like GIFs, PNGs, or JPEGs to SVG, and some images (especially photos) are not conducive to this treatment. If a diagram cannot be produced in SVG then a PNG image is preferred over a JPEG.
The PNG compression algorithm is designed to work with large areas of solid colour that have sharp boundaries. It is therefore a good format for diagrams and cartoons. But it does not automatically give you the smallest possible file size. There are some things that need to be done by hand.
There is a myth among many web designers that PNGs are larger than GIFs. This myth stems from two facts:
- Many people compare 24 bit PNG with 8 bit GIF, which is an invalid comparison.
- Photoshop is historically known for being poor at creating PNG files.
[edit] Replace captions in the image with text
A title as a caption under the image |
|
---|---|
Image with title 1248 bytes |
Image without title 854 bytes (+74 bytes caption) |
Does the diagram contain a title? If so, consider removing the part of the image containing the title and adding a text caption to the image instead. Plain text:
- Takes up less space than the equivalent text in an image;
- Can easily be changed;
- Scales up into larger or smaller font sizes;
- Can be searched;
- Can be copied and pasted;
- Can be translated into other languages so that the same diagram can be used in other Wikipedias.
[edit] Choose a colour depth appropriate for the number of colours
1-bit colour No anti-aliasing 180 bytes |
4-bit colour Anti-aliased 309 bytes |
Enlarged view of anti-aliased image |
---|
Does the number of bits per pixel fit the number of colours in the image? Diagrams usually have few colours. If a diagram has 4 colours, there is no need to store it in a 24-bit (truecolour) format capable of distinguishing 16 million colours. The lower colour depth versions of PNG store colours in a palette. Paletted images can have a bit depth of 1, 2, 4, or 8 bit (2, 4, 16, or 256 colour). Use the lowest bit depth that can handle all colours in your image, although some image editing programs cannot create 2-bit colour images.
If you are converting an image with many colours (perhaps because somebody saved the original as a JPEG, avoid this) to a PNG, you may want to reduce the number of colours at the same time; see Wikipedia:How to reduce colors for saving a JPEG as PNG.
If your image is anti-aliased you may be using more colours than you suspect, because anti-aliasing smooths jagged edges by adding shades of grey where once there was black or white. Anti-aliased black-and-white images usually need to be saved as 16-colour or 256-colour images instead. See the illustration at the right.
[edit] Do not save diagrams as JPEG
Image showing lossy JPEG compression |
Enlarged view showing compression artifacts |
---|
To the right is an example of a file saved as JPEG when it should have been saved as PNG. JPEG uses lossy data compression meant for photographs. Compressing drawings or diagrams with JPEG results in an image of poor quality, because the human eye can spot the compression artifacts around the edges.
Another drawback is the large file size you will end up with. JPEG compression has many options but most commonly only two colour spaces: 24-bit RGB (8 bits per sample) and 8-bit greyscale. Most importantly, JPEG by its nature cannot support indexed colour. In the example on the right, a 4-colour image is inflated by using an inappropriate colour schema, which results in the rather large file size.
If you do not have an original file but only a JPEG that really should be a PNG, do not simply save the JPEG as PNG because this will result in an even larger file. There is a nice tutorial at Wikipedia:How to reduce colors for saving a JPEG as PNG.
[edit] Use SVG over PNG
Expanded PNG | Expanded SVG | Simple changes to SVG |
PNG is a raster graphics format, encoding the value of each individual pixel, while SVG is a vector graphics format that encodes an image as a series of geometric constructs. If this confuses you, don't worry; you don't need to understand the technical aspects to create or upload images. What this means in practice is that an SVG image scales to different sizes far better than an equivalent PNG. Therefore, for images that consist largely or entirely of polygons, lines, and curves (national flags, road signs, etc.), SVG is the preferred format. Shown here are two example enlarged crops of an image, one in SVG format, one in PNG format. The difference in quality is obvious.
SVGs can also be easily altered with a text editor. This makes updating and translating illustrations much easier. Unfortunately, text rendering is the most inconsistent part of SVG implementations so some users find it necessary to convert text to outlines, which removes this advantage. If it is felt necessary to convert text to outlines then a version with unconverted text should be uploaded first (for editing) followed by a version with converted text (for consistent rendering). Editors on Linux and other UNIX-like systems have the fewest difficulties with fonts in SVG because they usually have fonts in common with the Wikipedia servers, and they can use rsvg-view to preview SVGs exactly as Mediawiki will render them.
[edit] JPEG tips
As stated above, JPEG (Joint Photographic Experts Group) was developed with photographic images in mind. Although the JPEG algorithms are quite good, there are a couple of tips that will help to get the smallest file size possible without sacrificing quality:
- When saving a JPEG, the graphics program will let you choose the compression level. Usually the values range from 0 to 100 where 100 is the best quality possible with very little compression applied (some apps, most notably Paint Shop Pro have this scale in reverse with 0 as highest quality and 100 as the lowest quality). Don't mistake the 0 to 100 scale for a percentage, in that using 1/2 the setting does not result in 1/2 the quality, nor does it produce a file of 1/2 the size. Also, 100 does not mean "100%", as the image is still compressed, resulting in some minute loss of detail. Since most JPEGs in Wikipedia will be rescaled anyway before appearing on pages, a quality setting of 95 is appropriate.
- JPEG compression works better on slightly blurred images, so don't sharpen the images too much as it will result in a higher footprint.
- Always work from the original image and not from the already saved JPEG file, as quality gradually decreases the more you save it. For this reason, it may be good to keep the main copy here in a lossless format like PNG. However, as of right now, scaled versions are forced to be in the same format as the original image and having two copies of the image is a maintenance nightmare.
- JPEG files can be losslessly compressed, with
jpegtran -optimize
. Jpegtran is part of libjpeg. A package called littleutils contains a script calledopt-jpg
that automates JPEG optimization, usingjpegtran
as the underlying engine. - JPEG files can also be losslessly compressed with a tiny free Windows program called JPGExtra, which removes all hidden "extras" that are typically added by digital cameras and image editing software.
[edit] PNG tips
Images which are not photographs, such as diagrams and screen captures of applications or older video games, use few colors. If it makes sense, save the image in indexed mode. A truecolour PNG can often be converted to indexed mode without changing the look of the image, while saving on file size. (See color depth for information on indexed mode and truecolour.)
It is normal for image editing programs to produce poor PNG compression, even when run with their maximum compression choices. As a result, there are a variety of tools to compress PNGs without any loss of quality. However, if the image will be scaled by Mediawiki before viewing, then these steps are pointless. Some such tools and information on using them is shown below.
- PNGOUT (gratis)
- OptiPNG (open source)
- Pngcrush (open source)
- AdvPNG, part of the AdvanceCOMP compilation (GNU GPL)
OptiPNG is generally better than pngcrush and usually significantly faster. AdvPNG can be used after OptiPNG to further improve the results. AdvPNG is straightforward to use, as it optimizes only the compression itself.
For quick compression, simply use optipng with no options at all:
optipng file.png
If smallest results are desired and time is not important, a chain of this sort produces the smallest possible results:
optipng -o7 file.png advpng -z4 file.png
Each of these utilities uses a different, more sophisticated "deflate" compression method variant on the PNG and generally produces a smaller file when run after the other tools. If the smallest result matters, try both to see which produces the best result.
After any compression, the image should be compared to the original. It's occasionally the case that quirks in the original cause transparency to be lost even in compression which is intended to be lossless. This commonly, but not always, shows up as a change in the background colour which is obviously visible at a glance.
For quick-and-dirty optimization, the opt-png script (found in the littleutils package) might be useful. It automates PNG optimization, utilizing pngcrush and a variant of pngrewrite as underlying engines.
Note also that these chains, particularly the pngrewrite step, will discard non-image blocks, often including copyright or creator details. Check the pngrewrite and other program options if you want to preserve this information.
[edit] See also
This tutorial comes about after a discussion on Image:Covalent.png talk page.