Talk:Chroma subsampling

From Wikipedia, the free encyclopedia

Glad for the info. Yes, Chroma formats should be merged with Chroma subsampling. The formats by themselves don't mean much without some explanation of how subsampling occurs.


Great article - seems to be the only resource on the web that handles this topic. But I have a problem. Look at the picture on the left, wouldn't you say that the 4:2:0 picture is wrong? We're talking about horizontal lines. 4:2:0 switches the U and V component every uneven line. so i would say (from logic), there should be a long blue line. and then (in the next line) a long red line. --Abdull 14:19, 23 Mar 2005 (UTC)

Nope, i take back the previous statement. As i thik i figured out correctly, one of the reddish or blueish block represents something like "needed area that is covered with full information". Right? Maybe the blueish and reddish rectangles can be explained? --Abdull 14:25, 23 Mar 2005 (UTC)

I'm actually very curious about your diagram as well. I read it as follows: blue squares represent pixels with luma samples, while red squares represent pixels with both luma and chroma samples associated. This makes sense to me of the 4:1:1 diagram, which looks like every other chroma/luma sampling diagram for 4:1:1 subsampling I have ever seen. Things start to get tricky for me, though, when I get to, for example, the 4:4:4 diagram, which, if I am reading this correctly, I know to be false. Video sampled in a 4:4:4 Y:Cb:Cr ratio should have an equal signal bandwidth for each channel, and thus should have a chroma sample in both Cb and Cr channels for every pixel, right? (Typically, I have seen chroma samples treated as though there is complete overlap between pixels defined in Cb and Cr channels, although I am aware that in DVCPROHD and perhaps other examples this is not true). At any rate, I have always seen 4:4:4 video implied to be one chroma sample for every luma sample per channel, and thus would expect your diagram for this sampling pattern to be entirely red or blue. Looking further at the diagram, my reading of it would also seem to suggest that all these different chroma sampling patterns contain the same relative amount of information in luma and chroma channels. This contradicts both everything I have ever read on the subject and also the statement you make about 4:2:2 requiring less bandwidth than 4:4:4. It is possible that A) my knowledge of codec construction is deeply flawed, and/or that B) I have no idea how to read the diagram. In either case, I would greatly appreciate a caption on the diagram explaining how it should be read as I appear to not be the only one confused by it. -Evan (06/04/07)

Also I had a tough time wading through this page to get a sense for the talk; a lot of it centers around, well, what the first heading says, so I moved everything that was originally its own separate heading that really was about that to this category, and bolded the old category separators. I hope no one is offended; I simply wanted to make this easier to read. I have seen 22:11:11 (about HD sampling) and 8:8:8 (in regards to a DaVinci setup, too) mentioned before, just to throw my two cents in on that topic. Probably the right answer is in there somewhere. -Evan (06/04/07)


Contents

[edit] Discussion about the numbers > 4 in sampling ratios and thus meaning of the a:b:c notation

what do the numbers actually mean?

You seem to treat the groups of numbers as simply identifiers with no intrinsic meaning. I find it unlikely that this would be the case. does anyone know what the individual numbers actually mean? Plugwash 17:37, 8 Apr 2005 (UTC)

Here is your answer: http://www.quantel.com/domisphere/infopool.nsf/HTML/6CE9156EC8F04A5280256C7D00529E43?Open --Dulldull 17:48, 10 May 2005 (UTC)

I can't see any information on the individual numbers there. Plugwash 12:48, 23 May 2005 (UTC)

I just noticed someone had filled it in here since i last checked.

Reply/Note:

The above link is is gone. Best way to think of what the numbers means is that the 4 is the standard luminance resolution for the system you are talking about. The 4 is different for SDTV-(NTSC/PAL) than HTDV. In NTSC/PAL CCIR 601 digital video there are 720 luminance samples per line, this equals the 4 in 4:2:2. There are 360 chrominance samples per line, this is the 2, in 4:2:2.Telecine Guy 3/8/07

a:b:c notation - gone?

  • This article appears to contradict itself about the third field in a:b:c notation. Please see the discussion on the talk page.
  • It looks like this has been deleted from Chroma subsampling and the contradict should be moved. My guess this was a topic about MPEG encoding. Anyone agree?

—The preceding unsigned comment was added by Telecineguy (talkcontribs) 23:15, 20 February 2007 (UTC).

8:8:8

  • Where should 8:8:8 go?
  • Reply:

The only 8:8:8 systems I have seen is when 8:4:4 or 4:4:4 or 4:2:2 is sub pixel up sampled to 8:8:8. This is done for finer resolution in color correcting a video signal. I have only seen this done inside a DaVinci 8:8:8 SDTV Color Corrector that would still output 4:4:4 or 4:2:2. A common problem is video is when a video operator has the wrong settings, that is you cannot mix Chroma subsampling formats. 4:4:4 must feed 4:4:4, 4:2:2 must feed 4:2:2. A up (or down) converter must be used to go from one format to the other, other wise sampling errors will occur (usually lines in the video). As such I am not sure 8:8:8 needs to be added to the main page as a heading. Do you have another example of 8:8:8?

4:2:0 and 4:1:0 sampling The provided information is not correct. The numbers in x:y:z don't indicate which horizontal lines contain which chroma samples.

4:2:0 does not mean that there is no V or Cr information stored at all, it means that in each line, only one color difference channel is stored with half the horizontal resolution. The channel which is stored flips each line, so the ratio is 4:2:0 for one line, 4:0:2 in the next, then 4:2:0 again, and so on. This leads to half the horizontal as well as half the vertical resolution, giving a quarter of the color resolution overall.

So, this is not totally true. There are three types of 4:2:0 sampling: MPEG-1 (U and V have same placement between two vertical luma pixels) MPEG-2 (U and V have same placement between 2x2 luma pixels) and PAL DV (U and V alternating, just as you described above):

4:1:0 sampling means that chroma is shared between 4x4 luma pixels (and not 2:4). So it's 18 bytes for 16 pixels, or 9 bits per pixel. Indeo uses such sampling.

http://www.avisynth.org/Sampling (plus references in this article) (Section References in that page has a link "4:2:0" taking you to a page that states the sampling in 4:2:0 effectively happens between every other line: http://www.quantel.com/domisphere/infopool.nsf/HTML/dfb420?OpenDocument)

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwmt/html/YUVFormats.asp http://www.fourcc.org/fccyuv.htm#Planar%20YUV%20Formats (YUV9 is 4:1:0 sampling)

Btw, i don't understand the exact naming convention x:y:z either :) But, here are some random thoughts about it: http://forum.doom9.org/showthread.php?t=87991 131.155.54.18 15:02, 17 October 2005 (UTC)

I'd always heard that the numbers were: first the (relative) number of luminance samples, then chrominance (both U and V) on even scan lines, then chrominance on odd scan lines. But I was dealing with planar image file formats and therefore didn't need to pay attention to the question of whether all the chrominance was really on half the scan lines, or if U and V were alternately represented, etc.
I suspect that the numbers started out as something like this, then got applied traditionally with some vagueness to anything with essentially the same spatial resolutions in the channels. There's some regularity to the format designations but I wouldn't be surprised to hear that it's a little loose. -66.31.108.41 02:02, 23 March 2006 (UTC)


Origins of J:a:b notation Originally, the first number meant that luma was sampled 4 times the frequency of the color subcarrier (4fsc). The two numbers following referred to the sampling frequency of the Cb and Cr components. (may be wrong) However, when SD was standardized, they changed the sampling rates for better NTSC and PAL interchange. So while 4fsc is 14.3 MHz for NTSC and 17.7 MHz for PAL, for SD the luma sampling is 13.5 Mhz.

With subsequent high definition formats, engineers didn't want to go with higher numbers and liked the 4:x:x scheme (and 4fsc became meaningless anyways). So 4:x:x now refers to chroma subsampling scheme used.

(may be wrong) In my opinion, the whole notation is not a good idea since some of the numberings are ambiguous... notably 4:2:0 and 4:4:4.

Re. HD formats - 22:11:11 is sometimes used instead of 4:2:2. 82.127.93.74 11:22, 19 January 2007 (UTC)

4:4:4 Y'CbCr This is the best color sampling ratio (it yields a nearly perfect representation of each pixel's color), and is used as an intermediate format in high-end film scanners and cinematic postproduction. We could explain why the representation is imperfect... when converting from R'G'B' to Y'CbCr, you tend to incur quantization error.

The subjective comment about it being the best isn't necessarily true.

This is not true! If you keep R'G'B' then you have to use worse quantization level on each R', G', B' component to make them fit in the limited bandwidth. When you use Y'CbCr, you can use better quantization for Y', which is what your eye will firt detect, including for movement (where your eye is very accurate at detecting very small differences in luminance!).
Remember that quantization happens always in the compression of images in MPEG, and every channel is affected by quantization. With Y'CbCr, the impact is of quantization can be made much less severe on luma than on chroma, and finally you get a better perceived quality once the image is decompressed!
If you are not convinced, look at static high quality JPEG images (Q=100, i.e. 9 bits per color pixel) : those compressed within the Y'CbCr or YUV color models are definitely more crispy than with the sRGB with the same bitrate (or filesize). This is very visible when you look at textured areas of images, or at the borders of contrasted diagonals. Using Y'CbCr really enhanced the final resolution of the image with no extra cost in terms of final
The cost of the color space transformation is only in chroma, but the eye is not perceiving chromatic differences as well as in luminance (especially in textured areas like fields of herbs, stones, walls, characters hair and dressing), or when looking at very smoothed areas (like clouds in a natural sky, or at waves on the sea).
If you have a (8,8,8)-bit sRGB image, converting it to a (14,5,5)-bit Y'CbCr color space is almost lossless for the eye (and if you take a mathematical measure of the error when performing the conversion back to sRGB, the SNR is extremely high (much more than what your eye can perceive), but it compresses much better ; almost all of the loss in image compression does not come from this conversion. 90.5.134.25 17:19, 18 February 2007 (UTC)

8:4:4

This is really just 4:2:2 subsampling right? I was under the impression that the first number lost its meaning a long time ago, and now all that matters is the ratios. --Ray andrew 01:54, 9 March 2007 (UTC)

Reply: In NTSC/PAL digital video there are normally 720 luminance samples per line, this equals the 4 in 4:2:2. There are 360 chrominance samples per line, this is the 2, in 4:2:2. So in 8:4:4 there are 1440 luminance samples per line and 720 chrominance samples per line. This gives twice the resolution for luminance. The ratios 8 is twice 4, makes 8:4:4 a high end system, this is not subsampling, but true high res. Calling 8:4:4 the same ratio as 4:2:2 will not work as few devices can handle 1440 luminance samples per line. 8:8:8 noted above is subsampling depending on the input. 8:4:4 upped to 8:8:8 is subsampled only in chrominance. See CCIR 601 for more info on 4:2:2 video. When connecting devices together it is important to have the same samples per line on the output to input. Other wise sampling errors will occur, this can show up as faint lines top to bottom in the video. Telecine Guy

I've never heard of 8:4:4 before (i.e. it doesn't appear in manufacturer literature, it doesn't appear in Charles Poynton's book or other video engineering books, etc.). Is there a source to back this up? 74.102.238.157 08:40, 21 April 2007 (UTC)

It is mostly used by The Spirit DataCine and Da Vinci Systems' 888 and 2k color correctors. Telecine Guy 09:56, 22 April 2007 (UTC)

[edit] YUV Notation should really be avoided!

YUV refers to an analog encoding scheme, whereas this article really talks about Y'CbCr (digital). One difference is that the scale factors on the U and V are different than those on Cb and Cr.

Similarly, Y' does not directly represent the luminance from color science.

See http://poynton.com/papers/YUV_and_luminance_harmful.html

While I'm not aware of the difference between YUV and Y'CBCr myself (though I've certainly heard both used) it should be noted that the term YUV is very widely used in digital video texts/papers etc. Even if they do do so incorrectly. So even if the article is changed that miusage should be noted. AlyM 15:51, 6 July 2006 (UTC)

[edit] The Chroma Bug

The chroma bug is a problem related to the misinterpretation of chroma subsampling (since it interacts with interlacing).

see http://www.hometheaterhifi.com/volume_8_2/dvd-benchmark-special-report-chroma-bug-4-2001.html

[edit] Terminology

Note that the luma (Y') of video engineering deviates from the luminance (Y) of color science (as defined by CIE). Luma is formed as the weighted sum of gamma-corrected (tristimulus) RGB components. Luminance is formed as a weighed sum of linear (tristimulus) RGB components.

In practice, the CIE symbol Y is often incorrectly used to denote luma. In 1993, SMPTE adopted Engineering Guideline EG 28, clarifying the two terms. The luma of video engineering is to be denoted by the symbol Y', whereas the luminance of color science is to be denoted by the symbol Y. Note that the prime symbol ' is used to indicate gamma correction. In current practice, the symbol Y is ofter used to refer to the luma of video engineering and not the luminance of color science. The exact intention has to be discerned from the context of the term's usage.

Similarly, the chroma/chrominance of video engineering differs from the chrominance of color science.

  • This is mostly derived from Charles Poynton's writings

[[1]] Glennchan 22:44, 13 July 2006 (UTC)

[edit] Major Changes

I went ahead and made some fairly major changes to correct technical inaccuracies. My intention was not to step on anyone's toes.

To do:

  • I propose getting rid of the bitstream stuff - to me, it's too obvious
  • Good diagrams showing the schemes and where the chroma samples are taken from
  • Add article on chroma bug, link to it
  • Correct inaccuracies in the luminance (video), YIQ, YUV articles; then link to them
  • Perhaps add information on chroma interpolation and reconstruction
  • Perhaps add information on chroma filtering

Glennchan 06:48, 14 July 2006 (UTC)



[edit] Confusing

I felt the presentation is a bit confusing as it tries to make a correspondance between the bitstream and the sampling format. The bitstream structure can be discussed separately. The book Video Demystified by Keith Jack, provides a good explanation of chroma subsampling and the corresponding chapters of the book are freely available at http://books.elsevier.com/bookscat/samples/1878707566/1878707566.pdf

Yeah I think it should be changed... although I really gutted the article already.  :D I have no idea how the bitstream stuff is relevant to most people reading wikipedia... and as you point out its confusing / not organized. Perhaps push it into its own section, or remove it entirely.Glennchan 02:55, 20 July 2006 (UTC)

[edit] Digital component video

Hey, I think we should try to be more specific when we talk about what formats using what. What exactly does "digital component video" refer? To me, it is an umbrella term for the various digital video formats that use chroma subsampling- which could be any one of the many schemes available. (Skimming through Charles Poyntons book, digital component video interface seems to typically refer to SDI) Perhaps we should include HDMI on there, although I don't know too much about it. HDMI seems to support 4:2:2, 4:4:4, and the colorspaces Y'CbCr R'G'B' and xvYCC, and different bit depths. Glennchan 19:59, 7 December 2006 (UTC)

[edit] Adding contradiction tag

The "Sampling systems and ratios" section claims that the third component of the ratio is always either equal to the second, or zero. Nonetheless, the article also contains a (brief) section discussing 4:2:1. If 4:2:1 really exists, then the "Sampling systems and ratios" section can't be right.216.59.230.140 01:35, 2 February 2007 (UTC)

This is not really a contradiction. It forgets to describe the 4:2:1 case, which is extremely rare (not used in standard video, because the target viewing device will not be able to reproduce such detail in interlaced mode; this can only be viewed on a progressive-scan computer screen). Such notation means that one of the two chroma channels has a doubled bandwidth. It means that the chroma channels are not symetric.
It can't be one of the standard CIE colorspaces, but it could be used for example in HSV or HSL colorspaces:
  • V, the value, or L, the light, encodes the luma information on every pixel of a 2x2 macro-block
  • H and S encode the chroma, but the eye is more sensitive to H, the hue, and to S, the saturation so:
  • you can give twice more bandwidth to H than to S.
  • This is a minor improvement that helps giving more color
  • Such transform is very computational-intensive, notably because it involves gamma-correction which requires a much higher precision for the intermediate results; for HDTV (which requires very high pixel frequency) the cost would be too high (and it would not work on mobile devices due to excessive power dissipation, too many transistors or gates, and too much capacitance per bit, also because such transform requires ROM-lookup that are extremely energy-expensive, or require adding DRAM or registers for caching such data; it also requires generic ALU, instead of simpler static multipliers that aer hardware efficient).
For all these reasons, the 4:2:1 subsampling type is reserved to software-only implementation, and its performance is poor (it can't cope now with HDTV signals); there are some cameras that use it, but they all suffer from the cost and heavy weight of their batteries, and their autonomy is poor...
Really, forget 4:2:1, and use 4:2:0 with which you can much more easily increase the resolution to compensate for the fact that chroma information is subsampled in alternating frames. 90.5.134.25 17:39, 18 February 2007 (UTC)


[edit] Macropixel

The word "macropixel" is not defined, but used. Though I can imagine what it means, it's still unclear for me if "each macropixel of two neighbouring pixels uses 4 bytes of memory" means that for the whole frame the average memory per pixel is 2 or 4 bytes. I guess 4. Macfreek 15:29, 17 December 2005 (UTC)

[edit] a:b:c notation

I don't understand, even after reading the article, what 4:2:1 exactly does. How would 4:2:1 look in the image on the left?

I don't understand the reason for this notation. Why don't they give the factors of horizontal and vertical direction separatly, e.g. "2:2" could mean 2 times horizontally and 2 times vertically (like 4:2:0). Why has 0 got a meaning?

At the main entry I added a link to an image that I think better illustrates what the different chroma subsampling notations refer to. That black and white image is more explanatory compared to what the current pink & blue tiled image on the entry does. I would actually suggest to remove that pink and blue image, just keep linking to other one. —Preceding unsigned comment added by Raulsaavedraf (talk • contribs) 19:12, 22 May 2008 (UTC)

[edit] 8:4:4

I propose including 8:4:4 under 4:2:2 instead of its own section. 8:4:4 is marketing speak (by telecine vendors). For example, if I subsampled a 4096 x 3076 image's chroma components to 2048 x 1536, it would still be called 4:2:0 instead of something like 32:16:0, even though the analog bandwidth (an obsolete measure when talking about digital video) is ~30 times that of standard definition. The numbers in the notation correspond to absolute sample values (4 = 720 samples per line, etc.) only when talking about digitizing analog standard definition video. For everything except standard definition, the numbers just mean ratios. So if the chroma components have half the number of luma samples in the horizontal direction, the subsampling is 4:2:2, no matter if it's full-HD, 4K, or whatever pixel count you can imagine. If you disagree, then I propose adding 16:16:16, 16:8:0, 16:8:8 (and any other meaningless terms) to correspond to different sampling resolutions (I guess 12:6:6 would then be for 2048 x 1536 video with horizontal chroma subsampling :^) ). —Preceding unsigned comment added by 130.233.243.229 (talk) 22:44, 17 December 2007 (UTC)

I agree with you. -- J7n


[edit] Video codecs doing 4:4:4

As correctly pointed out in the Controversy section, rudimentary color subsampling is a method of the past – much like interlacing. I never use it when creating JPEG files. Yet all video formats known to me require it. What are the video codecs capable of handling 4:4:4? -- J7n —Preceding unsigned comment added by 83.99.184.75 (talk) 22:01, 6 May 2008 (UTC)

[edit] X:Y:Z: Cr and Cb, not even/odd scanlines

In X:Y:Z notation Y and Z correspond to Cr and Cb bandwidth rate, not to even and odd scanlines.

4:2:1 codecs halve Cr and quarter Cb bandwidth rather than sampling odd lines two times less often than even ones.

Bandwidth may be unrelated to sampling rate. Some codecs do really downsample the video, others sample color difference at luma rate then average them and transmit the average. —Preceding unsigned comment added by Abolen (talkcontribs) 15:41, 22 May 2008 (UTC)