YUV color space was invented as a broadcast solution to send color information through channels built for monochrome signals. Color is incorporated to a monochrome signal by combining the monochrome signal (also called brightness, luminance, or luma, and represented by the Y symbol), with two chrominance signals (also called chroma and represented by UV or CbCr symbols). This allows for full color definition and image quality on the receiving end of the transmission.
Storing or transferring video over IP can be taxing on network infrastructure. Chroma subsampling is a way to represent this video at a fraction of the original bandwidth, therefore reducing the strain on the network. This takes advantage of the human eye’s sensitivity to brightness as opposed to color. By reducing the detail required in the color information, video can be transferred at a lower bitrate in a way that's barely noticeable to the viewers.
4:4:4 full color depth
Full color depth is usually referred to as 4:4:4. The first number indicates that there are four pixels across, the second indicates that there are four unique colors, and the third indicates that there are four changes in color for the second row. These numbers are unrelated to the size of individual pixels.
Each pixel then receives three signals, one luma (brightness) component represented by Y, and two color difference components known as chroma represented by Cr (U) and Cb (V).
4:4:4 chroma subsampling
Subsampling is a way of sharing color across multiple pixels and using the eye and brain’s natural tendency to mix neighboring pixels. Subsampling reduces the color resolution by sampling chroma information at lower rate than luma information.
YUV 4:2:2 vs. 4:2:0
4:2:2 subsampling implies that the chroma components are only sampled at half the frequency of the luma:
4:2:2 chroma subsampling
The chroma components from pixels one, three, five, and seven will be shared with pixels two, four, six, and eight respectively. This reduces the overall image bandwidth by 33%.
Similarly, in 4:2:0 sub-sampling, the chroma components are sampled at a fourth of the frequency of the luma.
4:2:0 chroma subsampling
The components are shared by four pixels in a square pattern, which reduces the overall image bandwidth by 50%.
Several other chroma subsampling methods exist, but these are the two principles of chroma subsampling reducing the image bandwidth by reducing the pixel color sampling frequency remains the same.
The image below details how a 4x2 pixel region is represented in 4:2:0 and 4:2:2 subsampling.
YUV 4:4:4 vs 4:2:2 vs 4:2:0
In the example below, the three frames (one luminosity and two chromas) can be combined to create the final colored image:
Since most displays are black by default, the simplest way to portray an image is by brightness only. This is known as a monochromatic image:
Monochromatic image of a plant
In such cases, the incoming signal will only have a luma (Y) component, and no chroma components (U or V).
Subsampling size saving
With 8 bits per component,
- In 4:4:4, each pixel will require three bytes of data (since all three components are sent per pixel).
- In 4:2:2, every two pixels will have four bytes of data. This gives an average1 of two bytes per pixel (33% bandwidth reduction).
- In 4:2:0, every four pixels will have six bytes of data. This gives an average of 1.5 bytes per pixel (50% bandwidth reduction).
||Average size per pixel
||Bandwidth required for HD60 content
||Bandwidth required for UHD60 content
||3 bytes (24-bit)
||2 bytes (16-bit)
||1.5 bytes (12-bit)
When to use chroma subsampling and when to avoid?
Chroma subsampling is a useful method to use for natural content, where lower chroma resolution isn't noticeable.
On the other hand, for complex and precise synthetic content (for example, CGI content), full color depth is needed to prevent visible artifacts (edge blurring), since the pixel precise content may exacerbate them.
The images below show how CGI data can be impacted by subsampling.
Chroma sub-sampling 4:4:4 vs 4:2:2 vs 4:2:0
The finer details are lost when this image is displayed using chroma subsampling. This can be dangerous in mission-critical environments where key decisions are made based on the presented data. When sampling text at 4:2:2 or 4:2:0, then the quality will drop making said text increasingly difficult to read.
When choosing products for video walls, for example, it's crucial to choose technologies that allow versatility with regards to color space. Take a control room for instance. Part of the control room wall may display charts or graphs where every detail matters. In this case, a capture, encoding, decoding, and display product which has the capability to handle 4:4:4 is better suited. On the other hand, if watching a feed of high motion content, say a sports event, then the overall network bandwidth could be reduced by having this video play at 4:2:0. Versatility is key when choosing products for capture, streaming, recording, decoding, and display as it allows the user to reach a wider range of functionality.
Is YUV 4:4:4 the same as RGB?
While the output image will look very similar, and the bandwidth required to transfer the image will be the same2, the storage and transfer of data will differ between the two.
RGB will transmit content with a predetermined color depth per component. This means that each of the R, G, and B will contain data for each of the red, green, and blue color components respectively to collectively formulate the overall color of each pixel.
YUV, on the other hand, will transmit each pixel with the associated luma component, and two chroma components.