rav1e Chroma Subsampling: 4:2:2 and 4:4:4 Support
This article explores how the librav1e AV1 encoder
handles high-fidelity chroma subsampling formats, specifically 4:2:2 and
4:4:4. We will examine how the encoder configures these formats, their
impact on video quality and encoding performance, and how developers can
utilize them for professional video workflows.
Chroma Subsampling in AV1 and rav1e
Chroma subsampling is a method of compressing color information in video files by prioritizing luminance (brightness) over chrominance (color). While consumer video typically uses 4:2:0 subsampling, professional editing, screen recording, and high-end streaming often require 4:2:2 or 4:4:4 formats to preserve fine color details and sharp text.
As an encoder for the AV1 video format, librav1e fully
supports the AV1 specification’s profiles, which define how these
subsampling formats are handled. Specifically:
- Main Profile (Profile 0): Supports 8-bit and 10-bit YUV 4:2:0.
- High Profile (Profile 1): Supports 8-bit and 10-bit YUV 4:4:4.
- Professional Profile (Profile 2): Supports 8-bit, 10-bit, and 12-bit YUV 4:2:2, as well as 12-bit 4:2:0 and 4:4:4.
How librav1e Processes 4:2:2 and 4:4:4
The librav1e library manages chroma subsampling formats
natively through its configuration API. It handles these formats using a
structured pipeline:
1. Configuration via the API
Developers configure the desired subsampling format using the
ChromaSampling enum in the rav1e API. This enum specifies
how the chroma channels are sampled relative to the luma channel: *
Cs420 (Half horizontal and vertical resolution) *
Cs422 (Half horizontal, full vertical resolution) *
Cs444 (Full horizontal and vertical resolution) *
Cs400 (Monochrome, no chroma)
The library also allows setting the ChromaSamplePosition
to define exactly where the chroma samples are located relative to the
luma grid (e.g., Colocated or Vertical/Unknown).
2. Internal Pixel Representation
Inside librav1e, video frames are stored in planes. For
a 4:4:4 input, the U and V chroma planes are allocated at the same
resolution as the Y (luma) plane. For 4:2:2, the chroma planes have full
vertical resolution but half horizontal resolution. The encoder
processes these planes without downsampling them to 4:2:0, ensuring that
the high-fidelity color information is preserved throughout the motion
estimation, transform, and quantization steps.
3. Encoding Optimization and Bit Depth
librav1e supports 8-bit, 10-bit, and 12-bit depths. When
encoding 4:2:2 or 4:4:4 video at higher bit depths (such as 10-bit or
12-bit), librav1e utilizes optimized SIMD (Single
Instruction, Multiple Data) assembly paths. This ensures that the
increased data load of processing full-resolution color channels does
not severely bottleneck the encoding process.
Performance and Quality Impact
Using 4:2:2 or 4:4:4 in librav1e has distinct
trade-offs:
- Color Fidelity: 4:4:4 completely eliminates color bleeding and “fuzzy” red text, which is highly beneficial for screen-casting, remote desktop applications, and gaming content.
- Compression Efficiency: Because there is more raw data to compress, 4:4:4 and 4:2:2 files require a higher bitrate to achieve the same perceived quality level as a 4:2:0 file, though the output remains highly efficient due to AV1’s advanced coding tools.
- Encoding Speed: Processing 4:2:2 and 4:4:4 formats increases the computational workload. The encoder must perform motion search, intra-prediction, and transform loops on larger chroma blocks compared to 4:2:0.