How librav1e Handles HDR Metadata
This article explains how the Rust-based AV1 encoder, librav1e, processes High Dynamic Range (HDR) metadata. It covers how the encoder signals color properties, handles static HDR10 metadata like Mastering Display Color Volume (MDCV) and Content Light Level (CLL), and manages dynamic HDR metadata within the AV1 ecosystem.
To properly encode HDR video, an encoder must write specific metadata into the video bitstream. This metadata informs playback devices how to map the colors and brightness of the content to the capabilities of the viewer’s screen. librav1e handles this through specific color-description signaling and metadata injection.
1. Signaling Color Properties
HDR video requires precise color space configuration. librav1e allows users to define these properties using command-line interface (CLI) parameters or the library’s programming API. For standard HDR10 content, the following parameters are used:
- Color Primaries (
--primaries): Typically set toBT2020to define the wide color gamut. - Transfer Characteristics (
--transfer): Set toSMPTE2084(for Perceptual Quantizer / PQ) orARIB-STD-B67(for Hybrid Log-Gamma / HLG). - Matrix Coefficients (
--matrix): Typically set toBT2020NCL(Non-Constant Luminance).
Specifying these flags ensures the output bitstream is flagged correctly so decoders interpret the color values properly.
2. Static HDR Metadata (HDR10)
Static HDR metadata describes the brightness limits of the mastering display and the content itself. librav1e handles this via two main inputs:
- Mastering Display Color Volume (MDCV): Configured
via the
--mastering-displayparameter. It accepts a string defining the chromaticity coordinates of the mastering display’s red, green, blue, and white points, along with the maximum and minimum luminance in nits.- CLI Format:
G(x,y)B(x,y)R(x,y)WP(x,y)L(max,min)
- CLI Format:
- Content Light Level (CLL): Configured via the
--content-lightparameter. It specifies the Maximum Content Light Level (MaxCLL) and Maximum Frame-Average Light Level (MaxFALL) in nits.- CLI Format:
max_cll,max_fall(e.g.,1000,400)
- CLI Format:
In the Rust API, these correspond to the
MasteringDisplay and ContentLightLevel
structs, which are passed to the encoder configuration before the
encoding session begins.
3. Dynamic HDR Metadata
Unlike static metadata, which applies to the entire video, dynamic metadata changes scene-by-scene or frame-by-frame to optimize playback.
- HDR10+: librav1e does not natively parse or generate dynamic HDR10+ JSON files internally. Instead, dynamic metadata is structured as Open Bitstream Units (OBUs) containing ITU-T T.35 metadata. Users can pass this pre-formatted metadata into the encoder using external tools or developer APIs that write the T.35 OBUs directly into the AV1 bitstream.
- Dolby Vision: Dolby Vision metadata is proprietary. For AV1 encoding, Dolby Vision is handled at the container level (such as MP4 or MKV) or through external muxers rather than being processed directly by the core librav1e encoding engine.