Abstract:
A processor for video coding receives a full-frame rate (FFR) HDR video signal and a corresponding FFR SDR video signal. An encoder generates a scalable bitstream that allows decoders to generate half-frame-rate (HFR) SDR, FFR SDR, HFR HDR, or FFR HDR signals. Given odd and even frames of the input FFR SDR signal, the scalable bitstream combines a base layer of coded even SDR frames with an enhancement layer of coded packed frames, where each packed frame includes a downscaled odd SDR frame, a downscaled even HDR residual frame, and a downscaled odd HDR residual frame. In an alternative implementation, the scalable bitstream combines four signals layers: a base layer of even SDR frames, an enhancement layer of odd SDR frames, a base layer of even HDR residual frames and an enhancement layer of odd HDR residual frames. Corresponding decoder architectures are also presented.
Abstract:
A sparse FIR filter can be used to process an image in order to rectify imaging artifacts. In a first example application, the sparse FIR filter is applied as a selective sparse FIR filter that examines a set of selected neighboring pixels of an original pixel in order to identify smooth areas of the image and to selectively apply filtering to only the smooth areas of the image. The parameters of selective filtering are selected based on the characteristics of an inter-layer predictor. In a second example application, the sparse FIR filter is applied as an edge aware selective sparse FIR filter that examines additional neighboring pixels to the set of selected pixels in order to identify edges and carry out selective filtering of smooth areas of the image. Examples for detecting and removing banding artifacts during the coding of high-dynamic range images are provided.
Abstract:
In an embodiment, a control map of false contour filtering is generated for a predicted image. The predicted image is predicted from a low dynamic range image mapped from the wide dynamic range image. Based at least in part on the control map of false contour filtering and the predicted image, one or more filter parameters for a sparse finite-impulse-response (FIR) filter are determined. The sparse FIR filter is applied to filter pixel values in a portion of the predicted image based at least in part on the control map of false contour filtering. The control map of false contour filtering is encoded into a part of a multi-layer video signal that includes the low dynamic range image.
Abstract:
Input VDR images are received. A candidate set of function parameter values for a mapping function is selected from multiple candidate sets. A set of image blocks of non-zero standard deviations in VDR code words in at least one input VDR image is constructed. Mapped code values are generated by applying the mapping function with the candidate set of function parameter values to VDR code words in the set of image blocks in the at least one input VDR image. Based on the mapped code values, a subset of image blocks of standard deviations below a threshold value in mapped code words is determined as a subset of the set of image blocks. Based at least in part on the subset of image blocks, it is determined whether the candidate set of function parameter values is optimal for the mapping function to map the at least one input VDR image.
Abstract:
For each content-mapped frame of a scene, it is determined whether the content mapped frame is susceptible to object fragmentation with respect to texture in a homogeneous region based on statistical values derived from the content-mapped image and a source image mapped into the content-mapped image. The homogeneous region is a region of consistent texture in the source image. Based on a count of content-mapped frames susceptible to object fragmentation in homogeneous region, it is determined whether the scene is susceptible to object fragmentation in homogeneous region. If so, an upper limit for mapped codewords for a prediction function for predicting codewords of a predicted image from the mapped codewords in the content-mapped image is adjusted. Mapped codewords above the upper limit are clipped to the upper limit.
Abstract:
Pixel data of a video sequence with enhanced dynamic range (EDR) are predicted based on pixel data of a corresponding video sequence with standard dynamic range (SDR) and an inter-layer predictor. Under a highlights clipping constrain, conventional SDR to EDR prediction is adjusted as follows: a) given a highlights threshold, the SDR to EDR predictor is adjusted to output a fixed output value for all input SDR pixel values larger than the highlights threshold, and b) given a dark-regions threshold, the residual values between the input EDR signal and its predicted value are set to zero for all input SDR pixel values lower than the dark-regions threshold. Example processes to determine the highlights and dark-regions thresholds and whether highlights clipping is occurring are provided.
Abstract:
An encoder receives an input enhanced dynamic range (EDR) image and a corresponding lower dynamic range (LDR) image to be coded at a given target rate. Before coding, a pre-dithering process is applied to the input LDR image to generate a dithered LDR image at a second bit depth, lower than its original bit depth. The pre-dithering process includes: generating uniformly-distributed noise, applying a spatial filter to the noise to generate low-pass or high-pass filtered noise, applying a temporal high pass or low pass filter to the spatially-filtered noise to generate output noise, adding the output noise to the input LDR image to generate a noise-enhanced LDR image, and quantizing the noise-enhanced image to generate the dithered LDR image. Selecting the characteristics of the dithering filters is based on both the target bit rate and luminance characteristics of the pixels in the input LDR image.
Abstract:
An encoder receives one or more input pictures of enhanced dynamic range (EDR) to be encoded in a coded bit stream comprising a base layer and one or more enhancement layer. The encoder comprises a base layer quantizer (BLQ) and an enhancement layer quantizer (ELQ) and selects parameters of the BLQ and the ELQ by a joint BLQ-ELQ adaptation method which given a plurality of candidate sets of parameters for the BLQ, for each candidate set, computes a joint BLQ-ELQ distortion value based on a BLQ distortion function, an ELQ distortion function, and at least in part on the number of input pixels to be quantized by the ELQ. The encoder selects as the output BLQ parameter set the candidate set for which the computed joint BLQ-ELQ distortion value is the smallest. Example ELQ, BLQ, and joint BLQ-ELQ distortion functions are provided.
Abstract:
In some embodiments, an encoder device is disclosed to generate single-channel standard dynamic range/high dynamic range content predictors. The device receives a standard dynamic range image content and a representation of a high dynamic range image content. The device determines a first mapping function to map the standard dynamic range image content to the high dynamic range image content. The device generates a single channel prediction metadata based on the first mapping function, such that a decoder device can subsequently render a predicted high dynamic range image content by applying the metadata to transform the standard dynamic range image content to the predicted high definition image content.
Abstract:
Given a standard-dynamic range (SDR) video input, techniques for generating and compressing composer metadata describing inverse luma and chroma reshaping functions are described. Given the SDR input, the composer metadata allow a decoder to generate a corresponding output in high-dynamic range. Three techniques are proposed: a static, sequence-based, architecture, a two-stage, scene-based, distributed solution with a centralized post-processing method, and a single-stage distributed solution using overlapped segments. Techniques to reduce the amount of transmitted composer metadata are also described.