Abstract:
Methods for defining decoder capability for decoding multi-layer bitstreams containing video information, in which the decoder is implemented based on multiple single-layer decoder cores, are disclosed. In one aspect, the method may include identifying at least one allocation of layers of the bitstream into at least one set of layers. The method may further include detecting whether each set of layers is capable of being exclusively assigned to one of the decoder cores for the decoding of the bitstream. The method may also include determining whether the decoder is capable of decoding the bitstream based at least in part on detecting whether each set of layers is capable of being exclusively assigned to one of the decoder cores.
Abstract:
In one example, the disclosure is directed to techniques that include receiving a bitstream comprising at least a syntax element, a first network abstraction layer unit type, and a coded access unit comprising a plurality of pictures. The techniques further include determining a value of the syntax element which indicates whether the access unit was coded using cross-layer alignment. The techniques further include determining the first network abstraction layer unit type for a picture in the access unit and determining whether the first network abstraction layer unit type equals a value in a range of type values. The techniques further include setting a network abstraction layer unit type for all other pictures in the coded access unit to equal the value of the first network abstraction layer unit type if the first network abstraction layer unit type is equal to a value in the range of type values.
Abstract:
In one example, the disclosure is directed to techniques that include receiving a bitstream comprising at least a syntax element, a first network abstraction layer unit type, and a coded access unit comprising a plurality of pictures. The techniques further include determining a value of the syntax element which indicates whether the access unit was coded using cross-layer alignment. The techniques further include determining the first network abstraction layer unit type for a picture in the access unit and determining whether the first network abstraction layer unit type equals a value in a range of type values. The techniques further include setting a network abstraction layer unit type for all other pictures in the coded access unit to equal the value of the first network abstraction layer unit type if the first network abstraction layer unit type is equal to a value in the range of type values.
Abstract:
In general, techniques are described for performing residual prediction in video coding. As one example, a device configured to code scalable or multi-view video data may comprise one or more processors configured to perform the techniques. The processors may determine a difference picture, for a current picture, based on a first reference picture in a same layer or view as the current picture and a decoded picture in a different layer or view as the current picture. The decoded picture may be in a same access unit as the first reference picture. The processors may perform bi-prediction based on the difference picture to code at least a portion of the current picture.
Abstract:
In one example, a device for coding video data includes a video coder configured to code a value for a syntax element representative of whether any two reference layer samples, collocated with two respective enhancement layer picture samples within a common enhancement layer tile, must be within a common reference layer tile, and code the enhancement layer picture samples based at least in part on the value of the syntax element.
Abstract:
An apparatus configured to code video information includes a memory and a processor in communication with the memory. The memory is configured to store video information associated with a reference layer and an enhancement layer, the reference layer comprising a reference layer (RL) picture having a first slice and a second slice, and the enhancement layer comprising an enhancement layer (EL) picture corresponding to the RL picture. The processor is configured to generate an inter-layer reference picture (ILRP) by upsampling the RL picture, the ILRP having a single slice associated therewith, set slice information of the single slice of the ILRP equal to slice information of the first slice, and use the ILRP to code at least a portion of the EL picture. The processor may encode or decode the video information.
Abstract:
An example method of decoding video data includes obtaining, from a video bitstream, a representation of a difference between a motion vector (MV) predictor and a MV that identifies a predictor block for a current block of video data in a current picture; obtaining, from the video bitstream, a syntax element indicating whether adaptive motion vector resolution (AMVR) is used for the current block; determining, based on the representation of the difference between the MV predictor and the MV that identifies the predictor block, a value of the MV; storing the value of the MV at fractional-pixel resolution regardless of whether AMVR is used for the current block and regardless of whether the predictor block is included in the current picture; determining, based on the value of the stored MV, pixel values of the predictor block; and reconstructing the current block based on the pixel values of the predictor block.
Abstract:
In one example, a device for coding video data includes a video coder configured to code data representative of whether a tile of an enhancement layer picture can be predicted using inter-layer prediction, and predict data of the tile using inter-layer prediction only when the data indicates that the tile can be predicted using inter-layer prediction.
Abstract:
An apparatus for coding video information according to certain aspects includes a memory unit and a processor in communication with the memory unit. The memory unit stores video information associated with a reference layer and a corresponding enhancement layer. The processor determines a value of a video unit positioned at a position within the enhancement layer based at least in part on an intra prediction value weighted by a first weighting factor, wherein the intra prediction value is determined based on at least one additional video unit in the enhancement layer, and a value of a co-located video unit in the reference layer weighted by a second weighting factor, wherein the co-located video unit is located at a position in the reference layer corresponding to the position of the video unit in the enhancement layer. In some embodiments, the at least one of the first and second weighting factors is between 0 and 1.
Abstract:
A device for encoding or decoding video data may clip first residual data based on a bit depth of the first residual data. The device may generate second residual data at least in part by applying an inverse Adaptive Color Transform (TACT) to the first residual data. Furthermore, the device may reconstruct, based on the second residual data, a coding block of a coding unit (CU) of the video data.