Abstract:
In an example, a method of decoding video data includes decoding data that indicates a picture order count (POC) reset for a POC value of a first picture of a first layer of multi-layer video data, wherein the first picture is included in an access unit. The example method also includes, based on the data that indicates the POC reset for the POC value of the first picture and prior to decoding the first picture, decrementing POC values of all pictures stored to a decoded picture buffer (DPB) that precede the first picture in coding order including at least one picture of a second layer of the multi-layer video data.
Abstract:
A computing device obtains a Network Abstraction Layer (NAL) unit header of a NAL unit of the multi-layer video data. The NAL unit header comprises a layer identifier syntax element having a value that specifies an identifier of a layer of the NAL unit. The layer identifier syntax element comprises a plurality of bits that represent the value within a defined range of values. A requirement of the bitstream conforming to a video coding standard is that the value of the layer identifier syntax element is less than the maximum value of the range of values.
Abstract:
A video processing device includes a memory storing video data and one or more processors configured to: receive a first network abstraction layer (NAL) unit comprising a first picture of an access unit; in response to determining the first NAL unit comprises an intra random access point (IRAP) picture and in response to a NAL unit type for the first NAL unit indicating the presence of an instantaneous decoding refresh (IDR) picture without any associated leading pictures for a second NAL unit of the access unit comprising another IRAP picture, determine a NAL unit type for the second NAL unit to be a NAL unit type indicating the presence of an IDR picture without any associated leading pictures; and, process the first NAL unit and the second NAL unit based on the NAL unit type for the second NAL unit.
Abstract:
In an example, a method of coding video data includes determining a location of a reference sample associated with a reference picture of video data based on one or more scaled offset values, where the reference picture is included in a first layer of a multi-layer bitstream and the one or more scaled offset values indicate a difference in scale between the first layer and a second, different layer. The method also includes determining a location of a collocated reference block of video data in the first layer based on the location of the reference sample, and coding a current block of video data in the second layer relative to the collocated reference block.
Abstract:
In an example, a method of coding video data includes coding data of a video parameter set (VPS) of a multi-layer bitstream, including at least one of data that indicates whether any layers of the multi-layer bitstream have an inter-layer prediction restriction or data that indicates whether tile boundaries are aligned between at least two of the layers of the multi-layer bitstream, and coding the multi-layer bitstream in accordance with the data of the VPS.
Abstract:
In one example, a video coder is configured to code a value for a syntax element indicating whether at least a portion of a picture order count (POC) value of a picture is to be reset to a value of zero, when the value for the syntax element indicates that the portion of the POC value is to be reset to the value of zero, reset at least the portion of the POC value such that the portion of the POC value is equal to zero, and code video data using the reset POC value. Coding video data using the reset POC value may include inter-predicting a block of a subsequent picture relative to the picture, where the block may include a motion parameter that identifies the picture using the reset POC value. The block may be coded using temporal inter-prediction or inter-layer prediction.
Abstract:
A device for processing video data includes a memory; a receiver configured to real-time transport protocol (RTP) packets; and one or more processors configured to: receive a first real-time transport protocol (RTP) packet comprising a first network abstraction layer (NAL) unit, and in response to a transmission mode for the first RTP packet being a single session transmission mode and a first parameter being equal to a first value, determine a decoding order number for the first NAL unit based on a transmission order of the first NAL unit.
Abstract:
In one example, a device for coding video data includes a video coder configured to code data indicating whether tile boundaries of different layers of video data are aligned and whether inter-layer prediction is allowed along or across tile boundaries of enhancement layer blocks, code an enhancement layer block in an enhancement layer tile of the video data without using inter-layer prediction from a collocated base layer block for which inter-layer filtering or reference layer filtering across tile boundaries in a reference layer picture in an access unit including both the enhancement layer tile and the base layer block is enabled, and code the collocated base layer block.
Abstract:
In one example, a device for coding video data includes a video coder configured to code data representative of whether a tile of an enhancement layer picture can be predicted using inter-layer prediction, and predict data of the tile using inter-layer prediction only when the data indicates that the tile can be predicted using inter-layer prediction.
Abstract:
Techniques for encapsulating video streams containing multiple coded views in a media file are described herein. In one example, a method includes parsing a track of multiview video data, wherein the track includes one or more views, including only one of a texture view of a particular view and a depth view of the particular view. The method further includes parsing a track reference to determine a dependency of the track to a referenced track indicated in the track reference. Track reference types include ‘deps’ that indicates that the track includes the depth view of the particular view and the reference track includes the texture, ‘tref’ that indicates that the track depends on the texture view which is stored in the referenced track, and ‘dref’ that indicates that the track depends on the depth view which is stored in the referenced track.