Abstract:
In an example, a method of decoding video data includes decoding, from a video parameter set (VPS) of a multi-layer bitstream, data that indicates at least one of a tile configuration for layers of the multi-layer bitstream or a parallel processing configuration for layers of the multi-layer bitstream. The method also includes decoding the multi-layer bitstream in accordance with the data decoded from the VPS.
Abstract:
In one example, the disclosure is directed to techniques that include receiving a bitstream comprising at least a syntax element, a first network abstraction layer unit type, and a coded access unit comprising a plurality of pictures. The techniques further include determining a value of the syntax element which indicates whether the access unit was coded using cross-layer alignment. The techniques further include determining the first network abstraction layer unit type for a picture in the access unit and determining whether the first network abstraction layer unit type equals a value in a range of type values. The techniques further include setting a network abstraction layer unit type for all other pictures in the coded access unit to equal the value of the first network abstraction layer unit type if the first network abstraction layer unit type is equal to a value in the range of type values.
Abstract:
Systems, methods, and devices for coding multilayer video data are disclosed that may include, encoding, decoding, transmitting, or receiving a non-entropy encoded set of profile, tier, and level syntax structures, potentially at a position within a video parameter set (VPS) extension. The systems, methods, and devices may refer to one of the profile, tier, and level syntax structures for each of a plurality of output layer sets. The systems, methods, and devices may encode or decode video data of one of the output layer sets based on information from the profile, tier, and level syntax structure referred to for the output layer set.
Abstract:
A device for processing video data includes a memory, a receiver configured to real-time transport protocol (RTP) packets, and one or more processors configured to receive a first aggregation packet according to a real-time transfer protocol (RTP), wherein the first aggregation packet comprises a payload header and one or more aggregation units; parse a first aggregation unit that is the first aggregation unit of the first aggregation packet to determine a value for a first parameter, wherein the first parameter specifies a decoding order number for a NAL unit included in the first aggregation packet; parse a second aggregation unit to determine a value for a second parameter, wherein the second aggregation unit follows the first aggregation unit in the first aggregation packet; and, based on the first parameter and the second parameter, determine a decoding order for a NAL unit included in the second aggregation unit.
Abstract:
A device for processing video data includes a memory; a receiver configured to real-time transport protocol (RTP) packets; and one or more processors configured to receive a first fragmentation unit comprising a subset of a fragmented network abstraction layer (NAL) unit; parse a start bit of the fragmentation unit to determine if the first fragmentation unit comprises a start of the fragmented NAL unit; in response to the first fragmentation unit comprising the start of the fragmented NAL unit and one or both of a transmission mode for the first fragmentation unit being a multi-session transmission mode and a first parameter being greater than a first value, parse a second parameter to determine a decoding order for the fragmented NAL unit; decode the fragmented NAL unit based on the determined decoding order.
Abstract:
A method of coding video data includes upsampling at least a portion of a reference layer picture to an upsampled picture having an upsampled picture size. The upsampled picture size has a horizontal upsampled picture size and a vertical upsampled picture size. At least one of the horizontal or vertical upsampled picture sizes may be different than a horizontal picture size or vertical picture size, respectively, of an enhancement layer picture. In addition, position information associated with the upsampled picture may be signaled. An inter-layer reference picture may be generated based on the upsampled picture and the position information.
Abstract:
In one example, a device for coding video data includes a video coder configured to code a value for a syntax element representative of whether any two reference layer samples, collocated with two respective enhancement layer picture samples within a common enhancement layer tile, must be within a common reference layer tile, and code the enhancement layer picture samples based at least in part on the value of the syntax element.
Abstract:
Techniques for encapsulating video streams containing multiple coded views in a media file are described herein. In one example, a method includes parsing a track of multiview video data, wherein the track includes at least one depth view. The method further includes parsing information to determine a spatial resolution associated with the depth view, wherein decoding the spatial resolution does not require parsing of a sequence parameter set of the depth view. Another example method includes composing a track of multiview video data, wherein the track includes the one or more views. The example method further includes composing information to indicate a spatial resolution associated with the depth view, wherein decoding the spatial resolution does not require parsing of a sequence parameter set of the depth view.
Abstract:
Techniques for low-delay buffering in a video coding process are disclosed. Video decoding techniques may include receiving a first decoded picture buffer (DPB) output delay and a second DPB output delay for a decoded picture, determining, for the decoded picture, a first DPB output time using the first DPB output delay in the case a hypothetical reference decoder (HRD) setting for a video decoder indicates operation at a picture level, and determining, for the decoded picture, a second DPB output time using the second DPB output delay in the case that the HRD setting for the video decoder indicates operation at a sub-picture level.
Abstract:
Techniques are described for signaling decoding unit identifiers for decoding units of an access unit. The video decoder determines which network abstraction layer (NAL) units are associated with which decoding units based on the decoding unit identifiers. Techniques are also described for including one or more copies of supplemental enhancement information (SEI) messages in an access unit.