Abstract:
Methods for coding an inter-layer reference picture set (RPS) and coding end of bitstream (EoB) network access (NAL) units in multi-layer coding are disclosed. In one aspect, the method includes determining whether a candidate inter-layer reference picture is present in the video information. The video information includes an inter-layer RPS including a plurality of subsets. The method further includes determining an inter-layer RPS subset to which the candidate inter-layer reference picture belongs in response to determining that the candidate inter-layer reference picture is not present, and indicating that no reference picture is present in the inter-layer RPS subset to which the candidate inter-layer reference picture belongs.
Abstract:
An apparatus configured to decode video information includes a memory and a processor in communication with the memory. The memory is configured to a memory configured to store video information associated with bitstream. The apparatus further includes a processor in communication with the memory, the processor configured to determine that a reference layer is not included in the bitstream and to receive, from an external source, a decoded base layer picture. The processor is further configured to receive, from the external source, a first indication that the picture is an intra random access point (IRAP) picture. The processor is also configured to receive a second indication whether the picture is one of an instantaneous decoder refresh (IDR) picture, a clean random access (CRA) picture, or a broken link access (BLA) picture; and to decode the video information based at least in part on the first and second indications.
Abstract:
An apparatus configured to code video information includes a memory and a processor in communication with the memory. The memory is configured to store video information associated with a bitstream. The apparatus further includes a processor in communication with the memory, the processor configured to determine whether a reference layer is included in the bitstream. The processor is further configured to determine, based upon whether the reference layer is included in the bitstream, whether or not to process an indication and to, if the reference layer is included in the bitstream, process, in a video bitstream, the indication. The processor is also configured to code the video information based at least in part on the processed indication.
Abstract:
A computing device generates a file that comprises a track box that contains metadata for a track in the file. Media data for the track comprises a sequence of samples. Each of the samples is a video access unit of multi-layer video data. As part of generating the file, the computing device generates, in the file, an additional box that documents all of the samples containing at least one Intra Random Access Point (IRAP) picture.
Abstract:
In one example, a device for processing video data includes a memory for storing an enhancement layer of video data coded according to an extension of a video coding standard, and one or more processors configured to decode a hierarchy extension descriptor for an elementary stream including the enhancement layer, wherein the hierarchy extension descriptor includes data representative of two or more reference layers on which the enhancement layer depends, wherein the two or more reference layers include a first enhancement layer, conforming to a first scalability dimension, and a second enhancement layer, conforming to a second scalability dimension, and wherein the first scalability dimension is different than the second scalability dimension, and to process the video data based at least in part on the data representative of the two or more reference layers.
Abstract:
A computing device may obtain, from a first bitstream that includes a coded representation of the video data, a Supplemental Enhancement Information (SEI) message that includes an indication of an extraction mode that was used to produce the first bitstream. If the extraction mode is the first extraction mode, the first bitstream includes one or more coded pictures not needed for correct decoding of the target output layer set. If the extraction mode is the second extraction mode, the first bitstream does not include the one or more coded pictures not needed for correct decoding of the target output layer set.
Abstract:
A method of decoding video data including receiving an encoded video bitstream that includes a plurality of pictures and storing the plurality of pictures in one or more sub-DPBs. The method further including receiving a respective set of sub-DPB parameters for each respective operation point of the encoded video bitstream. applying the respective set of sub-DPB parameters to all layers of an output layer set for each respective operation point, and performing a sub-DPB management process on the one or more sub-DPBs in accordance with the received respective single sets of sub-DPB parameters.
Abstract:
Techniques are described for signaling of representation format information in multi-layer bitstreams. Representation format information is signaled using representation format syntax structures included in a video parameter set (VPS) for a video sequence in a multi-layer bitstream. When syntax elements associated with the representation format syntax structures are not present in the VPS, a mapping of representation formats to layers in the multi-layer bitstream may be inferred. According to the techniques, in the absence of the syntax elements, a video decoder infers which of the representation format syntax structures is applied to which of the layers in the bitstream based on a number of the representation format syntax structures included in the VPS for the video sequence. By basing the inference on the number of representation format syntax structures for the video sequence, the inference may be accurate for the type of multi-layer video extension used in the multi-layer bitstream.
Abstract:
A method, apparatus, and manufacture for processing video data. A list of output layer sets in a video bitstream is received, and an index to at least one target output layer set in the list of output layer sets is received. Next, target output layers in the at least one target output layer set is determined based on the index. At least the target output layers from the video bitstream are decoded. Then, the decoded target output layers are output without outputting layers that are not targeted for output.
Abstract:
A video encoder may generate a bitstream that includes a syntax element that indicates whether inter-layer prediction is enabled for decoding a tile of a picture of the video data. Similarly, a video decoder may obtain, from a bitstream, a syntax element that indicates whether inter-layer prediction is enabled. The video decoder may determine, based on the syntax element, whether inter-layer prediction is enabled for decoding a tile of a picture of the video data, and decode the tile based on the determination.