Abstract:
A video processing device can receive in an encoded bitstream of video data a network abstraction layer (NAL) unit and parse a first syntax element in a header of the NAL unit to determine a temporal identification (ID) for the NAL unit, wherein a value of the first syntax element is one greater than the temporal identification.
Abstract:
In one example, the disclosure is directed to techniques that include receiving a bitstream comprising at least a syntax element, a first network abstraction layer unit type, and a coded access unit comprising a plurality of pictures. The techniques further include determining a value of the syntax element which indicates whether the access unit was coded using cross-layer alignment. The techniques further include determining the first network abstraction layer unit type for a picture in the access unit and determining whether the first network abstraction layer unit type equals a value in a range of type values. The techniques further include setting a network abstraction layer unit type for all other pictures in the coded access unit to equal the value of the first network abstraction layer unit type if the first network abstraction layer unit type is equal to a value in the range of type values.
Abstract:
Systems and methods for inter-layer reference picture set derivation based on sub-layer reference prediction dependency are described herein. One aspect of the subject matter described in the disclosure provides a video encoder comprising a memory configured to store one or more direct reference layer pictures of one or more current pictures in a sequence, wherein the one or more current pictures are associated with a current layer, the current layer being associated with the one or more direct reference layers. The video encoder further comprises a processor in communication with the memory unit. The memory unit is configured to set an indication associated with a current picture to indicate whether all of the one or more direct reference layer pictures of the current picture that are not restricted for use in inter-layer prediction are included in an inter-layer reference picture set associated with the current picture.
Abstract:
A device for processing video data includes a memory; a receiver configured to real-time transport protocol (RTP) packets; and one or more processors configured to receive a first fragmentation unit comprising a subset of a fragmented network abstraction layer (NAL) unit; parse a start bit of the fragmentation unit to determine if the first fragmentation unit comprises a start of the fragmented NAL unit; in response to the first fragmentation unit comprising the start of the fragmented NAL unit and one or both of a transmission mode for the first fragmentation unit being a multi-session transmission mode and a first parameter being greater than a first value, parse a second parameter to determine a decoding order for the fragmented NAL unit; decode the fragmented NAL unit based on the determined decoding order.
Abstract:
A method of coding video data includes receiving one or more layers of video information. Each layer may include at least one picture. The method can include processing an indicator within at least one of a video parameter set (VPS), a sequence parameter set (SPS), or a picture parameter set (PPS) that indicates whether all direct reference layer pictures associated with the at least one of the video parameter set (VPS), the sequence parameter set (SPS), or the picture parameter set (PPS) are added to an inter-layer reference picture set. Based on the indicator, the method can further include refraining from further signaling inter-layer reference picture information in any video slice associated with the at least one of the video parameter set (VPS), the sequence parameter set (SPS), or the picture parameter set (PPS). Alternatively, based on the indicator, the method can further include adding to the inter-layer reference picture set all direct reference layer pictures for any video slice associated with the at least one of the video parameter set (VPS), the sequence parameter set (SPS), or the picture parameter set (PPS).
Abstract:
In general, techniques are described for coding picture order count values identifying long-term reference pictures. A video decoding device comprising a processor may perform the techniques. The processor may determine least significant bits (LSBs) of a picture order count (POC) value that identifies a long-term reference picture (LTRP). The LSBs do not uniquely identify the POC value with respect to the LSBs of any other POC value identifying any other picture in a decoded picture buffer (DPB). The processor may determine most significant bits (MSBs) of the POC value. The MSBs combined with the LSBs is sufficient to distinguish the POC value from any other POC value that identifies any other picture in the DPB. The processor may retrieve the LTRP from the decoded picture buffer based on the LSBs and MSBs of the POC value, and decode a current picture of the video data using the retrieved LTRP.
Abstract:
A device for coding video data includes a memory comprising a decoded picture buffer (DPB) configured to store video data, and a video coder configured to code data representative of a value for a picture order count (POC) resetting period identifier, wherein the data is included in a slice segment header for a slice associated with a coded picture of a layer of video data, and wherein the value of the POC resetting period identifier indicates a POC resetting period including the coded picture, and reset at least part of a POC value for the coded picture in the POC resetting period in the layer and POC values for one or more pictures in the layer that are currently stored in the DPB.
Abstract:
Systems, methods, and devices for coding multilayer video data are disclosed that may include encoding, decoding, transmitting, or receiving multilayer video data. The systems, methods, and devices may transmit or receive a video parameter set (VPS) including information for a series of layers, each layer including visual signal information. The systems, methods, and devices may code (encode or decode) video data based on the visual signal information signaled per layer in the VPS.
Abstract:
During a coding process, systems, methods, and apparatus may code information indicating whether gradual decoder refresh (GDR) of a picture is enabled. When GDR is enabled, the coding process, systems, methods, and apparatus may code information that indicates whether one or more slices of the picture belong to a foreground region of the picture. In another example, during a coding process, systems, methods, and apparatus may decode video data corresponding to an ISP identification (ISP ID) for one of the ISPs for slices of a picture. The systems, methods, and apparatus may decode video data corresponding to an ROI using the ISP.
Abstract:
A video coding device, such as a video encoder or a video decoder, may be configured to code a duration between coded picture buffer (CPB) removal time of a first decoding unit (DU) in an access unit (AU) and a second DU, wherein the second DU is subsequent to the first DU in decoding order and in the same AU as the first DU. The video coding device may further determine a removal time of the DU based at least on the coded duration. The coding device may also code a sub-picture timing supplemental enhancement information (SEI) message associated with the first DU. The video coding device may further determine a removal time of the DU based at least in part on the sub-picture timing SEI message.