Abstract:
Systems, methods, and devices for coding multilayer video data are disclosed that may include, encoding, decoding, transmitting, or receiving a non-entropy encoded set of profile, tier, and level syntax structures, potentially at a position within a video parameter set (VPS) extension. The systems, methods, and devices may refer to one of the profile, tier, and level syntax structures for each of a plurality of output layer sets. The systems, methods, and devices may encode or decode video data of one of the output layer sets based on information from the profile, tier, and level syntax structure referred to for the output layer set.
Abstract:
Techniques are described for signaling decoding unit identifiers for decoding units of an access unit. The video decoder determines which network abstraction layer (NAL) units are associated with which decoding units based on the decoding unit identifiers. Techniques are also described for including one or more copies of supplemental enhancement information (SEI) messages in an access unit.
Abstract:
Techniques are described for modal sub-bitstream extraction. For example, a network entity may select a sub-bitstream extraction mode from a plurality of sub-bitstream extraction modes. Each sub-bitstream extraction mode may define a particular manner in which to extract coded pictures from views or layers to allow a video decoder to decode target output views or layers for display. In this manner, the network entity may adaptively select the appropriate sub-bitstream extraction technique, rather than a rigid, fixed sub-bitstream extraction technique.
Abstract:
Systems, methods, and devices for coding multilayer video data are disclosed that may include encoding, decoding, transmitting, or receiving multilayer video data. The systems, methods, and devices may receive or transmit a first output layer set for a layer set and receive or transmit a second output layer set for the layer set. The systems, methods, and devices may code (encode or decode) video data for at least one of the first output layer set and the second output layer set.
Abstract:
In one example, a device for coding video data includes a video coder configured to code data representative of whether a tile of an enhancement layer picture can be predicted using inter-layer prediction, and predict data of the tile using inter-layer prediction only when the data indicates that the tile can be predicted using inter-layer prediction.
Abstract:
A video coder can select which reference pictures should be signaled in a parameter set such as a picture parameter set (PPS) and which reference pictures should be signaled in a slice header such that when a video decoder constructs a reference picture set, the video decoder does not need to reorder the reference picture set to construct an initial reference picture list for a slice of video data.
Abstract:
In one example, a video coder, such as a video encoder or video decoder, is configured to code a video parameter set (VPS) for one or more layers of video data, wherein each of the one or more layers of video data refer to the VPS, and code the one or more layers of video data based at least in part on the VPS. The video coder may code the VPS for video data conforming to High-Efficiency Video Coding, Multiview Video Coding, Scalable Video Coding, or other video coding standards or extensions of video coding standards. The VPS may include data specifying parameters for corresponding sequences of video data within various different layers (e.g., views, quality layers, or the like). The parameters of the VPS may provide indications of how the corresponding video data is coded.
Abstract:
Techniques are described for sending output indications in codec-hybrid multi-layer video coding, in which a base layer of video data is provided by an external system and conforms to a different video codec standard than one or more enhancement layers of the video data. An enhancement layer video decoder receives an enhancement layer bitstream that includes at least one enhancement layer to be decoded, an indication that the base layer is provided externally, and an indication of which layers are target output layers to be output for display. The external system does not receive a target output layer indication in a base layer bitstream. The disclosed techniques enable the enhancement layer video decoder, when the base layer is provided by the external system, to send an output indication to the external system indicating whether the base layer or specific base layer decoded pictures need to be output for display.
Abstract:
In one example, a device includes one or more media decoders configured to decode media data, a network interface configured to receive a layered coding transport (LCT) Session Instance Description (LSID), the LSID including information representing a plurality of LCT sessions, each of the LCT sessions including data of a respective one of a plurality of representations of a DASH media presentation and data of one or more of the LCT sessions, and a processor configured to initiate consumption of one or more of the representations of the DASH media presentation using the LSID and without using a manifest file for the DASH media presentation, wherein to initiate consumption, the processor is configured to receive, via the network interface, packets of the LCT sessions including portions of data of the one or more of the representations; and provide data of the packets to the one or more media decoders.
Abstract:
A method for processing video data in a real-time transport protocol (RTP) payload includes encapsulating video data in a single network abstraction layer (NAL) unit packet for an RTP session. The single NAL unit packet contains a single NAL unit. The method may also include encapsulating decoding order number information in the single NAL unit packet based on at least one of: the RTP session being in a multi-stream transmission (MST) mode, or a maximum number of NAL units that may precede the NAL unit in a de-packetization buffer in reception order and follow the NAL unit in decoding order being greater than 0.