Abstract:
In general, techniques are described for separately processing depth and texture components of video data. A device configured to process video data including a view component comprised of a depth component and a texture component may perform various aspects of the techniques. The device may comprise a processor configured to determine a supplemental enhancement information message that applies when processing the view component of the video data, and determine a nested supplemental enhancement information message that applies in addition to the supplemental enhancement information message when processing the depth component of the view component.
Abstract:
Techniques are described related to output and removal of decoded pictures from a decoded picture buffer (DPB). The example techniques may remove a decoded picture from the DPB prior to coding a current picture. For instance, the example techniques may remove the decoded picture if that decoded picture is not identified in the reference picture set of the current picture.
Abstract:
A video encoder is configured to determine a picture size for one or more pictures included in a video sequence. The picture size associated with the video sequence may be a multiple of an aligned coding unit size for the video sequence. In one example, the aligned coding unit size for the video sequence may comprise a minimum coding unit size where the minimum coding unit size is selected from a plurality of smallest coding unit sizes corresponding to different pictures in the video sequence. A video decoder is configured to obtain syntax elements to determine the picture size and the aligned coding unit size for the video sequence. The video decoder decodes the pictures included in the video sequence with the picture size, and stores the decoded pictures in a decoded picture buffer.
Abstract:
File formats and parsing and coding of video data are defined to promote more efficient random accessibility of coded video data. Constraints may be imposed on placement of parameter sets and the definition of sync samples in video files. Parameter set data for video data may be coded, for a non-sync sample, only in the sample entry for the sample, the sample, a previous sample in decoding order that is a sync sample, or in a sample occurring in decoding order between the sample and the previous sample in decoding order that is a sync sample.
Abstract:
A device for processing media data is configured to receive media data including virtual reality (VR) video data; determine, based at least in part on data signaled at an adaptation set level of a media presentation description for a media presentation, a projection mapping used in the media presentation; process segments of a video representation of the media presentation based on the projection mapping used in the media presentation. A device for processing media data is configured to generate media data that includes VR video data; include in the media data, data signaled at an adaptation set level of a media presentation description that identifies a projection mapping used in media presentation included in the media data; and send segments of a video representation of the media presentation based on the projection mapping used in the media presentation.
Abstract:
A media device pre-fetches media data that is likely to be retrieved. An example media device includes a memory for storing media data, and one or more processors implemented in circuitry and configured to receive information indicating at least one data structure of a plurality of data structures that is likely to be retrieved by a plurality of user devices operated by a respective plurality of users, the data structure including media data, and retrieve the media data of the data structure before receiving requests for the media data from the user devices. The information may be included in, e.g., a manifest file, a special Parameters Enhancing Delivery (PED) message, and/or a separate track of a video file multiplexed with other tracks of the video file.
Abstract:
In general, the disclosure relates to techniques for regional random access within a picture of video data. For example, a video coding device receives a plurality of pictures in a coding order. Each respective picture of the plurality of pictures comprises a plurality of regions. For a first region in a first picture of the plurality of pictures, the video coding device determines that the first region is codable independent from each other region of the first picture and from a first region in a second picture preceding the first picture in the coding order and, responsive to making such a determination, determine that the first region in the first picture has random accessibility. The video coding device codes each video block in the first region independent from any video blocks outside of the first region.
Abstract:
An example method includes processing a file including fisheye video data, the file including a syntax structure including a plurality of syntax elements that specify attributes of the fisheye video data, wherein the plurality of syntax elements includes: a first syntax element that explicitly indicates whether the fisheye video data is monoscopic or stereoscopic, and one or more syntax elements that implicitly indicate whether the fisheye video data is monoscopic or stereoscopic; determining, based on the first syntax element, whether the fisheye video data is monoscopic or stereoscopic; and rendering, based on the determination, the fisheye video data as monoscopic or stereoscopic.
Abstract:
A video coder may reconstruct a current picture of video data. A current region of the current picture is associated with a temporal index indicating a temporal layer to which the current region belongs. Furthermore, for each respective array of a plurality of arrays that correspond to different temporal layers, the video coder may store, in the respective array, sets of adaptive loop filtering (ALF) parameters used in applying ALF filters to samples of regions of pictures of the video data that are decoded prior to the current region and that are in the temporal layer corresponding to the respective array or a lower temporal layer than the temporal layer corresponding to the respective array. The video coder determines, based on a selected set of ALF parameters in the array corresponding to the temporal layer to which the current region belongs, an applicable set of ALF parameters.
Abstract:
In one example, a device for retrieving media data includes one or more processors implemented in circuitry and configured to parse system level information of a media bitstream encapsulating a video elementary stream, the system level information indicating that the video elementary stream includes one or more supplemental enhancement information (SEI) messages and payload types for each of the SEI messages, extract the one or more SEI messages and the payload types from the system level information, and send the one or more SEI messages and the payload types to one or more other processing units of the device.