Abstract:
A device for processing video data can be configured to receive in a video parameter set, one or more syntax elements that include information related to session negotiation; receive in the video data a first sequence parameter set comprising a first syntax element identifying the video parameter set; receive in the video data a second sequence parameter set comprising a second syntax element identifying the video parameter set; process, based on the one or more syntax elements, a first set of video blocks associated with the first parameter set and a second set of video blocks associated with the second parameter set.
Abstract:
In general, techniques are described for separately coding depth and texture components of video data. A video coding device configured to code video data may perform the techniques. The video coding device may comprise a decoded picture buffer and a processor configured to store a depth component in the decoded picture buffer, analyze a view dependency to determine whether the depth component is used for inter-view prediction and remove the depth component from the decoded picture buffer in response to determining that the depth component is not used for inter-view prediction. for processing video data including a view component comprised of a depth component and a texture component
Abstract:
As one example, techniques for decoding video data include receiving a bitstream that includes one or more pictures of a coded video sequence (CVS), decoding a first picture according to a decoding order, wherein the first picture is a random access point (RAP) picture that is not an instantaneous decoding refresh (IDR) picture, and decoding at least one other picture following the first picture according to the decoding order based on the decoded first picture. As another example, techniques for encoding video data include generating a bitstream that includes one or more pictures of a CVS, wherein a first picture according to the decoding order is a RAP picture that is not an IDR picture, and avoiding including at least one other picture, other than the first picture, that corresponds to a leading picture associated with the first picture, in the bitstream.
Abstract:
A video encoder generates a first network abstraction layer (NAL) unit. The first NAL unit contains a first fragment of a parameter set associated with video data. The video encoder also generates a second NAL unit. The second NAL unit contains a second fragment of the parameter set. A video decoder may receive a bitstream that includes the first and second NAL units. The video decoder decodes, based at least in part on the parameter set, one or more coded pictures of the video data.
Abstract:
Techniques described herein for coding video data include techniques for coding pictures partitioned into tiles, in which each of the plurality of tiles in a picture is assigned to one of a plurality of tile groups. One example method for coding video data comprising a picture that is partitioned into a plurality tiles comprises coding video data in a bitstream, and coding, in the bitstream, information that indicates one of a plurality of tile groups to which each of the plurality of tiles is assigned. The techniques for grouping tiles described herein may facilitate improved parallel processing for both encoding and decoding of video bitstreams, improved error resilience, and more flexible region of interest (ROI) coding.
Abstract:
Techniques are described related to output and removal of decoded pictures from a decoded picture buffer (DPB). The example techniques may remove a decoded picture from the DPB prior to coding a current picture. For instance, the example techniques may remove the decoded picture if that decoded picture is not identified in the reference picture set of the current picture.
Abstract:
A block-request streaming system provides for improvements in the user experience and bandwidth efficiency of such systems, typically using an ingestion system that generates data in a form to be served by a conventional file server (HTTP, FTP, or the like), wherein the ingestion system intakes content and prepares it as files or data elements to be served by the file server. The system might include controlling the sequence, timing and construction of block requests, time based indexing, variable block sizing, optimal block partitioning, control of random access point placement, including across multiple presentation versions, dynamically updating presentation data, and/or efficiently presenting live content and time shifting.
Abstract:
A video encoder is configured to determine a picture size for one or more pictures included in a video sequence. The picture size associated with the video sequence may be a multiple of an aligned coding unit size for the video sequence. In one example, the aligned coding unit size for the video sequence may comprise a minimum coding unit size where the minimum coding unit size is selected from a plurality of smallest coding unit sizes corresponding to different pictures in the video sequence. A video decoder is configured to obtain syntax elements to determine the picture size and the aligned coding unit size for the video sequence. The video decoder decodes the pictures included in the video sequence with the picture size, and stores the decoded pictures in a decoded picture buffer.
Abstract:
Techniques and systems are provided for determining features for one or more objects in one or more video frames. For example, an image of an object, such as a face, can be received, and features of the object in the image can be identified. A size of the object can be determined based on the image, for example based on inter-eye distance of a face. Based on the size, either a high-resolution set of features or a low-resolution set of features is selected to compare to the features of the object. The object can be identified by matching the features of the object to matching features from the selected set of features.
Abstract:
This disclosure describes techniques for signaling and processing information indicating simplified depth coding (SDC) for depth intra-prediction and depth inter-prediction modes in a 3D video coding process, such as a process defined by the 3D-HEVC extension to HEVC. In some examples, the disclosure describes techniques for unifying the signaling of SDC for depth intra-prediction and depth inter-prediction modes in 3D video coding. The signaling of SDC can be unified so that a video encoder or video decoder uses the same syntax element for signaling SDC for both the depth intra-prediction mode and the depth inter-prediction mode. Also, in some examples, a video coder may signal and/or process a residual value generated in the SDC mode using the same syntax structure, or same type of syntax structure, for both the depth intra-prediction mode and depth inter-prediction mode.