Abstract:
Techniques and systems are provided for performing predictive random access using a background picture. For example, a method of decoding video data includes obtaining an encoded video bitstream comprising a plurality of pictures. The plurality of pictures include a plurality of predictive random access pictures. A predictive random access picture is at least partially encoded using inter-prediction based on at least one background picture. The method further includes determining, for a time instance of the video bitstream, a predictive random access picture of the plurality of predictive random access pictures with a time stamp closest in time to the time instance. The method further includes determining a background picture associated with the predictive random access picture, and decoding at least a portion of the predictive random access picture using inter-prediction based on the background picture.
Abstract:
A video coder can be configured to perform texture first coding for a first texture view, a first depth view, a second texture view, and a second depth view; for a macroblock of the second texture view, locate a depth block of the first depth view that corresponds to the macroblock; based on at least one depth value of the depth block, derive a disparity vector for the macroblock; code a first sub-block of the macroblock based on the derived disparity vector; and, code a second sub-block of the macroblock based on the derived disparity vector.
Abstract:
A video coder can be configured to code a random access point (RAP) picture and code one or more decodable leading pictures (DLPs) for the RAP picture such that all pictures that are targeted for discard precede the DLPs associated with the RAP picture in display order.
Abstract:
Techniques for encapsulating video streams containing multiple coded views in a media file are described herein. In one example, a method includes parsing a track of video data, wherein the track includes one or more views. The method further includes parsing information to determine whether a texture view or a depth view of a reference view is required for decoding at least one of the one or more views in the track. Another example method includes composing a track of video data, wherein the track includes one or more views and composing information that indicates whether a texture view or a depth view of a reference view is required for decoding at least one of the one or more views in the track.
Abstract:
As one example, a method of coding video data includes storing one or more decoding units of video data in a picture buffer. The method further includes obtaining a respective buffer removal time for the one or more decoding units, wherein obtaining the respective buffer removal time comprises receiving a respective signaled value indicative of the respective buffer removal time for at least one of the decoding units. The method further includes removing the decoding units from the picture buffer in accordance with the obtained buffer removal time for each of the decoding units. The method further includes coding video data corresponding to the removed decoding units, wherein coding the video data comprises decoding the at least one of the decoding units.
Abstract:
In one example, a video coder is configured to code a value for a syntax element indicating whether at least a portion of a picture order count (POC) value of a picture is to be reset to a value of zero, when the value for the syntax element indicates that the portion of the POC value is to be reset to the value of zero, reset at least the portion of the POC value such that the portion of the POC value is equal to zero, and code video data using the reset POC value. Coding video data using the reset POC value may include inter-predicting a block of a subsequent picture relative to the picture, where the block may include a motion parameter that identifies the picture using the reset POC value. The block may be coded using temporal inter-prediction or inter-layer prediction.
Abstract:
In one example, a device for decoding video data includes a video decoder configured to decode one or more syntax elements of a current reference picture set (RPS) prediction data structure, wherein at least one of the syntax elements represents a picture order count (POC) difference between a POC value associated with the current RPS and a POC value associated with a previously decoded RPS, form a current RPS based at least in part on the RPS prediction data structure and the previously decoded RPS, and decode one or more pictures using the current RPS. A video encoder may be configured to perform a substantially similar process during video encoding.
Abstract:
Techniques are described for determining a disparity vector for a current block based on disparity motion vectors of one or more spatially and temporally neighboring regions to a current block to be predicted. The spatially and temporally neighboring regions include one or a plurality of blocks, and the disparity motion vector represents a single vector in one reference picture list for the plurality of blocks within the spatially or temporally neighboring region. The determined disparity vector could be used to coding tools which utilize the information between different views such as merge mode, advanced motion vector prediction (AMVP) mode, inter-view motion prediction, and inter-view residual prediction.
Abstract:
A device obtains, from a bitstream that includes an encoded representation of the video data, a non-nested Supplemental Enhancement Information (SEI) message that is not nested within another SEI message in the bitstream. Furthermore, the device determines a layer of the bitstream to which the non-nested SEI message is applicable. The non-nested SEI message is applicable to layers for which video coding layer (VCL) network abstraction layer (NAL) units of the bitstream have layer identifiers equal to a layer identifier of a SEI NAL unit that encapsulates the non-nested SEI message. A temporal identifier of the SEI NAL unit is equal to a temporal identifier of an access unit containing the SEI NAL unit. Furthermore, the device processes, based in part on one or more syntax elements in the non-nested SEI message, video data of the layer of the bitstream to which the non-nested SEI message is applicable.
Abstract translation:设备从包含视频数据的编码表示的比特流中获得未嵌套在比特流中另一SEI消息内的非嵌套补充增强信息(SEI)消息。 此外,设备确定非嵌套SEI消息可应用到的比特流层。 非嵌套SEI消息适用于层的视频编码层(VCL)网络抽象层(NAL)单元具有层标识符等于封装非嵌套SEI消息的SEI NAL单元的层标识符。 SEI NAL单元的时间标识符等于包含SEI NAL单元的访问单元的时间标识符。 此外,该设备部分地基于非嵌套SEI消息中的一个或多个语法元素处理非嵌套SEI消息可应用于的位流层的视频数据。
Abstract:
In an example, a method of processing video data includes determining a candidate motion vector for deriving motion information of a current block of video data, where the motion information indicates motion of the current block relative to reference video data. The method also includes determining a derived motion vector for the current block based on the determined candidate motion vector, where determining the derived motion vector comprises performing a motion search for a first set of reference data that corresponds to a second set of reference data outside of the current block.