Abstract:
Systems, methods, and devices for coding multilayer video data are disclosed that may include encoding, decoding, transmitting, or receiving multilayer video data. The systems, methods, and devices may transmit or receive a video parameter set (VPS) including information for a series of layers, each layer including visual signal information. The systems, methods, and devices may code (encode or decode) video data based on the visual signal information signaled per layer in the VPS.
Abstract:
Techniques and systems are provided for coding video data. For example, a method of coding video data includes determining one or more illumination compensation parameters for a current block and coding the current block as part of an encoded bitstream using the one or more illumination compensation parameters. In some cases, the method can include determining one or more spatially neighboring samples for the current block and deriving the one or more illumination compensation parameters for the current block based on at least one of the one or more spatially neighboring samples. The method can further include signaling, individually, for the current block, an illumination compensation status in the encoded bitstream. The method can further include signaling at least one of the one or more illumination compensation parameters for the current block in the encoded bitstream.
Abstract:
An example method of entropy coding video data includes determining a window size of a plurality of window sizes for a context of a plurality of contexts used in a context-adaptive coding process to entropy code a value for a syntax element of the video data; entropy coding, based on a probability state of the context, a bin of the value for the syntax element; updating a probability state of the context based on the window size and the coded bin. The example method also includes entropy coding a next bin with the same context based on the updated probability state of the context.
Abstract:
An example method of entropy coding video data includes obtaining a pre-defined initialization value for a context of a plurality of contexts used in a context-adaptive entropy coding process to entropy code a value for a syntax element in a slice of the video data, wherein the pre-defined initialization value is stored with N-bit precision; determining, using a look-up table and based on the pre-defined initialization value, an initial probability state of the context for the slice of the video data, wherein a number of possible probability states for the context is greater than two raised to the power of N; and entropy coding, based on the initial probability state of the context, a bin of the value for the syntax element.
Abstract:
In general, techniques are described for separately coding depth and texture components of video data. A video coding device for coding video data that includes a view component comprised of a depth component and a texture component may perform the techniques. The video coding device may comprise, as one example, a processor configured to activate a parameter set as a texture parameter set for the texture component of the view component, and code the texture component of the view component based on the activated texture parameter set.
Abstract:
In general, techniques are described for coding picture order count values identifying long-term reference pictures. A video decoding device comprising a processor may perform the techniques. The processor may be configured to determine a number of bits used to represent least significant bits of the picture order count value that identifies a long-term reference picture to be used when decoding at least a portion of a current picture and parse the determined number of bits from a bitstream representative of the encoded video data. The parsed bits represent the least significant bits of the picture order count value. The processor retrieves the long-term reference picture from a decoded picture buffer based on the least significant bits, and decodes at least the portion of the current picture using the retrieved long-term reference picture.
Abstract:
A prediction unit (PU) of a coding unit (CU) is split into two or more sub-PUs including a first sub-PU and a second sub-PU. A first motion vector of a first type is obtained for the first sub-PU and a second motion vector of the first type is obtained for the second sub-PU. A third motion vector of a second type is obtained for the first sub-PU and a fourth motion vector of the second type is obtained for the second sub-PU, such that the second type is different than the first type. A first portion of the CU corresponding to the first sub-PU is coded according to advanced residual prediction (ARP) using the first and third motion vectors. A second portion of the CU corresponding to the second sub-PU is coded according to ARP using the second and fourth motion vectors.
Abstract:
Techniques are described for encoding and decoding depth data for three-dimensional (3D) video data represented in a multiview plus depth format using depth coding modes that are different than high-efficiency video coding (HEVC) coding modes. Examples of additional depth intra coding modes available in a 3D-HEVC process include at least two of a Depth Modeling Mode (DMM), a Simplified Depth Coding (SDC) mode, and a Chain Coding Mode (CCM). In addition, an example of an additional depth inter coding mode includes an Inter SDC mode. In one example, the techniques include signaling depth intra coding modes used to code depth data for 3D video data in a depth modeling table that is separate from the HEVC syntax. In another example, the techniques of this disclosure include unifying signaling of residual information of depth data for 3D video data across two or more of the depth coding modes.
Abstract:
As one example, techniques for decoding video data include receiving a bitstream that includes one or more pictures of a coded video sequence (CVS), decoding a first picture according to a decoding order, wherein the first picture is a random access point (RAP) picture that is not an instantaneous decoding refresh (IDR) picture, and decoding at least one other picture following the first picture according to the decoding order based on the decoded first picture. As another example, techniques for encoding video data include generating a bitstream that includes one or more pictures of a CVS, wherein a first picture according to the decoding order is a RAP picture that is not an IDR picture, and avoiding including at least one other picture, other than the first picture, that corresponds to a leading picture associated with the first picture, in the bitstream.
Abstract:
Techniques for encapsulating video streams containing multiple coded views in a media file are described herein. In one example, a method includes parsing a track of multiview video data, wherein the track includes one or more views, including only one of a texture view of a particular view and a depth view of the particular view. The method further includes parsing a track reference to determine a dependency of the track to a referenced track indicated in the track reference. Track reference types include ‘deps’ that indicates that the track includes the depth view and the reference track includes the texture view, ‘tref’ that indicates that the track depends on the texture view which is stored in the referenced track, and ‘dref’ that indicates that the track depends on the depth view which is stored in the referenced track.