Abstract:
In general, techniques are described for coding picture order count values identifying long-term reference pictures. A video decoding device comprising a processor may perform the techniques. The processor may be configured to determine a number of bits used to represent least significant bits of the picture order count value that identifies a long-term reference picture to be used when decoding at least a portion of a current picture and parse the determined number of bits from a bitstream representative of the encoded video data. The parsed bits represent the least significant bits of the picture order count value. The processor retrieves the long-term reference picture from a decoded picture buffer based on the least significant bits, and decodes at least the portion of the current picture using the retrieved long-term reference picture.
Abstract:
Techniques are described related to receiving a first decoded frame of video data, wherein the first decoded frame is associated with a first resolution, determining whether a decoded picture buffer is available to store the first decoded frame based on the first resolution, and in the event the decoded picture buffer is available to store the first decoded frame, storing the first decoded frame in the decoded picture buffer, and determining whether the decoded picture buffer is available to store a second decoded frame of video data, wherein the second decoded frame is associated with a second resolution, based on the first resolution and the second resolution, wherein the first decoded frame is different than the second decoded frame.
Abstract:
In one implementation, an apparatus is provided for encoding or decoding video information. The apparatus comprises a memory configured to store inter-layer reference pictures associated with a current picture that is being coded. The apparatus further comprises a processor operationally coupled to the memory. In one embodiment, the processor is configured to indicate a number of inter-layer reference pictures to use to predict the current picture using inter-layer prediction. The processor is also configured to indicate which of the inter-layer reference pictures to use to predict the current picture using inter-layer prediction. The processor is also configured to determine an inter-layer reference picture set associated with the current picture using the indication of the number of inter-layer reference pictures and the indication of which of the inter-layer reference pictures to use to predict the current picture using inter-layer prediction.
Abstract:
Techniques for encapsulating video streams containing multiple coded views in a media file are described herein. In one example, a method includes parsing a track of multiview video data, wherein the track includes at least one depth view. The method further includes parsing information to determine a spatial resolution associated with the depth view, wherein decoding the spatial resolution does not require parsing of a sequence parameter set of the depth view. Another example method includes composing a track of multiview video data, wherein the track includes the one or more views. The example method further includes composing information to indicate a spatial resolution associated with the depth view, wherein decoding the spatial resolution does not require parsing of a sequence parameter set of the depth view.
Abstract:
A video decoder assembles, in a buffer model, an access unit from a plurality of elementary streams of a video data stream. The video data stream may be a transport stream or a program stream. The same buffer model is used regardless of whether the elementary streams contain Scalable High Efficiency Video Coding (SHVC), Multi-View HEVC (MV-HEVC), or 3D-HEVC bitstreams. Furthermore, the video decoder decodes the access unit.
Abstract:
Techniques and systems are provided for maintaining blob trackers for one or more video frames. For example, a blob tracker can be identified for a current video frame. The blob tracker is associated with a blob detected for the current video frame, and the blob includes pixels of at least a portion of one or more objects in the current video frame. One or more characteristics of the blob tracker are determined. The one or more characteristics are based on a bounding region history of the blob tracker. A confidence value is determined for the blob tracker based on the determined one or more characteristics, and a status of the blob tracker is determined based on the determined confidence value. The status of the blob tracker indicates whether to maintain the blob tracker for the one or more video frames. For example, the determined status can include a first type of blob tracker that is output as an identified blob tracker-blob pair, a second type of blob tracker that is maintained for further analysis, or a third type of blob tracker that is removed from a plurality of blob trackers maintained for the one or more video frames.
Abstract:
Techniques and systems are provided for generating a background picture. The background picture can be used for coding one or more pictures. For example, a method of generating a background picture includes generating a long-term background model for one or more pixels of a background picture. The long-term background model includes a statistical model for detecting long-term motion of the one or more pixels in a sequence of pictures. The method further includes generating a short-term background model for the one or more pixels of the background picture. The short-term background model detects short-term motion of the one or more pixels between two or more pictures. The method further includes determining a value for the one or more pixels of the background picture using the long-term background model and the short-term background model.
Abstract:
An example method of entropy coding video data includes obtaining a pre-defined initialization value for a context of a plurality of contexts used in a context-adaptive entropy coding process to entropy code a value for a syntax element in a slice of the video data, wherein the pre-defined initialization value is stored with N-bit precision; determining, using a look-up table and based on the pre-defined initialization value, an initial probability state of the context for the slice of the video data, wherein a number of possible probability states for the context is greater than two raised to the power of N; and entropy coding, based on the initial probability state of the context, a bin of the value for the syntax element.
Abstract:
Automatic adaptive zoom enables computing devices that receive video streams to use a higher resolution stream when the user enables zoom, so that the quality of the output video is preserved. In some examples, a tracking video stream and a target video stream are obtained and are processed. The tracking video stream has a first resolution, and the target video stream has a second resolution that is higher than the first resolution. The tracking video stream is processed to define regions of interest for frames of the tracking video stream. The target video stream is processed to generate zoomed-in regions of frames of the target video stream. A zoomed-in region of the target video stream corresponds to a region of interest defined using the tracking video stream. The zoomed-in regions of the frames of the target video stream are then provided for display on a client device.
Abstract:
Example techniques are described to determine transforms to be used during video encoding and video decoding. A video encoder and a video decoder may select transform subsets that each identify one or more candidate transforms. The video encoder and the video decoder may determine transforms from the selected transform subsets.