Abstract:
Techniques and systems are provided for classifying objects in one or more video frames. An object tracker associated with an object in a current video frame can be selected for object classification. Object classification can be determined to be performed in a next video frame (instead of the current video frame) for the object associated with the selected tracker. An image patch to use for the object classification can be obtained from the next video frame. The image patch can be based on a first bounding region associated with the object tracker in the current video frame, can be based on a second bounding region associated with the tracker in the next video frame, or can be based on both the first and second bounding regions. The object classification can be performed for the object associated with the selected object tracker using the image patch from the next video frame.
Abstract:
Techniques and systems are provided for tracking objects in a sequence of video frames. For example, an object tracker maintained for the sequence of video frames is identified. An object tracked by the object tracker is detected based on an application of an object detector to at least one key frame in the sequence of video frames. The object detector can include a complex object detector. A status of the object tracker can be updated to a still status in a current video frame of the sequence of video frames. A tracker having the still status is associated with an object that is static in one or more video frames of the sequence of video frames. The object can be tracked in the current video frame using the object tracker based on the status of the object tracker being updated to the still status in the current video frame. For example, a bounding region of the object tracker in the current frame can be replaced with a previous bounding region of the object tracker in a previous frame based on the status of the object tracker being updated to the still status in the current video frame.
Abstract:
Techniques and systems are provided for tracking objects in one or more video frames. For example, based on an application of an object detector to at least one key frame in the one or more video frames, a first set of bounding regions for a video frame can be obtained. A group of bounding regions can be determined from the first set of bounding regions. A bounding region from the group of bounding regoins can be removed based on one or more metrics associated with the bounding region. Object tracking for the video frame can be performed using an updated set of bounding regions that is based on removal of the bounding region from the group of bounding regions.
Abstract:
An apparatus configured to code video information includes a memory unit and a processor in communication with the memory unit. The memory unit is configured to store video information associated with an enhancement layer having a first block and a base layer having a second block, the second block in the base layer corresponding to the first block in the enhancement layer. The processor is configured to predict, by inter layer prediction, the first block in the enhancement layer based on information derived from the second block in the base layer. At least a portion of the second block is located outside of a reference region of the base layer, the reference region being available for use for the inter layer prediction of the first block. The processor may encode or decode the video information.
Abstract:
In one example, a device includes a video coder (e.g., a video encoder or a video decoder) configured to code parameter set information for a video bitstream, code video data of a base layer of the video bitstream using the parameter set information, and code video data of an enhancement layer of the video bitstream using at least a portion of the parameter set information. The parameter set information may include, for example, profile and level information and/or hypothetical reference decoder (HRD) parameters. For example, the video coder may code a sequence parameter set (SPS) for a video bitstream, code video data of a base layer of the video bitstream using the SPS, and code video data of an enhancement layer of the video bitstream using at least a portion of the SPS, without using any other SPS for the enhancement layer.
Abstract:
A video decoder generates an initial reference picture list (RPL). Furthermore, the video decoder determines that an ordered set of reference picture list modification (RPLM) syntax elements does not include any additional syntax elements when a syntax element in the ordered set of RPLM syntax elements has a particular value. Furthermore, the video decoder generates a final RPL. For each respective RPLM syntax element in the ordered set of syntax elements, when the respective RPLM syntax element does not have the particular value, the final RPL includes, at an insertion position for the respective RPLM syntax element, a particular reference picture. The respective syntax element indicates a position in the initial RPL of the particular RPLM reference picture. The insertion position for the respective RPLM syntax element corresponds to a position in the ordered set of RPLM syntax elements of the respective RPLM syntax element.
Abstract:
Systems, methods, and devices are disclosed that code a supplemental enhancement information (SEI) message. In some examples, the SEI message may contain an identifier of an active video parameter set (VPS). In some examples, the identifier may be fixed-length coded.
Abstract:
Systems, methods, and devices for processing video data are disclosed. Some examples systems, methods, and devices receive an external indication at a video decoder. The example systems, methods, and devices treat a clean random access (CRA) picture as a broken link access (BLA) picture based on the external indication.
Abstract:
A device comprising a video file creation module is configured to obtain a plurality of slices of coded video content. Parameter sets are associated with the coded video content. The video creation module encapsulates the plurality of slices of coded video content within one or more access units of a video stream. A first type of parameter set may be encapsulated within one or more access units of the video stream. A second type of parameter set may be encapsulated within a sample description. The sample description may include an indicator identifying a number of parameter sets stored within one or more access units of the video stream.
Abstract:
A block-request streaming system provides for low-latency streaming of a media presentation. A plurality of media segments are generated according to an encoding protocol. Each media segment includes a random access point. A plurality of media fragments are encoded according to the same protocol. The media segments are aggregated from a plurality of media fragments.