Abstract:
A video encoding system encodes video streams for multiple bit rate video streaming using an approach that permits the encoded resolution to vary based, at least in part, on motion complexity. The video encoding system dynamically decides an encoding resolution for segments of the multiple bit rate video streams that varies with video complexity so as to achieve a better visual experience for multiple bit rate streaming. Motion complexity may be considered separately, or along with spatial complexity, in making the resolution decision.
Abstract:
A video encoding system encodes video streams for multiple bit rate video streaming using an approach that permits the encoded bit rate to vary subject to a peak bit rate and average bit rate constraints for higher quality streams, while a bottom bit rate stream is encoded to achieve a constant chunk rate. The video encoding system also dynamically decides an encoding resolution for segments of the multiple bit rate video streams that varies with video complexity so as to achieve a better visual experience for multiple bit rate streaming.
Abstract:
A semantic object tracking method tracks general semantic objects with multiple non-rigid motion, disconnected components and multiple colors throughout a vector image sequence. The method accurately tracks these general semantic objects by spatially segmenting image regions from a current frame and then classifying these regions as to which semantic object they originated from in the previous frame. To classify each region, the method perform a region based motion estimation between each spatially segmented region and the previous frame to computed the position of a predicted region in the previous frame. The method then classifies each region in the current frame as being part of a semantic object based on which semantic object in the previous frame contains the most overlapping points of the predicted region. Using this method, each region in the current image is tracked to one semantic object from the previous frame, with no gaps or overlaps. The method propagates few or no errors because it projects regions into a frame where the semantic object boundaries are previously computed rather than trying to project and adjust a boundary in a frame where the object's boundary is unknown.
Abstract:
Homogeneous moving objects of arbitrary shapes are segmented and tracked with respect to the motion of the objects. In an intraframe mode of operation, a segmentation method includes obtaining a motion representation of corresponding pixels in the selected video image frame and a preceding video image frame to form motion-segmented video image features. Video image features are also segmented according to their spatial image characteristics (e.g., color) to form spatially-segmented video image features. Finally, the video image features are jointly segmented as a weighted combination of the motion-segmented video image features and the spatially-segmented video image features. The joint motion and spatial segmentation of image features provides enhanced accuracy in representing moving image features. This enhanced accuracy is particularly beneficial because the motion of image features is a significant display characteristic for human observers.
Abstract:
A multiple bitrate (MBR) video encoding management tool utilizes available processing units for parallel MBR video encoding. Instead of focusing only on multi-threading of encoding tasks for a single picture or group of pictures (GOP), the management tool parallelizes the encoding of multiple GOPs between different processing units and/or different computing systems. With this parallel MBR video encoding architecture, different GOPs can be encoded in parallel. To facilitate such parallel encoding, data dependencies between GOPs are removed. The management tool can adjust the number of GOPs to encode in parallel on a computing system so as to favor parallelism of encoding for different GOPs at the expense of parallelism of encoding inside a GOP, or vice versa, and thereby set a suitable balance between encoding latency and throughput.
Abstract:
The subject disclosure relates to face recognition in video. Face detection data in frames of input data are used to generate face galleries, which are labeled and used in recognizing faces throughout the video. Metadata that associates the video frame and the face are generated and maintained for subsequent identification. Faces other than those found by face detection may be found by face tracking, in which facial landmarks found by the face detection are used to track a face over previous and/or subsequent video frames. Once generated, the maintained metadata may be accessed to efficiently determine the identity of a person corresponding to a viewer-selected face.
Abstract:
A device for dynamically extracting and compressing information for a streaming media asset is provided. One embodiment of the device provides a computing device comprising a processor and memory comprising instructions stored therein that are executable by the processor. The instructions stored in the memory are executable to provide to a requesting computing device dynamically compressed information for a streaming media asset, the dynamically compressed information derived from an information file comprising variable data elements arranged in one or more data fields according to a well-known structure. For example, the instructions are executable to receive from the requesting computing device a request for the compressed information, extract the variable data elements from the information file, compress the variable data elements to form compressed data elements, and send to the requesting computing device a compressed file comprising the compressed data elements.
Abstract:
A semantic object tracking method tracks general semantic objects with multiple non-rigid motion, disconnected components and multiple colors throughout a vector image sequence. The method accurately tracks these general semantic objects by spatially segmenting image regions from a current frame and then classifying these regions as to which semantic object they originated from in the previous frame. To classify each region, the method perform a region based motion estimation between each spatially segmented region and the previous frame to computed the position of a predicted region in the previous frame. The method then classifies each region in the current frame as being part of a semantic object based on which semantic object in the previous frame contains the most overlapping points of the predicted region. Using this method, each region in the current image is tracked to one semantic object from the previous frame, with no gaps or overlaps. The method propagates few or no errors because it projects regions into a frame where the semantic object boundaries are previously computed rather than trying to project and adjust a boundary in a frame where the object's boundary is unknown.
Abstract:
A video encoding system encodes video streams for multiple bit rate video streaming using an approach that permits the encoded bit rate to vary subject to a peak bit rate and average bit rate constraints for higher quality streams, while a bottom bit rate stream is encoded to achieve a constant chunk rate. The video encoding system also dynamically decides an encoding resolution for segments of the multiple bit rate video streams that varies with video complexity so as to achieve a better visual experience for multiple bit rate streaming.
Abstract:
A video encoder uses previously calculated motion information for inter frame coding to achieve faster computation speed for video compression. In a multi bit rate application, motion information produced by motion estimation for inter frame coding of a compressed video bit stream at one bit rate is passed on to a subsequent encoding of the video at a lower bit rate. The video encoder chooses to use the previously calculated motion information for inter frame coding at the lower bit rate if the video resolution is unchanged. A multi core motion information pre-calculation produces motion information prior to encoding by dividing motion estimation of each inter frame to separate CPU cores.