Abstract:
A method includes receiving first data defining a first bounding box for a first image of a sequence of images. The first bounding box corresponds to a region of interest including a tracked object. The method also includes receiving object tracking data for a second image of the sequence of images, the object tracking data defining a second bounding box. The second bounding box corresponds to the region of interest including the tracked object in the second image. The method further includes determining a similarity metric for first pixels within the first bounding box and search pixels within each of multiple search bounding boxes. Search coordinates of each of the search bounding boxes correspond to second coordinates of the second bounding box shifted in one or more directions. The method also includes determining a modified second bounding box based on the similarity metric.
Abstract:
A method of processing data includes receiving, at a computing device, data representative of an image captured by an image sensor. The method also includes determining a first scene clarity score. The method further includes determining whether the first scene clarity score satisfies a threshold, and if the first scene clarity score satisfies the threshold, determining a second scene clarity score based on second data extracted from the data.
Abstract:
A method of generating metadata includes using at least one digital image to select at least one among a plurality of objects, wherein the at least one digital image depicts the plurality of objects in relation to a physical space. The method also includes, in response to the selecting at least one object, determining a position of the at least one object in a location space. The method also includes, based on said determined position, producing metadata that identifies one among a plurality of separate regions that divide the location space, wherein said plurality of separate regions includes regions of unequal size.
Abstract:
Systems, devices and methods for improved tracking with an electronic device are disclosed. The disclosures employ advanced exposure compensation and/or stabilization techniques. The tracking features may therefore be used in an electronic device to improve tracking performance under dramatically changing lighting conditions and/or when exposed to destabilizing influences, such as jitter. Historical data related to the lighting conditions and/or to the movement of a region of interest containing the tracked object are advantageously employed to improve the tracking system under such conditions.
Abstract:
A method of generating a temporal saliency map is disclosed. In a particular embodiment, the method includes receiving an object bounding box from an object tracker. The method includes cropping a video frame based at least in part on the object bounding box to generate a cropped image. The method further includes performing spatial dual segmentation on the cropped image to generate an initial mask and performing temporal mask refinement on the initial mask to generate a refined mask. The method also includes generating a temporal saliency map based at least in part on the refined mask.
Abstract:
A method for determining a region of an image is described. The method includes presenting an image of a scene including one or more objects. The method also includes receiving an input selecting a single point on the image corresponding to a target object. The method further includes obtaining a motion mask based on the image. The motion mask indicates a local motion section and a global motion section of the image. The method further includes determining a region in the image based on the selected point and the motion mask.
Abstract:
An electronic device is described. The electronic device includes a processor. The processor is configured to obtain a plurality of images. The processor is also configured to obtain global motion information indicating global motion between at least two of the plurality of images. The processor is further configured to obtain object tracking information indicating motion of a tracked object between the at least two of the plurality of images. The processor is additionally configured to perform automatic zoom based on the global motion information and the object tracking information. Performing automatic zoom produces a zoom region including the tracked object. The processor is configured to determine a motion response speed for the zoom region based on a location of the tracked object within the zoom region.
Abstract:
Systems, devices and methods for improved tracking with an electronic device are disclosed. The disclosures employ advanced exposure compensation and/or stabilization techniques. The tracking features may therefore be used in an electronic device to improve tracking performance under dramatically changing lighting conditions and/or when exposed to destabilizing influences, such as jitter. Historical data related to the lighting conditions and/or to the movement of a region of interest containing the tracked object are advantageously employed to improve the tracking system under such conditions.
Abstract:
A method includes receiving information that identifies a reference position in a location space. The method also includes receiving data that identifies one among a plurality of candidate geometrical arrangements. The method also includes producing a representation that depicts a plurality of objects which are arranged, relative to the reference position in the location space, according to the identified candidate geometrical arrangement.
Abstract:
Automatic adaptive zoom enables computing devices that receive video streams to use a higher resolution stream when the user enables zoom, so that the quality of the output video is preserved. In some examples, a tracking video stream and a target video stream are obtained and are processed. The tracking video stream has a first resolution, and the target video stream has a second resolution that is higher than the first resolution. The tracking video stream is processed to define regions of interest for frames of the tracking video stream. The target video stream is processed to generate zoomed-in regions of frames of the target video stream. A zoomed-in region of the target video stream corresponds to a region of interest defined using the tracking video stream. The zoomed-in regions of the frames of the target video stream are then provided for display on a client device.