Abstract:
A method, computer readable medium and apparatus for verifying an identity of an individual based upon facial expressions as exhibited in a query video of the individual are disclosed. The method includes receiving a reference video for each one of a plurality of different individuals, wherein a plurality of facial gesture encoders is extracted from at least one frame of the reference video describing one or more facial expressions of each one of the plurality of different individuals, receiving the query video, calculating a similarity score for the reference video for the each one of the plurality of different individuals based on an analysis that compares the plurality of facial gesture encoders of the at least one frame of the reference video for the each one of the plurality of different individuals to a plurality of facial gesture encoders extracted from at least one frame of the query video.
Abstract:
Methods and systems for automatically synchronizing videos acquired via two or more cameras with overlapping views in a multi-camera network. Reference lines within an overlapping field of view of the two (or more) cameras in the multi-camera network can be determined wherein the reference lines connect two or more pairs of corresponding points. Spatiotemporal maps of the reference lines can then be obtained. An optimal alignment between video segments obtained from the cameras is then determined based on the registration of the spatiotemporal maps.
Abstract:
A method, non-transitory computer-readable medium, and apparatus for localizing a region of interest using a dynamic hand gesture are disclosed. For example, the method captures the ego-centric video containing the dynamic hand gesture, analyzes a frame of the ego-centric video to detect pixels that correspond to a fingertip using a hand segmentation algorithm, analyzes temporally one or more frames of the ego-centric video to compute a path of the fingertip in the dynamic hand gesture, localizes the region of interest based on the path of the fingertip in the dynamic hand gesture and performs an action based on an object in the region of interest.
Abstract:
A system and method for detection of drive-arounds in a retail setting. An embodiment includes acquiring images of a retail establishment, analyzing the images to detect entry of a customer onto the premises of the retail establishment, tracking a detected customer's location as the customer traverses the premises of the retail establishment, analyzing the images to detect exit of the detected customer from the premises of the retail establishment, and generating a drive-around notification if the customer does not enter a prescribed area or remain on the premises of the retail location for at least a prescribed minimum period of time.
Abstract:
A system and method for detection of a goods-received event includes acquiring images of a retail location including a vehicular drive-thru, determining a region of interest within the images, the region of interest including at least a portion of a region in which goods are delivered to a customer, and analyzing the images using at least one computer vision technique to determine when goods are received by a customer. The analyzing includes identifying at least one item belonging to a class of items, the at least one item's presence in the region of interest being indicative of a goods-received event.
Abstract:
A mobile electronic device processes a sequence of images to identify and re-identify an object of interest in the sequence. An image sensor of the device, receives a sequence of images. The device detects an object in a first image as well as positional parameters of the device that correspond to the object in the first image. The device determines a range of positional parameters within which the object may appear in a field of view of the device. When the device detects that the object of interest exited the field of view it subsequently uses motion sensor data to determine that the object of interest has likely re-entered the field of view, it will analyze the current frame to confirm that the object of interest has re-entered the field of view.
Abstract:
A method and system for identifying content relevance comprises acquiring video data, mapping the acquired video data to a feature space to obtain a feature representation of the video data, assigning the acquired video data to at least one action class based on the feature representation of the video data, and determining a relevance of the acquired video data.
Abstract:
Block-based motion estimation of video compression estimates the direction and magnitude of motion of objects in the scene in a computationally efficient manner and accurately predicts the optimal search direction/neighborhood location for motion vectors. A system can include a motion detection module that detects apparent motion in the scene, a motion direction and magnitude prediction module that estimates the direction and magnitude of motion of the objects detected to be in motion by the motion detection module, and a block-based motion estimation module that performs searches in reduced neighborhoods of the target block according to the estimated motion by the motion direction and magnitude prediction module and only for the blocks determined to be in motion by the motion detection module. The Invention is particularly well suited for stationary traffic cameras that monitor roads and highways for traffic law enforcement purposes.
Abstract:
A method, computer readable medium and apparatus for verifying an identity of an individual based upon facial expressions as exhibited in a query video of the individual are disclosed. The method includes receiving a reference video for each one of a plurality of different individuals, wherein a plurality of facial gesture encoders is extracted from at least one frame of the reference video describing one or more facial expressions of each one of the plurality of different individuals, receiving the query video, calculating a similarity score for the reference video for the each one of the plurality of different individuals based on an analysis that compares the plurality of facial gesture encoders of the at least one frame of the reference video for the each one of the plurality of different individuals to a plurality of facial gesture encoders extracted from at least one frame of the query video.
Abstract:
A system and method of providing annotated trajectories by receiving image frames from a video camera and determining a location based on the image frames from the video camera. The system and method can further include the steps of determining that the location is associated with a preexisting annotation and displaying the preexisting annotation. Additionally or alternatively, the system and method can further include the steps of generating a new annotation automatically or based on a user input and associating the new annotation with the current location.