Abstract:
A method, non-transitory computer-readable medium, and apparatus for adaptive sampling an ego-centric video to extract features for performing an analysis are disclosed. For example, the method captures the ego-centric video, determines a spatio-temporal location of interest within the ego-centric video, applies an adaptive sampling centered around the spatio-temporal location of interest to obtain one or more spatio-temporal patches, extracts one or more features using the one or more spatio-temporal patches and performs an analysis based on the one or more features.
Abstract:
A method, non-transitory computer readable medium, and apparatus for training hand detection in an ego-centric video are disclosed. For example, the method prompts a user to provide a hand gesture, captures the ego-centric video containing the hand gesture, analyzes the hand gesture in a frame of the ego-centric video to identify a set of pixels in the image corresponding to a hand region, generates a training set of features from the set of pixels that correspond to the hand region and trains a head-mounted video device to detect the hand in subsequently captured ego-centric video images based on the training set of features.