LEARNING RELIABLE KEYPOINTS IN SITU WITH INTROSPECTIVE SELF-SUPERVISION

    公开(公告)号:US20240257374A1

    公开(公告)日:2024-08-01

    申请号:US18565791

    申请日:2021-09-23

    CPC classification number: G06T7/579 G06T7/73 G06T2207/20081 G06T2207/20084

    Abstract: An apparatus to facilitate learning reliable keypoints in situ with introspective self-supervision is disclosed. The apparatus includes one or more processors to provide a view-overlapped keyframe pair from a pose graph that is generated by a visual simultaneous localization and mapping (VSLAM) process executed by the one or more processors: determine a keypoint match from the view-overlapped key frame pair based on a keypoint detection and matching process, the keypoint match corresponding to a keypoint: calculate an inverse reliability score based on matched pixels corresponding to the keypoint match in the view-overlapped keyframe pair: identify a supervision signal associated with the keypoint match, the supervision signal comprising a keypoint reliability score of the keypoint based on a final pose output of the VSLAM process; and train a keypoint detection neural network using the keypoint match, the inverse reliability score, and the keypoint reliability score.

    ONLINE LEARNING METHOD AND SYSTEM FOR ACTION RECOGNITION

    公开(公告)号:US20230410487A1

    公开(公告)日:2023-12-21

    申请号:US18250498

    申请日:2020-11-30

    Abstract: Performing online learning for a model to detect unseen actions in an action recognition system is disclosed. The method includes extracting semantic features in a semantic domain from semantic action labels, transforming the semantic features from the semantic domain into mixed features in a mixed domain, and storing the mixed features in a feature database. The method further includes extracting visual features in a visual domain from a video stream and determining if the visual features indicate an unseen action in the video stream. If no unseen action is determined, applying an offline classification model to the visual features to identify seen actions, assigning identifiers to the identified seen actions, transforming the visual features from the visual domain into mixed features in the mixed domain, and storing the mixed features and seen action identifiers in the feature database. If an unseen action is determined, transforming the visual features from the visual domain into mixed features in the mixed domain, applying a continual learner model to mixed features from the feature database to identify unseen actions in the video stream, assigning identifiers to the identified unseen actions, and storing the unseen action identifiers in the feature database.

    Methods and apparatus to match images using semantic features

    公开(公告)号:US11341736B2

    公开(公告)日:2022-05-24

    申请号:US16768559

    申请日:2018-03-01

    Abstract: Methods and apparatus to match images using semantic features are disclosed. An example apparatus includes a semantic labeler to determine a semantic label for each of a first set of points of a first image and each of a second set of points of a second image; a binary robust independent element features (BRIEF) determiner to determine semantic BRIEF descriptors for a first subset of the first set of points and a second subset of the second set of points based on the semantic labels; and a point matcher to match first points of the first subset of points to second points of the second subset of points based on the semantic BRIEF descriptors.

    Estimation of human orientation in images using depth information from a depth camera

    公开(公告)号:US11164327B2

    公开(公告)日:2021-11-02

    申请号:US16098649

    申请日:2016-06-02

    Abstract: Techniques are provided for estimation of human orientation and facial pose, in images that include depth information. A methodology embodying the techniques includes detecting a human in an image generated by a depth camera and estimating an orientation category associated with the detected human. The estimation is based on application of a random forest classifier, with leaf node template matching, to the image. The orientation category defines a range of angular offsets relative to an angle corresponding to the human facing the depth camera. The method also includes performing a three dimensional (3D) facial pose estimation of the detected human, based on detected facial landmarks, in response to a determination that the estimated orientation category includes the angle corresponding to the human facing the depth camera.

Patent Agency Ranking