REAL-TIME VISUAL OBJECT TRACKING FOR UNMANNED AERIAL VEHICLES (UAVS)

    公开(公告)号:US20220114739A1

    公开(公告)日:2022-04-14

    申请号:US17558588

    申请日:2021-12-21

    IPC分类号: G06T7/262 B64C39/02 G06V10/20

    摘要: Embodiments described herein provide various examples of real-time visual object tracking. In another aspect, a process for performing a local re-identification of a target object which was earlier detected in a video but later lost when tracking the target object is disclosed. This process begins by receiving a current video frame of the video and a predicted location of the target object. The process then places a current search window in the current video frame centered on or in the vicinity of the predicted location of the target object. Next, the process extracts a feature map from an image patch within the current search window. The process further retrieves a set of stored feature maps computed at a set of previously-determined locations of the target object from a set of previously-processed video frames in the video. The process next computes a set of correlation maps between the feature map and each of the set of stored feature maps. The process then attempts to re-identify the target object locally in the current video frame based on the set of computed correlation maps.

    Joint face-detection and head-pose-angle-estimation using small-scale convolutional neural network (CNN) modules for embedded systems

    公开(公告)号:US10467458B2

    公开(公告)日:2019-11-05

    申请号:US15789957

    申请日:2017-10-20

    IPC分类号: G06K9/00 G06T7/20

    摘要: Embodiments described herein provide various examples of a joint face-detection and head-pose-angle-estimation system based on using a small-scale hardware CNN module such as the built-in CNN module in HiSilicon Hi3519 system-on-chip. In some embodiments, the disclosed joint face-detection and head-pose-angle-estimation system is configured to jointly perform multiple tasks of detecting most or all faces in a sequence of video frames, generating pose-angle estimations for the detected faces, tracking detected faces of a same person across the sequence of video frames, and generating “best-pose” estimation for the person being tracked. The disclosed joint face-detection and pose-angle-estimation system can be implemented on resource-limited embedded systems such as smart camera systems that are only integrated with one or more small-scale CNN modules. The proposed system in conjunction with using a subimage-based technique has made it possible to performance multiple face detection and face recognition tasks on high-resolution input images with small-scale low-cost CNN modules.

    Face detection using small-scale convolutional neural network (CNN) modules for embedded systems

    公开(公告)号:US10268947B2

    公开(公告)日:2019-04-23

    申请号:US15657109

    申请日:2017-07-21

    摘要: Embodiments described herein provide various examples of a face detection system, based on using a small-scale hardware convolutional neural network (CNN) module configured into a multi-task cascaded CNN. In some embodiments, a subimage-based CNN system can be configured to be equivalent to a large-scale CNN that processes the entire input image without partitioning such that the output of the subimage-based CNN system can be exactly identical to the output of the large-scale CNN. Based on this observation, some embodiments of this patent disclosure make use of the subimage-based CNN system and technique on one or more stages of a cascaded CNN or a multitask cascaded CNN (MTCNN) so that a larger input image to a given stage of the cascaded CNN or the MTCNN can be partitioned into a set of subimages of a smaller size. As a result, each stage of the cascaded CNN or the MTCNN can use the same small-scale hardware CNN module that is associated with a maximum input image size constraint.

    VIDEO-BASED FALL RISK ASSESSMENT SYSTEM
    5.
    发明申请

    公开(公告)号:US20200205697A1

    公开(公告)日:2020-07-02

    申请号:US16731025

    申请日:2019-12-30

    摘要: Various embodiments of a video-based fall risk assessment system are disclosed. During operation, this fall risk assessment system can receives a sequence of video frames including a person being monitored for fall risk assessment. The system next generates a sequence of action labels for the sequence of video frames by, for each video frame in the sequence of video frames: estimating a pose of the person within the video frame; and classifying the estimated pose as a given action among a set of predetermined actions. Next, the system identifies a subset of action labels within the sequence of action labels. The system next extracts a set of gait features for the person from a subset of video frames within the sequence of video frames corresponding to the subset of action labels. Subsequently, the system analyzes the set of extracted gait features to generate a fall risk assessment for the person. In some embodiments, the sequence of video frames is captured during a predetermined time period, such as an hour, a day, or a week.

    HIGH-QUALITY TRAINING DATA PREPARATION FOR HIGH-PERFORMANCE FACE RECOGNITION SYSTEMS

    公开(公告)号:US20190205620A1

    公开(公告)日:2019-07-04

    申请号:US15859652

    申请日:2017-12-31

    IPC分类号: G06K9/00 G06K9/62 G06K9/46

    摘要: Embodiments described herein provide various examples of a face-image training data preparation system for performing large-scale face-image training data acquisition, pre-processing, cleaning, balancing, and post-processing. The disclosed training data preparation system can collect a very large set of loosely-labeled images of different people from the public domain, and then generate a raw training dataset including a set of incorrectly-labeled face images. The disclosed training data preparation system can then perform cleaning and balancing operations on the raw training dataset to generate a high-quality face-image training dataset free of the incorrectly-labeled face images. The processed high-quality face-image training dataset can be subsequently used to train deep-neural-network-based face recognition systems to achieve high performance in various face recognition applications. Compared to conventional face recognition systems and techniques, the disclosed training data preparation system and technique provide a fully-automatic, highly-deterministic and high-quality training data preparation procedure which does not rely heavily on assumptions.

    Real-time visual object tracking for unmanned aerial vehicles (UAVs)

    公开(公告)号:US11645765B2

    公开(公告)日:2023-05-09

    申请号:US17558588

    申请日:2021-12-21

    摘要: Embodiments described herein provide various examples of real-time visual object tracking. In another aspect, a process for performing a local re-identification of a target object which was earlier detected in a video but later lost when tracking the target object is disclosed. This process begins by receiving a current video frame of the video and a predicted location of the target object. The process then places a current search window in the current video frame centered on or in the vicinity of the predicted location of the target object. Next, the process extracts a feature map from an image patch within the current search window. The process further retrieves a set of stored feature maps computed at a set of previously-determined locations of the target object from a set of previously-processed video frames in the video. The process next computes a set of correlation maps between the feature map and each of the set of stored feature maps. The process then attempts to re-identify the target object locally in the current video frame based on the set of computed correlation maps.

    OBSTACLE AVOIDANCE SYSTEM BASED ON EMBEDDED STEREO VISION FOR UNMANNED AERIAL VEHICLES

    公开(公告)号:US20190304120A1

    公开(公告)日:2019-10-03

    申请号:US15943978

    申请日:2018-04-03

    IPC分类号: G06T7/593 G06K9/00 B64C39/02

    摘要: Embodiments described herein provide various examples of an automatic obstacle avoidance system for unmanned vehicles using embedded stereo vision techniques. In one aspect, an UAV capable of performing autonomous obstacle detection and avoidance is disclosed. This UAV includes: a stereo vision camera set coupled to the one or more processors and the memory to capture a sequence of stereo images; and a stereo vision module configured to: receive a pair of stereo images captured by a pair of stereo vision cameras; perform a border cropping operation on the pair of stereo images to obtain a pair of cropped stereo images; perform a subsampling operation on the pair of cropped stereo images to obtain a pair of subsampled stereo images; and perform a dense stereo matching operation on the pair of subsampled stereo images to generate a dense three-dimensional (3D) point map of a space corresponding to the pair of stereo images.