VIDEO-BASED ACTIVITY RECOGNITION
    1.
    发明申请

    公开(公告)号:US20220076039A1

    公开(公告)日:2022-03-10

    申请号:US17185800

    申请日:2021-02-25

    Abstract: Systems and techniques are provided for performing video-based activity recognition. For example, a process can include extracting, using a first machine learning model, first one or more features from a first frame and second one or more features from a second frame. The first one or more features and the second one or more features are associated with a person driving a vehicle. The process can include processing, using a second machine learning model, the first one or more features and the second one or more features. The process can include determining, based on processing of the first one or more features and the second one or more features using the second machine learning model, at least one activity associated with the person driving the vehicle.

    COMBINING CONVOLUTION AND DECONVOLUTION FOR OBJECT DETECTION

    公开(公告)号:US20190303715A1

    公开(公告)日:2019-10-03

    申请号:US15940907

    申请日:2018-03-29

    Abstract: Provided are systems, methods, and computer-readable medium for operating a neural network. In various implementations, the neural network can receive an input image that includes an object to be identified. The neural network can generate a plurality of initial feature maps using a convolution layers, wherein a first initial feature maps is generated using the input image. The neural network can generate an up-sampled feature map using a de-convolution layer that takes an initial feature map as input, where the up-sampled feature map has a same resolution as the previous initial feature map. The neural network can combine the up-sampled feature map and the previous initial feature map, and use the combined feature map to more accurate identify the object.

    SEGMENTATION WITH MONOCULAR DEPTH ESTIMATION

    公开(公告)号:US20240394893A1

    公开(公告)日:2024-11-28

    申请号:US18695794

    申请日:2021-12-01

    Abstract: Systems, methods, and computer-readable media are provided for performing image segmentation with depth filtering. In some examples, a method can include obtaining a frame capturing a scene: generating, based on the frame, a first segmentation map including a target segmentation mask identifying a target of interest and one or more background masks identifying one or more background regions of the frame; and generating a second segmentation map including the first segmentation map with the one or more background masks filtered out, the one or more background masks being filtered from the first segmentation map based on a depth map associated with the frame.

    CONVOLUTION AND TRANSFORMER-BASED IMAGE SEGMENTATION

    公开(公告)号:US20240378727A1

    公开(公告)日:2024-11-14

    申请号:US18316823

    申请日:2023-05-12

    Abstract: Techniques are provided for image processing. For instance, a process can include obtaining an image; extracting a first set of features at a first scale resolution; extracting a second set of features at a second scale resolution (lower than the first scale resolution); performing a self-attention transform to generate similarity scores for the second set of features; adding the similarity scores to the second set of features to generate a first feature extractor output; up-sampling the first feature extractor output to generate a second feature extractor output; adding the second feature extractor output to the first set of features to generate a third feature extractor output; receiving an instance query; performing a cross-attention transform on the instance query and the first feature extractor output to generate a set of weights; and matrix multiplying the set of weights and the third feature extractor output to generate instance masks.

Patent Agency Ranking