Pedestrian attribute and gesture detection

    公开(公告)号:US12236705B1

    公开(公告)日:2025-02-25

    申请号:US17320678

    申请日:2021-05-14

    Applicant: Zoox, Inc.

    Abstract: Techniques for detecting attributes and/or gestures associated with pedestrians in an environment are described herein. The techniques may include receiving sensor data associated with a pedestrian in an environment of a vehicle and inputting the sensor data into a machine-learned model that is configured to determine a gesture and/or an attribute of the pedestrian. Based on the input data, an output may be received from the machine-learned model that indicates the gesture and/or the attribute of the pedestrian and the vehicle may be controlled based at least in part on the gesture and/or the attribute of the pedestrian. The techniques may also include training the machine-learned model to detect the attribute and/or the gesture of the pedestrian.

    Depth data model training with upsampling, losses, and loss balancing

    公开(公告)号:US11157774B2

    公开(公告)日:2021-10-26

    申请号:US16684568

    申请日:2019-11-14

    Applicant: Zoox, Inc.

    Abstract: Techniques for training a machine learned (ML) model to determine depth data based on image data are discussed herein. Training can use stereo image data and depth data (e.g., lidar data). A first (e.g., left) image can be input to a ML model, which can output predicted disparity and/or depth data. The predicted disparity data can be used with second image data (e.g., a right image) to reconstruct the first image. Differences between the first and reconstructed images can be used to determine a loss. Losses may include pixel, smoothing, structural similarity, and/or consistency losses. Further, differences between the depth data and the predicted depth data and/or differences between the predicted disparity data and the predicted depth data can be determined, and the ML model can be trained based on the various losses. Thus, the techniques can use self-supervised training and supervised training to train a ML model.

    Machine learning techniques
    3.
    发明授权

    公开(公告)号:US10936922B2

    公开(公告)日:2021-03-02

    申请号:US16013729

    申请日:2018-06-20

    Applicant: Zoox, Inc.

    Abstract: Improved techniques for training a machine learning (ML) model are discussed herein. Training the ML model can be based on a subset of examples. In particular, the training can include identifying a reference region associated with an area of the image representing an object, and selecting, based at least in part on a first confidence score associated with a first bounding box, a first hard example for inclusion in the subset of examples. In some cases, the first confidence score and the first bounding box can be associated with a first portion of the feature map. Next, the training can include determining that a first degree of alignment of the first bounding box to the reference region is above a threshold degree of alignment, and in response, replacing the first hard example with a second hard example.

    Instance segmentation inferred from machine learning model output

    公开(公告)号:US10817740B2

    公开(公告)日:2020-10-27

    申请号:US16013764

    申请日:2018-06-20

    Applicant: Zoox, Inc.

    Abstract: Techniques for using instance segmentation with machine learning (ML) models are discussed herein. An image can be provided as input to a ML model, which can generate, as an output from the ML model, a feature map comprising a plurality of features. Each feature of the plurality of features can comprise a confidence score, classification information, and a region of interest (ROI) determined in accordance with a non-maximal suppression (NMS) technique. Individual ROIs that are similar can be associated together for segmentation purposes. That is, instead of requiring a second ML model and/or a second operation to segment the image (e.g., identify which pixels correspond with the detected object, for example, by outputting a mask or set of lines and/or curves), the techniques discussed herein substantially simultaneously detect an object (e.g., determine an ROI) and segment the image.

    DEPTH DATA MODEL TRAINING
    6.
    发明申请

    公开(公告)号:US20210150278A1

    公开(公告)日:2021-05-20

    申请号:US16684554

    申请日:2019-11-14

    Applicant: Zoox, Inc.

    Abstract: Techniques for training a machine learned (ML) model to determine depth data based on image data are discussed herein. Training can use stereo image data and depth data (e.g., lidar data). A first (e.g., left) image can be input to a ML model, which can output predicted disparity and/or depth data. The predicted disparity data can be used with second image data (e.g., a right image) to reconstruct the first image. Differences between the first and reconstructed images can be used to determine a loss. Losses may include pixel, smoothing, structural similarity, and/or consistency losses. Further, differences between the depth data and the predicted depth data and/or differences between the predicted disparity data and the predicted depth data can be determined, and the ML model can be trained based on the various losses. Thus, the techniques can use self-supervised training and supervised training to train a ML model.

    Machine Learning Techniques
    8.
    发明申请

    公开(公告)号:US20190392268A1

    公开(公告)日:2019-12-26

    申请号:US16013729

    申请日:2018-06-20

    Applicant: Zoox, Inc.

    Abstract: Improved techniques for training a machine learning (ML) model are discussed herein. Training the ML model can be based on a subset of examples. In particular, the training can include identifying a reference region associated with an area of the image representing an object, and selecting, based at least in part on a first confidence score associated with a first bounding box, a first hard example for inclusion in the subset of examples. In some cases, the first confidence score and the first bounding box can be associated with a first portion of the feature map. Next, the training can include determining that a first degree of alignment of the first bounding box to the reference region is above a threshold degree of alignment, and in response, replacing the first hard example with a second hard example.

    INSTANCE SEGMENTATION INFERRED FROM MACHINE-LEARNING MODEL OUTPUT

    公开(公告)号:US20190392242A1

    公开(公告)日:2019-12-26

    申请号:US16013764

    申请日:2018-06-20

    Applicant: Zoox, Inc.

    Abstract: Techniques for using instance segmentation with machine learning (ML) models are discussed herein. An image can be provided as input to a ML model, which can generate, as an output from the ML model, a feature map comprising a plurality of features. Each feature of the plurality of features can comprise a confidence score, classification information, and a region of interest (ROI) determined in accordance with a non-maximal suppression (NMS) technique. Individual ROIs that are similar can be associated together for segmentation purposes. That is, instead of requiring a second ML model and/or a second operation to segment the image (e.g., identify which pixels correspond with the detected object, for example, by outputting a mask or set of lines and/or curves), the techniques discussed herein substantially simultaneously detect an object (e.g., determine an ROI) and segment the image.

    Key point detection
    10.
    发明授权

    公开(公告)号:US12100224B1

    公开(公告)日:2024-09-24

    申请号:US17246016

    申请日:2021-04-30

    Applicant: Zoox, Inc.

    CPC classification number: G06V20/58 G06N20/00 G06V10/462 G06V40/10

    Abstract: Techniques for detecting key points associated with objects in an environment are described herein. The techniques may include receiving sensor data representing a portion of an environment in which the vehicle is operating and inputting the sensor data into a machine-learned model. Based on the input sensor data, the machine-learned model may detect one or more key points corresponding to physical features (e.g., hands, feet, eyes, etc.) of a pedestrian who is in the environment. Based on the one or more key points, a bounding box associated with the pedestrian may be generated and the vehicle may be controlled based on at least one of the key points or the bounding box. The techniques may also include training the machine-learned model to detect key points associated with pedestrians.

Patent Agency Ranking