-
公开(公告)号:US20240331268A1
公开(公告)日:2024-10-03
申请号:US18127949
申请日:2023-03-29
发明人: Jiading Fang , Vitor Guizilini , Igor Vasiljevic , Rares A. Ambrus , Gregory Shakhnarovich , Matthew R. Walter , Adrien David Gaidon
CPC分类号: G06T15/08 , G06N5/022 , G06T15/06 , G06T15/503 , G06T2210/21 , G06T2210/56
摘要: System, methods, and other embodiments described herein relate to generating an image by interpolating features estimated from a learning model. In one embodiment, a method includes sampling three-dimensional (3D) points of a light ray that crosses a frustum space associated with a single-view camera, the 3D points reflecting depth estimates derived from data that the single-view camera generates for a scene. The method also includes deriving feature values for the 3D points using tri-linear interpolation across feature planes of the frustum space, the feature planes being estimated by a learning model. The method also includes inferring an image in two dimensions (2D) by translating the feature values and compositing the data with volumetric rendering for the scene. The method also includes executing a control task by a controller using the image.
-
公开(公告)号:US12033341B2
公开(公告)日:2024-07-09
申请号:US17390760
申请日:2021-07-30
发明人: Vitor Guizilini , Rares Andrei Ambrus , Adrien David Gaidon , Igor Vasiljevic , Gregory Shakhnarovich
IPC分类号: G06T7/55 , B60R1/00 , B60W60/00 , G05D1/00 , G05D1/248 , G05D1/646 , G06F18/214 , G06N3/08 , G06T3/04 , G06T3/18 , G06T3/40 , G06T7/11 , G06T7/292 , G06T7/579 , H04N23/90
CPC分类号: G06T7/55 , B60R1/00 , B60W60/001 , G05D1/0212 , G05D1/0246 , G05D1/248 , G05D1/646 , G06F18/214 , G06F18/2148 , G06N3/08 , G06T3/04 , G06T3/18 , G06T3/40 , G06T7/11 , G06T7/292 , G06T7/579 , H04N23/90 , B60R2300/102 , B60W2420/403 , G06T2207/10028 , G06T2207/20081 , G06T2207/20084 , G06T2207/30244 , G06T2207/30252
摘要: A method for scale-aware depth estimation using multi-camera projection loss is described. The method includes determining a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes training a scale-aware depth estimation model and an ego-motion estimation model according to the multi-camera photometric loss. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the scale-aware depth estimation model and the ego-motion estimation model. The method also includes planning a vehicle control action of the ego vehicle according to the 360° point cloud of the scene surrounding the ego vehicle.
-
公开(公告)号:US20240029286A1
公开(公告)日:2024-01-25
申请号:US18110421
申请日:2023-02-16
发明人: Vitor Guizilini , Igor Vasiljevic , Adrien D. Gaidon , Jiading Fang , Gregory Shakhnarovich , Matthew R. Walter , Rares A. Ambrus
CPC分类号: G06T7/593 , G06T7/85 , G06T2207/10028 , G06T2207/10024 , G06T2207/20081
摘要: A method of generating additional supervision data to improve learning of a geometrically-consistent latent scene representation with a geometric scene representation architecture is provided. The method includes receiving, with a computing device, a latent scene representation encoding a pointcloud from images of a scene captured by a plurality of cameras each with known intrinsics and poses, generating a virtual camera having a viewpoint different from viewpoints of the plurality of cameras, projecting information from the pointcloud onto the viewpoint of the virtual camera, and decoding the latent scene representation based on the virtual camera thereby generating an RGB image and depth map corresponding to the viewpoint of the virtual camera for implementation as additional supervision data.
-
4.
公开(公告)号:US20220245843A1
公开(公告)日:2022-08-04
申请号:US17722360
申请日:2022-04-17
摘要: Systems and methods for self-supervised learning for visual odometry using camera images, may include: estimating correspondences between keypoints of a target camera image and keypoints of a context camera image; based on the keypoint correspondences, lifting a set of 2D keypoints to 3D, using a neural camera model; and projecting the 3D keypoints into the context camera image using the neural camera model. Some embodiments may use the neural camera model to achieve the lifting and projecting of keypoints without a known or calibrated camera model.
-
5.
公开(公告)号:US20240354991A1
公开(公告)日:2024-10-24
申请号:US18486619
申请日:2023-10-13
发明人: Vitor Campagnolo Guizilini , Igor Vasiljevic , Dian Chen , Adrien David Gaidon , Rares A. Ambrus
CPC分类号: G06T7/80 , G06T7/50 , G06T2207/10028 , G06T2207/20081
摘要: Systems, methods, and other embodiments described herein relate to estimating scaled depth maps by sampling variational representations of an image using a learning model. In one embodiment, a method includes encoding data embeddings by a learning model to form conditioned latent representations using attention networks, the data embeddings including features about an image from a camera and calibration information about the camera. The method also includes computing a probability distribution of the conditioned latent representations by factoring scale priors. The method also includes sampling the probability distribution to generate variations for the data embeddings. The method also includes estimating scaled depth maps of a scene from the variations at different coordinates using the attention networks.
-
公开(公告)号:US20240161471A1
公开(公告)日:2024-05-16
申请号:US18364946
申请日:2023-08-03
发明人: Vitor Guizilini , Rares A. Ambrus , Jiading Fang , Sergey Zakharov , Vincent Sitzmann , Igor Vasiljevic , Adrien Gaidon
IPC分类号: G06V10/774 , G06V20/40 , G06V20/56 , G06V20/64
CPC分类号: G06V10/7747 , G06V20/41 , G06V20/56 , G06V20/64
摘要: Systems and methods described herein support enhanced computer vision capabilities which may be applicable to, for example, autonomous vehicle operation. An example method includes generating, through training, a shared latent space based on (i) image data that include multiple images, where each image has a different viewing frame of a scene, and (ii) first and second types of embeddings, and training a decoder based on the first type of embeddings. The method also includes generating an embedding based on the first type of embeddings that is representative of a novel viewing frame of the scene, decoding, with the decoder, the shared latent space using cross-attention with the generated embedding, and generating the novel viewing frame of the scene based on an output of the decoder.
-
7.
公开(公告)号:US11727589B2
公开(公告)日:2023-08-15
申请号:US17377684
申请日:2021-07-16
发明人: Vitor Guizilini , Rares Andrei Ambrus , Adrien David Gaidon , Igor Vasiljevic , Gregory Shakhnarovich
IPC分类号: G06T7/55 , B60R1/00 , G06T3/00 , G05D1/02 , G06N3/08 , G06T7/579 , G06T7/292 , G06T7/11 , B60W60/00 , G06T3/40 , G06F18/214 , H04N23/90
CPC分类号: G06T7/55 , B60R1/00 , B60W60/001 , G05D1/0212 , G05D1/0246 , G06F18/214 , G06F18/2148 , G06N3/08 , G06T3/0012 , G06T3/0093 , G06T3/40 , G06T7/11 , G06T7/292 , G06T7/579 , H04N23/90 , B60R2300/102 , B60W2420/42 , G05D2201/0213 , G06T2207/10028 , G06T2207/20081 , G06T2207/20084 , G06T2207/30244 , G06T2207/30252
摘要: A method for multi-camera monocular depth estimation using pose averaging is described. The method includes determining a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes determining a multi-camera pose consistency constraint (PCC) loss associated with the multi-camera rig of the ego vehicle. The method further includes adjusting the multi-camera photometric loss according to the multi-camera PCC loss to form a multi-camera PCC photometric loss. The method also includes training a multi-camera depth estimation model and an ego-motion estimation model according to the multi-camera PCC photometric loss. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the trained multi-camera depth estimation model and the ego-motion estimation model.
-
公开(公告)号:US11688090B2
公开(公告)日:2023-06-27
申请号:US17377161
申请日:2021-07-15
发明人: Vitor Guizilini , Rares Andrei Ambrus , Adrien David Gaidon , Igor Vasiljevic , Gregory Shakhnarovich
IPC分类号: G06T7/55 , G06N3/08 , G06T7/579 , B60R1/00 , G06T3/00 , G05D1/02 , G06T7/292 , G06T7/11 , B60W60/00 , G06T3/40 , G06F18/214 , H04N23/90
CPC分类号: G06T7/55 , B60R1/00 , B60W60/001 , G05D1/0212 , G05D1/0246 , G06F18/214 , G06F18/2148 , G06N3/08 , G06T3/0012 , G06T3/0093 , G06T3/40 , G06T7/11 , G06T7/292 , G06T7/579 , H04N23/90 , B60R2300/102 , B60W2420/42 , G05D2201/0213 , G06T2207/10028 , G06T2207/20081 , G06T2207/20084 , G06T2207/30244 , G06T2207/30252
摘要: A method for multi-camera self-supervised depth evaluation is described. The method includes training a self-supervised depth estimation model and an ego-motion estimation model according to a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes generating a single-scale correction factor according to a depth map of each camera of the multi-camera rig during a time-step. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the self-supervised depth estimation model and the ego-motion estimation model. The method also includes scaling the 360° point cloud according to the single-scale correction factor to form an aligned 360° point cloud.
-
9.
公开(公告)号:US11615544B2
公开(公告)日:2023-03-28
申请号:US17021978
申请日:2020-09-15
摘要: Systems and methods for map construction using a video sequence captured on a camera of a vehicle in an environment, comprising: receiving a video sequence from the camera, the video sequence including a plurality of image frames capturing a scene of the environment of the vehicle; using a neural camera model to predict a depth map and a ray surface for the plurality of image frames in the received video sequence; and constructing a map of the scene of the environment based on image data captured in the plurality of frames and depth information in the predicted depth maps.
-
公开(公告)号:US11494927B2
公开(公告)日:2022-11-08
申请号:US17021951
申请日:2020-09-15
摘要: Systems and methods for self-supervised depth estimation using image frames captured from a vehicle-mounted camera, may include: receiving a first image captured by the camera while the camera is mounted at a first location on the vehicle, the source image comprising pixels representing a scene of the environment of the vehicle; receiving a reference image captured by the camera while the camera is mounted at a second location on the vehicle, the reference image comprising pixels representing a scene of the environment; predicting a depth map for the first image comprising predicted depth values for pixels of the first image; warping the first image to a perspective of the camera at the second location on the vehicle to arrive at a warped first image; projecting the warped first image onto the source image; determining a loss based on the projection; and updating predicted depth values for the first image.
-
-
-
-
-
-
-
-
-