-
公开(公告)号:US20240428576A1
公开(公告)日:2024-12-26
申请号:US18613263
申请日:2024-03-22
Applicant: QUALCOMM Incorporated
Inventor: Tianyu JIANG , Manish Kumar SINGH , Hsin-Pai CHENG , Hong CAI , Mingu LEE , Kartikeya BHARDWAJ , Christopher LOTT , Fatih Murat PORIKLI
Abstract: Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A transformed version of image pixels is accessed as input to an attention layer of a machine learning model. A number of local attention operations to apply, in one transformer, to the transformed version of image pixels is selected based at least in part on a size of the transformed version of image pixels. A transformer output for the attention layer of the machine learning model is generated based on applying the number of local attention operations and at least one global attention operation to the transformed version of image pixels.
-
公开(公告)号:US20240412493A1
公开(公告)日:2024-12-12
申请号:US18537404
申请日:2023-12-12
Applicant: QUALCOMM Incorporated
Inventor: Risheek GARREPALLI , Yunxiao SHI , Hong CAI , Yinhao ZHU , Shubhankar Mangesh BORSE , Jisoo JEONG , Debasmit DAS , Manish Kumar SINGH , Rajeev YASARLA , Shizhong Steve HAN , Fatih Murat PORIKLI
IPC: G06V10/776 , G06T7/50 , G06V10/764 , G06V10/82 , G06V20/70
Abstract: Systems and techniques are provided for processing image data. According to some aspects, a computing device can generate a gradient (e.g., a classifier gradient using a trained classifier) associated with a current sample. The computing device can combine the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate. The computing device can predict, using the diffusion machine learning model and based on the score function estimate, a new sample.
-
13.
公开(公告)号:US20240404093A1
公开(公告)日:2024-12-05
申请号:US18327380
申请日:2023-06-01
Applicant: QUALCOMM Incorporated
Inventor: Jisoo JEONG , Hong CAI , Risheek GARREPALLI , Fatih Murat PORIKLI , Mathew SAM , Khalid TAHBOUB , Bing HAN
IPC: G06T7/593
Abstract: Systems and techniques are provided for generating disparity information from two or more images. For example, a process can include obtaining first disparity information corresponding to a pair of images, the pair of images including a first image of a scene and a second image of the scene. The process can include obtaining confidence information associated with the first disparity information. The process can include processing, using a machine learning network, the first disparity information and the confidence information to generate second disparity information corresponding to the pair of images. The process can include combining, based on the confidence information, the first disparity information with the second disparity information to generate a refined disparity map corresponding to the pair of images.
-
公开(公告)号:US20240303913A1
公开(公告)日:2024-09-12
申请号:US18180797
申请日:2023-03-08
Applicant: QUALCOMM Incorporated
Inventor: Yinhao ZHU , Rui ZHU , Hong CAI , Fatih Murat PORIKLI
CPC classification number: G06T15/506 , G06T7/593
Abstract: Systems and techniques are provided for physical-based light estimation for inverse rendering of indoor scenes. For example, a computing device can obtain an estimated scene geometry based on a multi-view observation of a scene. The computing device can further obtain a light emission mask based on the multi-view observation of the scene. The computing device can also obtain an emitted radiance field based on the multi-view observation of the scene. The computing device can then determine, based on the light emission mask and the emitted radiance field, a geometry of at least one light source of the estimated scene geometry.
-
公开(公告)号:US20240161312A1
公开(公告)日:2024-05-16
申请号:US18477493
申请日:2023-09-28
Applicant: QUALCOMM Incorporated
Inventor: Jisoo JEONG , Risheek GARREPALLI , Hong CAI , Fatih Murat PORIKLI
IPC: G06T7/246
CPC classification number: G06T7/248 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084
Abstract: A computer-implemented method includes generating a first augmented frame by combining a first image and a first frame of a first frame pair. The computer-implemented method also includes generating, via an optical flow estimation model, a first flow estimation based on a second frame of the first frame pair and the first augmented frame. The computer-implemented method further includes updating one or both of parameters or weights of the optical flow estimation model based on a first loss between the first flow estimation and a training target.
-
公开(公告)号:US20240020844A1
公开(公告)日:2024-01-18
申请号:US18349726
申请日:2023-07-10
Applicant: QUALCOMM Incorporated
Inventor: Debasmit DAS , Shubhankar Mangesh BORSE , Hyojin PARK , Kambiz AZARIAN YAZDI , Hong CAI , Risheek GARREPALLI , Fatih Murat PORIKLI
IPC: G06T7/11
CPC classification number: G06T7/11 , G06T2207/20081 , G06T2207/20004
Abstract: Systems and techniques are provided for processing data (e.g., image data). For instance, according to some aspects of the disclosure, a method may include receiving, at a transformer of a machine learning system, learnable queries, keys, and values obtained from a feature map of a segmentation model of the machine learning system. The method may further include learning, via the transformer, a mapping between an unsupervised output and a supervised output of the segmentation model based on the feature map.
-
公开(公告)号:US20250094793A1
公开(公告)日:2025-03-20
申请号:US18469909
申请日:2023-09-19
Applicant: QUALCOMM Incorporated
Inventor: Manish Kumar SINGH , Tianyu JIANG , Hsin-Pai CHENG , Kartikeya BHARDWAJ , Hong CAI , Mingu LEE , Munawar HAYAT , Christopher LOTT , Fatih Murat PORIKLI
IPC: G06N3/0499
Abstract: A processor-implemented method for image or text processing includes receiving, by an artificial neural network (ANN) model, a set of tokens corresponding to an input. A token interaction block of the ANN model processes the set of tokens according to each channel of the input to generate a spatial mixture of a set of features for each channel of the input. A feed forward network block of the ANN model generates a mixture of channel features based on the spatial mixture of the set of features for each channel of the input. An attention block of the ANN model determines a set of attended features of the mixture of channel features according to a set of attention weights. In turn, the ANN model generates an inference based on the set of attend features of the mixture of channel features.
-
公开(公告)号:US20240386650A1
公开(公告)日:2024-11-21
申请号:US18509113
申请日:2023-11-14
Applicant: QUALCOMM Incorporated
Inventor: Farhad GHAZVINIAN ZANJANI , Leyla MIRVAKHABOVA , Yinhao ZHU , Hong CAI , Fatih Murat PORIKLI
Abstract: Systems and techniques are provided for processing image data corresponding to a scene. A process can include generating a planar distance map including a planar distance value for each pixel of at least one image corresponding to the scene. Planar segmentation is performed based on the planar distance map, a normal map corresponding to the at least one image, and positional encoding information of the planar distance map. A triangular mesh fragment is initialized based on sampling points from each planar segment of a plurality of planar segments from the planar segmentation. Ray-triangle intersections are determined based on performing ray casting for a reconstructed planar mesh including a plurality of triangular mesh fragments each corresponding to a different image. A planar reconstruction and segmentation machine learning network is optimized for the scene, based on training the planar reconstruction and segmentation machine learning network using one or more loss functions.
-
公开(公告)号:US20240171727A1
公开(公告)日:2024-05-23
申请号:US18470326
申请日:2023-09-19
Applicant: QUALCOMM Incorporated
Inventor: Yunxiao SHI , Hong CAI , Fatih Murat PORIKLI , Amin ANSARI , Sai Madhuraj JADHAV
IPC: H04N13/363 , G06T7/50 , G06V10/44 , G06V10/771 , H04N13/351
CPC classification number: H04N13/363 , G06T7/50 , G06V10/44 , G06V10/771 , H04N13/351 , G06V2201/07
Abstract: Systems and techniques are provided for processing image data. For example, a process can include obtaining a plurality of input images associated with a plurality of different spatial views. The process can include generating a set of features based on the plurality of input images. The process can include generating a set of projected features based on the set of features, wherein an embedding size associated with the set of projected features is smaller than an embedding size associated with the set of features. The process can include determining a cross-view attention associated with the plurality of different spatial views, the cross-view attention determined using the set of projected features.
-
20.
公开(公告)号:US20230252658A1
公开(公告)日:2023-08-10
申请号:US17650027
申请日:2022-02-04
Applicant: QUALCOMM Incorporated
Inventor: Hong CAI , Shichong PENG , Janarbek MATAI , Jamie Menjay LIN , Debasmit DAS , Fatih Murat PORIKLI
CPC classification number: G06T7/50 , G06T7/10 , G06N3/0454 , G06T2207/20084 , G06T2207/20212
Abstract: Certain aspects of the present disclosure provide techniques for generating fine depth maps for images of a scene based on semantic segmentation and segment-based refinement neural networks. An example method generally includes generating, through a segmentation neural network, a segmentation map based on an image of a scene. The segmentation map generally comprises a map segmenting the scene into a plurality of regions, and each region of the plurality of regions is generally associated with one of a plurality of categories. A first depth map of the scene is generated through a first depth neural network based on a depth measurement of the scene. A second depth map of the scene is generated through a depth refinement neural network based on the segmentation map and the first depth map. One or more actions are taken based on the second depth map of the scene.
-
-
-
-
-
-
-
-
-