Patent search ap:("QUALCOMM Incorporated") AND inv:"Hong CAI" Page 2

11.

发明申请
TRANSFORMER WITH MULTI-SCALE MULTI-CONTEXT ATTENTIONS 有权

公开(公告)号：US20240428576A1

公开(公告)日：2024-12-26

申请号：US18613263

申请日：2024-03-22

Applicant: QUALCOMM Incorporated

Inventor： Tianyu JIANG , Manish Kumar SINGH , Hsin-Pai CHENG , Hong CAI , Mingu LEE , Kartikeya BHARDWAJ , Christopher LOTT , Fatih Murat PORIKLI

IPC: G06V10/82 , G06V10/70 , G06V10/77

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A transformed version of image pixels is accessed as input to an attention layer of a machine learning model. A number of local attention operations to apply, in one transformer, to the transformed version of image pixels is selected based at least in part on a size of the transformed version of image pixels. A transformer output for the attention layer of the machine learning model is generated based on applying the number of local attention operations and at least one global attention operation to the transformed version of image pixels.

12.

发明申请
TEST-TIME SELF-SUPERVISED GUIDANCE FOR DIFFUSION MODELS 有权

公开(公告)号：US20240412493A1

公开(公告)日：2024-12-12

申请号：US18537404

申请日：2023-12-12

Applicant: QUALCOMM Incorporated

Inventor： Risheek GARREPALLI , Yunxiao SHI , Hong CAI , Yinhao ZHU , Shubhankar Mangesh BORSE , Jisoo JEONG , Debasmit DAS , Manish Kumar SINGH , Rajeev YASARLA , Shizhong Steve HAN , Fatih Murat PORIKLI

IPC: G06V10/776 , G06T7/50 , G06V10/764 , G06V10/82 , G06V20/70

Abstract: Systems and techniques are provided for processing image data. According to some aspects, a computing device can generate a gradient (e.g., a classifier gradient using a trained classifier) associated with a current sample. The computing device can combine the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate. The computing device can predict, using the diffusion machine learning model and based on the score function estimate, a new sample.

13.

发明申请
DISPARITY-BASED DEPTH REFINEMENT USING CONFIDENCE INFORMATION AND STEREOSCOPIC DEPTH INFORMATION 有权

公开(公告)号：US20240404093A1

公开(公告)日：2024-12-05

申请号：US18327380

申请日：2023-06-01

Applicant: QUALCOMM Incorporated

Inventor： Jisoo JEONG , Hong CAI , Risheek GARREPALLI , Fatih Murat PORIKLI , Mathew SAM , Khalid TAHBOUB , Bing HAN

IPC: G06T7/593

Abstract: Systems and techniques are provided for generating disparity information from two or more images. For example, a process can include obtaining first disparity information corresponding to a pair of images, the pair of images including a first image of a scene and a second image of the scene. The process can include obtaining confidence information associated with the first disparity information. The process can include processing, using a machine learning network, the first disparity information and the confidence information to generate second disparity information corresponding to the pair of images. The process can include combining, based on the confidence information, the first disparity information with the second disparity information to generate a refined disparity map corresponding to the pair of images.

14.

发明公开
PHYSICALLY-BASED EMITTER ESTIMATION FOR INDOOR SCENES 审中-公开

公开(公告)号：US20240303913A1

公开(公告)日：2024-09-12

申请号：US18180797

申请日：2023-03-08

Applicant: QUALCOMM Incorporated

Inventor： Yinhao ZHU , Rui ZHU , Hong CAI , Fatih Murat PORIKLI

IPC: G06T15/50 , G06T7/593

CPC classification number: G06T15/506 , G06T7/593

Abstract: Systems and techniques are provided for physical-based light estimation for inverse rendering of indoor scenes. For example, a computing device can obtain an estimated scene geometry based on a multi-view observation of a scene. The computing device can further obtain a light emission mask based on the multi-view observation of the scene. The computing device can also obtain an emitted radiance field based on the multi-view observation of the scene. The computing device can then determine, based on the light emission mask and the emitted radiance field, a geometry of at least one light source of the estimated scene geometry.

15.

发明公开
REALISTIC DISTRACTION AND PSEUDO-LABELING REGULARIZATION FOR OPTICAL FLOW ESTIMATION 审中-公开

公开(公告)号：US20240161312A1

公开(公告)日：2024-05-16

申请号：US18477493

申请日：2023-09-28

Applicant: QUALCOMM Incorporated

Inventor： Jisoo JEONG , Risheek GARREPALLI , Hong CAI , Fatih Murat PORIKLI

IPC: G06T7/246

CPC classification number: G06T7/248 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084

Abstract: A computer-implemented method includes generating a first augmented frame by combining a first image and a first frame of a first frame pair. The computer-implemented method also includes generating, via an optical flow estimation model, a first flow estimation based on a second frame of the first frame pair and the first augmented frame. The computer-implemented method further includes updating one or both of parameters or weights of the optical flow estimation model based on a first loss between the first flow estimation and a training target.

16.

发明公开
FEATURE CONDITIONED OUTPUT TRANSFORMER FOR GENERALIZABLE SEMANTIC SEGMENTATION 审中-公开

公开(公告)号：US20240020844A1

公开(公告)日：2024-01-18

申请号：US18349726

申请日：2023-07-10

Applicant: QUALCOMM Incorporated

Inventor： Debasmit DAS , Shubhankar Mangesh BORSE , Hyojin PARK , Kambiz AZARIAN YAZDI , Hong CAI , Risheek GARREPALLI , Fatih Murat PORIKLI

IPC: G06T7/11

CPC classification number: G06T7/11 , G06T2207/20081 , G06T2207/20004

Abstract: Systems and techniques are provided for processing data (e.g., image data). For instance, according to some aspects of the disclosure, a method may include receiving, at a transformer of a machine learning system, learnable queries, keys, and values obtained from a feature map of a segmentation model of the machine learning system. The method may further include learning, via the transformer, a mapping between an unsupervised output and a supervised output of the segmentation model based on the feature map.

17.

发明申请
RE-ARRANGING FEED FORWARD NETWORKS (FFNs) IN TRANSFORMER-BASED MODELS 有权

公开(公告)号：US20250094793A1

公开(公告)日：2025-03-20

申请号：US18469909

申请日：2023-09-19

Applicant: QUALCOMM Incorporated

Inventor： Manish Kumar SINGH , Tianyu JIANG , Hsin-Pai CHENG , Kartikeya BHARDWAJ , Hong CAI , Mingu LEE , Munawar HAYAT , Christopher LOTT , Fatih Murat PORIKLI

IPC: G06N3/0499

Abstract: A processor-implemented method for image or text processing includes receiving, by an artificial neural network (ANN) model, a set of tokens corresponding to an input. A token interaction block of the ANN model processes the set of tokens according to each channel of the input to generate a spatial mixture of a set of features for each channel of the input. A feed forward network block of the ANN model generates a mixture of channel features based on the spatial mixture of the set of features for each channel of the input. An attention block of the ANN model determines a set of attended features of the mixture of channel features according to a set of attention weights. In turn, the ANN model generates an inference based on the set of attend features of the mixture of channel features.

18.

发明申请
PLANAR MESH RECONSTRUCTION USING IMAGES FROM MULTIPLE CAMERA POSES 有权

公开(公告)号：US20240386650A1

公开(公告)日：2024-11-21

申请号：US18509113

申请日：2023-11-14

Applicant: QUALCOMM Incorporated

Inventor： Farhad GHAZVINIAN ZANJANI , Leyla MIRVAKHABOVA , Yinhao ZHU , Hong CAI , Fatih Murat PORIKLI

IPC: G06T15/06 , G06T7/10 , G06T7/50 , G06T15/10 , G06T17/20

Abstract: Systems and techniques are provided for processing image data corresponding to a scene. A process can include generating a planar distance map including a planar distance value for each pixel of at least one image corresponding to the scene. Planar segmentation is performed based on the planar distance map, a normal map corresponding to the at least one image, and positional encoding information of the planar distance map. A triangular mesh fragment is initialized based on sampling points from each planar segment of a plurality of planar segments from the planar segmentation. Ray-triangle intersections are determined based on performing ray casting for a reconstructed planar mesh including a plurality of triangular mesh fragments each corresponding to a different image. A planar reconstruction and segmentation machine learning network is optimized for the scene, based on training the planar reconstruction and segmentation machine learning network using one or more loss functions.

19.

发明公开
CROSS-VIEW ATTENTION FOR VISUAL PERCEPTION TASKS USING MULTIPLE CAMERA INPUTS 审中-公开

公开(公告)号：US20240171727A1

公开(公告)日：2024-05-23

申请号：US18470326

申请日：2023-09-19

Applicant: QUALCOMM Incorporated

Inventor： Yunxiao SHI , Hong CAI , Fatih Murat PORIKLI , Amin ANSARI , Sai Madhuraj JADHAV

IPC: H04N13/363 , G06T7/50 , G06V10/44 , G06V10/771 , H04N13/351

CPC classification number: H04N13/363 , G06T7/50 , G06V10/44 , G06V10/771 , H04N13/351 , G06V2201/07

Abstract: Systems and techniques are provided for processing image data. For example, a process can include obtaining a plurality of input images associated with a plurality of different spatial views. The process can include generating a set of features based on the plurality of input images. The process can include generating a set of projected features based on the set of features, wherein an embedding size associated with the set of projected features is smaller than an embedding size associated with the set of features. The process can include determining a cross-view attention associated with the plurality of different spatial views, the cross-view attention determined using the set of projected features.

20.

发明公开
DEPTH MAP COMPLETION IN VISUAL CONTENT USING SEMANTIC AND THREE-DIMENSIONAL INFORMATION 审中-公开

公开(公告)号：US20230252658A1

公开(公告)日：2023-08-10

申请号：US17650027

申请日：2022-02-04

Applicant: QUALCOMM Incorporated

Inventor： Hong CAI , Shichong PENG , Janarbek MATAI , Jamie Menjay LIN , Debasmit DAS , Fatih Murat PORIKLI

IPC: G06T7/50 , G06T7/10 , G06N3/04

CPC classification number: G06T7/50 , G06T7/10 , G06N3/0454 , G06T2207/20084 , G06T2207/20212

Abstract: Certain aspects of the present disclosure provide techniques for generating fine depth maps for images of a scene based on semantic segmentation and segment-based refinement neural networks. An example method generally includes generating, through a segmentation neural network, a segmentation map based on an image of a scene. The segmentation map generally comprises a map segmenting the scene into a plurality of regions, and each region of the plurality of regions is generally associated with one of a plurality of categories. A first depth map of the scene is generated through a first depth neural network based on a depth measurement of the scene. A second depth map of the scene is generated through a depth refinement neural network based on the segmentation map and the first depth map. One or more actions are taken based on the second depth map of the scene.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification