Patent search ap:("QUALCOMM Incorporated") AND inv:"Yinhao ZHU" Page 1

1.

发明申请
THREE-DIMENSIONAL (3D) OBJECT DETECTION BASED ON MULTIPLE TWO-DIMENSIONAL (2D) VIEWS CORRESPONDING TO DIFFERENT VIEWPOINTS 有权

公开(公告)号：US20250166395A1

公开(公告)日：2025-05-22

申请号：US18585444

申请日：2024-02-23

Applicant: QUALCOMM Incorporated

Inventor： Shizhong Steve HAN , Hong CAI , Haiyan WANG , Yinhao ZHU , Yunxiao SHI , Fatih Murat PORIKLI , Sourab BAPU SRIDHAR , Senthil Kumar YOGAMANI

IPC: G06V20/64 , G06T7/30 , G06T15/00 , G06V10/75 , G06V10/80

Abstract: Certain aspects of the present disclosure provide techniques for performing 3D object detection. Such techniques may include obtaining a first set of features based on a first 2D view; obtaining a second set of features based on a second 2D view, obtaining a third set of features based on a third 2D view, obtaining a fourth set of features based on a fourth 2D view, wherein the first 2D view and the second 2D view are based on input from a first input sensor and the third 2D view and the fourth 2D view are based on input from a second input sensor. The techniques may also include performing cross-attention between the first set of features and the second set of features and between the third set of features and the fourth set of features; and performing 3D object detection.

2.

发明申请
DEPTH ESTIMATION BASED ON FEATURE RECONSTRUCTION WITH ADAPTIVE MASKING AND MOTION PREDICTION 有权

公开(公告)号：US20250148633A1

公开(公告)日：2025-05-08

申请号：US18666502

申请日：2024-05-16

Applicant: QUALCOMM Incorporated

Inventor： Rajeev YASARLA , Hong CAI , Risheek GARREPALLI , Yinhao ZHU , Jisoo JEONG , Yunxiao SHI , Manish Kumar SINGH , Fatih Murat PORIKLI

IPC: G06T7/593 , G06T7/20

Abstract: Systems and techniques are provided for generating depth information. For example, a process can include obtaining a first feature volume including visual features corresponding to each respective frame included in a first set of frames. A first query generator network can generate reconstruction features associated with a reconstructed feature volume corresponding to the first feature volume. Based on the first feature volume, a second query generator network can generate motion features associated with predicted future motion corresponding to the first feature volume. An initial depth prediction can be generated for each respective frame based on cross-attention between features of a depth prediction decoder, the reconstruction features, and the motion features. A refined depth prediction can be generated for each respective based on cross-attention between the initial depth prediction, the reconstruction features, and the motion features.

3.

发明公开
THREE-DIMENSIONAL OBJECT PART SEGMENTATION USING A MACHINE LEARNING MODEL 审中-公开

公开(公告)号：US20240144589A1

公开(公告)日：2024-05-02

申请号：US18177028

申请日：2023-03-01

Applicant: QUALCOMM Incorporated

Inventor： Minghua LIU , Yinhao ZHU , Hong CAI , Fatih Murat PORIKLI , Hao SU

IPC: G06T17/00 , G06T7/12 , G06V10/25 , G06V20/70

CPC classification number: G06T17/00 , G06T7/12 , G06V10/25 , G06V20/70 , G06T2207/10028 , G06V2201/07

Abstract: Systems and techniques are provided for part segmentation. For example, a process for performing part segmentation can include obtaining a three-dimensional capture of an object. The method can include generating one or more two-dimensional images of the object from the three-dimensional capture of the object. The method can further include processing the one or more two-dimensional images of the object to generate at least one two-dimensional bounding box associated with a part of the object. The method can include performing three-dimensional part segmentation of the part of the object based on a three-dimensional point cloud generated from the one or more two-dimensional images of the object and the at least one two-dimensional bounding box and based on semantically labeled super points which are merged into subgroups associated with the part of the object.

4.

发明申请
PLANAR MESH RECONSTRUCTION USING IMAGES FROM MULTIPLE CAMERA POSES 有权

公开(公告)号：US20240386650A1

公开(公告)日：2024-11-21

申请号：US18509113

申请日：2023-11-14

Applicant: QUALCOMM Incorporated

Inventor： Farhad GHAZVINIAN ZANJANI , Leyla MIRVAKHABOVA , Yinhao ZHU , Hong CAI , Fatih Murat PORIKLI

IPC: G06T15/06 , G06T7/10 , G06T7/50 , G06T15/10 , G06T17/20

Abstract: Systems and techniques are provided for processing image data corresponding to a scene. A process can include generating a planar distance map including a planar distance value for each pixel of at least one image corresponding to the scene. Planar segmentation is performed based on the planar distance map, a normal map corresponding to the at least one image, and positional encoding information of the planar distance map. A triangular mesh fragment is initialized based on sampling points from each planar segment of a plurality of planar segments from the planar segmentation. Ray-triangle intersections are determined based on performing ray casting for a reconstructed planar mesh including a plurality of triangular mesh fragments each corresponding to a different image. A planar reconstruction and segmentation machine learning network is optimized for the scene, based on training the planar reconstruction and segmentation machine learning network using one or more loss functions.

5.

发明申请
TRANSFORMER-BASED ARCHITECTURE FOR TRANSFORM CODING OF MEDIA 有权

公开(公告)号：US20230100413A1

公开(公告)日：2023-03-30

申请号：US17486732

申请日：2021-09-27

Applicant: QUALCOMM Incorporated

Inventor： Yinhao ZHU , Yang YANG , Taco Sebastiaan COHEN

IPC: H04N19/60

Abstract: Systems and techniques are described herein for processing media data using a neural network system. For instance, a process can include obtaining a latent representation of a frame of encoded image data and generating, by a plurality of decoder transformer layers of a decoder sub-network using the latent representation of the frame of encoded image data as input, a frame of decoded image data. At least one decoder transformer layer of the plurality of decoder transformer layers includes: one or more transformer blocks for generating one or more patches of features and determine self-attention locally within one or more window partitions and shifted window partitions applied over the one or more patches; and a patch un-merging engine for decreasing a respective size of each patch of the one or more patches.

6.

发明申请
VARIABLE BIT RATE COMPRESSION USING NEURAL NETWORK MODELS 有权

公开(公告)号：US20220224926A1

公开(公告)日：2022-07-14

申请号：US17573568

申请日：2022-01-11

Applicant: QUALCOMM Incorporated

Inventor： Yadong LU , Yang YANG , Yinhao ZHU , Amir SAID , Reza POURREZA , Taco Sebastiaan COHEN

IPC: H04N19/42 , H04N19/30 , H04N19/13 , H04N19/136 , H04N19/124

Abstract: A computer-implemented method for operating an artificial neural network (ANN) includes receiving an input by the ANN. The ANN generates a latent representation of the input. The latent representation is communicated according to a bit rate based on a learned latent scaling parameter. The latent scaling parameter is learned based on a channel index and a tradeoff parameter value that corresponds to a value that balances the bit rate and a distortion.

7.

发明申请
THREE-DIMENSIONAL (3D) OBJECT DETECTION BASED ON MULTIPLE TWO-DIMENSIONAL (2D) VIEWS 有权

公开(公告)号：US20250166391A1

公开(公告)日：2025-05-22

申请号：US18585480

申请日：2024-02-23

Applicant: QUALCOMM Incorporated

Inventor： Shizhong Steve HAN , Hong CAI , Haiyan WANG , Yinhao ZHU , Yunxiao SHI , Fatih Murat PORIKLI , Sourab BAPU SRIDHAR , Senthil Kumar YOGAMANI

IPC: G06V20/58 , G06V10/10 , G06V10/84

Abstract: Certain aspects of the present disclosure provide techniques for performing 3D object detection. Such techniques may include obtaining one or more inputs associated with one or more two-dimensional (2D) views of a scene; selecting a set of 2D views of the scene from a plurality of 2D views of the scene based on the one or more inputs, the set of 2D views comprising a first 2D view of the scene and a second 2D view of the scene; and performing three-dimensional (3D) object detection in the scene based on the set of 2D views.

8.

发明申请
DEPTH COMPLETION USING ATTENTION-BASED REFINEMENT OF FEATURES 有权

公开(公告)号：US20250148628A1

公开(公告)日：2025-05-08

申请号：US18633302

申请日：2024-04-11

Applicant: QUALCOMM Incorporated

Inventor： Yunxiao SHI , Hong CAI , Manish Kumar SINGH , Shizhong Steve HAN , Yinhao ZHU , Fatih Murat PORIKLI

IPC: G06T7/50 , G06T3/40 , G06V10/44

Abstract: Systems and techniques are provided for generating depth information from one or more images. For example, a process can include obtaining a first depth map corresponding to an input comprising an image of the one or more images and a sparse depth measurement. A three-dimensional (3D) point cloud can be generated based on the first depth map and multi-scale visual features of the input, wherein the 3D point cloud includes a plurality of 3D point features uplifted from the multi-scale visual features. At least a portion of the plurality of 3D point features can be processed using one or more self-attention layers to generate refined 3D point features. A two-dimensional (2D) projection of the refined 3D point features can be generated and a second depth map can be generated based on the 2D projection of the refined 3D point features.

9.

发明申请
TEST-TIME SELF-SUPERVISED GUIDANCE FOR DIFFUSION MODELS 有权

公开(公告)号：US20240412493A1

公开(公告)日：2024-12-12

申请号：US18537404

申请日：2023-12-12

Applicant: QUALCOMM Incorporated

Inventor： Risheek GARREPALLI , Yunxiao SHI , Hong CAI , Yinhao ZHU , Shubhankar Mangesh BORSE , Jisoo JEONG , Debasmit DAS , Manish Kumar SINGH , Rajeev YASARLA , Shizhong Steve HAN , Fatih Murat PORIKLI

IPC: G06V10/776 , G06T7/50 , G06V10/764 , G06V10/82 , G06V20/70

Abstract: Systems and techniques are provided for processing image data. According to some aspects, a computing device can generate a gradient (e.g., a classifier gradient using a trained classifier) associated with a current sample. The computing device can combine the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate. The computing device can predict, using the diffusion machine learning model and based on the score function estimate, a new sample.

10.

发明公开
PHYSICALLY-BASED EMITTER ESTIMATION FOR INDOOR SCENES 审中-公开

公开(公告)号：US20240303913A1

公开(公告)日：2024-09-12

申请号：US18180797

申请日：2023-03-08

Applicant: QUALCOMM Incorporated

Inventor： Yinhao ZHU , Rui ZHU , Hong CAI , Fatih Murat PORIKLI

IPC: G06T15/50 , G06T7/593

CPC classification number: G06T15/506 , G06T7/593

Abstract: Systems and techniques are provided for physical-based light estimation for inverse rendering of indoor scenes. For example, a computing device can obtain an estimated scene geometry based on a multi-view observation of a scene. The computing device can further obtain a light emission mask based on the multi-view observation of the scene. The computing device can also obtain an emitted radiance field based on the multi-view observation of the scene. The computing device can then determine, based on the light emission mask and the emitted radiance field, a geometry of at least one light source of the estimated scene geometry.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification