Patent search ap:("QUALCOMM Incorporated") AND inv:"Yicheng LIN" Page 1

1.

发明申请
SYSTEMS AND METHODS FOR STATIC CACHED DECODING 有权

公开(公告)号：US20250094775A1

公开(公告)日：2025-03-20

申请号：US18468574

申请日：2023-09-15

Applicant: QUALCOMM Incorporated

Inventor： Shaojie ZHUO , Ramchalam KINATTINKARA RAMAKRISHNAN , Xiaopeng ZHANG , Yicheng LIN , Chenzheng SU , Liang SHEN

IPC: G06N3/0455

Abstract: Cached decoding systems and techniques are described. A system (e.g., decoder) receives an input token (e.g., input vector). The system applies a projection tensor (e.g., a projection matrix) to the input token to generate a feature tensor (e.g., a key tensor or a value tensor). The system processes at least the feature tensor and at least one previous feature tensor using at least one attention calculation to generate an output token. The at least one previous feature tensor is retrieved from a buffer. The at least one previous feature tensor can be stored in the buffer after having been previously calculated based on application of the projection tensor to a previous input token (e.g., from a previous iteration before the iteration in which the input token is received).

2.

发明公开
SINGLE SEARCH FOR ARCHITECTURES ON EMBEDDED DEVICES 审中-公开

公开(公告)号：US20240152726A1

公开(公告)日：2024-05-09

申请号：US18363487

申请日：2023-08-01

Applicant: QUALCOMM Incorporated

Inventor： Chen FENG , Xiaopeng ZHANG , Shaojie ZHUO , Ramchalam KINATTINKARA RAMAKRISHNAN , Chenzheng SU , Liang SHEN , Zi Wen HAN , Yicheng LIN

IPC: G06N3/04 , G06N3/084

CPC classification number: G06N3/04 , G06N3/084

Abstract: A processor-implemented method for a neural architecture search (NAS) starts by generating an over-parameterized super network having multiple layers. The super network has multiple operator types. Each of the layers includes a largest super kernel corresponding to a search space. The method also includes performing gradient descent to evolve a largest super kernel to a small kernel corresponding to the search space in order to generate a range of kernel encodings. The method further includes identifying a subset of kernel encodings from the range of kernel encodings, for each layer of the super network, based on the gradient descent. The method determines a set of candidate architectures based on the subset of kernel encodings, each of the candidate architectures having a different model size. The method selects a target model, from the set of architectures, based on meeting hardware specifications, and then applies the target model.

Patent Agency Ranking