Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Xianwei Zhang"

1.

发明申请
Selecting a Precision Level for Executing a Workload in an Electronic Device 审中-公开

公开(公告)号：US20190310864A1

公开(公告)日：2019-10-10

申请号：US15948795

申请日：2018-04-09

Applicant: Advanced Micro Devices, Inc.

Inventor： Anthony T. Gutierrez , Sergey Blagodurov , Scott A. Moe , Xianwei Zhang , Jieming Yin , Matthew D. Sinclair

IPC: G06F9/445 , G06N3/10 , G06N3/08 , G06N3/04

Abstract: An electronic device includes a controller functional block and a computational functional block. During operation, while the computational functional block executes a test portion of a workload at at least one precision level, the controller functional block monitors a behavior of the computational functional block. Based on the behavior of the computational functional block while executing the test portion of the workload at the at least one precision level, the controller functional block selects a given precision level from among a set of two or more precision levels at which the computational functional block is to execute a remaining portion of the workload. The controller functional block then configures the computational block to execute the remaining portion of the workload at the given precision level.

2.

发明授权
Memory request priority assignment techniques for parallel processors 有权

公开(公告)号：US11507522B2

公开(公告)日：2022-11-22

申请号：US16706421

申请日：2019-12-06

Applicant: Advanced Micro Devices, Inc.

Inventor： Sooraj Puthoor , Kishore Punniyamurthy , Onur Kayiran , Xianwei Zhang , Yasuko Eckert , Johnathan Alsop , Bradford Michael Beckmann

IPC: G06F13/18 , G06F13/16

Abstract: Systems, apparatuses, and methods for implementing memory request priority assignment techniques for parallel processors are disclosed. A system includes at least a parallel processor coupled to a memory subsystem, where the parallel processor includes at least a plurality of compute units for executing wavefronts in lock-step. The parallel processor assigns priorities to memory requests of wavefronts on a per-work-item basis by indexing into a first priority vector, with the index generated based on lane-specific information. If a given event is detected, a second priority vector is generated by applying a given priority promotion vector to the first priority vector. Then, for subsequent wavefronts, memory requests are assigned priorities by indexing into the second priority vector with lane-specific information. The use of priority vectors to assign priorities to memory requests helps to reduce the memory divergence problem experienced by different work-items of a wavefront.

3.

发明授权
Data compression system using base values and methods thereof 有权

公开(公告)号：US11740791B2

公开(公告)日：2023-08-29

申请号：US17497286

申请日：2021-10-08

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Seyed Mohammad Seyedzadehdelcheh , Xianwei Zhang , Bradford Beckmann , Shomit N. Das

IPC: G06F3/06 , G06F12/0875 , G06T1/20

CPC classification number: G06F3/0608 , G06F3/064 , G06F3/0659 , G06F3/0673 , G06F12/0875 , G06F2212/1044 , G06T1/20

Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.

4.

发明授权
Selecting a precision level for executing a workload in an electronic device 有权

公开(公告)号：US11150899B2

公开(公告)日：2021-10-19

申请号：US15948795

申请日：2018-04-09

Applicant: Advanced Micro Devices, Inc.

Inventor： Anthony T. Gutierrez , Sergey Blagodurov , Scott A. Moe , Xianwei Zhang , Jieming Yin , Matthew D. Sinclair

IPC: G06F9/30 , G06F8/71 , G06N3/063 , G06N3/08

Abstract: An electronic device includes a controller functional block and a computational functional block. During operation, while the computational functional block executes a test portion of a workload at at least one precision level, the controller functional block monitors a behavior of the computational functional block. Based on the behavior of the computational functional block while executing the test portion of the workload at the at least one precision level, the controller functional block selects a given precision level from among a set of two or more precision levels at which the computational functional block is to execute a remaining portion of the workload. The controller functional block then configures the computational block to execute the remaining portion of the workload at the given precision level.

5.

发明申请
MEMORY REQUEST PRIORITY ASSIGNMENT TECHNIQUES FOR PARALLEL PROCESSORS 有权

公开(公告)号：US20210173796A1

公开(公告)日：2021-06-10

申请号：US16706421

申请日：2019-12-06

Applicant: Advanced Micro Devices, Inc.

Inventor： Sooraj Puthoor , Kishore Punniyamurthy , Onur Kayiran , Xianwei Zhang , Yasuko Eckert , Johnathan Alsop , Bradford Michael Beckmann

IPC: G06F13/18 , G06F13/16

Abstract: Systems, apparatuses, and methods for implementing memory request priority assignment techniques for parallel processors are disclosed. A system includes at least a parallel processor coupled to a memory subsystem, where the parallel processor includes at least a plurality of compute units for executing wavefronts in lock-step. The parallel processor assigns priorities to memory requests of wavefronts on a per-work-item basis by indexing into a first priority vector, with the index generated based on lane-specific information. If a given event is detected, a second priority vector is generated by applying a given priority promotion vector to the first priority vector. Then, for subsequent wavefronts, memory requests are assigned priorities by indexing into the second priority vector with lane-specific information. The use of priority vectors to assign priorities to memory requests helps to reduce the memory divergence problem experienced by different work-items of a wavefront.

6.

发明授权
GPU cache management based on locality type detection 有权

公开(公告)号：US11487671B2

公开(公告)日：2022-11-01

申请号：US16446119

申请日：2019-06-19

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Xianwei Zhang , John Kalamatianos , Bradford Beckmann

IPC: G06F12/0891 , G06F12/0888 , G06F12/0895 , G06F9/54 , G06F9/38

Abstract: Wavefront loading in a processor is managed and includes monitoring a selected wavefront of a set of wavefronts. Reuse of memory access requests for the selected wavefront is counted. A cache hit rate in one or more caches of the processor is determined based on the counted reuse. Based on the cache hit rate, subsequent memory requests of other wavefronts of the set of wavefronts are modified by including a type of reuse of cache lines in requests to the caches. In the caches, storage of data in the caches is based on the type of reuse indicated by the subsequent memory access requests. Reused cache lines are protected by preventing cache line contents from being replaced by another cache line for a duration of processing the set of wavefronts. Caches are bypassed when streaming access requests are made.

7.

发明授权
Data compression system using base values and methods thereof 有权

公开(公告)号：US11144208B2

公开(公告)日：2021-10-12

申请号：US16724609

申请日：2019-12-23

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： SeyedMohammad Seyedzadehdelcheh , Xianwei Zhang , Bradford Beckmann , Shomit N. Das

IPC: G06K9/36 , G06F3/06 , G06F12/0875 , G06T1/20

Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification