Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Michael Estlick"

21.

发明授权
Masked multi-lane instruction memory fault handling using fast and slow execution paths 有权

公开(公告)号：US11847463B2

公开(公告)日：2023-12-19

申请号：US16585973

申请日：2019-09-27

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Kai Troester , Scott Thomas Bingham , John M. King , Michael Estlick , Erik Swanson , Robert Weidner

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3861 , G06F9/30036 , G06F9/30038 , G06F9/30043 , G06F9/3887 , G06F9/30018

Abstract: A processor includes a load/store unit and an execution pipeline to execute an instruction that represents a single-instruction-multiple-data (SIMD) operation, and which references a memory block storing operand data for one or more lanes of a plurality of lanes and a mask vector indicating which lanes of a plurality of lanes are enabled and which are disabled for the operation. The execution pipeline executes an instruction in a first execution mode unless a memory fault is generated during execution of the instruction in the first execution mode. In response to the memory fault, the execution pipeline re-executes the instruction in a second execution mode. In the first execution mode, a single load operation is attempted to access the memory block via the load/store unit. In the second execution mode, a separate load operation is performed by the load/store unit for each enabled lane of the plurality of lanes prior to executing the SIMD operation.

22.

发明授权
Differential pipeline delays in a coprocessor 有权

公开(公告)号：US11709681B2

公开(公告)日：2023-07-25

申请号：US15837974

申请日：2017-12-11

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Jay Fleischman , Michael Estlick , Michael Christopher Sedmak , Erik Swanson , Sneha V. Desai

IPC: G06F9/38

CPC classification number: G06F9/3867 , G06F9/3836

Abstract: A coprocessor such as a floating-point unit includes a pipeline that is partitioned into a first portion and a second portion. A controller is configured to provide control signals to the first portion and the second portion of the pipeline. A first physical distance traversed by control signals propagating from the controller to the first portion of the pipeline is shorter than a second physical distance traversed by control signals propagating from the controller to the second portion of the pipeline. A scheduler is configured to cause a physical register file to provide a first subset of bits of an instruction to the first portion at a first time. The physical register file provides a second subset of the bits of the instruction to the second portion at a second time subsequent to the first time.

23.

发明申请
THREAD FORWARD PROGRESS AND/OR QUALITY OF SERVICE 有权

公开(公告)号：US20230034933A1

公开(公告)日：2023-02-02

申请号：US17390149

申请日：2021-07-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Michael Estlick , Erik Swanson , Eric Dixon

IPC: G06F9/48 , G06F9/38

Abstract: Methods, systems, and apparatuses provide support for allowing thread forward progress in a processing system and that improves quality of service. One system includes a processor; a bus coupled to the processor; a memory coupled to the processor via the bus; and a floating point unit coupled to the processor via the bus, wherein floating point unit comprises hardware control logic operative to: store for each thread, by a scheduler of the floating point unit, a counter; increase, by the scheduler, a value of the counter for each thread corresponding to a thread when at least one source ready operation exist for the thread; compare, by the scheduler, the value of the counter to a predetermined threshold; and make other threads ineligible to be picked by the scheduler when the counter is greater than or equal to the predetermined threshold.

24.

发明申请
APPARATUS AND METHODS EMPLOYING A SHARED READ PORT REGISTER FILE 有权

公开(公告)号：US20230034072A1

公开(公告)日：2023-02-02

申请号：US17389838

申请日：2021-07-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Michael Estlick , Erik Swanson , Eric Dixon , Todd Baumgartner

IPC: G06F9/38 , G06F9/30

Abstract: In some implementations, a processor includes a plurality of parallel instruction pipes, a register file includes at least one shared read port configured to be shared across multiple pipes of the plurality of parallel instruction pipes. Control logic controls multiple parallel instruction pipes to read from the at least one shared read port. In certain examples, the at least one shared register file read port is coupled as a single read port for one of the parallel instruction pipes and as a shared register file read port for a plurality of other parallel instruction pipes.

25.

发明授权
Setting values of portions of registers based on bit values 有权

公开(公告)号：US11451241B2

公开(公告)日：2022-09-20

申请号：US15842027

申请日：2017-12-14

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Erik Swanson , Sneha V. Desai , Michael Estlick

IPC: G06F12/08 , H03M7/20 , G06F16/16 , G06F16/903 , G06F12/0891

Abstract: A processor employs a set of bits to indicate values of portions of registers of a register file. In response to a specified instruction indicating an expected change of instruction types to be executed, the processor sets one or more of the bits and, for subsequent instructions, interprets corresponding portions of the registers as having a specified value (e.g., zero). By employing the set of bits to set the values of the register portions, rather than setting the individual portions of the registers to the specified value, the processor conserves processor resources (e.g., power) when the processor transitions between executing instructions of different types.

26.

发明授权
Register renaming after a non-pickable scheduler queue 有权

公开(公告)号：US11281466B2

公开(公告)日：2022-03-22

申请号：US16660495

申请日：2019-10-22

Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC

Inventor： Arun A. Nair , Michael Estlick , Erik Swanson , Sneha V. Desai , Donglin Ji

IPC: G06F9/30 , G06F9/38

Abstract: A floating point unit includes a non-pickable scheduler queue (NSQ) that offers a load operation concurrently with a load store unit retrieving load data for an operand that is to be loaded by the load operation. The floating point unit also includes a renamer that renames architectural registers used by the load operation and allocates physical register numbers to the load operation in response to receiving the load operation from the NSQ. The floating point unit further includes a set of pickable scheduler queues that receive the load operation from the renamer and store the load operation prior to execution. A physical register file is implemented in the floating point unit and a free list is used to store physical register numbers of entries in the physical register file that are available for allocation.

27.

发明授权
Computer-based square root and division operations 有权

公开(公告)号：US09910638B1

公开(公告)日：2018-03-06

申请号：US15247416

申请日：2016-08-25

Applicant: Advanced Micro Devices, Inc.

Inventor： Hanbing Liu , John Kelley , Michael Estlick , Erik Swanson , Jay Fleischman

IPC: G06F7/38 , G06F7/552 , G06F7/535

CPC classification number: G06F7/5525 , G06F7/535 , G06F2207/5523

Abstract: Square root operations in a computer processor are disclosed. A first iteration for calculating partial results of a square root operation is performed in a larger number of cycles than remaining iterations. The first iteration requires calculation of a first digit that is larger than the subsequent digits. The first iteration thus requires multiplication of values that are larger than corresponding values for the subsequent other digits. By splitting the first digit into two parts, the required multiplications can be performed in less time than if the first digit were not split. Performing these multiplications in less time reduces the total delay for clock cycles associated with the first digit calculations, which increases the possible clock frequency allowed. A multiply-and-accumulate unit that performs either packed-single operations or double-precision operations may be used, along with a combined division/square root unit for simultaneous execution of division and square root operations.

28.

发明申请
COMPUTER-BASED SQUARE ROOT AND DIVISION OPERATIONS 有权

公开(公告)号：US20180060039A1

公开(公告)日：2018-03-01

申请号：US15247416

申请日：2016-08-25

Applicant: Advanced Micro Devices, Inc.

Inventor： Hanbing Liu , John Kelley , Michael Estlick , Erik Swanson , Jay Fleischman

IPC: G06F7/552 , G06F7/535

CPC classification number: G06F7/5525 , G06F7/535 , G06F2207/5523

Abstract: Square root operations in a computer processor are disclosed. A first iteration for calculating partial results of a square root operation is performed in a larger number of cycles than remaining iterations. The first iteration requires calculation of a first digit that is larger than the subsequent digits. The first iteration thus requires multiplication of values that are larger than corresponding values for the subsequent other digits. By splitting the first digit into two parts, the required multiplications can be performed in less time than if the first digit were not split. Performing these multiplications in less time reduces the total delay for clock cycles associated with the first digit calculations, which increases the possible clock frequency allowed. A multiply-and-accumulate unit that performs either packed-single operations or double-precision operations may be used, along with a combined division/square root unit for simultaneous execution of division and square root operations.

29.

发明申请
PROCESSOR AND METHODS FOR FLOATING POINT REGISTER ALIASING 审中-公开
Title translation: 浮点注入器的处理器和方法

公开(公告)号：US20150121040A1

公开(公告)日：2015-04-30

申请号：US14523660

申请日：2014-10-24

Applicant: Advanced Micro Devices, Inc.

Inventor： Robert E. Weidner , Jay E. Fleischman , Michael C. Sedmak , Michael Estlick , Richard McGowen, II , Emil Talpes

IPC: G06F9/30

CPC classification number: G06F9/3013 , G06F9/30036 , G06F9/30112 , G06F9/3017 , G06F9/384

Abstract: Methods, devices, and systems for accessing packed registers are presented. A state of the packed registers may be tracked and it may be determined whether the register is directly accessible based on the state. If the register is not directly accessible, an action may be performed which allows the register to be accessed directly. The action may include injecting at least one uop for reorganizing the physical storage of the register such that it is directly accessible. The action may include aligning the data with the least significant bit of a physical register or otherwise aligning the data with the datapath. The action may also include changing the state of the packed registers.

Abstract translation: 介绍了访问打包寄存器的方法，设备和系统。可以跟踪打包寄存器的状态，并且可以基于状态确定寄存器是否可直接访问。如果寄存器不可直接访问，则可以执行允许直接访问寄存器的动作。该动作可以包括至少注入一个uop来重新组织寄存器的物理存储器，使得它可以直接访问。该动作可以包括将数据与物理寄存器的最低有效位对准，或者使数据与数据通路对准。该动作还可以包括改变打包寄存器的状态。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification