Patent search ap:("QUALCOMM Incorporated") AND inv:"Gregory Michael Wright" Page 1

1.

发明授权
Replay of partially executed instruction blocks in a processor-based system employing a block-atomic execution model 有权

公开(公告)号：US11188336B2

公开(公告)日：2021-11-30

申请号：US15252323

申请日：2016-08-31

Applicant: QUALCOMM Incorporated

Inventor： Gregory Michael Wright

IPC: G06F9/30 , G06F9/38

Abstract: Replay of partially executed instruction blocks in a processor-based system employing a block-atomic execution model is disclosed. In one aspect, a partial replay controller is provided in a processor(s) of a central processing unit (CPU). If an instruction is detected in the instruction block associated with a potential architectural state modification, or an exception occurs during execution of instructions, the instruction block is re-executed. During re-execution of the instruction block, the partial replay controller is configured to record produced results from load/store instructions. Thus, if an exception occurs during re-execution of the instruction block, previously recorded produced results for the executed load/store instructions before the exception occurred are replayed during re-execution of the instruction block after the exception is resolved. Thus, execution of instructions leading up to side-effect operations in the instruction block can be deterministically repeated with previously produced results, without repeating the side-effects.

2.

发明授权
Providing variable interpretation of usefulness indicators for memory tables in processor-based systems 有权

公开(公告)号：US10725782B2

公开(公告)日：2020-07-28

申请号：US15701926

申请日：2017-09-12

Applicant: QUALCOMM Incorporated

Inventor： Anil Krishna , Yongseok Yi , Eric Rotenberg , Vignyan Reddy Kothinti Naresh , Gregory Michael Wright

IPC: G06F12/00 , G06F9/38 , G06N3/063 , G06N5/02 , G06F12/123 , G06N20/00

Abstract: Providing variable interpretation of usefulness indicators for memory tables in processor-based systems is disclosed. In one aspect, a memory system comprises a memory table providing multiple memory table entries, each including a usefulness indicator. A memory controller of the memory system comprises a global polarity indicator representing how the usefulness indicator for each memory table entry is interpreted and updated by the memory controller. If the global polarity indicator is set, the memory controller interprets a value of each usefulness indicator as directly corresponding to the usefulness of the corresponding memory table entry. Conversely, if the global polarity indicator is not set, the polarity is reversed such that the memory controller interprets the usefulness indicator value as inversely corresponding to the usefulness of the corresponding memory table entry. In this manner, the interpretation and updating of usefulness indicators by the memory controller can be varied using the global polarity indicator.

3.

发明申请
PROVIDING COHERENT MERGING OF COMMITTED STORE QUEUE ENTRIES IN UNORDERED STORE QUEUES OF BLOCK-BASED COMPUTER PROCESSORS 有权

公开(公告)号：US20170091102A1

公开(公告)日：2017-03-30

申请号：US14863577

申请日：2015-09-24

Applicant: QUALCOMM Incorporated

Inventor： Gregory Michael Wright

IPC: G06F12/08 , G06F12/12

CPC classification number: G06F12/0833 , G06F9/3834 , G06F9/3855 , G06F12/0855 , G06F12/12 , G06F12/123 , G06F12/128 , G06F2212/621 , G06F2212/69 , G06F2212/70

Abstract: Providing coherent merging of committed store queue entries in unordered store queues of block-based computer processors is disclosed. In one aspect, a block-based computer processor provides a merging logic circuit communicatively coupled to an unordered store queue and cache memory. The merging logic circuit is configured to select a first store queue entry in the unordered store queue, and read its memory address, an age indicator, and a data value. The age indicator and the data value are stored in merged data bytes within a merged data buffer. The merging logic circuit then locates a remaining store queue entry having a memory address identical to the first selected store queue entry, and reads its age indicator and data value. Based on the age indicator and one or more age indicators of the merged data bytes within the merged data buffer, the data value is merged into the merged data buffer.

4.

发明申请
PROVIDING EFFICIENT HANDLING OF BRANCH DIVERGENCE IN VECTORIZABLE LOOPS BY VECTOR-PROCESSOR-BASED DEVICES 审中-公开

公开(公告)号：US20200065098A1

公开(公告)日：2020-02-27

申请号：US16107136

申请日：2018-08-21

Applicant: QUALCOMM Incorporated

Inventor： Hadi Parandeh Afshar , Eric Rotenberg , Gregory Michael Wright

IPC: G06F9/30 , G06F9/48 , G06F9/52 , G06F1/06

Abstract: Providing efficient handling of branch divergence in vectorizable loops by vector-processor-based devices is disclosed. In some aspects, a vector-processor-based device provides a plurality of processing elements (PEs) coupled to a scheduler circuit comprising a clock cycle threshold and a mask register comprising a plurality of bits corresponding to a plurality of loop iterations of a vectorizable loop to be executed. The scheduler circuit initiates a first execution interval, during which loop iterations of the vectorizable loop are assigned to PEs for parallel execution. If a loop iteration's execution time exceeds the clock cycle threshold, the scheduler circuit sets a mask register bit corresponding to the loop iteration indicating that the loop iteration is incomplete, and defers its execution. After the first execution interval is complete, the scheduler circuit initiates a second execution interval, during which incomplete loop iterations indicated by the mask register are executed in parallel by the PEs.

5.

发明申请
PROVIDING RECONFIGURABLE FUSION OF PROCESSING ELEMENTS (PEs) IN VECTOR-PROCESSOR-BASED DEVICES 审中-公开

公开(公告)号：US20200012618A1

公开(公告)日：2020-01-09

申请号：US16028072

申请日：2018-07-05

Applicant: QUALCOMM Incorporated

Inventor： Hadi Parandeh Afshar , Amrit Panda , Eric Rotenberg , Gregory Michael Wright

IPC: G06F15/80 , G06F15/78 , G06F9/30

Abstract: Providing reconfigurable fusion of processing elements (PEs) in vector-processor-based devices is disclosed. In this regard, a vector-processor-based device provides a vector processor including a plurality of PEs and a decode/control circuit. The decode/control circuit receives an instruction block containing a vectorizable loop comprising a loop body. The decode/control circuit determines how many PEs of the plurality of PEs are required to execute the loop body, and reconfigures the plurality of PEs into one or more fused PEs, each including the determined number of PEs required to execute the loop body. The plurality of PEs, reconfigured into one or more fused PEs, then executes one or more loop iterations of the loop body. Some aspects further include a PE communications link interconnecting the plurality of PEs, to enable communications between PEs of a fused PE and communications of inter-iteration data dependencies between PEs without requiring vector register file access operations.

6.

发明申请
ENABLING PARALLEL MEMORY ACCESSES BY PROVIDING EXPLICIT AFFINE INSTRUCTIONS IN VECTOR-PROCESSOR-BASED DEVICES 审中-公开

公开(公告)号：US20190384606A1

公开(公告)日：2019-12-19

申请号：US16012347

申请日：2018-06-19

Applicant: QUALCOMM Incorporated

Inventor： Amrit Panda , Eric Rotenberg , Hadi Parandeh Afshar , Gregory Michael Wright

IPC: G06F9/345 , G06F9/38 , G06F8/41

Abstract: Enabling parallel memory accesses by providing explicit affine instructions in vector-processor-based devices is disclosed. In this regard, a vector-processor-based device implementing a block-based dataflow instruction set architecture (ISA) includes a decoder circuit configured to provide an affine instruction that specifies a base parameter indicating a base value B, a stride parameter indicating a stride interval value S, and a count parameter indicating a count value C. The decoder circuit of the vector-processor-based device decodes the affine instruction, and generates an output stream comprising one or more output values, wherein a count of the output values of the output stream equals the count value C. Using an index X where 0≤X

7.

发明申请
REPLAY OF PARTIALLY EXECUTED INSTRUCTION BLOCKS IN A PROCESSOR-BASED SYSTEM EMPLOYING A BLOCK-ATOMIC EXECUTION MODEL 审中-公开

公开(公告)号：US20170185408A1

公开(公告)日：2017-06-29

申请号：US15252323

申请日：2016-08-31

Applicant: QUALCOMM Incorporated

Inventor： Gregory Michael Wright

IPC: G06F9/30

CPC classification number: G06F9/30181 , G06F9/30043 , G06F9/3832 , G06F9/3861

Abstract: Replay of partially executed instruction blocks in a processor-based system employing a block-atomic execution model is disclosed. In one aspect, a partial replay controller is provided in a processor(s) of a central processing unit (CPU). If an instruction is detected in the instruction block associated with a potential architectural state modification, or an exception occurs during execution of instructions, the instruction block is re-executed. During re-execution of the instruction block, the partial replay controller is configured to record produced results from load/store instructions. Thus, if an exception occurs during re-execution of the instruction block, previously recorded produced results for the executed load/store instructions before the exception occurred are replayed during re-execution of the instruction block after the exception is resolved. Thus, execution of instructions leading up to side-effect operations in the instruction block can be deterministically repeated with previously produced results, without repeating the side-effects.

8.

发明申请
MANAGING ALLOCATION OF PHYSICAL REGISTERS IN A BLOCK-BASED INSTRUCTION SET ARCHITECTURE (ISA), AND RELATED APPARATUSES AND METHODS 审中-公开
Title translation: 在基于块的指令集架构（ISA）中管理物理寄存器的分配以及相关设备和方法

公开(公告)号：US20160179532A1

公开(公告)日：2016-06-23

申请号：US14578913

申请日：2014-12-22

Applicant: QUALCOMM Incorporated

Inventor： Gregory Michael Wright

IPC: G06F9/30 , G06F12/08

CPC classification number: G06F12/0875 , G06F9/30123 , G06F9/3836 , G06F9/384 , G06F9/3857 , G06F9/3859 , G06F12/084 , G06F2212/314 , G06F2212/452

Abstract: Managing allocation of physical registers in a block-based instruction set architecture (ISA), and related apparatuses and methods, are disclosed. In one aspect, an apparatus provides an instruction processing circuit communicatively coupled to multiple physical registers. The instruction processing circuit includes a register rename map that comprises an association between at least one architectural register and at least one of the multiple physical registers. The instruction processing circuit further comprises an in-use indicator set associated with the register rename map, the in-use indicator set indicative of an in-use physical register among the multiple physical registers. The instruction processing circuit is configured to copy the in-use indicator set to an output in-use indicator set, and modify the output in-use indicator set upon detection of a block-based write instruction to mark the in-use physical register as unused.

Abstract translation: 公开了在基于块的指令集体系结构（ISA）中管理物理寄存器的分配以及相关的装置和方法。一方面，一种装置提供通信地耦合到多个物理寄存器的指令处理电路。指令处理电路包括寄存器重命名映射，其包括至少一个架构寄存器与多个物理寄存器中的至少一个之间的关联。所述指令处理电路还包括与所述寄存器重命名映射相关联的使用中的指示符集合，所述指示集合指示所述多个物理寄存器中的使用中的物理寄存器。指令处理电路被配置为将使用中指示符集合复制到输出使用中指示符集合，并且在检测到基于块的写入指令时修改输出使用中指示符，以将使用中的物理寄存器标记为没用过。

9.

发明授权
Providing multi-element multi-vector (MEMV) register file access in vector-processor-based devices 有权

公开(公告)号：US11048509B2

公开(公告)日：2021-06-29

申请号：US16000580

申请日：2018-06-05

Applicant: QUALCOMM Incorporated

Inventor： Hadi Parandeh Afshar , Amrit Panda , Eric Rotenberg , Gregory Michael Wright

IPC: G06F9/30 , G06F15/78 , G06F15/80

Abstract: Providing multi-element multi-vector (MEMV) register file access in vector-processor-based devices is disclosed. In this regard, a vector-processor-based device includes a vector processor comprising multiple processing elements (PEs) communicatively coupled via a corresponding plurality of channels to a vector register file comprising a plurality of memory banks. The vector processor provides a direct memory access (DMA) controller that is configured to receive a plurality of vectors that each comprise a plurality of vector elements representing operands for processing a loop iteration. The DMA controller arranges the vectors in the vector register file such that, for each group of vectors to be accessed in parallel, vector elements for each vector are stored consecutively, but corresponding vector elements of consecutive vectors are stored in different memory banks of the vector register file. As a result, multiple elements of multiple vectors may be accessed with a single vector register file access operation.

10.

发明授权
Providing memory dependence prediction in block-atomic dataflow architectures 有权

公开(公告)号：US10684859B2

公开(公告)日：2020-06-16

申请号：US15269254

申请日：2016-09-19

Applicant: QUALCOMM Incorporated

Inventor： Chen-Han Ho , Gregory Michael Wright

IPC: G06F9/30 , G06F9/38

Abstract: Providing memory dependence prediction in block-atomic dataflow architectures is provided, in one aspect, la a memory dependence prediction circuit. The memory dependence prediction circuit comprises a predictor table configured to store multiple predictor table entries, each comprising a store instruction identifier, a block reach set, and a load set. Using this data, the memory dependence prediction circuit determines, upon a fetch of an instruction block by an execution pipeline, whether the instruction block contains store instructions that reach dependent load instructions. If so, the store instructions are marked as having dependent load instructions to wake. In some aspects, the memory dependence prediction circuit is configured to determine whether the instruction block contains dependent load instructions reached by store instructions. If so, the memory dependence prediction circuit delays execution of the dependent load instructions.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification