专利检索 ap:("Andrew T. Forsyth" OR "Dennis R. Bradford" OR "Jonathan C. Hall") AND inv:"Dennis R. Bradford" 第 1 页

1.

发明申请
PROCESSING MEMORY ACCESS INSTRUCTIONS THAT HAVE DUPLICATE MEMORY INDICES 有权
标题翻译：处理存储器访问指令，具有重复的存储器指示

公开(公告)号：US20140095779A1

公开(公告)日：2014-04-03

申请号：US13631378

申请日：2012-09-28

申请人： Andrew T. Forsyth , Dennis R. Bradford , Jonathan C. Hall

发明人： Andrew T. Forsyth , Dennis R. Bradford , Jonathan C. Hall

IPC分类号： G06F12/00 , G06F12/02

CPC分类号： G06F12/00 , G06F3/06 , G06F3/0608 , G06F3/0641 , G06F9/30 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F11/1453 , G06F12/0246

摘要： A method of an aspect includes receiving an instruction indicating a first source packed memory indices, a second source packed data operation mask, and a destination storage location. Memory indices of the packed memory indices are compared with one another. One or more sets of duplicate memory indices are identified. Data corresponding to each set of duplicate memory indices is loaded only once. The loaded data corresponding to each set of duplicate memory indices is replicated for each of the duplicate memory indices in the set. A packed data result in the destination storage location in response to the instruction. The packed data result includes data elements from memory locations that are indicated by corresponding memory indices of the packed memory indices when not blocked by corresponding elements of the packed data operation mask.

摘要翻译： 一方面的方法包括接收指示第一源打包存储器索引的指令，第二源打包数据操作掩码和目的地存储位置。将打包的内存索引的内存索引彼此进行比较。识别一组或多组重复的内存索引。与每组重复存储器索引对应的数据仅加载一次。对于集合中的每个重复存储器索引，复制对应于每组重复存储器索引的加载数据。打包数据导致响应于该指令的目的地存储位置。打包数据结果包括来自存储器位置的数据元素，当不被打包数据操作掩码的相应元素阻塞时，由打包的存储器索引的相应存储器索引指示。

2.

发明申请
APPARATUS AND METHOD FOR EFFICIENT GATHER AND SCATTER OPERATIONS 有权
标题翻译：高效和高效运行的装置和方法

公开(公告)号：US20140095831A1

公开(公告)日：2014-04-03

申请号：US13631071

申请日：2012-09-28

申请人： Edward T. Grochowski , Dennis R. Bradford , George Z. Chrysos , Andrew T. Forsyth , Michael D. Upton , Lisa K. Wu

发明人： Edward T. Grochowski , Dennis R. Bradford , George Z. Chrysos , Andrew T. Forsyth , Michael D. Upton , Lisa K. Wu

IPC分类号： G06F9/30

CPC分类号： G06F9/30036 , G06F9/30018 , G06F9/30043 , G06F9/30145 , G06F9/345 , G06F9/355

摘要： An apparatus and method are described for performing efficient gather operations in a pipelined processor. For example, a processor according to one embodiment of the invention comprises: gather setup logic to execute one or more gather setup operations in anticipation of one or more gather operations, the gather setup operations to determine one or more addresses of vector data elements to be gathered by the gather operations; and gather logic to execute the one or more gather operations to gather the vector data elements using the one or more addresses determined by the gather setup operations.

摘要翻译： 描述了一种用于在流水线处理器中执行有效收集操作的装置和方法。例如，根据本发明的一个实施例的处理器包括：收集设置逻辑，用于在预期一个或多个收集操作中执行一个或多个收集设置操作，所述收集设置操作确定向量数据元素的一个或多个地址聚集在一起聚集; 并收集逻辑以执行一个或多个收集操作以使用由收集设置操作确定的一个或多个地址来收集向量数据元素。

3.

发明申请
METHOD AND APPARATUS FOR EFFICIENTLY MANAGING ARCHITECTURAL REGISTER STATE OF A PROCESSOR 有权
标题翻译：有效管理处理者建筑登记状态的方法和装置

公开(公告)号：US20160179527A1

公开(公告)日：2016-06-23

申请号：US14581535

申请日：2014-12-23

申请人： Jesus Corbal , Dennis R. Bradford , Benjamin C. Chaffin , Taraneh Bahrami , Jonathan C. Hall , Thomas B. Maciukenas , Roger Gramunt , Rohan Sharma

发明人： Jesus Corbal , Dennis R. Bradford , Benjamin C. Chaffin , Taraneh Bahrami , Jonathan C. Hall , Thomas B. Maciukenas , Roger Gramunt , Rohan Sharma

IPC分类号： G06F9/30

CPC分类号： G06F9/30036 , G06F9/30018 , G06F9/30032 , G06F9/30072 , G06F9/30101 , G06F15/8084

摘要： An apparatus and method for efficiently managing the architectural state of a processor. For example, one embodiment of a processor comprises: a source mask register to be logically subdivided into at least a first portion to store a usable portion of a mask value and a second portion to store an indication of whether the usable portion of the mask value has been updated; a control register to store an unusable portion of the mask value; architectural state management logic to read the indication to determine whether the mask value has been updated prior to performing a store operation, wherein if the mask value has been updated, then the architectural state management logic is to read the usable portion of the mask value from the first portion of the source mask register and zero out bits of the unusable portion of the mask value to generate a final mask value to be saved to memory, and wherein if the mask value has not been updated, then the architectural state management logic is to concatenate the usable portion of the mask value with the unusable portion of the mask value read from the control register to generate a final mask value to be saved to memory.

摘要翻译： 一种用于有效管理处理器的架构状态的装置和方法。例如，处理器的一个实施例包括：源屏蔽寄存器，其逻辑地细分为至少第一部分以存储掩模值的可用部分，以及第二部分，用于存储掩模值的可用部分的指示已经升级; 控制寄存器，用于存储掩模值的不可用部分; 架构状态管理逻辑，用于读取指示以确定在执行存储操作之前是否更新了掩码值，其中如果掩码值已被更新，则架构状态管理逻辑将从掩码值的可用部分读取源掩码寄存器的第一部分和掩模值的不可用部分的零输出位，以产生要保存到存储器的最终掩码值，并且其中如果掩码值尚未被更新，则架构状态管理逻辑是将掩模值的可用部分与从控制寄存器读取的掩模值的不可用部分连接，以生成要保存到存储器的最终掩模值。

4.

发明授权
Processors having fully-connected interconnects shared by vector conflict instructions and permute instructions 有权

公开(公告)号：US10678541B2

公开(公告)日：2020-06-09

申请号：US13977126

申请日：2011-12-29

申请人： Andrew Thomas Forsyth , Dennis R. Bradford

发明人： Andrew Thomas Forsyth , Dennis R. Bradford

IPC分类号： G06F9/30 , G06F9/38

摘要： An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.

5.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR JUMPS USING A MASK REGISTER 审中-公开
标题翻译：系统，设备和使用掩码寄存器的JUMPS的方法

公开(公告)号：US20120254593A1

公开(公告)日：2012-10-04

申请号：US13078901

申请日：2011-04-01

申请人： Jesus Corbal San Adrian , Bret Toll , Robert C. Valentine , Milind Baburao Girkar , Andrew Thomas Foryth , George Z. Chrysos , Edward Thomas Grochowski , Dennis R. Bradford

发明人： Jesus Corbal San Adrian , Bret Toll , Robert C. Valentine , Milind Baburao Girkar , Andrew Thomas Foryth , George Z. Chrysos , Edward Thomas Grochowski , Dennis R. Bradford

IPC分类号： G06F9/38 , G06F9/312

CPC分类号： G06F9/324 , G06F9/30018 , G06F9/30058 , G06F9/30094

摘要： Embodiments of systems, apparatuses, and methods for performing a jump instruction in a computer processor are described. In some embodiments, the execution of a blend instruction causes a conditional jump to an address of a target instruction when all of bits of a writemask are zero, wherein the address of the target instruction is calculated using an instruction pointer of the instruction and the relative offset.

摘要翻译： 描述了用于在计算机处理器中执行跳转指令的系统，装置和方法的实施例。在一些实施例中，当写入掩码的所有位都为零时，混合指令的执行导致到目标指令的地址的条件跳转，其中使用指令的指令指针和相关的指令来计算目标指令的地址抵消。

6.

发明申请
Reducing Power Consumption In A Fused Multiply-Add (FMA) Unit Responsive To Input Data Values 有权
标题翻译：降低功率消耗在FUS（FMA）单位响应输入数据值

公开(公告)号：US20140122554A1

公开(公告)日：2014-05-01

申请号：US13664689

申请日：2012-10-31

申请人： Brian J. Hickmann , Dennis R. Bradford , Thomas D. Fletcher

发明人： Brian J. Hickmann , Dennis R. Bradford , Thomas D. Fletcher

IPC分类号： G06F7/60

CPC分类号： G06F7/60 , G06F1/324 , G06F1/3243 , G06F7/48 , G06F7/483 , G06F7/5443 , G06F7/57 , G06F9/30 , G06F9/3001 , G06F2207/3884 , Y02D10/126 , Y02D10/152

摘要： In an embodiment, a fused multiply-add (FMA) circuit is configured to receive a plurality of input data values to perform an FMA instruction on the input data values. The circuit includes a multiplier unit and an adder unit coupled to an output of the multiplier unit, and a control logic to receive the input data values and to reduce switching activity and thus reduce power consumption of one or more components of the circuit based on a value of one or more of the input data values. Other embodiments are described and claimed.

摘要翻译： 在一个实施例中，融合乘法（FMA）电路被配置为接收多个输入数据值以对输入数据值执行FMA指令。电路包括耦合到乘法器单元的输出的乘法器单元和加法器单元，以及用于接收输入数据值并降低开关活动并因此降低电路的一个或多个组件的功耗的控制逻辑，其基于一个或多个输入数据值的值。描述和要求保护其他实施例。

7.

发明申请
SYSTEM, APPARATUS, AND METHOD FOR ALIGNING REGISTERS 审中-公开
标题翻译：系统，装置和对准寄存器的方法

公开(公告)号：US20120254589A1

公开(公告)日：2012-10-04

申请号：US13078868

申请日：2011-04-01

申请人： Jesus Corbal San Adrian , Roger Espasa Sans , Milind Baburao Girkar , Lisa K. Wu , Dennis R. Bradford , Victor W. Lee

发明人： Jesus Corbal San Adrian , Roger Espasa Sans , Milind Baburao Girkar , Lisa K. Wu , Dennis R. Bradford , Victor W. Lee

IPC分类号： G06F9/30 , G06F9/315

CPC分类号： G06F9/30036 , G06F9/30018 , G06F9/30032 , G06F9/30192

摘要： Embodiments of systems, apparatuses, and methods for performing an align instruction in a computer processor are described. In some embodiments, the execution of an align instruction causes the selective storage of data elements of two concatenated sources to be stored in a destination.

摘要翻译： 描述了用于在计算机处理器中执行对准指令的系统，装置和方法的实施例。在一些实施例中，对齐指令的执行导致将两个级联源的数据元素的选择性存储存储在目的地中。

8.

发明申请
PROCESSORS HAVING FULLY-CONNECTED INTERCONNECTS SHARED BY VECTOR CONFLICT INSTRUCTIONS AND PERMUTE INSTRUCTIONS 审中-公开
标题翻译：具有由VECTOR CONFLICT指令和指令说明共享的完全连接的互连的处理程序

公开(公告)号：US20140181466A1

公开(公告)日：2014-06-26

申请号：US13977126

申请日：2011-12-29

申请人： Andrew Thomas Forsyth , Dennis R. Bradford

发明人： Andrew Thomas Forsyth , Dennis R. Bradford

IPC分类号： G06F9/30 , G06F9/38

摘要： An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.

摘要翻译： 一种装置包括解码单元，用于解码置换指令和向量冲突指令。向量执行单元与解码单元耦合并且包括完全连接的互连。完全连接的互连具有至少四个输入以接收至少一个源向量的至少四个对应的数据元素。完全连接的互连至少有四个输出。所述至少四个输入中的每一个与所述至少四个输出中的每一个耦合。所述执行单元还包括与所述至少四个输出耦合的置换指令执行逻辑，并且可操作以响应于所述置换指令来存储第一向量结果。执行单元还包括与至少四个输出耦合的向量冲突指令执行逻辑，并且可操作以响应于向量冲突指令将第二向量结果存储在目的地存储位置。

9.

发明申请
VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF 审中-公开
标题翻译：向导友好指示格式及其执行

公开(公告)号：US20130305020A1

公开(公告)日：2013-11-14

申请号：US13976707

申请日：2011-09-30

申请人： Robert C. Valentine , Jesus Corbal San Adrian , Roger Espasa Sans , Robert D. Cavin , Bret L. Toll , Santiago Galan Duran , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Edward Thomas Grochowski , Jonathan Cannon Hall , Dennis R. Bradford , Elmoustapha Ould-Ahmed-Vall , James C. Abel , Mark Charney , Seth Abraham , Suleyman Sair , Andrew Thomas Forsyth , Lisa Wu , Charles Yount

发明人： Robert C. Valentine , Jesus Corbal San Adrian , Roger Espasa Sans , Robert D. Cavin , Bret L. Toll , Santiago Galan Duran , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Edward Thomas Grochowski , Jonathan Cannon Hall , Dennis R. Bradford , Elmoustapha Ould-Ahmed-Vall , James C. Abel , Mark Charney , Seth Abraham , Suleyman Sair , Andrew Thomas Forsyth , Lisa Wu , Charles Yount

IPC分类号： G06F9/30

CPC分类号： G06F9/30145 , G06F9/3001 , G06F9/30014 , G06F9/30018 , G06F9/30025 , G06F9/30032 , G06F9/30036 , G06F9/30047 , G06F9/30149 , G06F9/30181 , G06F9/30185 , G06F9/30192 , G06F9/34

摘要： A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

摘要翻译： 一种向量友好的指令格式及其执行。根据本发明的一个实施例，处理器被配置为执行指令集。指令集包括向量友好指令格式。向量友好指令格式具有多个字段，包括基本操作字段，修改字段，增加操作字段和数据元素宽度字段，其中第一指令格式支持不同版本的基本操作和不同的扩充操作，基本操作字段，修饰符字段，α字段，β字段和数据元素宽度字段中的不同值，并且其中只有一个不同的值可以被放置在基本操作字段，修饰符字段，在指令流中的第一指令格式的指令的每次出现时的alpha字段，β字段和数据元素宽度字段。

10.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR BLENDING TWO SOURCE OPERANDS INTO A SINGLE DESTINATION USING A WRITEMASK 审中-公开
标题翻译：使用WRITEMASK将两个源操作混合到单个目的地的系统，设备和方法

公开(公告)号：US20120254588A1

公开(公告)日：2012-10-04

申请号：US13078864

申请日：2011-04-01

申请人： Jesus Corbal San Adrian , Bret L. Toll , Robert C. Valentine , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Andrew Thomas Forsyth , Elmoustapha Ould-Ahmed-Vall , Dennis R. Bradford , Lisa K. Wu

发明人： Jesus Corbal San Adrian , Bret L. Toll , Robert C. Valentine , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Andrew Thomas Forsyth , Elmoustapha Ould-Ahmed-Vall , Dennis R. Bradford , Lisa K. Wu

IPC分类号： G06F9/30

CPC分类号： G06F9/30192 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/30043

摘要： Embodiments of systems, apparatuses, and methods for performing a blend instruction in a computer processor are described. In some embodiments, the execution of a blend instruction causes a data element-by-element selection of data elements of first and second source operands using the corresponding bit positions of a writemask as a selector between the first and second operands and storage of the selected data elements into the destination at the corresponding position in the destination.

摘要翻译： 描述了用于在计算机处理器中执行混合指令的系统，装置和方法的实施例。在一些实施例中，混合指令的执行使用作为第一操作数和第二操作数之间的选择器的写入掩码的相应比特位置，逐个元素地选择第一和第二源操作数的数据元素，并存储所选择的数据元素到达目的地的目标位置。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类