专利检索 ap:"Dennis R Bradford" 第 1 页

1.

发明申请
VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF 审中-公开
标题翻译：向导友好指示格式及其执行

公开(公告)号：US20140149724A1

公开(公告)日：2014-05-29

申请号：US14170397

申请日：2014-01-31

申请人： Robert C. Valentine , Jesus Corbal San Adrian , Roger Espasa Sans , Robert D. Cavin , Bret L. Toll , Santiago Galan Duran , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Edward Thomas Grochowski , Jonathan Cannon Hall , Dennis R. Bradford , Elmoustapha Ould-Ahmed-Vall , James C. Abel , Mark Charney , Seth Abraham , Suleyman Sair , Andrew Thomas Forsyth , Lisa Wu , Charles Yount

发明人： Robert C. Valentine , Jesus Corbal San Adrian , Roger Espasa Sans , Robert D. Cavin , Bret L. Toll , Santiago Galan Duran , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Edward Thomas Grochowski , Jonathan Cannon Hall , Dennis R. Bradford , Elmoustapha Ould-Ahmed-Vall , James C. Abel , Mark Charney , Seth Abraham , Suleyman Sair , Andrew Thomas Forsyth , Lisa Wu , Charles Yount

IPC分类号： G06F9/30

CPC分类号： G06F9/30181 , G06F9/3001 , G06F9/30014 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/30047 , G06F9/30145 , G06F9/30149 , G06F9/30185 , G06F9/30192 , G06F9/34

摘要： A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

摘要翻译： 一种向量友好的指令格式及其执行。根据本发明的一个实施例，处理器被配置为执行指令集。指令集包括向量友好指令格式。向量友好指令格式具有多个字段，包括基本操作字段，修改字段，增加操作字段和数据元素宽度字段，其中第一指令格式支持不同版本的基本操作和不同的扩充操作，基本操作字段，修饰符字段，α字段，β字段和数据元素宽度字段中的不同值，并且其中只有一个不同的值可以被放置在基本操作字段，修饰符字段，在指令流中的第一指令格式的指令的每次出现时的alpha字段，β字段和数据元素宽度字段。

2.

发明申请
PROCESSING MEMORY ACCESS INSTRUCTIONS THAT HAVE DUPLICATE MEMORY INDICES 有权
标题翻译：处理存储器访问指令，具有重复的存储器指示

公开(公告)号：US20140095779A1

公开(公告)日：2014-04-03

申请号：US13631378

申请日：2012-09-28

申请人： Andrew T. Forsyth , Dennis R. Bradford , Jonathan C. Hall

发明人： Andrew T. Forsyth , Dennis R. Bradford , Jonathan C. Hall

IPC分类号： G06F12/00 , G06F12/02

CPC分类号： G06F12/00 , G06F3/06 , G06F3/0608 , G06F3/0641 , G06F9/30 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F11/1453 , G06F12/0246

摘要： A method of an aspect includes receiving an instruction indicating a first source packed memory indices, a second source packed data operation mask, and a destination storage location. Memory indices of the packed memory indices are compared with one another. One or more sets of duplicate memory indices are identified. Data corresponding to each set of duplicate memory indices is loaded only once. The loaded data corresponding to each set of duplicate memory indices is replicated for each of the duplicate memory indices in the set. A packed data result in the destination storage location in response to the instruction. The packed data result includes data elements from memory locations that are indicated by corresponding memory indices of the packed memory indices when not blocked by corresponding elements of the packed data operation mask.

摘要翻译： 一方面的方法包括接收指示第一源打包存储器索引的指令，第二源打包数据操作掩码和目的地存储位置。将打包的内存索引的内存索引彼此进行比较。识别一组或多组重复的内存索引。与每组重复存储器索引对应的数据仅加载一次。对于集合中的每个重复存储器索引，复制对应于每组重复存储器索引的加载数据。打包数据导致响应于该指令的目的地存储位置。打包数据结果包括来自存储器位置的数据元素，当不被打包数据操作掩码的相应元素阻塞时，由打包的存储器索引的相应存储器索引指示。

3.

发明申请
Method and apparatus to implement cache-coherent network interfaces 审中-公开
标题翻译：实现高速缓存一致性网络接口的方法和装置

公开(公告)号：US20080052463A1

公开(公告)日：2008-02-28

申请号：US11510021

申请日：2006-08-25

申请人： Nagabhushan Chitlur , Linda J. Rankin , Paul M. Stillwell , Dennis R. Bradford

发明人： Nagabhushan Chitlur , Linda J. Rankin , Paul M. Stillwell , Dennis R. Bradford

IPC分类号： G06F12/00

CPC分类号： G06F12/0813 , G06F12/0831

摘要： A cache-coherent network interface includes registers or buffers addressable by a processor with reference to an address space of the processor. The processor and the cache-coherent network interface both share a common system bus. The registers or buffers are further cacheable into a cache of the processor with reference to the address space.

摘要翻译： 缓存相干网络接口包括参考处理器的地址空间由处理器寻址的寄存器或缓冲器。处理器和高速缓存一致的网络接口都共享一个公共的系统总线。参考地址空间，寄存器或缓冲器可进一步缓存到处理器的高速缓存中。

4.

发明申请
INSTRUCTIONS FOR MERGING MASK PATTERNS 审中-公开
标题翻译：用于合并掩蔽图案的说明

公开(公告)号：US20160041827A1

公开(公告)日：2016-02-11

申请号：US13995944

申请日：2011-12-23

申请人： Jesus Corbal , Matthew J Craighead , Dennis R Bradford , Jonathan C. Hall , Andrew T. Forsyth

发明人： Jesus Corbal , Matthew J Craighead , Dennis R Bradford , Jonathan C. Hall , Andrew T. Forsyth

IPC分类号： G06F9/30

CPC分类号： G06F9/30018 , G06F9/30032 , G06F9/30036

摘要： A method is described that includes fetching an instruction and decoding the instruction. The method further includes fetching a first mask vector from a first mask register space location identified by the instruction. The method further includes fetching a second mask vector from a second mask register space location identified by the instruction. The method also includes executing the instruction by merging the first and second mask vectors into a single data structure and causing the single data structure to be written into a memory location identified by the instruction.

摘要翻译： 描述了一种包括获取指令并解码指令的方法。该方法还包括从由该指令识别的第一屏蔽寄存器空间位置获取第一屏蔽矢量。该方法还包括从由该指令识别的第二屏蔽寄存器空间位置获取第二屏蔽矢量。该方法还包括通过将第一和第二屏蔽矢量合并为单个数据结构并使单个数据结构被写入由该指令识别的存储器位置来执行该指令。

5.

发明申请
Reducing Power Consumption In A Fused Multiply-Add (FMA) Unit Responsive To Input Data Values 有权
标题翻译：降低功率消耗在FUS（FMA）单位响应输入数据值

公开(公告)号：US20140122554A1

公开(公告)日：2014-05-01

申请号：US13664689

申请日：2012-10-31

申请人： Brian J. Hickmann , Dennis R. Bradford , Thomas D. Fletcher

发明人： Brian J. Hickmann , Dennis R. Bradford , Thomas D. Fletcher

IPC分类号： G06F7/60

CPC分类号： G06F7/60 , G06F1/324 , G06F1/3243 , G06F7/48 , G06F7/483 , G06F7/5443 , G06F7/57 , G06F9/30 , G06F9/3001 , G06F2207/3884 , Y02D10/126 , Y02D10/152

摘要： In an embodiment, a fused multiply-add (FMA) circuit is configured to receive a plurality of input data values to perform an FMA instruction on the input data values. The circuit includes a multiplier unit and an adder unit coupled to an output of the multiplier unit, and a control logic to receive the input data values and to reduce switching activity and thus reduce power consumption of one or more components of the circuit based on a value of one or more of the input data values. Other embodiments are described and claimed.

摘要翻译： 在一个实施例中，融合乘法（FMA）电路被配置为接收多个输入数据值以对输入数据值执行FMA指令。电路包括耦合到乘法器单元的输出的乘法器单元和加法器单元，以及用于接收输入数据值并降低开关活动并因此降低电路的一个或多个组件的功耗的控制逻辑，其基于一个或多个输入数据值的值。描述和要求保护其他实施例。

6.

发明申请
SYSTEM, APPARATUS, AND METHOD FOR ALIGNING REGISTERS 审中-公开
标题翻译：系统，装置和对准寄存器的方法

公开(公告)号：US20120254589A1

公开(公告)日：2012-10-04

申请号：US13078868

申请日：2011-04-01

申请人： Jesus Corbal San Adrian , Roger Espasa Sans , Milind Baburao Girkar , Lisa K. Wu , Dennis R. Bradford , Victor W. Lee

发明人： Jesus Corbal San Adrian , Roger Espasa Sans , Milind Baburao Girkar , Lisa K. Wu , Dennis R. Bradford , Victor W. Lee

IPC分类号： G06F9/30 , G06F9/315

CPC分类号： G06F9/30036 , G06F9/30018 , G06F9/30032 , G06F9/30192

摘要： Embodiments of systems, apparatuses, and methods for performing an align instruction in a computer processor are described. In some embodiments, the execution of an align instruction causes the selective storage of data elements of two concatenated sources to be stored in a destination.

摘要翻译： 描述了用于在计算机处理器中执行对准指令的系统，装置和方法的实施例。在一些实施例中，对齐指令的执行导致将两个级联源的数据元素的选择性存储存储在目的地中。

7.

发明申请
PROCESSORS HAVING FULLY-CONNECTED INTERCONNECTS SHARED BY VECTOR CONFLICT INSTRUCTIONS AND PERMUTE INSTRUCTIONS 审中-公开
标题翻译：具有由VECTOR CONFLICT指令和指令说明共享的完全连接的互连的处理程序

公开(公告)号：US20140181466A1

公开(公告)日：2014-06-26

申请号：US13977126

申请日：2011-12-29

申请人： Andrew Thomas Forsyth , Dennis R. Bradford

发明人： Andrew Thomas Forsyth , Dennis R. Bradford

IPC分类号： G06F9/30 , G06F9/38

摘要： An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.

摘要翻译： 一种装置包括解码单元，用于解码置换指令和向量冲突指令。向量执行单元与解码单元耦合并且包括完全连接的互连。完全连接的互连具有至少四个输入以接收至少一个源向量的至少四个对应的数据元素。完全连接的互连至少有四个输出。所述至少四个输入中的每一个与所述至少四个输出中的每一个耦合。所述执行单元还包括与所述至少四个输出耦合的置换指令执行逻辑，并且可操作以响应于所述置换指令来存储第一向量结果。执行单元还包括与至少四个输出耦合的向量冲突指令执行逻辑，并且可操作以响应于向量冲突指令将第二向量结果存储在目的地存储位置。

8.

发明申请
APPARATUS AND METHOD FOR EFFICIENT GATHER AND SCATTER OPERATIONS 有权
标题翻译：高效和高效运行的装置和方法

公开(公告)号：US20140095831A1

公开(公告)日：2014-04-03

申请号：US13631071

申请日：2012-09-28

申请人： Edward T. Grochowski , Dennis R. Bradford , George Z. Chrysos , Andrew T. Forsyth , Michael D. Upton , Lisa K. Wu

发明人： Edward T. Grochowski , Dennis R. Bradford , George Z. Chrysos , Andrew T. Forsyth , Michael D. Upton , Lisa K. Wu

IPC分类号： G06F9/30

CPC分类号： G06F9/30036 , G06F9/30018 , G06F9/30043 , G06F9/30145 , G06F9/345 , G06F9/355

摘要： An apparatus and method are described for performing efficient gather operations in a pipelined processor. For example, a processor according to one embodiment of the invention comprises: gather setup logic to execute one or more gather setup operations in anticipation of one or more gather operations, the gather setup operations to determine one or more addresses of vector data elements to be gathered by the gather operations; and gather logic to execute the one or more gather operations to gather the vector data elements using the one or more addresses determined by the gather setup operations.

摘要翻译： 描述了一种用于在流水线处理器中执行有效收集操作的装置和方法。例如，根据本发明的一个实施例的处理器包括：收集设置逻辑，用于在预期一个或多个收集操作中执行一个或多个收集设置操作，所述收集设置操作确定向量数据元素的一个或多个地址聚集在一起聚集; 并收集逻辑以执行一个或多个收集操作以使用由收集设置操作确定的一个或多个地址来收集向量数据元素。

9.

发明授权
Processors having fully-connected interconnects shared by vector conflict instructions and permute instructions 有权

公开(公告)号：US10678541B2

公开(公告)日：2020-06-09

申请号：US13977126

申请日：2011-12-29

申请人： Andrew Thomas Forsyth , Dennis R. Bradford

发明人： Andrew Thomas Forsyth , Dennis R. Bradford

IPC分类号： G06F9/30 , G06F9/38

摘要： An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.

10.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR CHAINED FUSED MULTIPLY ADD 审中-公开

公开(公告)号：US20180113708A1

公开(公告)日：2018-04-26

申请号：US15299420

申请日：2016-10-20

申请人： JESUS CORBAL , ROBERT VALENTINE , ROMAN S. DUBTSOV , NIKITA A. SHUSTROV , MARK J. CHARNEY , DENNIS R. BRADFORD , MILIND B. GIRKAR , EDWARD T. GROCHOWSKI , THOMAS D. FLETCHER , WARREN E. FERGUSON

发明人： JESUS CORBAL , ROBERT VALENTINE , ROMAN S. DUBTSOV , NIKITA A. SHUSTROV , MARK J. CHARNEY , DENNIS R. BRADFORD , MILIND B. GIRKAR , EDWARD T. GROCHOWSKI , THOMAS D. FLETCHER , WARREN E. FERGUSON

IPC分类号： G06F9/30 , G06F7/544

CPC分类号： G06F9/3001 , G06F7/483 , G06F7/5443 , G06F9/30036 , G06F9/30109 , G06F9/30112 , G06F9/3016

摘要： Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类