-
公开(公告)号:US12056461B2
公开(公告)日:2024-08-06
申请号:US17484845
申请日:2021-09-24
申请人: Intel Corporation
发明人: Martin Langhammer , Dongdong Chen
CPC分类号: G06F7/5443 , G06F7/485 , G06F2207/382 , G06F2207/3824
摘要: An integrated circuit with specialized processing blocks are provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.
-
公开(公告)号:US12033237B2
公开(公告)日:2024-07-09
申请号:US18306033
申请日:2023-04-24
申请人: Intel Corporation
IPC分类号: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/544 , G06F17/15 , G06F17/16 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084
CPC分类号: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/5443 , G06F17/153 , G06F17/16 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F2207/382 , G06F2207/4824
摘要: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to convert elements of a floating-point tensor to convert the floating-point tensor into a fixed-point tensor.
-
公开(公告)号:US20230342111A1
公开(公告)日:2023-10-26
申请号:US18216797
申请日:2023-06-30
申请人: Intel Corporation
发明人: Martin Langhammer , Dongdong Chen , Kevin Hurd
CPC分类号: G06F7/4876 , G06F7/5443 , G06F2207/3824 , G06F2207/382
摘要: An integrated circuit with specialized processing blocks is provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.
-
公开(公告)号:US11669933B2
公开(公告)日:2023-06-06
申请号:US17730364
申请日:2022-04-27
申请人: Intel Corporation
IPC分类号: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/544 , G06F17/15 , G06F17/16 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045
CPC分类号: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/5443 , G06F17/153 , G06F17/16 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F2207/382 , G06F2207/4824
摘要: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to quantize elements of a floating-point tensor to convert the floating-point tensor into a dynamic fixed-point tensor.
-
公开(公告)号:US20190205094A1
公开(公告)日:2019-07-04
申请号:US15857998
申请日:2017-12-29
申请人: Facebook, Inc.
IPC分类号: G06F7/523
CPC分类号: G06F7/523 , G06F7/5443 , G06F2207/382 , G06N3/0481 , G06N3/063 , G06N3/08 , G06N5/022
摘要: The disclosed method may include (1) receiving a precision level of each weight associated with each input of a node of a computational model, (2) identifying, for each weight, one of a plurality of multiplier groups, where each multiplier group may include a plurality of hardware multipliers of a corresponding bit width, and where the corresponding bit width of the plurality of hardware multipliers of the one of the plurality of multiplier groups may be sufficient to multiply the weight by the associated input, and (3) multiplying each weight by its associated input using an available hardware multiplier of the one of the plurality of multiplier groups identified for the weight. Various other processing elements, methods, and systems are also disclosed.
-
公开(公告)号:US10073676B2
公开(公告)日:2018-09-11
申请号:US15272231
申请日:2016-09-21
申请人: Altera Corporation
发明人: Martin Langhammer
CPC分类号: G06F7/485 , G06F7/4876 , G06F7/49957 , G06F7/5443 , G06F9/30014 , G06F17/16 , G06F2207/382 , G06F2207/483
摘要: The present embodiments relate to performing reduced-precision floating-point arithmetic operations using specialized processing blocks with higher-precision floating-point arithmetic circuitry. A specialized processing block may receive four floating-point numbers that represent two single-precision floating-point numbers, each separated into an LSB portion and an MSB portion, or four half-precision floating-point numbers. A first partial product generator may generate a first partial product of first and second input signals, while a second partial product generator may generate a second partial product of third and fourth input signals. A compressor circuit may generate carry and sum vector signals based on the first and second partial products; and circuitry may anticipate rounding and normalization operations by generating in parallel based on the carry and sum vector signals at least two results when performing the single-precision floating-point operation and at least four results when performing the two half-precision floating-point operations.
-
公开(公告)号:US20170322808A1
公开(公告)日:2017-11-09
申请号:US15147642
申请日:2016-05-05
IPC分类号: G06F9/30
CPC分类号: G06F9/30112 , G06F7/38 , G06F9/3001 , G06F9/30014 , G06F9/30189 , G06F2207/382
摘要: Multiple data wordlengths may be supported by a processor through a single data path and/or a single set of registers. For example, the processor may support 16-bit wordlengths and 24-bit wordlengths through a single datapath. For supported data wordlengths that are less than the wordlength of the registers and datapath, the data may be left-aligned within the registers and datapath. The left alignment of data may allow saturation detection in the processor to be performed by examining the same saturation point regardless of the wordlength of the data being operated on. A special saturation mode of the processor may set the lower bits to zero when a configuration register or instruction-bit is set and saturation is detected.
-
公开(公告)号:US09753690B2
公开(公告)日:2017-09-05
申请号:US15151062
申请日:2016-05-10
CPC分类号: G06F7/49936 , G06F7/00 , G06F7/483 , G06F7/49915 , G06F9/30014 , G06F9/30036 , G06F9/30189 , G06F9/3887 , G06F15/78 , G06F15/8053 , G06F2207/382
摘要: A tool for supporting vector operations in a scalar data path. The tool determines a location for a split in the scalar mode configuration to enable vector mode operations. The tool determines the number of coarse shift multiplexers in conflict at a bit position receiving data from both the left half and the right half of the vector mode configuration. The tool duplicates a coarse shift multiplexer in conflict at the bit position receiving data from both the left half and the right half of the vector mode configuration. The tool duplicates an intermediate data signal in conflict at a signal position receiving data from both the left half and the right half of the vector mode configuration. The tool receives a control signal to split the scalar mode configuration and shift the leading zeros across the left half and the right half of the vector mode configuration.
-
公开(公告)号:US20160253152A1
公开(公告)日:2016-09-01
申请号:US15151062
申请日:2016-05-10
IPC分类号: G06F7/499
CPC分类号: G06F7/49936 , G06F7/00 , G06F7/483 , G06F7/49915 , G06F9/30014 , G06F9/30036 , G06F9/30189 , G06F9/3887 , G06F15/78 , G06F15/8053 , G06F2207/382
摘要: A tool for supporting vector operations in a scalar data path. The tool determines a location for a split in the scalar mode configuration to enable vector mode operations. The tool determines the number of coarse shift multiplexers in conflict at a bit position receiving data from both the left half and the right half of the vector mode configuration. The tool duplicates a coarse shift multiplexer in conflict at the bit position receiving data from both the left half and the right half of the vector mode configuration. The tool duplicates an intermediate data signal in conflict at a signal position receiving data from both the left half and the right half of the vector mode configuration. The tool receives a control signal to split the scalar mode configuration and shift the leading zeros across the left half and the right half of the vector mode configuration.
摘要翻译: 用于在标量数据路径中支持向量操作的工具。 该工具确定标量模式配置中拆分的位置,以启用向量模式操作。 该工具确定在向量模式配置的左半部分和右半部分接收数据的位位置处冲突中的粗移位复用器的数量。 该工具在从向量模式配置的左半部分和右半部分接收数据的比特位置处复制冲突中的粗略移位复用器。 该工具在从矢量模式配置的左半部分和右半部分接收数据的信号位置处复制冲突中的中间数据信号。 该工具接收一个控制信号,以分割标量模式配置,并将前导零移动到向量模式配置的左半部分和右半部分。
-
公开(公告)号:US09223753B2
公开(公告)日:2015-12-29
申请号:US13793240
申请日:2013-03-11
CPC分类号: G06F17/10 , G06F7/483 , G06F7/5443 , G06F9/30014 , G06F9/30189 , G06F9/3861 , G06F2207/382 , G06F2207/3828
摘要: A floating point execution unit is capable of selectively repurposing a subset of the significand bits in a floating point value for use as additional exponent bits to dynamically provide an extended range for floating point calculations. A significand field of a floating point operand may be considered to include first and second portions, with the first portion capable of being concatenated with the second portion to represent the significand for a floating point value, or, to provide an extended range, being concatenated with the exponent field of the floating point operand to represent the exponent for a floating point value.
摘要翻译: 浮点执行单元能够选择性地重新排列浮点值中的有效位的子集,以用作附加指数位以动态地提供用于浮点计算的扩展范围。 浮点操作数的有效位域可以被认为包括第一和第二部分,其中第一部分能够与第二部分连接以表示浮点值的有效位数,或者提供扩展的范围,被连接 与浮点运算数的指数字段表示浮点值的指数。
-
-
-
-
-
-
-
-
-