Integrated circuits with machine learning extensions

    公开(公告)号:US12056461B2

    公开(公告)日:2024-08-06

    申请号:US17484845

    申请日:2021-09-24

    申请人: Intel Corporation

    IPC分类号: G06F7/544 G06F7/485

    摘要: An integrated circuit with specialized processing blocks are provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

    INTEGRATED CIRCUITS WITH MACHINE LEARNING EXTENSIONS

    公开(公告)号:US20230342111A1

    公开(公告)日:2023-10-26

    申请号:US18216797

    申请日:2023-06-30

    申请人: Intel Corporation

    IPC分类号: G06F7/487 G06F7/544

    摘要: An integrated circuit with specialized processing blocks is provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

    Reduced floating-point precision arithmetic circuitry

    公开(公告)号:US10073676B2

    公开(公告)日:2018-09-11

    申请号:US15272231

    申请日:2016-09-21

    发明人: Martin Langhammer

    IPC分类号: G06F7/487 G06F17/16

    摘要: The present embodiments relate to performing reduced-precision floating-point arithmetic operations using specialized processing blocks with higher-precision floating-point arithmetic circuitry. A specialized processing block may receive four floating-point numbers that represent two single-precision floating-point numbers, each separated into an LSB portion and an MSB portion, or four half-precision floating-point numbers. A first partial product generator may generate a first partial product of first and second input signals, while a second partial product generator may generate a second partial product of third and fourth input signals. A compressor circuit may generate carry and sum vector signals based on the first and second partial products; and circuitry may anticipate rounding and normalization operations by generating in parallel based on the carry and sum vector signals at least two results when performing the single-precision floating-point operation and at least four results when performing the two half-precision floating-point operations.

    LOW-POWER PROCESSOR WITH SUPPORT FOR MULTIPLE PRECISION MODES

    公开(公告)号:US20170322808A1

    公开(公告)日:2017-11-09

    申请号:US15147642

    申请日:2016-05-05

    IPC分类号: G06F9/30

    摘要: Multiple data wordlengths may be supported by a processor through a single data path and/or a single set of registers. For example, the processor may support 16-bit wordlengths and 24-bit wordlengths through a single datapath. For supported data wordlengths that are less than the wordlength of the registers and datapath, the data may be left-aligned within the registers and datapath. The left alignment of data may allow saturation detection in the processor to be performed by examining the same saturation point regardless of the wordlength of the data being operated on. A special saturation mode of the processor may set the lower bits to zero when a configuration register or instruction-bit is set and saturation is detected.

    SPLITABLE AND SCALABLE NORMALIZER FOR VECTOR DATA
    9.
    发明申请
    SPLITABLE AND SCALABLE NORMALIZER FOR VECTOR DATA 有权
    用于矢量数据的可分离和可扩展的标准化

    公开(公告)号:US20160253152A1

    公开(公告)日:2016-09-01

    申请号:US15151062

    申请日:2016-05-10

    IPC分类号: G06F7/499

    摘要: A tool for supporting vector operations in a scalar data path. The tool determines a location for a split in the scalar mode configuration to enable vector mode operations. The tool determines the number of coarse shift multiplexers in conflict at a bit position receiving data from both the left half and the right half of the vector mode configuration. The tool duplicates a coarse shift multiplexer in conflict at the bit position receiving data from both the left half and the right half of the vector mode configuration. The tool duplicates an intermediate data signal in conflict at a signal position receiving data from both the left half and the right half of the vector mode configuration. The tool receives a control signal to split the scalar mode configuration and shift the leading zeros across the left half and the right half of the vector mode configuration.

    摘要翻译: 用于在标量数据路径中支持向量操作的工具。 该工具确定标量模式配置中拆分的位置,以启用向量模式操作。 该工具确定在向量模式配置的左半部分和右半部分接收数据的位位置处冲突中的粗移位复用器的数量。 该工具在从向量模式配置的左半部分和右半部分接收数据的比特位置处复制冲突中的粗略移位复用器。 该工具在从矢量模式配置的左半部分和右半部分接收数据的信号位置处复制冲突中的中间数据信号。 该工具接收一个控制信号,以分割标量模式配置,并将前导零移动到向量模式配置的左半部分和右半部分。

    Dynamic range adjusting floating point execution unit
    10.
    发明授权
    Dynamic range adjusting floating point execution unit 有权
    动态范围调整浮点执行单元

    公开(公告)号:US09223753B2

    公开(公告)日:2015-12-29

    申请号:US13793240

    申请日:2013-03-11

    摘要: A floating point execution unit is capable of selectively repurposing a subset of the significand bits in a floating point value for use as additional exponent bits to dynamically provide an extended range for floating point calculations. A significand field of a floating point operand may be considered to include first and second portions, with the first portion capable of being concatenated with the second portion to represent the significand for a floating point value, or, to provide an extended range, being concatenated with the exponent field of the floating point operand to represent the exponent for a floating point value.

    摘要翻译: 浮点执行单元能够选择性地重新排列浮点值中的有效位的子集,以用作附加指数位以动态地提供用于浮点计算的扩展范围。 浮点操作数的有效位域可以被认为包括第一和第二部分,其中第一部分能够与第二部分连接以表示浮点值的有效位数,或者提供扩展的范围,被连接 与浮点运算数的指数字段表示浮点值的指数。