摘要:
Apparatus for performing mixed precision calculations in the floating point unit of a microprocessor from a single instruction opcode. 80-bit floating-point registers (44) may be specified as the source or destination address of a floating-point instruction. When the address range of the destination indicates (26) that a floating point register is addressed, the result of that operation is not rounded to the precision specified by the instruction, but is rounded (58) to extended 80-bit precision and loaded into the floating point register (FP-44). When the address range of the source indicates (26) that an FP register is addressed, the data is loaded from the FP register in extended precision, regardless of the precision specified by the instruction. In this way, real and long-real operations can be made to use extended precision numbers without explicitly specifying that in the opcode.
摘要:
Systems and methods for training a machine learning model. The methods comprise receiving a plurality of first data points, each data point of the first data points being represented in a floating-point representation. The methods further comprise converting the plurality of first data points into a corresponding plurality of second data points. Each of the second data points is represented in a dynamic fixed-point representation. The plurality of second data points may include: for each second data point, the sign component of the corresponding first data point, for each second data point, a dynamic fixed-point mantissa component, and one or more shared fraction components. At least two of the second data points share a value of a shared fraction component of the one or more shared fraction components. The methods further comprise performing integer computations during training of the machine learning model using the second data points.
摘要:
A log circuit for piecewise linear approximation is disclosed. The log circuit identifies an input associated with a logarithm operation to be performed using piecewise linear approximation. The log circuit then identifies a range that the input falls within from various ranges associated with piecewise linear approximation (PLA) equations for the logarithm operation, where the identified range corresponds to one of the PLA equations. The log circuit computes a result of the corresponding PLA equation based on the respective operands of the equation. The log circuit then returns an output associated with the logarithm operation, which is based at least partially on the result of the PLA equation.
摘要:
A round-for-reround mode (preferably in a BID encoded Decimal format) of a floating point instruction prepares a result for later rounding to a variable number of digits by detecting that the least significant digit may be a 0, and if so changing it to 1 when the trailing digits are not all 0. A subsequent reround instruction is then able to round the result to any number of digits at least 2 fewer than the number of digits of the result. An optional embodiment saves a tag indicating the fact that the low order digit of the result is 0 or 5 if the trailing bits are non-zero in a tag field rather than modify the result. Another optional embodiment also saves a half-way-and-above indicator when the trailing digits represent a decimal with a most significant digit having a value of 5. An optional subsequent reround instruction is able to round the result to any number of digits fewer or equal to the number of digits of the result using the saved tags.
摘要:
A microprocessor comprises an instruction pipeline, a shared memory, and first and second arithmetic processing units in the instruction pipeline, each capable of reading or receiving operands from and writing or providing results to the shared memory. The first arithmetic processing unit performs a first portion of a mathematical operation to produce an intermediate result vector that is not a complete, final result of the mathematical operation. The first arithmetic processing unit generates a plurality of non-architectural calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The second arithmetic processing unit performs a second portion of the mathematical operation, in accordance with the calculation control indicators, to produce a complete, final result of the mathematical operation.
摘要:
An instruction to perform a comparison of a first value and a second value is executed. Based on a control of the instruction, a compare function to be performed is determined. The compare function is one of a plurality of compare functions configured for the instruction, and the compare function has a plurality of options for comparison. A compare option based on the first value and the second value is selected from the plurality of options defined for the compare function, and used to compare the first value and the second value. A result of the comparison is then placed in a select location, the result to be used in processing within a computing environment.
摘要:
A microprocessor splits a fused multiply-accumulate operation of the form A*B+C into first and second multiply-accumulate sub-operations to be performed by a multiplier and an adder. The first sub-operation at least multiplies A and B, and conditionally also accumulates C to the partial products of A and B to generate an unrounded nonredundant sum. The unrounded nonredundant sum is stored in memory shared by the multiplier and adder for an indefinite time period, enabling the multiplier and adder to perform other operations unrelated to the multiply-accumulate operation. The second sub-operation conditionally accumulates C to the unrounded nonredundant sum if C is not already incorporated into the value, and then generates a final rounded result.
摘要:
In accordance with some embodiments, a floating point number datapath circuitry, e.g., within an integrated circuit programmable logic device is provided. The datapath circuitry may be used for computing a rounded absolute value of a mantissa of a floating point number. The floating point datapath circuitry may have only a single adder stage for computing a rounded absolute value of a mantissa of the floating point number based on one or more bits of an unrounded mantissa of the floating point number. The unrounded and rounded mantissas may include a sign bit, a sticky bit, a round bit, and/or a least significant bit, and/or other bits. The unrounded mantissa may be in a format that includes negative numbers (e.g., 2's complement) and the rounded mantissa may be in a format that may include a portion of the floating point number represented as a positive number, (e.g., signed magnitude).
摘要:
An arithmetic operation is performed using a first instruction execution unit to generate an intermediate result vector and a plurality of calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The intermediate result vector and the plurality of calculation control indicators are stored in memory external to the instruction execution unit, and later read by a second instruction execution unit to complete the arithmetic operation.
摘要:
A microprocessor comprises an instruction pipeline, a shared memory, and first and second arithmetic processing units in the instruction pipeline, each capable of reading or receiving operands from and writing or providing results to the shared memory. The first arithmetic processing unit performs a first portion of a mathematical operation to produce an intermediate result vector that is not a complete, final result of the mathematical operation. The first arithmetic processing unit generates a plurality of non-architectural calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The second arithmetic processing unit performs a second portion of the mathematical operation, in accordance with the calculation control indicators, to produce a complete, final result of the mathematical operation.