-
1.
公开(公告)号:US20240361797A1
公开(公告)日:2024-10-31
申请号:US18140107
申请日:2023-04-27
Applicant: International Business Machines Corporation
Inventor: Geoffrey Burr , Milos Stanisavljevic , Yasuteru Kohda
CPC classification number: G06F1/03 , G06F7/5443 , G06N3/048
Abstract: Special-purpose digital-compute hardware for fully-programmable look-up-tables is provided. In one aspect, a system for implementing a continuous function by piecewise linear approximation includes: at least one memory programmatically loaded with an indexed table of slope/intercept values of linear segments along a gradient of the continuous function approximating a plurality of contiguous ranges of the continuous function; at least one Bin ID logic having data registers programmatically loaded with bin-threshold values corresponding to the plurality of contiguous ranges defining a series of arbitrarily-spaced bins; and a Fused-Multiply-Add circuit configured to multiply an incoming data element by a slope value and add an intercept value from the indexed table of slope/intercept values selected based on the bin-threshold values. Comparators in the Bin ID logic can be configured to compare the incoming data-element with the bin-threshold values. A method for implementing a continuous function by piecewise linear approximation is also provided.
-
公开(公告)号:US20240281212A1
公开(公告)日:2024-08-22
申请号:US18588604
申请日:2024-02-27
Applicant: Achronix Semiconductor Corporation
Inventor: Christopher C. LaFrieda , Virantha N. Ekanayake
CPC classification number: G06F7/575 , G06F1/03 , G06F7/5045 , H03K19/20 , H03K19/21
Abstract: A four-input lookup table (“LUT4”) is modified to operate in a first mode as an ordinary LUT4 and in a second mode as a 1-bit adder providing a sum output and a carry output. A six-input lookup table (“LUT6”) is modified to operate in a first mode as an ordinary LUT6 with a single output and in a second mode as a 2-bit adder providing a sum output and a carry output. Both possible results for the two different possible carry inputs can be determined and selected between when the carry input is available, implementing a 2-bit carry-select adder when in the second mode and retaining the ability to operate as an ordinary LUT6 in the first mode. Using the novel LUT6 design in a circuit chip fabric allows a 2-bit adder slice to be built that efficiently makes use of the LUT6 without requiring additional logic blocks.
-
公开(公告)号:US20240241693A1
公开(公告)日:2024-07-18
申请号:US18426504
申请日:2024-01-30
Applicant: Intel Corporation
CPC classification number: G06F7/483 , G06F1/03 , G06F7/4873 , G06F7/4988 , G06F7/544
Abstract: A processor to facilitate execution of a single-precision floating point operation on an operand is disclosed. The processor includes one or more execution units, each having a plurality of floating point units to execute one or more instructions to perform the single-precision floating point operation on the operand, including performing a floating point operation on an exponent component of the operand; and performing a floating point operation on a mantissa component of the operand, comprising dividing the mantissa component into a first sub-component and a second sub-component, determining a result of the floating point operation for the first sub-component and determining a result of the floating point operation for the second sub-component, and returning a result of the floating point operation.
-
公开(公告)号:US12039288B2
公开(公告)日:2024-07-16
申请号:US17072692
申请日:2020-10-16
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ihor Vasyltsov , Wooseok Chang , Youngnam Hwang
CPC classification number: G06F7/4988 , G06F1/03 , G06F17/10
Abstract: A processor-implemented data processing method includes: normalizing input data of an activation function comprising a division operation; determining dividend data corresponding to a dividend of the division operation by reading, from a memory, a value of a first lookup table addressed by the normalized input data; determining divisor data corresponding to a divisor of the division operation by accumulating the dividend data; and determining output data of the activation function corresponding to an output of the division operation obtained by reading, from the memory, a value of a second lookup table addressed by the dividend data and the divisor data.
-
公开(公告)号:US12032911B2
公开(公告)日:2024-07-09
申请号:US17144695
申请日:2021-01-08
Applicant: Nice Ltd.
Inventor: Stephen Lauber
IPC: G06F40/289 , G06F1/03 , G06N20/00 , G10L15/26
CPC classification number: G06F40/289 , G06F1/03 , G06N20/00 , G10L15/26
Abstract: A system and method for training and using a text embedding model may include creating structured phrases from an input text; creating turn input samples from the input text, each turn input sample based on only or consisting of input from a single turn within the text and being formed by removing structure from structured phrases; and training an embedding model using the structured phrases and turn input samples. Call input samples may be created based on input from more than one turn within the text. At each level of resolution (e.g. phrase, speaker, call), a different level of resolution may be used to create input samples. At inference an embedding may be based on a weighted combination of the sub-terms within an input phrase, each weight being based on an inverse document frequency measure for the sub-term associated with the weight.
-
6.
公开(公告)号:US20240211537A1
公开(公告)日:2024-06-27
申请号:US18433974
申请日:2024-02-06
Applicant: Micron Technology, Inc.
Inventor: Fa-Long Luo
IPC: G06F17/16 , G06F1/03 , G06F12/02 , G06F17/14 , G11C11/4094
CPC classification number: G06F17/16 , G06F1/03 , G06F12/0207 , G06F17/147 , G11C11/4094
Abstract: Video processing matrix operations within a memory fabric, including converting a memory array into a matrix fabric for discrete cosine transform (DCT) matrix transformations and performing DCT matrix operations therein. For example, DCT matrix-matrix multiplication operations are performed within a memory device that includes a matrix fabric and matrix multiplication unit (MMU). Matrix-matrix multiplication operations may be obtained using separate matrix-vector products. The matrix fabric may use a crossbar construction of resistive elements. Each resistive element stores a level of impedance that represents the corresponding matrix coefficient value. The crossbar connectivity can be driven with an electrical signal representing the input vector as an analog voltage. The resulting signals can be converted from analog voltages to a digital values by an MMU to yield a vector-matrix product. In some cases, the MMU may additionally perform various other logical operations within the digital domain.
-
公开(公告)号:US11967119B2
公开(公告)日:2024-04-23
申请号:US17643837
申请日:2021-12-12
Applicant: ZHEJIANG DAHUA TECHNOLOGY CO., LTD.
Inventor: Pan Yu , Zhuqing Zhu , Qi Chen , Wei Fang , Yinchang Yang
CPC classification number: G06T9/00 , G06F1/03 , H03M7/3079 , H03M7/6011
Abstract: The present disclosure relates to systems and methods for coding. The methods may include receiving at least two contexts, for each of the at least two contexts, obtaining at least one coding parameter corresponding to the context from at least one lookup table, determining a probability interval value corresponding to the context based on a previous probability interval value and the at least one coding parameter, determining a normalized probability interval value corresponding to the context by performing a normalization operation on the probability interval value, determining a probability interval lower limit corresponding to the context based on a previous probability interval lower limit and the at least one coding parameter, determining a normalized probability interval lower limit corresponding to the context by performing the normalization operation on the probability interval lower limit, and outputting at least one byte based on the normalized probability interval lower limit.
-
公开(公告)号:US11948086B2
公开(公告)日:2024-04-02
申请号:US18305297
申请日:2023-04-21
Applicant: Google LLC
Inventor: Rahul Nagarajan , Lifeng Nai , George Kurian , Hema Hariharan
Abstract: Methods, systems, and apparatus, including computer-readable media, are described for performing neural network computations using a system configured to implement a neural network on a hardware circuit. The system includes a host that receives a batch of inputs to a neural network layer. Each of the inputs is stored in a memory location identified by an address. The system identifies one or more duplicate addresses in a listing of addresses for one or more inputs. For each duplicate address: the system generates a unique identifier that identifies the duplicate address in the listing of addresses. The system (i) obtains first inputs from memory locations identified by addresses corresponding to the unique identifiers and (ii) generates an output of the layer from the obtained first inputs.
-
9.
公开(公告)号:US20240045722A1
公开(公告)日:2024-02-08
申请号:US18488674
申请日:2023-10-17
Applicant: NVIDIA Corporation
Inventor: Ravi P. Singh , Ching-Yu Hung , Jagadeesh Sankaran , Ahmad Itani , Yen-Te Shih
CPC classification number: G06F9/5027 , G06F7/76 , G06F1/03 , G06F9/5077
Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
-
公开(公告)号:US20230421629A1
公开(公告)日:2023-12-28
申请号:US17850522
申请日:2022-06-27
Applicant: Dell Products L.P.
Inventor: Ofir Ezrielev , Jehuda Shemer
Abstract: Methods and systems for managing distribution of inference models throughout a distributed system are disclosed. To manage distribution of inference models, a system may include a data aggregator and one or more data collectors. The data aggregator may obtain a threshold, the threshold indicating an acceptable inference error rate for an inference model. The data aggregator may obtain an inference model based on the threshold by training an inference model, performing a lookup in an inference model lookup table, or via other methods. The data aggregator may optimize the inference model to determine the minimum quantity of computing resources consumed by an inference model in order to generate inferences accurate within the threshold. In order to do so, the data aggregator may simulate the operation of more computationally-costly inference models and less computationally-costly inference models.
-
-
-
-
-
-
-
-
-