Patent search ap:("Intel Corporation") AND inv:"Mikhail Smelyanskiy" Page 1

1.

发明申请
HARDWARE APPARATUSES AND METHODS TO PREFETCH A MULTIDIMENSIONAL BLOCK OF ELEMENTS FROM A MULTIDIMENSIONAL ARRAY 审中-公开

公开(公告)号：US20190138309A1

公开(公告)日：2019-05-09

申请号：US16004081

申请日：2018-06-08

Applicant: INTEL CORPORATION

Inventor： VICTOR LEE , Mikhail Smelyanskiy , Alexander Heinecke

IPC: G06F9/30 , G06F12/02 , G06F12/0862 , G06F9/345 , G06F12/0875 , G06F9/34 , G06F12/0811

Abstract: Methods and apparatuses relating to a prefetch instruction to prefetch a multidimensional block of elements from a multidimensional array into a cache. In one embodiment, a hardware processor includes a decoder to decode a prefetch instruction to prefetch a multidimensional block of elements from a multidimensional array into a cache, wherein at least one operand of the prefetch instruction is to indicate a system memory address of an element of the multidimensional block of elements, a stride of the multidimensional block of elements, and boundaries of the multidimensional block of elements, and an execution unit to execute the prefetch instruction to generate system memory addresses of the other elements of the multidimensional block of elements, and load the multidimensional block of elements into the cache from the system memory addresses.

2.

发明申请
Texture Unit for General Purpose Computing 审中-公开
Title translation: 通用计算的纹理单位

公开(公告)号：US20150228091A1

公开(公告)日：2015-08-13

申请号：US14693056

申请日：2015-04-22

Applicant: Intel Corporation

Inventor： Victor W. Lee , Mikhail Smelyanskiy , Ganesh S. Dasika , Jose Gonzalez , Jatin Chhugani , Yen-Kuang Chen , Changkyu Kim , Julio Gago , Santiago Galan , Victor Moya Del Barrio

IPC: G06T11/00

CPC classification number: G06T11/001 , G06F17/16 , G06T1/00

Abstract: A texture unit may be used to perform general purpose mathematical computations such as dot products. This enables some general purpose computations and operations to be offloaded from a central processing unit to the texture unit. The texture unit may use linear interpolators in order to perform the dot product calculations.

Abstract translation: 纹理单元可用于执行诸如点积的通用数学计算。这使得一些通用计算和操作能够从中央处理单元卸载到纹理单元。纹理单元可以使用线性内插器来执行点积计算。

3.

发明授权
Optimized compute hardware for machine learning operations 有权

公开(公告)号：US11334796B2

公开(公告)日：2022-05-17

申请号：US16983107

申请日：2020-08-03

Applicant: Intel Corporation

Inventor： Dipankar Das , Roger Gramunt , Mikhail Smelyanskiy , Jesus Corbal , Dheevatsa Mudigere , Naveen K. Mellempudi , Alexander F. Heinecke

IPC: G06F17/16 , G06F9/30 , G06F9/38 , G06F7/544 , G06N3/08 , G06N3/063 , G06N3/04

Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.

4.

发明授权
Sorting data and merging sorted data in an instruction set architecture 有权

公开(公告)号：US10198264B2

公开(公告)日：2019-02-05

申请号：US14969864

申请日：2015-12-15

Applicant: Intel Corporation

Inventor： Asit K. Mishra , Deborah T. Marr , Jong Soo Park , Nadathur Rajagopalan Satish , Mikhail Smelyanskiy , Michael Anderson , Mostofa Ali Patwary , Narayanan Sundaram , Sheng Li

IPC: G06F9/30

Abstract: A processing device includes a sorting module, which adds to each of a plurality of elements a position value of a corresponding position in a register rest resulting in a plurality of transformed elements in corresponding positions. The plurality of elements include a plurality of bits. The sorting module compares each of the plurality of transformed elements to itself and to one another. The sorting module also assigns one of an enabled or disabled indicator to each of the plurality of the transformed elements based on the comparison. The sorting module further counts a number of the enabled indicators assigned to each of the plurality of the transformed elements to generate a sorted sequence of the plurality of elements.

5.

发明授权
Hardware apparatuses and methods to prefetch a multidimensional block of elements from a multidimensional array 有权

公开(公告)号：US09996350B2

公开(公告)日：2018-06-12

申请号：US14583651

申请日：2014-12-27

Applicant: Intel Corporation

Inventor： Victor Lee , Mikhail Smelyanskiy , Alexander Heinecke

IPC: G06F9/30 , G06F9/34 , G06F12/0875 , G06F9/345

CPC classification number: G06F9/30047 , G06F9/30145 , G06F9/34 , G06F9/3455 , G06F12/0207 , G06F12/0811 , G06F12/0862 , G06F12/0875 , G06F2212/452 , G06F2212/6026

Abstract: Methods and apparatuses relating to a prefetch instruction to prefetch a multidimensional block of elements from a multidimensional array into a cache. In one embodiment, a hardware processor includes a decoder to decode a prefetch instruction to prefetch a multidimensional block of elements from a multidimensional array into a cache, wherein at least one operand of the prefetch instruction is to indicate a system memory address of an element of the multidimensional block of elements, a stride of the multidimensional block of elements, and boundaries of the multidimensional block of elements, and an execution unit to execute the prefetch instruction to generate system memory addresses of the other elements of the multidimensional block of elements, and load the multidimensional block of elements into the cache from the system memory addresses.

6.

发明授权
Methods, systems and apparatus to optimize sparse matrix applications 有权

公开(公告)号：US09720663B2

公开(公告)日：2017-08-01

申请号：US14750635

申请日：2015-06-25

Applicant: Intel Corporation

Inventor： Hongbo Rong , Jong Soo Park , Mikhail Smelyanskiy , Geoff Lowney

IPC: G06F9/45 , G06F9/445 , G06F17/16

CPC classification number: G06F8/41 , G06F8/443 , G06F8/4434 , G06F8/4435 , G06F9/44521 , G06F17/16

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to optimize sparse matrix execution. An example disclosed apparatus includes a context former to identify a matrix function call from a matrix function library, the matrix function call associated with a sparse matrix, a pattern matcher to identify an operational pattern associated with the matrix function call, and a code generator to associate a function data structure with the matrix function call exhibiting the operational pattern, the function data structure stored external to the matrix function library, and facilitate a runtime link between the function data structure and the matrix function call.

7.

发明申请
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS 有权

公开(公告)号：US20210019631A1

公开(公告)日：2021-01-21

申请号：US16983107

申请日：2020-08-03

Applicant: Intel Corporation

Inventor： Dipankar Das , Roger Gramunt , Mikhail Smelyanskiy , Jesus Corbal , Dheevatsa Mudigere , Naveen K. Mellempudi , Alexander F. Heinecke

IPC: G06N3/08 , G06N3/063 , G06N3/04 , G06F17/16 , G06F9/30

Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.

8.

发明申请
VIRTUAL VECTOR PROCESSING 审中-公开

公开(公告)号：US20190026158A1

公开(公告)日：2019-01-24

申请号：US15872762

申请日：2018-01-16

Applicant: Intel Corporation

Inventor： Anthony Nguyen , Engin Ipek , Victor Lee , Daehyun Kim , Mikhail Smelyanskiy

IPC: G06F9/50 , G06F15/80

Abstract: Methods and apparatus to provide virtualized vector processing are described. In one embodiment, one or more operations corresponding to a virtual vector request are distributed to one or more processor cores for execution.

9.

发明申请
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS 审中-公开

公开(公告)号：US20180322390A1

公开(公告)日：2018-11-08

申请号：US15869564

申请日：2018-01-12

Applicant: Intel Corporation

Inventor： Dipankar Das , Roger Gramunt , Mikhail Smelyanskiy , Jesus Corbal , Dheevatsa Mudigere , Naveen K. Mellempudi , Alexander F. Heinecke

IPC: G06N3/08 , G06F17/16 , G06N3/04 , G06N3/063

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

10.

发明授权
Texture unit for general purpose computing 有权
Title translation: 用于通用计算的纹理单元

公开(公告)号：US09076254B2

公开(公告)日：2015-07-07

申请号：US14054933

申请日：2013-10-16

Applicant: Intel Corporation

Inventor： Victor W. Lee , Mikhail Smelyanskiy , Ganesh S. Dasika , Jose Gonzalez , Jatin Chhugani , Yen-Kuang Chen , Changkyu Kim , Julio Gago , Santiago Galan , Victor Moya Del Barrio

IPC: G09G5/00 , G06T11/00 , G06F17/16 , G06T1/00

CPC classification number: G06T11/001 , G06F17/16 , G06T1/00

Abstract: A texture unit may be used to perform general purpose mathematical computations such as dot products. This enables some general purpose computations and operations to be offloaded from a central processing unit to the texture unit. The texture unit may use linear interpolators in order to perform the dot product calculations.

Abstract translation: 纹理单元可用于执行诸如点积的通用数学计算。这使得一些通用计算和操作能够从中央处理单元卸载到纹理单元。纹理单元可以使用线性内插器来执行点积计算。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification