Patent search ap:("NVIDIA Corporation") AND inv:"Stuart Oberman" Page 1

1.

发明授权
Generalized acceleration of matrix multiply accumulate operations 有权

公开(公告)号：US11797303B2

公开(公告)日：2023-10-24

申请号：US17351175

申请日：2021-06-17

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

CPC classification number: G06F9/30014 , G06F9/3001 , G06F9/3012 , G06F9/30036 , G06F9/3851 , G06T1/20

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

2.

发明授权
Generalized acceleration of matrix multiply accumulate operations 有权

公开(公告)号：US10884734B2

公开(公告)日：2021-01-05

申请号：US16459191

申请日：2019-07-01

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

3.

发明申请
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS 审中-公开

公开(公告)号：US20180321938A1

公开(公告)日：2018-11-08

申请号：US15826435

申请日：2017-11-29

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06F9/38 , G06T1/20

CPC classification number: G06F9/30014 , G06F9/3001 , G06F9/30036 , G06F9/3012 , G06F9/3851 , G06T1/20

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

4.

发明授权
Generalized acceleration of matrix multiply accumulate operations 有权

公开(公告)号：US11816481B2

公开(公告)日：2023-11-14

申请号：US17890540

申请日：2022-08-18

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06F9/38 , G06T1/20

CPC classification number: G06F9/30014 , G06F9/3001 , G06F9/3012 , G06F9/30036 , G06F9/3851 , G06T1/20

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

5.

发明授权
Generalized acceleration of matrix multiply accumulate operations 有权

公开(公告)号：US11797301B2

公开(公告)日：2023-10-24

申请号：US17141082

申请日：2021-01-04

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

CPC classification number: G06F9/30014 , G06F9/3001 , G06F9/3012 , G06F9/30036 , G06F9/3851 , G06T1/20

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

6.

发明公开
INLINE DATA INSPECTION FOR WORKLOAD SIMPLIFICATION 审中-公开

公开(公告)号：US20230221957A1

公开(公告)日：2023-07-13

申请号：US18112923

申请日：2023-02-22

Applicant: NVIDIA Corporation

Inventor： Jeffrey Michael Pool , Andrew Kerr , John Tran , Ming Y. Siu , Stuart Oberman

IPC: G06F9/30

CPC classification number: G06F9/30043 , G06F9/30021 , G06F9/30145

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

7.

发明申请
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS 有权

公开(公告)号：US20210311734A1

公开(公告)日：2021-10-07

申请号：US17351175

申请日：2021-06-17

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

8.

发明授权
Generalized acceleration of matrix multiply accumulate operations 有权

公开(公告)号：US11816482B2

公开(公告)日：2023-11-14

申请号：US17890706

申请日：2022-08-18

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06F9/38 , G06T1/20

CPC classification number: G06F9/30014 , G06F9/3001 , G06F9/3012 , G06F9/30036 , G06F9/3851 , G06T1/20

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

9.

发明申请
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS 有权

公开(公告)号：US20220405098A1

公开(公告)日：2022-12-22

申请号：US17890706

申请日：2022-08-18

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

10.

发明申请
INLINE DATA INSPECTION FOR WORKLOAD SIMPLIFICATION 审中-公开

公开(公告)号：US20190065195A1

公开(公告)日：2019-02-28

申请号：US15693345

申请日：2017-08-31

Applicant: NVIDIA Corporation

Inventor： Jeffrey Michael Pool , Andrew Kerr , John Tran , Ming Y. Siu , Stuart Oberman

IPC: G06F9/30

Abstract: A method, computer readable medium, and system are disclosed for inline data inspection. The method includes the steps of receiving, by a load/store unit, a load instruction and obtaining, by an inspection circuit that is coupled to the load/store unit, data specified by the load instruction. Additional steps include determining that the data equals zero and transmitting the data and a predicate signal to the load/store unit, wherein the predicate signal indicates that the data equals zero. Alternative additional steps include computing a predicate value based on a comparison between the data and a threshold value and transmitting the data and the predicate value to the load/store unit, wherein the predicate value is asserted when the data is less than the threshold value and is negated when the data is not less than the threshold value.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification