-
公开(公告)号:US12223427B2
公开(公告)日:2025-02-11
申请号:US18325744
申请日:2023-05-30
Applicant: Intel Corporation
Inventor: Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag , Jeremie Dreyfuss , Amit Bleiweiss , Tomer Schwartz , Raanan Yonatan Yehezkel Rohekar , Michael Behar , Amitai Armon , Uzi Sarel
Abstract: In an example, an apparatus comprises a plurality of execution units comprising and logic, at least partially including hardware logic, to receive a plurality of data inputs for training a neural network, wherein the data inputs comprise training data and weights inputs; represent the data inputs in a first form; and represent the weight inputs in a second form. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20240119255A1
公开(公告)日:2024-04-11
申请号:US18392761
申请日:2023-12-21
Applicant: Intel Corporation
Inventor: Yaniv Fais , Moshe Maor
Abstract: An example apparatus to perform a convolution on an input tensor includes a parameters generator to: generate a horizontal hardware execution parameter for a horizontal dimension of the input tensor based on a kernel parameter and a layer parameter; and generate a vertical hardware execution parameter for a vertical dimension of the input tensor based on the kernel parameter and the layer parameter; an accelerator interface to configure a hardware accelerator circuitry based on the horizontal and vertical hardware execution parameters; a horizontal Iterator controller to determine when the hardware accelerator circuitry completes the first horizontal iteration of the convolution; and a vertical Iterator controller to determine when the hardware accelerator circuitry completes the first vertical iteration of the convolution.
-
3.
公开(公告)号:US20220237850A1
公开(公告)日:2022-07-28
申请号:US17669126
申请日:2022-02-10
Applicant: Intel Corporation
Inventor: Uzi Sarel , Ehud Cohen , Tomer Schwartz , Amitai Armon , Yahav Shadmiy , Itamar Ben-Ari , Amit Bleiweiss , Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag , Michael Behar , Guy Jacob , Gal Leibovich , Jeremie Dreyfuss
Abstract: In an example, an apparatus comprises a plurality of execution units; and logic, at least partially including hardware logic, to determine a sub-graph of a network that can be executed in a frequency domain and apply computations in the sub-graph in the frequency domain. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20220076118A1
公开(公告)日:2022-03-10
申请号:US17404153
申请日:2021-08-17
Applicant: Intel Corporation
Inventor: Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag , Jeremie Dreyfuss , Amit Bleiweiss , Tomer Schwartz , Raanan Yonatan Yehezkel Rohekar , Michael Behar , Amitai Armon , Uzi Sarel
Abstract: In an example, an apparatus comprises a plurality of execution units comprising and logic, at least partially including hardware logic, to receive a plurality of data inputs for training a neural network, wherein the data inputs comprise training data and weights inputs; represent the data inputs in a first form; and represent the weight inputs in a second form. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11087206B2
公开(公告)日:2021-08-10
申请号:US15581045
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Tomer Schwartz , Ehud Cohen , Uzi Sarel , Amitai Armon , Yaniv Fais , Lev Faivishevsky , Amit Bleiweiss , Yahav Shadmiy , Jacob Subag
Abstract: A mechanism is described for facilitating memory handling and data management in machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting multiple tables associated with multiple neural networks at multiple autonomous machines, where each of the multiple tables include an index. The method may further include combining the multiple tables and multiple indexes associated with the multiple tables into a single table and a single index, respectively, where the single table is communicated to the multiple autonomous machines to allow simultaneous processing of one or more portions of the single table using one or more memory devices and one or more processors of one or more of the multiple autonomous machines.
-
公开(公告)号:US20180314933A1
公开(公告)日:2018-11-01
申请号:US15499899
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Amit Bleiweiss , Lev Faivishevsky , Tomer Schwartz , Yaniv Fais , Jacob Subag
CPC classification number: G06N5/003 , G06N3/0445 , G06N3/0481 , G06N3/063 , G06N99/005
Abstract: In an example, an apparatus comprises a plurality of execution units and logic, at least partially including hardware logic, to implement training of a deep tree application at a data center. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180314931A1
公开(公告)日:2018-11-01
申请号:US15499896
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Uzi Sarel , Ehud Cohen , Tomer Schwartz , Amitai Armon , Yahav Shadmiy , Amit Bleiweiss , Gal Leibovich , Jeremie Dreyfuss , Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag
CPC classification number: G06T1/20 , G06F9/30014 , G06F9/30025 , G06F9/30043 , G06N3/00
Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to expose embedded cast operations in at least one of a load instruction or a store instruction; determine a target precision level for the cast operations; and load the cast operations at the target precision level. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180314926A1
公开(公告)日:2018-11-01
申请号:US15581045
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Tomer Schwartz , Ehud Cohen , Uzi Sarel , Amitai Armon , Yaniv Fais , Lev Faivishevsky , Amit Bleiweiss , Yahav Shadmiy , Jacob Subag
Abstract: A mechanism is described for facilitating memory handling and data management in machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting multiple tables associated with multiple neural networks at multiple autonomous machines, where each of the multiple tables include an index. The method may further include combining the multiple tables and multiple indexes associated with the multiple tables into a single table and a single index, respectively, where the single table is communicated to the multiple autonomous machines to allow simultaneous processing of one or more portions of the single table using one or more memory devices and one or more processors of one or more of the multiple autonomous machines.
-
公开(公告)号:US12223413B2
公开(公告)日:2025-02-11
申请号:US18392761
申请日:2023-12-21
Applicant: Intel Corporation
Inventor: Yaniv Fais , Moshe Maor
Abstract: An example apparatus to perform a convolution on an input tensor includes a parameters generator to: generate a horizontal hardware execution parameter for a horizontal dimension of the input tensor based on a kernel parameter and a layer parameter; and generate a vertical hardware execution parameter for a vertical dimension of the input tensor based on the kernel parameter and the layer parameter; an accelerator interface to configure a hardware accelerator circuitry based on the horizontal and vertical hardware execution parameters; a horizontal Iterator controller to determine when the hardware accelerator circuitry completes the first horizontal iteration of the convolution; and a vertical Iterator controller to determine when the hardware accelerator circuitry completes the first vertical iteration of the convolution.
-
公开(公告)号:US20250045560A1
公开(公告)日:2025-02-06
申请号:US18922038
申请日:2024-10-21
Applicant: Intel Corporation
Inventor: Yaniv Fais , Moshe Maor
Abstract: An example apparatus to perform a convolution on an input tensor includes a parameters generator to: generate a horizontal hardware execution parameter for a horizontal dimension of the input tensor based on a kernel parameter and a layer parameter; and generate a vertical hardware execution parameter for a vertical dimension of the input tensor based on the kernel parameter and the layer parameter; an accelerator interface to configure a hardware accelerator circuitry based on the horizontal and vertical hardware execution parameters; a horizontal Iterator controller to determine when the hardware accelerator circuitry completes the first horizontal iteration of the convolution; and a vertical Iterator controller to determine when the hardware accelerator circuitry completes the first vertical iteration of the convolution.
-
-
-
-
-
-
-
-
-