Patent search ap:("Intel Corporation") AND inv:"Yuval Yosef" Page 1

1.

发明授权
Apparatus and method for low-latency invocation of accelerators 有权

公开(公告)号：US10083037B2

公开(公告)日：2018-09-25

申请号：US15281944

申请日：2016-09-30

Applicant: Intel Corporation

Inventor： Oren Ben-Kiki , Ilan Pardo , Robert Valentine , Eliezer Weissmann , Dror Markovich , Yuval Yosef

IPC: G06F9/38 , G06F9/30 , G06F11/07 , G06F9/54 , G06F12/0875

CPC classification number: G06F9/3802 , G06F9/3004 , G06F9/30043 , G06F9/30076 , G06F9/30101 , G06F9/30145 , G06F9/3016 , G06F9/384 , G06F9/3877 , G06F9/3879 , G06F9/3881 , G06F9/54 , G06F11/0721 , G06F11/0724 , G06F11/0772 , G06F12/0875 , G06F2212/452

Abstract: An apparatus and method are described for providing low-latency invocation of accelerators. For example, a processor according to one embodiment comprises: a plurality of simultaneous multithreading (SMT) cores, at least one shared cache circuit to be shared among the SMT cores, and at least one L2 cache circuit to store both instructions and data. The processor further comprises a communication interconnect circuit including a PCIe circuit to communicatively couple one or more of the SMT cores to an accelerator device, the PCIe circuit to provide the accelerator device access to resources of the processor including the at least one shared cache circuit. The processor further comprises a memory access circuit to identify an accelerator context save/restore region in a memory determined by an accelerator context save/restore value, the accelerator context save/restore region to store an accelerator context state.

2.

发明申请
Instruction Set Architecture-Based Inter-Sequencer Communications With A Heterogeneous Resource 审中-公开
Title translation: 指令集基于异构资源的基于架构的间隔符间通信

公开(公告)号：US20150070368A1

公开(公告)日：2015-03-12

申请号：US14541933

申请日：2014-11-14

Applicant: Intel Corporation

Inventor： Hong Wang , John Shen , Hong Jiang , Richard Hankins , Per Hammarlund , Dion Rodgers , Gautham Chinya , Baiju Patel , Shiv Kaushik , Bryant Bigbee , Gad Sheaffer , Yoav Talgam , Yuval Yosef , James P. Held

IPC: G06F9/30 , G06T1/20

CPC classification number: G06F9/30145 , G06F9/30 , G06F9/30181 , G06F9/3877 , G06F9/3879 , G06T1/20

Abstract: In one embodiment, the present invention includes a method for directly communicating between an accelerator and an instruction sequencer coupled thereto, where the accelerator is a heterogeneous resource with respect to the instruction sequencer. An interface may be used to provide the communication between these resources. Via such a communication mechanism a user-level application may directly communicate with the accelerator without operating system support. Further, the instruction sequencer and the accelerator may perform operations in parallel. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，本发明包括一种用于在加速器和与其耦合的指令定序器之间直接通信的方法，其中加速器是相对于指令定序器的异构资源。可以使用接口来提供这些资源之间的通信。通过这种通信机制，用户级应用可以直接与加速器进行通信，而无需操作系统支持。此外，指令定序器和加速器可以并行地执行操作。描述和要求保护其他实施例。

3.

发明申请
APPARATUS AND METHOD FOR A HYBRID LATENCY-THROUGHPUT PROCESSOR 审中-公开

公开(公告)号：US20190196838A1

公开(公告)日：2019-06-27

申请号：US16289075

申请日：2019-02-28

Applicant: Intel Corporation

Inventor： Oren Ben-Kiki , Yuval Yosef , Ilan Pardo , Dror Markovich

IPC: G06F9/38 , G06F15/80 , G06F9/46 , G06F9/30

CPC classification number: G06F9/3851 , G06F9/30079 , G06F9/3836 , G06F9/3855 , G06F9/3861 , G06F9/3877 , G06F9/46 , G06F15/7867 , G06F15/7892 , G06F15/80

Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.

4.

发明授权
Memory address collision detection of ordered parallel threads with bloom filters 有权
Title translation: 带有绽放滤波器的有序并行线程的内存地址冲突检测

公开(公告)号：US09542193B2

公开(公告)日：2017-01-10

申请号：US13730704

申请日：2012-12-28

Applicant: Intel Corporation

Inventor： Enrique De Lucas , Pedro Marcuello , Oren Ben-Kiki , Ilan Pardo , Yuval Yosef

IPC: G06F9/38

CPC classification number: G06F9/30043 , G06F9/3009 , G06F9/3834 , G06F9/3851 , G06F9/3857 , G06F9/3867

Abstract: A semiconductor chip is described having a load collision detection circuit comprising a first bloom filter circuit. The semiconductor chip has a store collision detection circuit comprising a second bloom filter circuit. The semiconductor chip has one or more processing units capable of executing ordered parallel threads coupled to the load collision detection circuit and the store collision detection circuit. The load collision detection circuit and the store collision detection circuit is to detect younger stores for load operations of said threads and younger loads for store operations of said threads.

Abstract translation: 描述了一种具有负载碰撞检测电路的半导体芯片，该电路包括第一起爆滤波器电路。半导体芯片具有存储冲突检测电路，该电路包括第二盛盛滤波器电路。该半导体芯片具有能够执行与负载碰撞检测电路和存储冲突检测电路耦合的有序并行线程的一个或多个处理单元。负载碰撞检测电路和存储碰撞检测电路是检测较年轻的存储器用于所述线程和较小负载的负载操作，用于所述线程的存储操作。

5.

发明授权
Apparatus and method for a hybrid latency-throughput processor 有权
Title translation: 用于混合延迟吞吐量处理器的装置和方法

公开(公告)号：US09417873B2

公开(公告)日：2016-08-16

申请号：US13730055

申请日：2012-12-28

Applicant: INTEL CORPORATION

Inventor： Oren Ben-Kiki , Yuval Yosef , Ilan Pardo , Dror Markovich

IPC: G06F15/78 , G06F9/30 , G06F9/38 , G06F9/46

CPC classification number: G06F9/3851 , G06F9/30079 , G06F9/3836 , G06F9/3855 , G06F9/3861 , G06F9/3877 , G06F9/46 , G06F15/7867 , G06F15/7892 , G06F15/80

Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.

Abstract translation: 描述了用于在处理设备上执行延迟优化的执行逻辑和吞吐量优化的执行逻辑的装置和方法。例如，根据一个实施例的处理器包括：执行第一类型的程序代码的等待时间优化的执行逻辑; 吞吐量优化执行逻辑以执行第二类型的程序代码，其中所述第一类型的程序代码和所述第二类型的程序代码被设计用于相同的指令集架构; 识别过程中的第一类型的程序代码和第二类型的程序代码的逻辑，并且将用于执行的第一类型的程序代码分配在延迟优化的执行逻辑和第二类型的程序代码上以便在吞吐量 - 优化的执行逻辑。

6.

发明授权
Apparatus and method for a hybrid latency-throughput processor 有权

公开(公告)号：US10664284B2

公开(公告)日：2020-05-26

申请号：US16289075

申请日：2019-02-28

Applicant: Intel Corporation

Inventor： Oren Ben-Kiki , Yuval Yosef , Ilan Pardo , Dror Markovich

IPC: G06F9/38 , G06F9/30 , G06F9/46 , G06F15/80 , G06F15/78

Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.

7.

发明授权
Memory address collision detection of ordered parallel threads with bloom filters 有权

公开(公告)号：US10101999B2

公开(公告)日：2018-10-16

申请号：US15403101

申请日：2017-01-10

Applicant: INTEL CORPORATION

Inventor： Enrique De Lucas , Pedro Marcuello , Oren Ben-Kiki , Ilan Pardo , Yuval Yosef

IPC: G06F9/30 , G06F9/38

Abstract: A semiconductor chip is described having a load collision detection circuit comprising a first bloom filter circuit. The semiconductor chip has a store collision detection circuit comprising a second bloom filter circuit. The semiconductor chip has one or more processing units capable of executing ordered parallel threads coupled to the load collision detection circuit and the store collision detection circuit. The load collision detection circuit and the store collision detection circuit is to detect younger stores for load operations of said threads and younger loads for store operations of said threads.

8.

发明授权
Instruction set architecture-based inter-sequencer communications with a heterogeneous resource 有权
Title translation: 与异构资源的指令集基于架构的间隔器通信

公开(公告)号：US09588771B2

公开(公告)日：2017-03-07

申请号：US13791298

申请日：2013-03-08

Applicant: Intel Corporation

Inventor： Hong Wang , John Shen , Hong Jiang , Richard Hankins , Per Hammarlund , Dion Rodgers , Gautham Chinya , Baiju Patel , Shiv Kaushik , Bryant Bigbee , Gad Sheaffer , Yoav Talgam , Yuval Yosef , James P. Held

IPC: G06F9/38 , G06F9/30 , G06T1/20

CPC classification number: G06F9/30145 , G06F9/30 , G06F9/30181 , G06F9/3877 , G06F9/3879 , G06T1/20

Abstract: In one embodiment, the present invention includes a method for directly communicating between an accelerator and an instruction sequencer coupled thereto, where the accelerator is a heterogeneous resource with respect to the instruction sequencer. An interface may be used to provide the communication between these resources. Via such a communication mechanism a user-level application may directly communicate with the accelerator without operating system support. Further, the instruction sequencer and the accelerator may perform operations in parallel. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，本发明包括一种用于在加速器和与其耦合的指令定序器之间直接通信的方法，其中加速器相对于指令定序器是异质资源。可以使用接口来提供这些资源之间的通信。通过这种通信机制，用户级应用可以直接与加速器进行通信，而无需操作系统支持。此外，指令定序器和加速器可以并行地执行操作。描述和要求保护其他实施例。

9.

发明授权
Apparatus and method for low-latency invocation of accelerators 有权
Title translation: 用于低延迟调用加速器的装置和方法

公开(公告)号：US09361116B2

公开(公告)日：2016-06-07

申请号：US13729915

申请日：2012-12-28

Applicant: Intel Corporation

Inventor： Oren Ben-Kiki , Ilan Pardo , Robert Valentine , Eliezer Weissmann , Dror Markovich , Yuval Yosef

IPC: G06F9/30 , G06F9/38 , G06F11/07 , G06F9/54

CPC classification number: G06F9/3802 , G06F9/3004 , G06F9/30043 , G06F9/30076 , G06F9/30101 , G06F9/30145 , G06F9/3016 , G06F9/384 , G06F9/3877 , G06F9/3879 , G06F9/3881 , G06F9/54 , G06F11/0721 , G06F11/0724 , G06F11/0772 , G06F12/0875 , G06F2212/452

Abstract: An apparatus and method are described for providing low-latency invocation of accelerators. For example, a processor according to one embodiment comprises: a command register for storing command data identifying a command to be executed; a result register to store a result of the command or data indicating a reason why the command could not be executed; execution logic to execute a plurality of instructions including an accelerator invocation instruction to invoke one or more accelerator commands; and one or more accelerators to read the command data from the command register and responsively attempt to execute the command identified by the command data.

Abstract translation: 描述了一种用于提供加速器的低延迟调用的装置和方法。例如，根据一个实施例的处理器包括：命令寄存器，用于存储标识要执行的命令的命令数据; 用于存储命令结果的结果寄存器或表示无法执行命令的原因的数据; 执行逻辑以执行包括调用一个或多个加速器命令的加速器调用指令的多个指令; 以及一个或多个加速器，用于从命令寄存器读取命令数据，并且响应地尝试执行由命令数据识别的命令。

10.

发明授权
Apparatus and method for low-latency invocation of accelerators 有权

公开(公告)号：US10089113B2

公开(公告)日：2018-10-02

申请号：US15282082

申请日：2016-09-30

Applicant: Intel Corporation

Inventor： Oren Ben-Kiki , Ilan Pardo , Robert Valentine , Eliezer Weissmann , Dror Markovich , Yuval Yosef

IPC: G06F9/38 , G06F9/30 , G06F11/07 , G06F9/54 , G06F12/0875

Abstract: An apparatus and method are described for providing low-latency invocation of accelerators. For example, a system according to one embodiment comprises: a processor includes a plurality of simultaneous multithreading (SMT) cores, at least one shared cache circuit to be shared among two or more of the SMT cores; and at least one of the SMT cores including at least one level 2 (L2) cache circuit to store both instructions and data and communicatively coupled to the instruction cache circuit and the data cache circuit, a communication interconnect circuit including a peripheral component interconnect express (PCIe) circuit to communicatively couple one or more of the SMT cores to an accelerator device and a memory access circuit to identify an accelerator context save/restore region in a memory responsive to a context save/restore value, the accelerator context save/restore region to share an accelerator context state.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification