Patent search ap:"Doron Orenstein" Page 1

1.

发明授权
Unpacking packed data in multiple lanes 有权

公开(公告)号：US09086872B2

公开(公告)日：2015-07-21

申请号：US12494667

申请日：2009-06-30

Applicant: Asaf Hargil , Doron Orenstein

Inventor： Asaf Hargil , Doron Orenstein

IPC: G06F15/00 , G06F15/76 , G06F9/30

CPC classification number: G06F9/30145 , G06F9/30032 , G06F9/30036

Abstract: Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand.

2.

发明申请
COMPRESSED INSTRUCTION FORMAT 审中-公开
Title translation: 压缩指令格式

公开(公告)号：US20140310505A1

公开(公告)日：2014-10-16

申请号：US14307468

申请日：2014-06-17

Applicant: Robert Valentine , Doron Orenstein , Brett L. Toll

Inventor： Robert Valentine , Doron Orenstein , Brett L. Toll

IPC: G06F9/30

CPC classification number: G06F9/30 , G06F9/30145 , G06F9/30149 , G06F9/3017 , G06F9/30174 , G06F9/30178 , G06F9/30185 , G06F9/3816 , G06F9/382

Abstract: A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.

Abstract translation: 一种解码可变长度指令集中的指令的技术。在一个实施例中，描述了指令编码，其中支持遗留，现在和将来的指令集扩展，并且提供增加的功能，而不扩展代码大小，并且在一些情况下减少代码大小。

3.

发明授权
Vector shuffle instructions operating on multiple lanes each having a plurality of data elements using a common set of per-lane control bits 有权
Title translation: 在多个通道上操作的矢量洗牌指令，每个通道具有使用公共的每通道控制位的多个数据元素

公开(公告)号：US08078836B2

公开(公告)日：2011-12-13

申请号：US11967211

申请日：2007-12-30

Applicant: Zeev Sperber , Robert Valentine , Benny Eitan , Doron Orenstein

Inventor： Zeev Sperber , Robert Valentine , Benny Eitan , Doron Orenstein

IPC: G06F15/16

CPC classification number: G06F9/30032 , G06F9/30036 , G06F9/3885 , G06F9/3887

Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

Abstract translation: 描述车道内向量随机操作。在一个实施例中，混洗指令指定每通道控制位，源操作数和目的地操作数的字段，这些操作数具有相应的通道，每个通道被划分为多个数据元素的相应部分。根据每通道控制位，从源操作数的每个通道的相应部分中选择数据元素的集合。这些集合的元素被复制到目标操作数的每个通道的相应部分中的指定字段。混洗指令的另一实施例还指定第二源操作数，所有操作数具有被划分为多个数据元素的相应通道。根据每通道控制位选择的集合包含来自第一源操作数的每个通道部分的数据元素和来自第二源操作数的每个对应通道部分的数据元素。将元素复制到目标操作数的每个通道中的指定字段。

4.

发明申请
METHOD AND APPARATUS FOR AFFINITY-GUIDED SPECULATIVE HELPER THREADS IN CHIP MULTIPROCESSORS 有权
Title translation: 芯片多路由器中辅助引导的辅助线路的方法和装置

公开(公告)号：US20110035555A1

公开(公告)日：2011-02-10

申请号：US12909774

申请日：2010-10-21

Applicant: Hong Wang , Perry H. Wang , Jeffery A. Brown , Per Hammarlund , George Z. Chrysos , Doron Orenstein , Steve Shih-wei Liao , John P. Shen

Inventor： Hong Wang , Perry H. Wang , Jeffery A. Brown , Per Hammarlund , George Z. Chrysos , Doron Orenstein , Steve Shih-wei Liao , John P. Shen

IPC: G06F12/08 , G06F9/44 , G06F12/00 , G06F9/312

CPC classification number: G06F9/3842 , G06F9/383 , G06F9/3851 , G06F12/0862

Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.

Abstract translation: 提供了用于在芯片多处理器（CMP）中执行推测性数据预取的装置，系统和方法。数据由在CMP的一个核心上运行的辅助线程预取，而主程序在CMP的另一个核心上同时运行。由辅助线程预取的数据被提供给辅助核心。对于一个实施例，由辅助线程预取的数据被推送到主核心。它也可以也可以不被提供给辅助核心。在将数据广播到亲和组的所有核心的过程中，可能会将预取数据推送到主核心。对于至少另一个实施例，根据主核心的请求，从辅助核心的本地高速缓存提供由辅助线程预取的数据到主核心。

5.

发明授权
Method and apparatus for reducing clock frequency during low workload periods 有权
Title translation: 在低工作负载期间降低时钟频率的方法和装置

公开(公告)号：US07721129B2

公开(公告)日：2010-05-18

申请号：US11330647

申请日：2006-01-12

Applicant: Itamar S. Kazachinsky , Doron Orenstein

Inventor： Itamar S. Kazachinsky , Doron Orenstein

IPC: G06F1/32

CPC classification number: G06F1/324 , G06F1/3203 , Y02D10/126

Abstract: A clock frequency control unit for an integrated circuit (IC) includes a clock generator, a finite state machine (FSM), and a gating circuit (GC). The FSM has at least first and second states corresponding to non-low workload low workload states, respectively. In the first state, the GC provides a clock signal to functional units of the IC with the same frequency as the clock generator output. In the second state, the GC reduces the frequency of the clock signal. In one embodiment, the GC masks out selected cycles of the clock generator output to reduce the clock signal frequency. The FSM monitors the operation of the IC to transition from the first state to the second state when selected “low workload” conditions are detected (e.g., long latency cache miss). Similarly, the FSM transitions from the second state to the first state when selected “non-low workload” conditions are detected.

Abstract translation: 用于集成电路（IC）的时钟频率控制单元包括时钟发生器，有限状态机（FSM）和门控电路（GC）。 FSM至少具有与非低工作负载低工作负载状态相对应的第一和第二状态。在第一种状态下，GC以与时钟发生器输出相同的频率向IC的功能单元提供时钟信号。在第二种状态下，GC降低了时钟信号的频率。在一个实施例中，GC屏蔽时钟发生器输出的选定周期以减少时钟信号频率。当检测到选择的“低工作负载”条件（例如，长延迟高速缓存未命中）时，FSM监视IC的操作从第一状态转换到第二状态。类似地，当检测到所选择的“非低工作负载”条件时，FSM从第二状态转变到第一状态。

6.

发明申请
Method and apparatus for varying energy per instruction according to the amount of available parallelism 有权
Title translation: 根据可用的并行度量来改变每个指令的能量的方法和装置

公开(公告)号：US20060095807A1

公开(公告)日：2006-05-04

申请号：US10952627

申请日：2004-09-28

Applicant: Edward Grochowski , John Shen , Hong Wang , Doron Orenstein , Gad Sheaffer , Ronny Ronen , Murali Annavaram

Inventor： Edward Grochowski , John Shen , Hong Wang , Doron Orenstein , Gad Sheaffer , Ronny Ronen , Murali Annavaram

IPC: G06F1/30

CPC classification number: G06F9/3851 , G06F1/206 , G06F1/3203 , G06F1/329 , G06F9/3885 , G06F9/3891 , G06F9/3897 , Y02D10/16 , Y02D10/24 , Y02D50/20

Abstract: A method and apparatus for changing the configuration of a multi-core processor is disclosed. In one embodiment, a throttle module (or throttle logic) may determine the amount of parallelism present in the currently-executing program, and change the execution of the threads of that program on the various cores. If the amount of parallelism is high, then the processor may be configured to run a larger amount of threads on cores configured to consume less power. If the amount of parallelism is low, then the processor may be configured to run a smaller amount of threads on cores configured for greater scalar performance.

Abstract translation: 公开了一种用于改变多核处理器的配置的方法和装置。在一个实施例中，节气门模块（或节气门逻辑）可以确定当前执行的程序中存在的并行度量，并且改变该程序在各种核心上的线程的执行。如果并行量高，则处理器可以被配置为在被配置为消耗较少功率的核上运行更大量的线程。如果并行量较低，则处理器可能被配置为在配置为更高标量性能的核心上运行较少量的线程。

7.

发明授权
Memory system for multiple data types 失效
Title translation: 多种数据类型的内存系统

公开(公告)号：US06944720B2

公开(公告)日：2005-09-13

申请号：US10402827

申请日：2003-03-27

Applicant: Zeev Sperber , Guy Peled , Doron Orenstein , Ehud Cohen , Gabi Malka

Inventor： Zeev Sperber , Guy Peled , Doron Orenstein , Ehud Cohen , Gabi Malka

IPC: G06F12/08 , G06F12/10

CPC classification number: G06F12/0875 , G06F12/1054 , G06F2212/401

Abstract: A memory system is provided for storing multiple data types. The memory system includes a main memory, a local cache, and a translation unit. The local cache has multiple entries, each of which includes a data field to store data and a status field to indicate a storage state for the stored data. The translation unit includes a translation lookaside buffer (TLB) and a status-cache (STC). The TLB stores address translations for data in the main memory, and the STC stores storage states for data indicated by the address translations.

Abstract translation: 提供了一种用于存储多种数据类型的存储器系统。存储器系统包括主存储器，本地高速缓存和翻译单元。本地缓存具有多个条目，每个条目包括用于存储数据的数据字段和用于指示所存储的数据的存储状态的状态字段。翻译单元包括翻译后备缓冲器（TLB）和状态缓存（STC）。 TLB存储主存储器中的数据的地址转换，并且STC存储由地址转换指示的数据的存储状态。

8.

发明授权
Method and apparatus for resuming memory operations from a low latency wake-up low power state 有权
Title translation: 从低延迟唤醒低功率状态恢复存储器操作的方法和装置

公开(公告)号：US06886105B2

公开(公告)日：2005-04-26

申请号：US09504003

申请日：2000-02-14

Applicant: Opher Kahn , Doron Orenstein

Inventor： Opher Kahn , Doron Orenstein

IPC: G06F1/32 , G11C7/22 , G11C11/406

CPC classification number: G06F1/3275 , G06F1/3203 , G11C7/22 , G11C11/406 , G11C2207/2227 , G11C2211/4067 , Y02D10/14 , Y02D50/20

Abstract: A method and apparatus for resuming operations from a low latency wake-up low power state. One embodiment provides a system including a processor, an operating system, and a memory subsystem that requires initialization commands to exit a memory low power state. Control logic detects exit from an operating system low latency low power state and responsively generates a plurality of initialization commands to remove the memory subsystem from the memory low power state prior to deasserting a stop clock signal and allowing execution to resume.

Abstract translation: 一种用于从低延迟唤醒低功率状态恢复操作的方法和装置。一个实施例提供了一种包括处理器，操作系统和存储器子系统的系统，其需要初始化命令来退出存储器低功率状态。控制逻辑检测到从操作系统的低延迟低功率状态退出，并响应地产生多个初始化命令，以在停止停止时钟信号并允许执行恢复之前将存储器子系统从存储器低功率状态中移除。

9.

发明授权
Method and apparatus for providing memory access in a processor pipeline 失效
Title translation: 用于在处理器流水线中提供存储器访问的方法和装置

公开(公告)号：US5787026A

公开(公告)日：1998-07-28

申请号：US575780

申请日：1995-12-20

Applicant: Doron Orenstein , Millind Mittal , Ofri Wechsler

Inventor： Doron Orenstein , Millind Mittal , Ofri Wechsler

IPC: G06F9/38 , G06F9/302

CPC classification number: G06F9/3826 , G06F9/3867

Abstract: The invention provides a method and apparatus for providing operand reads in a processor pipeline. According to one aspect of the invention, a method is described for executing an instruction in a computer pipeline that requires different operands be read from the same register file in different stages of the computer pipeline. According to another aspect of the invention, a method is described for executing an instruction in a processor pipeline. According to this method, at least a first operand is read from a register file in a first stage of the processor pipeline. If execution of the instruction causes the processor to place the first operand in a storage area other than the register file, then the first operand in written to that storage area in a subsequent stage of the processor pipeline. Otherwise, one or more ALU operations are performed on the first operand and at least a second operand in a different subsequent stage of the processor pipeline.

Abstract translation: 本发明提供了一种用于在处理器管线中提供操作数读取的方法和装置。根据本发明的一个方面，描述了一种用于执行计算机流水线中的指令的方法，其需要在计算机管线的不同阶段从同一寄存器文件读取不同的操作数。根据本发明的另一方面，描述了一种用于在处理器流水线中执行指令的方法。根据该方法，在处理器管线的第一级中，从寄存器文件读取至少第一操作数。如果指令的执行导致处理器将第一操作数放置在除寄存器文件之外的存储区域中，则将第一操作数写入处理器管线的后续阶段中的该存储区域。否则，在处理器流水线的不同后续阶段的第一操作数和至少第二操作数上执行一个或多个ALU操作。

10.

发明授权
Compressed instruction format 有权
Title translation: 压缩指令格式

公开(公告)号：US09569208B2

公开(公告)日：2017-02-14

申请号：US14307468

申请日：2014-06-17

Applicant: Robert Valentine , Doron Orenstein , Brett L. Toll

Inventor： Robert Valentine , Doron Orenstein , Brett L. Toll

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30 , G06F9/30145 , G06F9/30149 , G06F9/3017 , G06F9/30174 , G06F9/30178 , G06F9/30185 , G06F9/3816 , G06F9/382

Abstract: A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.

Abstract translation: 一种解码可变长度指令集中的指令的技术。在一个实施例中，描述了指令编码，其中支持遗留，现在和将来的指令集扩展，并且提供增加的功能，而不扩展代码大小，并且在一些情况下减少代码大小。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification