Patent search ap:("Sudarshan Kadambi" OR "Vijay Balakrishnan" OR "Wayne I. Yamamoto") AND inv:"Sudarshan Kadambi" Page 1

1.

发明授权
Method and apparatus for reducing the effects of hot spots in cache memories 有权
Title translation: 减少高速缓冲存储器中热点影响的方法和装置

公开(公告)号：US06948032B2

公开(公告)日：2005-09-20

申请号：US10354327

申请日：2003-01-29

Applicant: Sudarshan Kadambi , Vijay Balakrishnan , Wayne I. Yamamoto

Inventor： Sudarshan Kadambi , Vijay Balakrishnan , Wayne I. Yamamoto

IPC: G06F12/00 , G06F12/08

CPC classification number: G06F12/0897

Abstract: One embodiment of the present invention provides a system that uses a hot spot cache to alleviate the performance problems caused by hot spots in cache memories, wherein the hot spot cache stores lines that are evicted from hot spots in the cache. Upon receiving a memory operation at the cache, the system performs a lookup for the memory operation in both the cache and the hot spot cache in parallel. If the memory operation is a read operation that causes a miss in the cache and a hit in the hot spot cache, the system reads a data line for the read operation from the hot spot cache, writes the data line to the cache, performs the read operation on the data line in the cache, and then evicts the data line from the hot spot cache.

Abstract translation: 本发明的一个实施例提供一种使用热点缓存来缓解由高速缓冲存储器中的热点引起的性能问题的系统，其中热点缓存存储从高速缓存中的热点驱逐的线。在缓存中接收到存储器操作时，系统并行地对高速缓存和热点高速缓存中的存储器操作进行查找。如果存储器操作是导致高速缓存中的缺失和热点高速缓存中的命中的读取操作，则系统从热点缓存读取用于读取操作的数据行，将数据行写入高速缓存，执行在缓存中的数据行上读取操作，然后从热点缓存中排除数据行。

2.

发明授权
Method and apparatus for predicting hot spots in cache memories 有权
Title translation: 用于预测高速缓冲存储器中的热点的方法和装置

公开(公告)号：US06976125B2

公开(公告)日：2005-12-13

申请号：US10354329

申请日：2003-01-29

Applicant: Sudarshan Kadambi , Vijay Balakrishnan , Wayne I. Yamamoto

Inventor： Sudarshan Kadambi , Vijay Balakrishnan , Wayne I. Yamamoto

IPC: G06F12/08 , G06F12/12 , G06F12/02

CPC classification number: G06F12/12 , G06F12/0897

Abstract: One embodiment of the present invention provides a system for predicting hot spots in a cache memory. Upon receiving a memory operation at the cache, the system determines a target location within the cache for the memory operation. Once the target location is determined, the system increments a counter associated with the target location. If the counter reaches a pre-determined threshold value, the system generates a signal indicating that the target location is a hot spot in the cache memory.

Abstract translation: 本发明的一个实施例提供一种用于预测高速缓冲存储器中的热点的系统。在高速缓存中接收到存储器操作时，系统确定用于存储器操作的高速缓存内的目标位置。一旦确定了目标位置，系统会增加与目标位置相关联的计数器。如果计数器达到预定阈值，则系统产生指示目标位置是高速缓冲存储器中的热点的信号。

3.

发明授权
Method and apparatus for reducing register file access times in pipelined processors 有权
Title translation: 用于在流水线处理器中减少寄存器文件访问时间的方法和装置

公开(公告)号：US06934830B2

公开(公告)日：2005-08-23

申请号：US10259721

申请日：2002-09-26

Applicant: Sudarshan Kadambi , Adam R. Talcott , Wayne I. Yamamoto

Inventor： Sudarshan Kadambi , Adam R. Talcott , Wayne I. Yamamoto

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30138 , G06F9/3824 , G06F9/3857

Abstract: One embodiment of the present invention provides a system that reduces the time required to access registers from a register file within a processor. During operation, the system receives an instruction to be executed, wherein the instruction identifies at least one operand to be accessed from the register file. Next, the system looks up the operands in a register pane, wherein the register pane is smaller and faster than the register file and contains copies of a subset of registers from the register file. If the lookup is successful, the system retrieves the operands from the register pane to execute the instruction. Otherwise, if the lookup is not successful, the system retrieves the operands from the register file, and stores the operands into the register pane. This triggers the system to reissue the instruction to be executed again, so that the re-issued instruction retrieves the operands from the register pane.

Abstract translation: 本发明的一个实施例提供一种减少从处理器内的寄存器文件访问寄存器所需的时间的系统。在操作期间，系统接收要执行的指令，其中该指令从该寄存器文件中识别要访问的至少一个操作数。接下来，系统在寄存器窗格中查找操作数，其中寄存器窗格比寄存器文件更小和更快，并且包含寄存器文件中寄存器子集的副本。如果查找成功，系统将从寄存器窗格中检索操作数，执行指令。否则，如果查找不成功，系统将从寄存器文件中检索操作数，并将操作数存储到寄存器窗格中。这将触发系统重新发出要再次执行的指令，以便重新发出的指令从寄存器窗格中检索操作数。

4.

发明申请
Method and apparatus for avoiding cache pollution due to speculative memory load operations in a microprocessor 有权
Title translation: 用于避免由于微处理器中的推测性存储器负载操作引起的高速缓存污染的方法和装置

公开(公告)号：US20050055533A1

公开(公告)日：2005-03-10

申请号：US10658663

申请日：2003-09-08

Applicant: Sudarshan Kadambi , Vijay Balakrishnan

Inventor： Sudarshan Kadambi , Vijay Balakrishnan

IPC: G06F9/38 , G06F15/00 , G06F15/76

CPC classification number: G06F9/3842 , G06F9/3834 , G06F9/3838 , G06F9/3861

Abstract: A cache pollution avoidance unit includes a dynamic memory dependency table for storing a dependency state condition between a first load instruction and a sequentially later second load instruction, which may depend on the completion of execution of the first load instruction for operand data. The cache pollution avoidance unit logically ANDs the dependency state condition stored in the dynamic memory dependency table with a cache memory “miss” state condition returned by the cache pollution avoidance unit for operand data produced by the first load instruction and required by the second load instruction. If the logical ANDing is true, memory access to the second load instruction is squashed and the execution of the second load instruction is re-scheduled.

Abstract translation: 高速缓存污染回避单元包括用于存储第一加载指令和顺序后续的第二加载指令之间的依赖状态条件的动态存储器依赖表，其可以取决于操作数数据的第一加载指令的执行完成。高速缓存污染回避单元逻辑地将存储在动态存储器依赖关系表中的依赖状态条件与由高速缓存污染避免单元返回的用于由第一加载指令产生并由第二加载指令要求的操作数数据的高速缓存存储器“未命中”状态。如果逻辑与运算为真，则对第二加载指令的存储器访问被压缩，并且重新调度第二加载指令的执行。

5.

发明授权
Method and apparatus for avoiding cache pollution due to speculative memory load operations in a microprocessor 有权
Title translation: 用于避免由于微处理器中的推测性存储器负载操作引起的高速缓存污染的方法和装置

公开(公告)号：US07010648B2

公开(公告)日：2006-03-07

申请号：US10658663

申请日：2003-09-08

Applicant: Sudarshan Kadambi , Vijay Balakrishnan

Inventor： Sudarshan Kadambi , Vijay Balakrishnan

IPC: G06F12/00

CPC classification number: G06F9/3842 , G06F9/3834 , G06F9/3838 , G06F9/3861

Abstract: A cache pollution avoidance unit includes a dynamic memory dependency table for storing a dependency state condition between a first load instruction and a sequentially later second load instruction, which may depend on the completion of execution of the first load instruction for operand data. The cache pollution avoidance unit logically ANDs the dependency state condition stored in the dynamic memory dependency table with a cache memory “miss” state condition returned by the cache pollution avoidance unit for operand data produced by the first load instruction and required by the second load instruction. If the logical ANDing is true, memory access to the second load instruction is squashed and the execution of the second load instruction is re-scheduled.

Abstract translation: 高速缓存污染回避单元包括用于存储第一加载指令和顺序后续的第二加载指令之间的依赖状态条件的动态存储器依赖表，其可以取决于操作数数据的第一加载指令的执行完成。高速缓存污染回避单元逻辑地将存储在动态存储器依赖关系表中的依赖状态条件与由高速缓存污染回避单元返回的用于由第一加载指令产生并由第二加载指令所要求的操作数数据的高速缓存存储器“未命中”状态。如果逻辑与运算为真，则对第二加载指令的存储器访问被压缩，并且重新调度第二加载指令的执行。

6.

发明授权
Converting victim writeback to a fill 有权
Title translation: 将受害者回写转换为填充

公开(公告)号：US08364907B2

公开(公告)日：2013-01-29

申请号：US13359547

申请日：2012-01-27

Applicant: Ramesh Gunna , Sudarshan Kadambi

Inventor： Ramesh Gunna , Sudarshan Kadambi

IPC: G06F12/00

CPC classification number: G06F12/0833 , G06F12/0804 , G06F12/0862 , G06F12/1045 , G06F12/126 , G06F2212/6028

Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.

Abstract translation: 在一个实施例中，处理器可以被配置为将ECC粒度存储写入数据高速缓存，而非ECC粒度存储可以与存储器请求缓冲器中的高速缓存数据合并。在一个实施例中，处理器可以被配置为检测受害者块回写命中存储器请求缓冲器中的一个或多个存储器（或反之亦然），并且可以将受害者块回写转换为填充。在一个实施例中，处理器可以推测性地发出来自加载/存储队列的负载后的存储，但是响应于负载上的窥探命中而阻止对存储的更新。

7.

发明授权
Converting victim writeback to a fill 有权
Title translation: 将受害者回写转换为填充

公开(公告)号：US08131946B2

公开(公告)日：2012-03-06

申请号：US12908535

申请日：2010-10-20

Applicant: Ramesh Gunna , Sudarshan Kadambi

Inventor： Ramesh Gunna , Sudarshan Kadambi

IPC: G06F12/00

CPC classification number: G06F12/0833 , G06F12/0804 , G06F12/0862 , G06F12/1045 , G06F12/126 , G06F2212/6028

Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.

Abstract translation: 在一个实施例中，处理器可以被配置为将ECC粒度存储写入数据高速缓存，而非ECC粒度存储可以与存储器请求缓冲器中的高速缓存数据合并。在一个实施例中，处理器可以被配置为检测受害者块回写命中存储器请求缓冲器中的一个或多个存储器（或反之亦然），并且可以将受害者块回写转换为填充。在一个实施例中，处理器可以推测性地发出来自加载/存储队列的负载后的存储，但是响应于负载上的窥探命中而阻止对存储的更新。

8.

发明授权
Multi-stride prefetcher with a recurring prefetch table 有权
Title translation: 具有循环预取表的多步预取器

公开(公告)号：US07487296B1

公开(公告)日：2009-02-03

申请号：US11062266

申请日：2005-02-17

Applicant: Sorin Iacobovici , Sudarshan Kadambi , Yuan C. Chou

Inventor： Sorin Iacobovici , Sudarshan Kadambi , Yuan C. Chou

IPC: G06F12/06

CPC classification number: G06F12/0862 , G06F9/3455 , G06F9/383 , G06F2212/6026

Abstract: A multi-stride prefetcher includes a recurring prefetch table that in turn includes a stream table and an index table. The stream table includes a valid field and a tag field. The stream table also includes a thread number field to help support multi-threaded processor cores. The tag field stores a tag from an address associated with a cache miss. The index table includes fields for storing information characterizing a state machine. The fields include a learning bit. The multi-stride prefetcher prefetches data into a cache for a plurality of streams of cache misses, each stream having a plurality of strides.

Abstract translation: 多步预取器包括循环预取表，其又包括流表和索引表。流表包括一个有效的字段和一个标签字段。流表还包括一个线程号字段，以帮助支持多线程处理器内核。标签字段从与缓存未命中相关联的地址中存储标签。索引表包括用于存储表征状态机的信息的字段。这些字段包括一个学习位。多步预取器将数据预取为多个高速缓存未命中流的高速缓存，每个流具有多个步幅。

9.

发明授权
Processor that eliminates mis-steering instruction fetch resulting from incorrect resolution of mis-speculated branch instructions 有权

公开(公告)号：US07076640B2

公开(公告)日：2006-07-11

申请号：US10095397

申请日：2002-03-11

Applicant: Sudarshan Kadambi

Inventor： Sudarshan Kadambi

IPC: G06F9/34

CPC classification number: G06F9/30058 , G06F9/3867

Abstract: A processor avoids or eliminates repetitive replay conditions and frequent instruction resteering through various techniques including resteering the fetch after the branch instruction retires, and delaying branch resolution. A processor resolves conditional branches and avoids repetitive resteering by delaying branch resolution. The processor has an instruction pipeline with inserted delay in branch condition and replay control pathways. For example, an instruction sequence that includes a load instruction followed by a subtract instruction then a conditional branch, delays branch resolution to allow time for analysis to determine whether the condition branch has resolved correctly. Eliminating incorrect branch resolutions prevents flushing of correctly predicted branches.

10.

发明申请
Prefetch Unit 有权
Title translation: 预取单元

公开(公告)号：US20110264864A1

公开(公告)日：2011-10-27

申请号：US13165297

申请日：2011-06-21

Applicant: Sudarshan Kadambi , Puneet Kumar , Po-Yung Chang

Inventor： Sudarshan Kadambi , Puneet Kumar , Po-Yung Chang

IPC: G06F12/02

CPC classification number: G06F12/0862 , G06F9/30047 , G06F9/3455 , G06F9/383 , G06F2212/6028

Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache.

Abstract translation: 在一个实施例中，处理器包括耦合到数据高速缓存的预取单元。预取单元被配置为同时维护多个单独的活动预取流。每个预取流是由处理器执行专用预取指令的软件或通过一个或多个加载/存储存储器操作通过检测到数据高速缓存未命中而启动的硬件。预取单元还被配置为响应于多个预取流来生成预取请求，以将数据预取到数据高速缓存中。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification