Direct access to low-latency memory
    1.
    发明授权
    Direct access to low-latency memory 有权
    直接访问低延迟内存

    公开(公告)号:US07594081B2

    公开(公告)日:2009-09-22

    申请号:US11024002

    申请日:2004-12-28

    CPC classification number: G06F9/3824 G06F9/3885 G06F12/0888

    Abstract: A content aware application processing system is provided for allowing directed access to data stored in a non-cache memory thereby bypassing cache coherent memory. The processor includes a system interface to cache coherent memory and a low latency memory interface to a non-cache coherent memory. The system interface directs memory access for ordinary load/store instructions executed by the processor to the cache coherent memory. The low latency memory interface directs memory access for non-ordinary load/store instructions executed by the processor to the non-cache memory, thereby bypassing the cache coherent memory. The non-ordinary load/store instruction can be a coprocessor instruction. The memory can be a low-latency type memory. The processor can include a plurality of processor cores.

    Abstract translation: 提供内容感知应用处理系统,用于允许定向访问存储在非高速缓冲存储器中的数据,从而绕过高速缓存一致存储器。 该处理器包括用于缓存相干存储器的系统接口和用于非高速缓存一致记忆体的低延迟存储器接口。 系统接口将由处理器执行的普通加载/存储指令的存储器访问指向高速缓存一致存储器。 低延迟存储器接口将由处理器执行的非普通加载/存储指令的存储器访问引导到非高速缓存存储器,从而绕过高速缓存一致存储器。 非普通的加载/存储指令可以是协处理器指令。 存储器可以是低延迟型存储器。 处理器可以包括多个处理器核。

    Selective replication of data structures
    2.
    发明授权
    Selective replication of data structures 有权
    数据结构的选择性复制

    公开(公告)号:US07558925B2

    公开(公告)日:2009-07-07

    申请号:US11335189

    申请日:2006-01-18

    CPC classification number: G06F12/06 G06F12/0653 G06F2212/174

    Abstract: Methods and apparatus are provided for selectively replicating a data structure in a low-latency memory. The memory includes multiple individual memory banks configured to store replicated copies of the same data structure. Upon receiving a request to access the stored data structure, a low-latency memory access controller selects one of the memory banks, then accesses the stored data from the selected memory bank. Selection of a memory bank can be accomplished using a thermometer technique comparing the relative availability of the different memory banks. Exemplary data structures that benefit from the resulting efficiencies include deterministic finite automata (DFA) graphs and other data structures that are loaded (i.e., read) more often than they are stored (i.e., written).

    Abstract translation: 提供了用于在低延迟存储器中选择性地复制数据结构的方法和装置。 存储器包括被配置为存储相同数据结构的复制副本的多个单独存储体。 在接收到访问所存储的数据结构的请求时,低延迟存储器访问控制器选择存储体之一,然后从所选存储体存取所存储的数据。 可以使用比较不同存储体的相对可用性的温度计技术来实现存储体的选择。 受益于所产生的效率的示例性数据结构包括确定性有限自动机(DFA)图和与它们被存储(即,写入)相比更加加载(即读)的其他数据结构。

    Deterministic finite automata (DFA) instruction
    4.
    发明授权
    Deterministic finite automata (DFA) instruction 有权
    确定性有限自动机(DFA)指令

    公开(公告)号:US08301788B2

    公开(公告)日:2012-10-30

    申请号:US11220899

    申请日:2005-09-07

    CPC classification number: G06F9/30003 H04L1/0045

    Abstract: A computer-readable instruction is described for traversing deterministic finite automata (DFA) graphs to perform a pattern search in the in-coming packet data in real-time. The instruction includes one or more pre-defined fields. One of the fields includes a DFA graph identifier for identifying one of several previously-stored DFA graphs. Another one of the fields includes an input reference for identifying input data to be processed using the identified DFA graphs. Yet another one of the fields includes an output reference for storing results generated responsive to the processed input data. The instructions are forwarded to a DFA engine adapted to process the input data using the identified DFA graph and to provide results as instructed by the output reference.

    Abstract translation: 描述了一种用于遍历确定性有限自动机(DFA)图的计算机可读指令,以便在即将进行的分组数据中实时地执行模式搜索。 该指令包括一个或多个预定义字段。 其中一个字段包括用于标识几个先前存储的DFA图形之一的DFA图形标识符。 另一个领域包括用于使用所识别的DFA图形来识别要处理的输入数据的输入参考。 另一个领域包括用于存储响应于经处理的输入数据生成的结果的输出参考。 这些指令被转发到适用于使用识别的DFA图处理输入数据的DFA引擎,并根据输出参考的指示提供结果。

    System and method to provide non-coherent access to a coherent memory system
    5.
    发明授权
    System and method to provide non-coherent access to a coherent memory system 有权
    提供对相干存储器系统的非相干访问的系统和方法

    公开(公告)号:US08850125B2

    公开(公告)日:2014-09-30

    申请号:US13280756

    申请日:2011-10-25

    CPC classification number: G06F12/0888 G06F12/08 G06F12/0831

    Abstract: In one embodiment, a system comprises a memory and a memory controller that provides a cache access path to the memory and a bypass-cache access path to the memory, receives requests to read graph data from the memory on the bypass-cache access path and receives requests to read non-graph data from the memory on the cache access path. A method comprises receiving a request at a memory controller to read graph data from a memory on a bypass-cache access path, receiving a request at the memory controller to read non-graph data from the memory through a cache access path, and arbitrating, in the memory controller, among the requests using arbitration.

    Abstract translation: 在一个实施例中,系统包括存储器和存储器控制器,其提供到存储器的高速缓存访​​问路径和到存储器的旁路高速缓存访​​问路径,接收从旁路高速缓存访​​问路径上的存储器读取图形数据的请求,以及 接收从缓存访问路径上的内存中读取非图形数据的请求。 一种方法包括在存储器控制器处接收来自旁路高速缓存访​​问路径上的存储器的图形数据的请求,在存储器控制器处接收请求以通过高速缓存访​​问路径从存储器读取非图形数据, 在内存控制器中,在使用仲裁的请求中。

    Mechanism for synchronizing multiple skewed source-synchronous data channels with automatic initialization feature

    公开(公告)号:US07024533B2

    公开(公告)日:2006-04-04

    申请号:US10441451

    申请日:2003-05-20

    CPC classification number: G06F13/1689

    Abstract: A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.

    Mechanism for synchronizing multiple skewed source-synchronous data channels with automatic initialization feature
    7.
    发明授权
    Mechanism for synchronizing multiple skewed source-synchronous data channels with automatic initialization feature 失效
    同步多个偏斜源同步数据通道与自动初始化功能的机制

    公开(公告)号:US06636955B1

    公开(公告)日:2003-10-21

    申请号:US09652480

    申请日:2000-08-31

    CPC classification number: G06F13/1689

    Abstract: A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself. Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.

    Abstract translation: 计算机系统具有存储器控制器,其包括耦合到多个存储器通道的读取缓冲器。 存储器控制器有利地消除由存储器模块位于与存储器控制器不同的距离处引起的通道间偏移。 存储器控制器优选地包括用于每个存储器通道的通道接口和同步逻辑电路。 该电路包括读取和写入缓冲区,读取缓冲区的加载和卸载指针。 卸载指针逻辑生成卸载指针,加载指针逻辑生成加载指针。 指针优选地是根据两个不同的时钟信号递增的自由运行指针。 负载指针根据由存储器控制器产生的时钟增加,但是已经被引出到存储器模块和从存储器模块返回。 卸载指针根据计算机系统本身产生的时钟增加。 因为每个存储器通道的迹线长度可能不同,所以存储器模块将读数据提供给存储器控制器所花费的时间可能对于每个通道而言可能不同。 “偏斜”被定义为数据到达最早通道时和数据到达最新通道之间的时间差。 在系统初始化期间,指针是同步的。 初始化之后,这些指针用于加载和卸载读取缓冲区,从而消除内部信道偏移的影响。

    Computer resource management and allocation system
    8.
    发明授权
    Computer resource management and allocation system 有权
    计算机资源管理与分配系统

    公开(公告)号:US06754739B1

    公开(公告)日:2004-06-22

    申请号:US09651945

    申请日:2000-08-31

    CPC classification number: G06F9/544

    Abstract: A method and architecture for improved system resource management and allocation for the processing of request and response messages in a computer system. The resource management scheme provides for dynamically sharing system resources, such as data buffers, between request and response messages or transactions. In particular, instead of simply dedicating a portion of the system resources to requests and the remaining portion to responses, a minimum amount of resources are reserved for responses and a minimum amount for requests, while the remaining resources are dynamically shared between both types of messages. The method and architecture of the present invention allows for more efficient use of system resources, while avoiding deadlock conditions and ensuring a minimum service rate for requests.

    Abstract translation: 一种用于改进系统资源管理和分配以在计算机系统中处理请求和响应消息的方法和架构。 资源管理方案提供在请求和响应消息或事务之间动态共享系统资源,例如数据缓冲器。 特别地,不是简单地将系统资源的一部分专用于请求,而将余下的部分用于响应,而是为响应保留最小量的资源和用于请求的最小量,而剩余的资源在两种类型的消息之间动态共享 。 本发明的方法和体系结构允许更有效地利用系统资源,同时避免死锁状况并确保请求的最低服务速率。

    System and method to reduce memory access latencies using selective replication across multiple memory ports
    9.
    发明授权
    System and method to reduce memory access latencies using selective replication across multiple memory ports 有权
    使用多个内存端口选择性复制来减少内存访问延迟的系统和方法

    公开(公告)号:US08560757B2

    公开(公告)日:2013-10-15

    申请号:US13280738

    申请日:2011-10-25

    Abstract: In one embodiment, a system includes memory ports distributed into subsets identified by a subset index, where each memory port has an individual wait time based on a respective workload. The system further comprises a first address hashing unit configured to receive a read request including a virtual memory address associated with a replication factor and referring to graph data. The first address hashing unit translates the replication factor into a corresponding subset index based on the virtual memory address, and converts the virtual memory address to a hardware based memory address referring to graph data in the memory ports within a subset indicated by the corresponding subset index. The system further comprises a memory replication controller configured to direct read requests to the hardware based address to the one of the memory ports within the subset indicated by the corresponding subset index with a lowest individual wait time.

    Abstract translation: 在一个实施例中,系统包括分布到由子集索引识别的子集中的存储器端口,其中每个存储器端口基于相应的工作负载具有单独的等待时间。 该系统还包括第一地址哈希单元,其被配置为接收包括与复制因子相关联的虚拟存储器地址并参考图形数据的读取请求。 第一地址散列单元基于虚拟存储器地址将复制因子转换为相应的子集索引,并且参考由相应子集索引指示的子集内的存储器端口中的图形数据将虚拟存储器地址转换为基于硬件的存储器地址 。 该系统还包括存储器复制控制器,其被配置为将读取请求引导到基于硬件的地址到具有最低个人等待时间的相应子集索引指示的子集内的存储器端口之一。

    SYSTEM AND METHOD TO REDUCE MEMORY ACCESS LATENCIES USING SELECTIVE REPLICATION ACROSS MULTIPLE MEMORY PORTS
    10.
    发明申请
    SYSTEM AND METHOD TO REDUCE MEMORY ACCESS LATENCIES USING SELECTIVE REPLICATION ACROSS MULTIPLE MEMORY PORTS 有权
    使用多个存储器端口选择性复制来减少存储器访问延迟的系统和方法

    公开(公告)号:US20130103904A1

    公开(公告)日:2013-04-25

    申请号:US13280738

    申请日:2011-10-25

    Abstract: In one embodiment, a system comprises multiple memory ports distributed into multiple subsets, each subset identified by a subset index and each memory port having an individual wait time. The system further comprises a first address hashing unit configured to receive a read request including a virtual memory address associated with a replication factor, and referring to graph data. The first address hashing unit translates the replication factor into a corresponding subset index based on the virtual memory address, and converts the virtual memory address to a hardware based memory address that refers to graph data in the memory ports within a subset indicated by the corresponding subset index. The system further comprises a memory replication controller configured to direct read requests to the hardware based address to the one of the memory ports within the subset indicated by the corresponding subset index with a lowest individual wait time.

    Abstract translation: 在一个实施例中,系统包括分布到多个子集中的多个存储器端口,每个子集由子集索引标识,每个存储器端口具有单独的等待时间。 该系统还包括第一地址哈希单元,其被配置为接收包括与复制因子相关联的虚拟存储器地址的读取请求,并且参考图形数据。 第一地址散列单元基于虚拟存储器地址将复制因子转换为对应的子集索引,并将虚拟存储器地址转换为基于硬件的存储器地址,该存储器地址涉及由相应子集指示的子集内的存储器端口中的图形数据 指数。 该系统还包括存储器复制控制器,其被配置为将读取请求引导到基于硬件的地址到具有最低个人等待时间的相应子集索引指示的子集内的存储器端口之一。

Patent Agency Ranking