Temporary storage of memory line while waiting for cache eviction
    1.
    发明授权
    Temporary storage of memory line while waiting for cache eviction 失效
    在等待缓存驱逐时临时存储内存条

    公开(公告)号:US07594080B2

    公开(公告)日:2009-09-22

    申请号:US10661802

    申请日:2003-09-12

    IPC分类号: G06F12/00 G06F12/08

    CPC分类号: G06F12/0859 G06F12/12

    摘要: The temporary storage of a memory line to be stored in a cache while waiting for another memory line to be evicted from the cache is disclosed. A method includes evicting a first memory line currently stored in the cache and storing a second memory line not currently stored in the cache in its place. While the first memory line is being evicted, such as by first being inserted into an eviction queue, the second memory line is temporarily stored in a buffer. The buffer may be a data transfer buffer (DTB). Upon eviction of the first memory line, the second memory line is moved from the buffer into the cache.

    摘要翻译: 公开了在等待另一存储器线从缓存中逐出时存储在高速缓存中的存储器线的临时存储。 一种方法包括扫描当前存储在高速缓存中的第一存储器行,并将当前存储在高速缓存中的第二存储器行存储在其中。 当第一存储器线被驱逐时,例如首先插入驱逐队列中,第二存储器线暂时存储在缓冲器中。 缓冲器可以是数据传输缓冲器(DTB)。 在驱逐第一存储器线路时,第二存储器线路从缓冲器移动到高速缓存器中。

    Method and system for communicating interrupts between nodes of a multinode computer system
    2.
    发明授权
    Method and system for communicating interrupts between nodes of a multinode computer system 失效
    用于在多节点计算机系统的节点之间通过网络通信中断的方法和系统

    公开(公告)号:US06247091B1

    公开(公告)日:2001-06-12

    申请号:US08848545

    申请日:1997-04-28

    申请人: Thomas D. Lovett

    发明人: Thomas D. Lovett

    IPC分类号: G06F1324

    CPC分类号: G06F13/24

    摘要: Each node of multinode computer system includes an interrupt controller, a pair of send and receive queues, and a state machine for communicating interrupts between nodes. The communication among the interrupt controller, the state machine, and the queues is coordinated by a queue manager. For sending an interrupt, the interrupt controller accepts an interrupt placed on a bus within the node and intended for another node and stores it in the send queue. The controller then notifies the interrupt source that the interrupt has been accepted before it is transmitted to other node. The interrupt has a first form suitable for transmission on the bus. A state machine within the node takes the interrupt from the send queue and puts the interrupt into a second form suitable for transmission across a network connecting the multiple nodes. For receiving an interrupt, the state machine accepts an interrupt from another node and stores it in the receive queue, notifying the interrupt source that the interrupt has been accepted before its is placed on the node bus. The interrupt has the second form suitable for transmission across the network. The interrupt controller takes the interrupt from the receive queue and puts it in the first form suitable for transmission on the bus.

    摘要翻译: 多节点计算机系统的每个节点包括中断控制器,一对发送和接收队列,以及用于在节点之间通信中断的状态机。 中断控制器,状态机和队列之间的通信由队列管理器进行协调。 为了发送中断,中断控制器接受放置在节点内的总线上的中断,并且用于另一个节点并将其存储在发送队列中。 然后,控制器通知中断源,在中断被发送到其他节点之前已被接受。 中断具有适合在总线上传输的第一种形式。 节点内的状态机从发送队列中获取中断,并将中断置于适合通过连接多个节点的网络进行传输的第二种形式。 为了接收中断,状态机接受来自另一个节点的中断,并将其存储在接收队列中,通知中断源在中断被放置在节点总线之前被接受。 中断具有适合于跨网络传输的第二种形式。 中断控制器从接收队列中取出中断,并将其置于适合在总线上传输的第一种形式。

    Multiple-stage pipeline for transaction conversion
    3.
    发明授权
    Multiple-stage pipeline for transaction conversion 失效
    用于事务转换的多级流水线

    公开(公告)号:US07210018B2

    公开(公告)日:2007-04-24

    申请号:US10334855

    申请日:2002-12-30

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0815 G06F13/1615

    摘要: A multiple-stage pipeline for transaction conversion is disclosed. A method is disclosed that converts a transaction into a set of concurrently performable actions. In a first pipeline stage, the transaction is decoded into an internal protocol evaluation (PE) command, such as by utilizing a look-up table (LUT). In a second pipeline stage, an entry within a PE random access memory (RAM) is selected, based on the internal PE command. This may be accomplished by converting the internal PE command into a PE RAM base address and an associated qualifier thereof. In a third pipeline stage, the entry within the PE RAM is converted to the set of concurrently performable actions, such as based on the PE RAM base address and its associate qualifier.

    摘要翻译: 公开了一种用于事务转换的多级流水线。 公开了一种将事务转换为一组可同时执行的动作的方法。 在第一流水线阶段中,事务被解码成内部协议评估(PE)命令,例如通过利用查找表(LUT)。 在第二流水线阶段中,基于内部PE命令来选择PE随机存取存储器(RAM)内的条目。 这可以通过将内部PE命令转换为PE RAM基地址及其相关限定词来实现。 在第三流水线阶段,将PE RAM内的条目转换为一组可同时执行的动作,例如基于PE RAM基地址及其相关限定符。

    Maintaining order of write operations in a multiprocessor for memory consistency
    4.
    发明授权
    Maintaining order of write operations in a multiprocessor for memory consistency 失效
    维护多处理器中写入操作的顺序,以保持内存一致性

    公开(公告)号:US06493809B1

    公开(公告)日:2002-12-10

    申请号:US09493782

    申请日:2000-01-28

    IPC分类号: G06F1300

    CPC分类号: G06F13/4243

    摘要: A method of invalidating shared cache lines such as on a sharing list by issuing an invalidate acknowledgement before actually invalidating a cache line. The method is useful in multiprocessor systems such as a distributed shared memory (DSM) or non-uniform memory access (NUMA) machines that include a number of interconnected processor nodes each having local memory and caches that store copies of the same data. In such a multiprocessor system using the Scalable Content Interface (SCI) protocol, an invalidate request is sent from the head node on the sharing list to a succeeding node on the list. In response to the invalidate request, the succeeding node issues an invalidate acknowledgement before the cache line is actually invalidated. After issuing the invalidate acknowledgement, the succeeding node initiates invalidation of the cache line. The invalidate acknowledgement can take the form of a response to the head node or a forwarding of the invalidate request to the next succeeding node on the list. To maintain processor consistency, a flag is set each time an invalidate acknowledgement is sent. The flag is cleared after the invalidation of the cache line is completed. Cacheable transactions received at the succeeding node while a flag is set are delayed until the flag is cleared.

    摘要翻译: 一种使共享高速缓存行无效化的方法,例如在共享列表上通过在实际使高速缓存行无效之前发出无效确认。 该方法在诸如分布式共享存储器(DSM)或非均匀存储器访问(NUMA)机器的多处理器系统中是有用的,其包括多个互连的处理器节点,每个互连处理器节点具有存储相同数据的副本的本地存储器和高速缓存。 在使用可伸缩内容接口(SCI)协议的这种多处理器系统中,将无效请求从共享列表上的头节点发送到列表上的后续节点。 响应于无效请求,后续节点在高速缓存行实际无效之前发出无效确认。 发出无效确认后,后续节点启动高速缓存行的无效。 无效确认可以采取对头节点的响应的形式或将无效请求转发到列表上的下一个后续节点。 为了保持处理器的一致性,每次发送无效确认时都会设置一个标志。 标志在高速缓存行无效完成后被清除。 在设置标志时在后续节点处接收的可缓存事务被延迟直到该标志被清除。

    TRAFFIC CLASS ARBITRATION BASED ON PRIORITY AND BANDWIDTH ALLOCATION
    5.
    发明申请
    TRAFFIC CLASS ARBITRATION BASED ON PRIORITY AND BANDWIDTH ALLOCATION 审中-公开
    基于优先级和带宽分配的交通类仲裁

    公开(公告)号:US20160373362A1

    公开(公告)日:2016-12-22

    申请号:US15120038

    申请日:2015-02-18

    摘要: This disclosure describes systems, devices, methods and computer readable media for enhanced network communication for use in higher performance applications including storage, high performance computing (HPC) and Ethernet-based fabric interconnects. In some embodiments, a network controller may include a transmitter circuit configured to transmit packets on a plurality of virtual lanes (VLs), the VLs associated with a defined VL priority and an allocated share of network bandwidth. The network controller may also include a bandwidth monitor module configured to measure bandwidth consumed by the packets and an arbiter module configured to adjust the VL priority based on a comparison of the measured bandwidth to the allocated share of network bandwidth. The transmitter circuit may be further configured to transmit the packets based on the adjusted VL priority.

    摘要翻译: 本公开描述了用于增强网络通信的系统,设备,方法和计算机可读介质,用于包括存储,高性能计算(HPC)和基于以太网的结构互连的更高性能应用中。 在一些实施例中,网络控制器可以包括被配置为在多个虚拟通道(VL)上发送分组的发射机电路,与所定义的VL优先级相关联的VL和所分配的网络带宽份额。 网络控制器还可以包括:带宽监视器模块,被配置为测量所述分组消耗的带宽;以及仲裁器模块,所述仲裁器模块被配置为基于所测量的带宽与所分配的网络带宽份额的比较来调整所述VL优先级。 发射机电路还可以被配置为基于经调整的VL优先级来传送分组。

    Method and apparatus of using global snooping to provide cache coherence to distributed computer nodes in a single coherent system
    6.
    发明授权
    Method and apparatus of using global snooping to provide cache coherence to distributed computer nodes in a single coherent system 失效
    使用全局监听在单一相干系统中为分布式计算机节点提供高速缓存一致性的方法和装置

    公开(公告)号:US06973544B2

    公开(公告)日:2005-12-06

    申请号:US10045927

    申请日:2002-01-09

    IPC分类号: G06F12/00 G06F12/08 G06F13/00

    CPC分类号: G06F12/0813 G06F12/0817

    摘要: A method and apparatus for providing cache coherence in a multiprocessor system which is configured into two or more nodes with memory local to each node and a tag and address crossbar system and a data crossbar system which interconnects all nodes. The disclosure is applicable to multiprocessor computer systems which utilize system memory distributed over more than one node and snooping of data states in each node which utilizes memory local to that node. Global snooping is used to provide a single point of serialization of data tags. A central crossbar controller examines cache state tags of a given address line for all nodes simultaneously and issues an appropriate reply back to a node requesting data while generating other data requests to any other node in the system for the purpose of maintaining cache coherence and supplying the requested data. The system utilizes memory local to each node by dividing such memory into local and remote categories which are mutually exclusive for any given cache line. The disclosure provides support for a third level remote cache for each node.

    摘要翻译: 一种用于在多处理器系统中提供高速缓存一致性的方法和装置,其被配置为具有每个节点本地的存储器的两个或更多个节点以及互连所有节点的标签和地址交叉开关系统以及数据交叉开关系统。 本公开适用于利用分布在多于一个节点上的系统存储器并且利用利用该节点本地的存储器的每个节点中的数据状态的窥探的多处理器计算机系统。 全局侦听用于提供数据标签的单一序列化。 中央交叉开关控制器同时检查所有节点的给定地址线的高速缓存状态标签,并向请求数据的节点发出适当的回复,同时向系统中的任何其他节点生成其他数据请求,以便保持高速缓存的一致性并提供 请求的数据。 该系统通过将这样的存储器划分为对于任何给定的高速缓存行互斥的本地和远程类别来利用每个节点本地的存储器。 本公开提供了对于每个节点的第三级远程高速缓存的支持。

    Building block removal from partitions
    7.
    发明授权
    Building block removal from partitions 有权
    从分区拆除构建块

    公开(公告)号:US06934835B2

    公开(公告)日:2005-08-23

    申请号:US10045774

    申请日:2002-01-09

    IPC分类号: G06F3/00 G06F9/00 G06F9/50

    CPC分类号: G06F9/5061

    摘要: Removing building blocks from partitions to which they have been bound is disclosed. A building block of a platform is removed from a partition of the platform by first halting activity by the partition on the building block. A first partition identifier of the building block indicates the partition of the building block. The building block joined the partition in a masterless manner. The resources of the building block are withdrawn from the partition, and the building block is deauthorized from the platform.

    摘要翻译: 公开了从它们所绑定的分区中移除构建块。 通过首先停止构建块上的分区的活动,从平台的分区中移除平台的构建块。 构建块的第一分区标识符指示构建块的分区。 建筑物以无畏的方式加入了分区。 构建块的资源从分区中撤出,构建块从平台中取消。

    Method and apparatus for maintaining an order of write operations by
processors in a multiprocessor computer to maintain memory consistency
    8.
    发明授权
    Method and apparatus for maintaining an order of write operations by processors in a multiprocessor computer to maintain memory consistency 失效
    用于维护多处理器计算机中的处理器的写入操作顺序以维持存储器一致性的方法和装置

    公开(公告)号:US5900020A

    公开(公告)日:1999-05-04

    申请号:US678372

    申请日:1996-06-27

    摘要: A method and apparatus for maintaining processor consistency in a multiprocessor computer such as a multinode computer system are disclosed. A processor proceeds with write operations before its previous write operations complete, while processor consistency is maintained. A write operation begins with a request by the processor to invalidate copies of the data stored in other nodes. This current invalidate request is queued while acknowledging to the processor that the request is complete even though it has not actually completed. The processor proceeds to complete the write operation by changing the data. It can then execute subsequent operations, including other write operations. The queued request, however, is not transmitted to other nodes in the computer until all previous invalidate requests by the processor are complete. This ensures that the current invalidate request will not pass a previous invalidate request. The invalidate requests are added and removed from a processor's outstanding invalidate list as they arise and are completed. An invalidate request is completed by notifying the nodes in a linked list related to the current invalidate request that data shared by the node is now invalid.

    摘要翻译: 公开了一种在诸如多节点计算机系统的多处理器计算机中维持处理器一致性的方法和装置。 处理器在其先前的写入操作完成之前进行写入操作,同时保持处理器的一致性。 写入操作从处理器的请求开始,使存储在其他节点中的数据的副本无效。 该当前无效请求被排队,同时向处理器确认请求完成,即使它尚未实际完成。 处理器继续通过更改数据来完成写入操作。 然后,它可以执行后续操作,包括其他写入操作。 然而,排队的请求不会传输到计算机中的其他节点,直到处理器的所有先前的无效请求都完成为止。 这确保当前的无效请求不会通过先前的无效请求。 无效请求在处理器未完成的无效列表出现并被完成时被添加和删除。 通过通知与当前无效请求相关的链接列表中的节点,节点共享的数据现在无效,则完成无效请求。

    Traffic class arbitration based on priority and bandwidth allocation

    公开(公告)号:US10237191B2

    公开(公告)日:2019-03-19

    申请号:US15120038

    申请日:2015-02-18

    摘要: This disclosure describes systems, devices, methods and computer readable media for enhanced network communication for use in higher performance applications including storage, high performance computing (HPC) and Ethernet-based fabric interconnects. In some embodiments, a network controller may include a transmitter circuit configured to transmit packets on a plurality of virtual lanes (VLs), the VLs associated with a defined VL priority and an allocated share of network bandwidth. The network controller may also include a bandwidth monitor module configured to measure bandwidth consumed by the packets and an arbiter module configured to adjust the VL priority based on a comparison of the measured bandwidth to the allocated share of network bandwidth. The transmitter circuit may be further configured to transmit the packets based on the adjusted VL priority.

    HIERARCHICAL/LOSSLESS PACKET PREEMPTION TO REDUCE LATENCY JITTER IN FLOW-CONTROLLED PACKET-BASED NETWORKS
    10.
    发明申请
    HIERARCHICAL/LOSSLESS PACKET PREEMPTION TO REDUCE LATENCY JITTER IN FLOW-CONTROLLED PACKET-BASED NETWORKS 审中-公开
    基于分组的网络中的分层/不可信分组预防措施来减少延迟抖动

    公开(公告)号:US20150180799A1

    公开(公告)日:2015-06-25

    申请号:US14136293

    申请日:2013-12-20

    摘要: Methods, apparatus, and systems for implementing hierarchical and lossless packet preemption and interleaving to reduce latency jitter in flow-controller packet-based networks. Fabric packets are divided into a plurality of data units, with data units for different fabric packets buffered in separate buffers. Data units are pulled from the buffers and added to a transmit stream in which groups of data units are interleaved. Upon receipt by a receiver, the groups of data units are separated out and buffered in separate buffers under which data units for the same fabric packets are grouped together. In one aspect, each buffer is associated with a respective virtual lane (VL), and the fabric packets are effectively transferred over fabric links using virtual lanes. VLs may have different levels of priority under which data units for fabric packets in higher-priority VLs may preempt fabric packets in lower-priority VLs. By transferring data units rather than entire packets, transmission of a packet can be temporarily paused in favor of a higher-priority packet. Multiple levels of preemption and interleaving in a nested manner are supported.

    摘要翻译: 用于实现分级和无损数据包抢占和交织以减少流控制器基于分组的网络中的延迟抖动的方法,装置和系统。 结构数据包被划分为多个数据单元,不同结构数据包的数据单元缓冲在单独的缓冲区中。 数据单元被从缓冲器中拉出并且被添加到数据单元组交错的发送流中。 在由接收器接收时,数据单元组被分离出并且在单独的缓冲器中缓冲,在这些缓冲器中,用于相同结构数据包的数据单元被分组在一起。 在一个方面,每个缓冲器与相应的虚拟通道(VL)相关联,并且使用虚拟通道在结构链路上有效地传送结构数据包。 VL可以具有不同的优先级,在该优先级下,较高优先级VL中的结构数据包的数据单元可以优先考虑低优先级VL中的结构数据包。 通过传送数据单元而不是整个分组,可以临时暂停分组的传输,以利于较高优先级的分组。 支持多种级别的抢占和嵌套方式的交错。