Shared Memory Space in a Unified Memory Model
    1.
    发明申请
    Shared Memory Space in a Unified Memory Model 有权
    统一内存模型中的共享内存空间

    公开(公告)号:US20140040565A1

    公开(公告)日:2014-02-06

    申请号:US13562985

    申请日:2012-07-31

    IPC分类号: G06F12/06

    摘要: Methods and systems are provided for mapping a memory instruction to a shared memory address space in a computer arrangement having a CPU and an APD. A method includes receiving a memory instruction that refers to an address in the shared memory address space, mapping the memory instruction based on the address to a memory resource associated with either the CPU or the APD, and performing the memory instruction based on the mapping.

    摘要翻译: 提供了用于将存储器指令映射到具有CPU和APD的计算机装置中的共享存储器地址空间的方法和系统。 一种方法包括接收参考共享存储器地址空间中的地址的存储器指令,将基于地址的存储器指令映射到与CPU或APD相关联的存储器资源,以及基于映射执行存储器指令。

    Cache Management for Memory Operations
    2.
    发明申请
    Cache Management for Memory Operations 有权
    内存操作缓存管理

    公开(公告)号:US20130262775A1

    公开(公告)日:2013-10-03

    申请号:US13436767

    申请日:2012-03-30

    IPC分类号: G06F12/08

    摘要: Embodiments of the present invention provides for the execution of threads and/or workitems on multiple processors of a heterogeneous computing system in a manner that they can share data correctly and efficiently. Disclosed method, system, and article of manufacture embodiments include, responsive to an instruction from a sequence of instructions of a work-item, determining an ordering of visibility to other work-items of one or more other data items in relation to a particular data item, and performing at least one cache operation upon at least one of the particular data item or the other data items present in any one or more cache memories in accordance with the determined ordering. The semantics of the instruction includes a memory operation upon the particular data item.

    摘要翻译: 本发明的实施例提供在异构计算系统的多个处理器上执行线程和/或工作项,以使得它们可以正确且有效地共享数据。 公开的方法,系统和制品实施例包括响应于来自工作项目的指令序列的指令,确定与特定数据相关的一个或多个其他数据项的其他工作项的可见性的排序 并且根据所确定的顺序对存在于任何一个或多个高速缓存存储器中的特定数据项或其他数据项中的至少一个执行至少一个高速缓存操作。 指令的语义包括对特定数据项的存储器操作。

    Writeback cancellation processing system for use in a packet switched
cache coherent multiprocessor system
    3.
    发明授权
    Writeback cancellation processing system for use in a packet switched cache coherent multiprocessor system 失效
    回写取消处理系统,用于分组交换高速缓存一致多处理器系统

    公开(公告)号:US5684977A

    公开(公告)日:1997-11-04

    申请号:US415040

    申请日:1995-03-31

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0828 G06F12/0822

    摘要: A multiprocessor computer system is provided having a multiplicity of sub-systems and a main memory coupled to a system controller. An interconnect module, interconnects the main memory and sub-systems in accordance with interconnect control signals received from the system controller. At least two of the sub-systems are data processors, each having a respective cache memory that stores multiple blocks of data and a set of master cache tags (Etags), including one cache tag for each data block stored by the cache memory. Each data processor includes a master interface for sending memory transaction requests to the system controller. The system controller processes each memory transaction and maintains a set of duplicate cache tags (Dtags) for each data processor. Finally, the system controller contains transaction execution circuitry for activating a transaction for servicing by the interconnect. The transaction execution circuitry pipelines memory access requests from the data processors, and includes invalidation circuitry for processing each writeback request from a given data processor prior to activation to determine if the Dtag index corresponding to the victimized cache line is invalid. Thereafter, the invalidation circuitry activates writeback requests only if the Dtag index is not invalid and cancels the writeback request if the Dtag index is invalid.

    摘要翻译: 提供了具有多个子系统和耦合到系统控制器的主存储器的多处理器计算机系统。 互连模块根据从系统控制器接收的互连控制信号,互连主存储器和子系统。 至少两个子系统是数据处理器,每个数据处理器具有存储多个数据块的相应缓存存储器和一组主缓存标签(Etags),包括由高速缓存存储器存储的每个数据块的一个高速缓存标签。 每个数据处理器包括用于向系统控制器发送存储器事务请求的主接口。 系统控制器处理每个存储器事务,并为每个数据处理器维护一组重复的缓存标签(Dtags)。 最后,系统控制器包含用于激活交易以进行互连维修的事务执行电路。 交易执行电路管理来自数据处理器的存储器访问请求,并且包括用于在激活之前处理来自给定数据处理器的每个回写请求的无效电路,以确定与受害高速缓存行对应的Dtag索引是否无效。 此后,无效电路仅在Dtag索引无效时才激活写回请求,如果Dtag索引无效则取消写回请求。

    Fast, dual ported cache controller for data processors in a packet
switched cache coherent multiprocessor system
    4.
    发明授权
    Fast, dual ported cache controller for data processors in a packet switched cache coherent multiprocessor system 失效
    快速,双端口缓存控制器,用于数据包交换缓存一致多处理器系统中的数据处理器

    公开(公告)号:US5644753A

    公开(公告)日:1997-07-01

    申请号:US714965

    申请日:1996-09-17

    IPC分类号: G11C11/41 G06F12/08 G06F13/00

    摘要: A multiprocessor computer system has data processors and a main memory coupled to a system controller. Each data processor has a cache memory. Each cache memory has a cache controller with two ports for receiving access requests. A first port receives access requests from the associated data processor and a second port receives access requests from the system controller. All cache memory access requests include an address value; access requests from the system controller also include a mode flag. A comparator in the cache controller processes the address value in each access request and generates a hit/miss signal indicating whether the data block corresponding to the address value is stored in the cache memory. The cache controller has two modes of operation, including a first standard mode of operation in which read/write access to the cache memory is preceded by generation of the hit/miss signal by the comparator, and a second accelerated mode of operation in which read/write access to the cache memory is initiated without waiting for the comparator to process the access request's address value. The first mode of operation is used for all access requests by the data processor and for system controller access requests when the mode flag has a first value. The second mode of operation is used for the system controller access requests when the mode flag has a second value distinct from the first value.

    摘要翻译: 多处理器计算机系统具有耦合到系统控制器的数据处理器和主存储器。 每个数据处理器都有一个缓存存储器。 每个高速缓冲存储器具有一个具有两个用于接收访问请求的端口的缓存控制器。 第一端口从相关联的数据处理器接收访问请求,第二端口从系统控制器接收访问请求。 所有高速缓存存储器访问请求都包含一个地址值; 来自系统控制器的访问请求还包括模式标志。 高速缓存控制器中的比较器处理每个访问请求中的地址值,并产生指示与地址值相对应的数据块是否存储在高速缓冲存储器中的命中/未命中信号。 高速缓存控制器具有两种操作模式,包括第一标准操作模式,其中先前通过比较器生成命中/未命中信号,其中对高速缓冲存储器的读/写访问以及其中读取的第二加速操作模式 启动对高速缓冲存储器的写入访问,而不必等待比较器处理访问请求的地址值。 当模式标志具有第一个值时,第一种操作模式用于数据处理器和系统控制器访问请求的所有访问请求。 当模式标志具有与第一值不同的第二值时,第二操作模式用于系统控制器访问请求。

    Distributed Cache Coherence at Scalable Requestor Filter Pipes that Accumulate Invalidation Acknowledgements from other Requestor Filter Pipes Using Ordering Messages from Central Snoop Tag
    6.
    发明申请
    Distributed Cache Coherence at Scalable Requestor Filter Pipes that Accumulate Invalidation Acknowledgements from other Requestor Filter Pipes Using Ordering Messages from Central Snoop Tag 有权
    可扩展请求者的分布式缓存一致性累积无效的过滤器来自其他请求者过滤器管道的致谢使用来自中央监听标签的订购消息

    公开(公告)号:US20070186054A1

    公开(公告)日:2007-08-09

    申请号:US11307413

    申请日:2006-02-06

    IPC分类号: G06F13/28

    CPC分类号: G06F12/082 G06F12/0828

    摘要: A multi-processor, multi-cache system has filter pipes that store entries for request messages sent to a central coherency controller. The central coherency controller orders requests from filter pipes using coherency rules but does not track completion of invalidations. The central coherency controller reads snoop tags to identify sharing caches having a copy of a requested cache line. The central coherency controller sends an ordering message to the requesting filter pipe. The ordering message has an invalidate count indicating the number of sharing caches. Each sharing cache receives an invalidation message from the central coherency controller, invalidates its copy of the cache line, and sends an invalidation acknowledgement message to the requesting filter pipe. The requesting filter pipe decrements the invalidate count until all sharing caches have acknowledged invalidation. All ordering, data, and invalidation acknowledgement messages must be received by the requesting filter pipe before loading the data into its cache.

    摘要翻译: 多处理器,多缓存系统具有过滤器管道,其存储发送到中央一致性控制器的请求消息的条目。 中央一致性控制器使用一致性规则对来自过滤器管道的请求进行排序,但不跟踪完成无效。 中央一致性控制器读取窥探标签以识别具有所请求的高速缓存行的副本的共享高速缓存。 中央一致性控制器向请求过滤管发送排序消息。 排序消息具有指示共享缓存数量的无效计数。 每个共享缓存从中央一致性控制器接收到无效消息,使其高速缓存行的副本无效,并向请求的过滤器管道发送无效确认消息。 请求过滤管道减少无效计数,直到所有共享缓存都确认无效。 在将数据加载到其缓存中之前,请求过滤器管道必须接收所有排序,数据和无效确认消息。

    Apparatus and method to speculatively initiate primary memory accesses
    7.
    发明授权
    Apparatus and method to speculatively initiate primary memory accesses 失效
    推测性地启动主存储器访问的装置和方法

    公开(公告)号:US5761708A

    公开(公告)日:1998-06-02

    申请号:US658874

    申请日:1996-05-31

    IPC分类号: G06F12/08 G06F13/16 G06F13/18

    CPC分类号: G06F13/161 G06F12/0884

    摘要: A central processing unit with an external cache controller and a primary memory controller is used to speculatively initiate primary memory access in order to improve average primary memory access times. The external cache controller processes an address request during an external cache latency period and selectively generates an external cache miss signal or an external cache hit signal. If no other primary memory access demands exist at the beginning of the external cache latency period, the primary memory controller is used to speculatively initiate a primary memory access corresponding to the address request. The speculative primary memory access is completed in response to an external cache miss signal. The speculative primary memory access is aborted if an external cache hit signal is generated or a non-speculative primary memory access demand is generated during the external cache latency period.

    摘要翻译: 具有外部高速缓存控制器和主存储器控制器的中央处理单元用于推测性地启动主存储器访问,以便提高平均主存储器访问时间。 外部高速缓存控制器在外部高速缓存等待期间处理地址请求,并选择性地产生外部高速缓存未命中信号或外部高速缓存命中信号。 如果在外部高速缓存等待时间开始时不存在其他主存储器访问需求,则主存储器控制器用于推测地发起对应于地址请求的主存储器访问。 响应于外部高速缓存未命中信号完成了推测性主存储器访问。 如果外部缓存命中信号被产生或在外部高速缓存等待时间段期间产生非推测性的主存储器访问需求,则推测主存储器访问被中止。

    Transaction activation processor for controlling memory transaction
execution in a packet switched cache coherent multiprocessor system
    8.
    发明授权
    Transaction activation processor for controlling memory transaction execution in a packet switched cache coherent multiprocessor system 失效
    用于控制分组交换高速缓存一致多处理器系统中的存储器事务执行的事务激活处理器

    公开(公告)号:US5655100A

    公开(公告)日:1997-08-05

    申请号:US414772

    申请日:1995-03-31

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0828 G06F12/0822

    摘要: A multiprocessor computer system has a multiplicity of sub-systems and a main memory coupled to a system controller. Some of the sub-systems are data processors, each having a respective cache memory that stores multiple blocks of data and a respective set of master cache tags (Etags), including one Etag for each data block stored by the cache memory. Each data processor includes an interface for sending memory transaction requests to the system controller and for receiving cache transaction requests from the system controller corresponding to memory transaction requests by other ones of the data processors. The system controller includes transaction activation logic for activating each said memory transaction request when it meets predefined activation criteria, and for blocking each said memory transaction request until the predefined activation criteria are met. An active transaction status table stores status data representing memory transaction requests that have been activated, including an address value for each activated transaction. The transaction activation logic includes comparator logic for comparing each memory transaction request with the active transaction status data for all activated memory transaction requests so as to detect whether activation of a particular memory transaction request would violate the predefined activation criteria. With certain exceptions concerning writeback transactions, an incoming transaction for accessing a data block that maps to the same cache line a pending, previously activated transaction, will be blocked until the pending transaction that maps to the same cache line is completed.

    摘要翻译: 多处理器计算机系统具有多个子系统和耦合到系统控制器的主存储器。 一些子系统是数据处理器,每个数据处理器具有存储多个数据块的相应高速缓存存储器以及相应的主缓存标签集(Etag),包括由高速缓冲存储器存储的每个数据块的一个Etag。 每个数据处理器包括一个接口,用于向系统控制器发送存储器事务请求,并接收来自系统控制器的高速缓存事务请求,对应于其他数据处理器的存储器事务请求。 系统控制器包括事务激活逻辑,用于当其满足预定义的激活准则时激活每个所述存储器事务请求,并且用于阻止每个所述存储器事务请求直到满足预定义的激活标准。 活动事务状态表存储表示已激活的存储器事务请求的状态数据,包括每个激活的事务的地址值。 事务激活逻辑包括比较器逻辑,用于将每个存储器事务请求与所有激活的存储器事务请求的活动事务状态数据进行比较,以便检测特定存储器事务请求的激活是否违反预定义的激活标准。 对于回写事务有一些例外,用于访问映射到相同高速缓存行的未决事务,先前激活的事务的数据块的传入事务将被阻止,直到映射到同一高速缓存行的挂起事务完成。

    Managing coherent memory between an accelerated processing device and a central processing unit
    9.
    发明授权
    Managing coherent memory between an accelerated processing device and a central processing unit 有权
    管理加速处理设备和中央处理单元之间的连贯内存

    公开(公告)号:US09430391B2

    公开(公告)日:2016-08-30

    申请号:US13601126

    申请日:2012-08-31

    摘要: Existing multiprocessor computing systems often have insufficient memory coherency and, consequently, are unable to efficiently utilize separate memory systems. Specifically, a CPU cannot effectively write to a block of memory and then have a GPU access that memory unless there is explicit synchronization. In addition, because the GPU is forced to statically split memory locations between itself and the CPU, existing multiprocessor computing systems are unable to efficiently utilize the separate memory systems. Embodiments described herein overcome these deficiencies by receiving a notification within the GPU that the CPU has finished processing data that is stored in coherent memory, and invalidating data in the CPU caches that the GPU has finished processing from the coherent memory. Embodiments described herein also include dynamically partitioning a GPU memory into coherent memory and local memory through use of a probe filter.

    摘要翻译: 现有的多处理器计算系统通常具有不足的存储器一致性,因此不能有效地利用单独的存储器系统。 具体来说,CPU无法有效地写入内存块,然后除了有明确的同步之外,还可以对存储器进行GPU访问。 另外,由于GPU被迫静态分割其本身与CPU之间的存储器位置,所以现有的多处理器计算系统不能有效地利用单独的存储器系统。 本文所描述的实施例通过在GPU内接收到通知,CPU已经完成处理存储在相干存储器中的数据,并使CPU缓冲器中的数据无效,GPU已经从相干存储器完成处理来克服这些缺陷。 本文描述的实施例还包括通过使用探针滤波器来将GPU存储器动态地划分为相干存储器和本地存储器。

    Shared memory space in a unified memory model
    10.
    发明授权
    Shared memory space in a unified memory model 有权
    共享内存空间在统一的内存模型中

    公开(公告)号:US09009419B2

    公开(公告)日:2015-04-14

    申请号:US13562985

    申请日:2012-07-31

    IPC分类号: G06F12/02 G06F12/06

    摘要: Methods and systems are provided for mapping a memory instruction to a shared memory address space in a computer arrangement having a CPU and an APD. A method includes receiving a memory instruction that refers to an address in the shared memory address space, mapping the memory instruction based on the address to a memory resource associated with either the CPU or the APD, and performing the memory instruction based on the mapping.

    摘要翻译: 提供了用于将存储器指令映射到具有CPU和APD的计算机装置中的共享存储器地址空间的方法和系统。 一种方法包括接收参考共享存储器地址空间中的地址的存储器指令,将基于地址的存储器指令映射到与CPU或APD相关联的存储器资源,以及基于映射执行存储器指令。