-
公开(公告)号:US20210182197A1
公开(公告)日:2021-06-17
申请号:US16717835
申请日:2019-12-17
发明人: DEREK E. WILLIAMS , GUY L. GUTHRIE , HUGH SHEN , LUKE MURRAY
IPC分类号: G06F12/0817 , G06F12/0891 , G06F12/0842
摘要: A cache memory includes a data array, a directory of contents of the data array that specifies coherence state information, and snoop logic that processes operations snooped from a system fabric by reference to the data array and the directory. The snoop logic, responsive to snooping on the system fabric a request of a flush or clean memory access operation of an initiating coherence participant, determines whether the directory indicates the cache memory has coherence ownership of a target address of the request. Based on determining the directory indicates the cache memory has coherence ownership of the target address, the snoop logic provides a coherence response to the request that causes coherence ownership of the target address to be transferred to the initiating coherence participant, such that the initiating coherence participant can protect the target address against conflicting requests.
-
62.
公开(公告)号:US20200183696A1
公开(公告)日:2020-06-11
申请号:US16216659
申请日:2018-12-11
发明人: DEREK E. WILLIAMS , GUY L. GUTHRIE , HUGH SHEN , SANJEEV GHAI
IPC分类号: G06F9/38 , G06F9/34 , G06F9/30 , G06F9/54 , G06F12/0811
摘要: A data processing system includes multiple processing units all having access to a shared memory. A processing unit of the data processing system includes a processor core including an upper level cache, core reservation logic that records addresses in the shared memory for which the processor core has obtained reservations, and an execution unit that executes memory access instructions including a fronting load instruction. Execution of the fronting load instruction generates a load request that specifies a load target address. The processing unit further includes lower level cache that, responsive to receipt of the load request and based on the load request indicating an address match for the load target address in the core reservation logic, protects the load target address against access by any conflicting memory access request during a protection interval following servicing of the load request.
-
63.
公开(公告)号:US20200034146A1
公开(公告)日:2020-01-30
申请号:US16048884
申请日:2018-07-30
发明人: DEREK E. WILLIAMS , GUY L. GUTHRIE , SANJEEV GHAI , HUGH SHEN
IPC分类号: G06F9/30
摘要: A data processing system includes multiple processing units all having access to a shared memory. A processing unit includes a processor core that executes memory access instructions including a fronting load instruction, wherein execution of the fronting load instruction generates a load request that specifies a load target address. The processing unit also includes reservation logic that records addresses in the shared memory for which the processor core has obtained reservations. In addition, the processing unit includes a read-claim state machine that, responsive to receipt of the load request and based on an address match for the load target address in the reservation logic, protects the load target address against access by any conflicting memory access request during a protection interval following servicing of the load request.
-
公开(公告)号:US20190188138A1
公开(公告)日:2019-06-20
申请号:US15846392
申请日:2017-12-19
IPC分类号: G06F12/0831 , G06F13/16
CPC分类号: G06F12/0831 , G06F13/1663 , G06F2212/1032 , G06F2212/507
摘要: A data processing system includes first and second processing nodes and response logic coupled by an interconnect fabric. A first coherence participant in the first processing node is configured to issue a memory access request specifying a target memory block, and a second coherence participant in the second processing node is configured to issue a probe request regarding a memory region tracked in a memory coherence directory. The first coherence participant is configured to, responsive to receiving the probe request after the memory access request and before receiving a systemwide coherence response for the memory access request, detect an address collision between the probe request and the memory access request and, responsive thereto, transmit a speculative coherence response. The response logic is configured to, responsive to the speculative coherence response, provide a systemwide coherence response for the probe request that prevents the probe request from succeeding.
-
65.
公开(公告)号:US20190065380A1
公开(公告)日:2019-02-28
申请号:US15819458
申请日:2017-11-21
发明人: GUY L. GUTHRIE , JODY B. JOYNER , RONALD N. KALLA , MICHAEL S. SIEGEL , JEFFREY A. STUECHELI , CHARLES D. WAIT , FREDERICK J. ZIEGLER
IPC分类号: G06F12/0864 , G06F12/1009
摘要: Reducing translation latency within a memory management unit (MMU) using external caching structures including requesting, by the MMU on a node, page table entry (PTE) data and coherent ownership of the PTE data from a page table in memory; receiving, by the MMU, the PTE data, a source flag, and an indication that the MMU has coherent ownership of the PTE data, wherein the source flag identifies a source location of the PTE data; performing a lateral cast out to a local high-level cache on the node in response to determining that the source flag indicates that the source location of the PTE data is external to the node; and directing at least one subsequent request for the PTE data to the local high-level cache.
-
公开(公告)号:US20180349138A1
公开(公告)日:2018-12-06
申请号:US15825387
申请日:2017-11-29
发明人: GUY L. GUTHRIE , DEREK E. WILLIAMS
IPC分类号: G06F9/30 , G06F12/0875
CPC分类号: G06F9/30043 , G06F9/467 , G06F12/0875 , G06F2212/452
摘要: A data processing system implementing a weak memory model includes a plurality of processing units coupled to an interconnect fabric. In response execution of a multicopy atomic store instruction, an initiating processing unit broadcasts a store request on the interconnect fabric to obtain coherence ownership of a target cache line. The initiating processing unit posts a kill request to at least one of the plurality of processing units to request invalidation of a copy of the target cache line. In response to successful posting of the kill request, the initiating processing unit broadcasts a store complete request on the interconnect fabric to enforce completion of the invalidation of the copy of the target cache line. In response to the store complete request receiving a coherence response indicating success, the initiating processing unit permits an update to the target cache line requested by the multicopy atomic store instruction to be atomically visible.
-
67.
公开(公告)号:US20180052788A1
公开(公告)日:2018-02-22
申请号:US15243601
申请日:2016-08-22
发明人: GUY L. GUTHRIE , DEREK E. WILLIAMS
IPC分类号: G06F13/28 , G06F12/084 , G06F12/0897
CPC分类号: G06F13/28 , G06F12/084 , G06F12/0897 , G06F2212/621
摘要: In a data processing system implementing a weak memory model, a lower level cache receives, from a processor core, a plurality of copy-type requests and a plurality of paste-type requests that together indicate a memory move to be performed. The lower level cache also receives, from the processor core, a barrier request that requests enforcement of ordering of memory access requests prior to the barrier request with respect to memory access requests after the barrier request. Prior to completion of processing of the barrier request by the lower level cache, the lower level cache speculatively issues a request on the interconnect fabric to obtain a copy of a data granule specified by a memory access request among the pluralities of requests that follows the barrier request in program order.
-
公开(公告)号:US20180052608A1
公开(公告)日:2018-02-22
申请号:US15243581
申请日:2016-08-22
发明人: LAKSHMINARAYANA B. ARIMILLI , GUY L. GUTHRIE , WILLIAM J. STARKE , JEFFREY A. STUECHELI , DEREK E. WILLIAMS
IPC分类号: G06F3/06 , G06F12/0897 , G06F12/10 , G06F9/30
CPC分类号: G06F3/065 , G06F3/061 , G06F3/0656 , G06F3/0659 , G06F3/0673 , G06F9/30032 , G06F9/3004 , G06F9/52 , G06F12/0292 , G06F12/0811 , G06F12/0833 , G06F12/0897 , G06F2212/1016 , G06F2212/1041 , G06F2212/206
摘要: A processor core of a data processing system, in response to a first instruction, generates a copy-type request specifying a source real address and transmits it to a lower level cache. In response to a second instruction, the processor core generates a paste-type request specifying a destination real address associated with a memory-mapped device and transmits it to the lower level cache. In response to receipt of the copy-type request, the lower level cache copies a data granule from a storage location specified by the source real address into a non-architected buffer. In response to receipt of the paste-type request, the lower level cache issues a command to write the data granule from the non-architected buffer to the memory-mapped device. In response to receipt from the memory-mapped device of a busy response, the processor core abandons the memory move instruction sequence and performs alternative processing.
-
公开(公告)号:US20180052607A1
公开(公告)日:2018-02-22
申请号:US15243489
申请日:2016-08-22
IPC分类号: G06F3/06 , G06F12/0842 , G06F12/0811 , G06F12/10 , G06F9/30 , G06F13/40
CPC分类号: G06F3/065 , G06F3/061 , G06F3/0656 , G06F3/0673 , G06F9/30032 , G06F9/3004 , G06F9/30047 , G06F9/3005 , G06F12/0811 , G06F12/0833 , G06F12/0842 , G06F12/0897 , G06F12/10 , G06F13/4068 , G06F2212/1056 , G06F2212/206 , G06F2212/62
摘要: A data processing system includes at least one processor core each having an associated store-through upper level cache and an associated store-in lower level cache. In response to execution of a memory move instruction sequence including a plurality of copy-type instructions and a plurality of paste-type instructions, the at least one processor core transmits a corresponding plurality of copy-type and paste-type requests to its associated lower level cache, where each copy-type request specifies a source real address and each paste-type request specifies a destination real address. In response to receipt of each copy-type request, the associated lower level cache copies a respective data granule from a respective storage location specified by the source real address of that copy-type request into a non-architected buffer. In response to receipt of each paste-type request, the associated lower level cache writes a respective one of the data granules from the non-architected buffer to a respective storage location specified by the destination real address. The memory move instruction sequence begins execution on a first hardware thread and continues on a second hardware thread.
-
公开(公告)号:US20180052599A1
公开(公告)日:2018-02-22
申请号:US15243554
申请日:2016-08-22
发明人: LAKSHMINARAYANA B. ARIMILLI , GUY L. GUTHRIE , WILLIAM J. STARKE , JEFFREY A. STUECHELI , DEREK E. WILLIAMS
IPC分类号: G06F3/06 , G06F12/0897 , G06F12/10 , G06F13/40 , G06F13/28
CPC分类号: G06F3/061 , G06F3/065 , G06F3/0656 , G06F3/0659 , G06F3/0673 , G06F12/0897 , G06F12/10 , G06F13/28 , G06F13/4068 , G06F2212/60
摘要: A data processing system includes a processor core having a store-in lower level cache, a memory controller, a memory-mapped device, and an interconnect fabric communicatively coupling the lower level cache and the memory-mapped device. In response to a first instruction in the processor core, a copy-type request specifying a source real address is transmitted to the lower level cache. In response to a second instruction in the processor core, a paste-type request specifying a destination real address associated with the memory-mapped device is transmitted to the lower level cache. In response to receipt of the copy-type request, the lower level cache copies a data granule from a storage location specified by the source real address into a non-architected buffer. In response to receipt of the paste-type request, the lower level cache issues on the interconnect fabric a command that writes the data granule from the non-architected buffer to the memory-mapped device.
-
-
-
-
-
-
-
-
-