-
公开(公告)号:US12182428B2
公开(公告)日:2024-12-31
申请号:US17124872
申请日:2020-12-17
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov , Johnathan Alsop , SeyedMohammad SeyedzadehDelcheh
Abstract: Systems, apparatuses, and methods for determining data placement based on packet metadata are disclosed. A system includes a traffic analyzer that determines data placement across connected devices based on observed values of the metadata fields in actively exchanged packets across a plurality of protocol types. In one implementation, the protocol that is supported by the system is the compute express link (CXL) protocol. The traffic analyzer performs various actions in response to events observed in a packet stream that match items from a pre-configured list. Data movement is handled underneath the software applications by changing the virtual-to-physical address translation once the data movement is completed. After the data movement is finished, threads will pull in the new host physical address into their translation lookaside buffers (TLBs) via a page table walker or via an address translation service (ATS) request.
-
公开(公告)号:US20240111421A1
公开(公告)日:2024-04-04
申请号:US17957732
申请日:2022-09-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Nathaniel Morris , Kevin Yu-Cheng Cheng , Atul Kumar Sujayendra Sandur , Sergey Blagodurov
IPC: G06F3/06
CPC classification number: G06F3/0611 , G06F3/0653 , G06F3/0673
Abstract: Connection modification based on traffic pattern is described. In accordance with the described techniques, a traffic pattern of memory operations across a set of connections between at least one device and at least one memory is monitored. The traffic pattern is then compared to a threshold traffic pattern condition, such as an amount of data traffic in different directions across the connections. A traffic direction of at least one connection of the set of connections is modified based on the traffic pattern corresponding to the threshold traffic pattern condition.
-
公开(公告)号:US11677813B2
公开(公告)日:2023-06-13
申请号:US17347116
申请日:2021-06-14
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov
IPC: H04L67/1008 , H04L41/0816 , H04L45/30 , H04L41/0896 , H04L47/80
CPC classification number: H04L67/1008 , H04L41/0816 , H04L41/0896 , H04L45/30 , H04L47/805
Abstract: A server includes a plurality of nodes that are connected by a network that includes an on-chip network or an inter-chip network that connects the nodes. The server also includes a controller to configure the network based on relative priorities of workloads that are executing on the nodes. Configuring the network can include allocating buffers to virtual channels supported by the network based on the relative priorities of the workloads associated with the virtual channels, configuring routing tables that route the packets over the network based on the relative priorities of the workloads that generate the packets, or modifying arbitration weights to favor granting access to the virtual channels to packets generated by higher priority workloads.
-
公开(公告)号:US20230169015A1
公开(公告)日:2023-06-01
申请号:US17539189
申请日:2021-11-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Kishore Punniyamurthy , SeyedMohammad SeyedzadehDelcheh , Sergey Blagodurov , Ganesh Dasika , Jagadish B. Kotra
IPC: G06F12/126
CPC classification number: G06F12/126 , G06F2212/6042
Abstract: A method includes storing a function representing a set of data elements stored in a backing memory and, in response to a first memory read request for a first data element of the set of data elements, calculating a function result representing the first data element based on the function.
-
公开(公告)号:US11586539B2
公开(公告)日:2023-02-21
申请号:US16713940
申请日:2019-12-13
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Weon Taek Na , Jagadish B. Kotra , Yasuko Eckert , Steven Raasch , Sergey Blagodurov
IPC: G06F12/08 , G06F12/0811 , G06F12/0871 , G06F9/30 , G06F12/0882 , G06F12/1027 , G06F12/0831
Abstract: A processing system selectively allocates space to store a group of one or more cache lines at a cache level of a cache hierarchy having a plurality of cache levels based on memory access patterns of a software application executing at the processing system. The processing system generates bit vectors indicating which cache levels are to allocate space to store groups of one or more cache lines based on the memory access patterns, which are derived from data granularity and movement information. Based on the bit vectors, the processing system provides hints to the cache hierarchy indicating the lowest cache level that can exploit the reuse potential for a particular data.
-
公开(公告)号:US11526449B2
公开(公告)日:2022-12-13
申请号:US17007133
申请日:2020-08-31
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Johnathan Alsop , Pouya Fotouhi , Bradford Beckmann , Sergey Blagodurov
IPC: G06F12/08 , G06F12/0891 , G06F9/30 , G06F12/0882 , G06F12/0811
Abstract: A processing system limits the propagation of unnecessary memory updates by bypassing writing back dirty cache lines to other levels of a memory hierarchy in response to receiving an indication from software executing at a processor of the processing system that the value of the dirty cache line is dead (i.e., will not be read again or will not be read until after it has been overwritten). In response to receiving an indication from software that data is dead, a cache controller prevents propagation of the dead data to other levels of memory in response to eviction of the dead data or flushing of the cache at which the dead data is stored.
-
公开(公告)号:US11385983B1
公开(公告)日:2022-07-12
申请号:US17130665
申请日:2020-12-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov , Jinyoung Choi
Abstract: An approach is provided for implementing memory profiling aggregation. A hardware aggregator provides memory profiling aggregation by controlling the execution of a plurality of hardware profilers that monitor memory performance in a system. For each hardware profiler of the plurality of hardware profilers, a hardware counter value is compared to a threshold value. When a threshold value is satisfied, execution of a respective hardware profiler of the plurality of hardware profilers is initiated to monitor memory performance. Multiple hardware profilers of the plurality of hardware profilers may execute concurrently and each generate a result counter value. The result counter values generated by each hardware profiler of the plurality of hardware profilers are aggregated to generate an aggregate result counter value. The aggregate result counter value is stored in memory that is accessible by a software processes for use in optimizing memory-management policy decisions.
-
公开(公告)号:US20220100668A1
公开(公告)日:2022-03-31
申请号:US17094989
申请日:2020-11-11
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov , Marko Scrbak , Brandon K. Potter
IPC: G06F12/0877 , G06F12/0815
Abstract: Methods and apparatus provide monitoring of memory access traffic in a data processing system by tracking, such as by data fabric hardware control logic, a number of cache line accesses to a page of memory associated with one or more memory devices, and producing spike indication data that indicates a spike in cache line accesses to a given page of memory. Pages are moved from a slower memory to a faster memory based on the spike indication data. In some implementations, the tracking is done by updating a cache directory with data representing the tracked number of cache line accesses.
-
公开(公告)号:US10678702B2
公开(公告)日:2020-06-09
申请号:US15167038
申请日:2016-05-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov , Andrew G. Kegel
IPC: G06F12/10 , G06F12/1009 , G06F12/1081 , G06F12/1027 , G06F3/06
Abstract: The described embodiments include an input-output memory management unit (IOMMU) with two or more memory elements and a controller. The controller is configured to select, based on one or more factors, one or more selected memory elements from among the two or more memory elements for performing virtual address to physical address translations in the IOMMU. The controller then performs the virtual address to physical address translations using the one or more selected memory elements.
-
公开(公告)号:US09916265B2
公开(公告)日:2018-03-13
申请号:US14569825
申请日:2014-12-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov , Gabriel H. Loh , Yasuko Eckert
CPC classification number: G06F13/1694 , G06F11/3414 , G06F12/023 , G06F13/161 , G06F2212/1044 , Y02D10/14
Abstract: A system includes a plurality of memory classes and a set of one or more processing units coupled to the plurality of memory classes. The system further includes a data migration controller to select a traffic rate as a maximum traffic rate for transferring data between the plurality of memory classes based on a net benefit metric associated with the traffic rate, and to enforce the maximum traffic rate for transferring data between the plurality of memory classes.
-
-
-
-
-
-
-
-
-