Apparatus and method for performing address translation using buffered address translation data

    公开(公告)号:US11163691B2

    公开(公告)日:2021-11-02

    申请号:US16451384

    申请日:2019-06-25

    Applicant: Arm Limited

    Abstract: Examples of the present disclosure relate to an apparatus comprising processing circuitry to perform data processing operations, storage circuitry to store data for access by the processing circuitry, address translation circuitry to maintain address translation data for translating virtual memory addresses into corresponding physical memory addresses, and prefetch circuitry. The prefetch circuitry is arranged to prefetch first data into the storage circuitry in anticipation of the first data being required for performing the data processing operations. The prefetching comprises, based on a prediction scheme, predicting a first virtual memory address associated with the first data, accessing the address translation circuitry to determine a first physical memory address corresponding to the first virtual memory address, and retrieving the first data based on the first physical memory address corresponding to the first virtual memory address. The prefetch circuitry is further arranged, based on the prediction scheme, to predict a second virtual memory address associated with second data in anticipation of the second data being prefetched, and to provide the predicted second virtual memory address to the address translation circuitry to enable the address translation circuitry to obtain the address translation data for the second virtual memory address.

    Cache eviction control for a private cache in an out-of-order data processing apparatus

    公开(公告)号:US11720494B1

    公开(公告)日:2023-08-08

    申请号:US17692305

    申请日:2022-03-11

    Applicant: Arm Limited

    CPC classification number: G06F12/0802 G06F12/12 G06F2212/60

    Abstract: Apparatuses and methods relating to controlling cache evictions are disclosed. Processing circuitry which execute instructions out-of-order is provided with a private cache into which blocks of data are copied from a shared storage location to which the processing circuitry shares access. The processing circuitry also has a read-after-read buffer, into which an entry is allocated when out-of-order execution of a load instruction occurs comprising an address accessed by the load instruction. The address remains as a valid entry in the read-after-read buffer until the load instruction is committed. Eviction of an eviction candidate block of data from the private cache to the shared storage location is controlled in dependence on whether the eviction candidate block of data has a corresponding valid entry in the read-after-read buffer.

    Increasing effective cache associativity

    公开(公告)号:US11138119B2

    公开(公告)日:2021-10-05

    申请号:US16247912

    申请日:2019-01-15

    Applicant: Arm Limited

    Abstract: There is provided an apparatus that includes storage circuitry. The storage circuitry is made up from a plurality of sets, each of the sets having at least one storage location. Receiving circuitry receives an access request that includes an input address. Lookup circuitry obtains a plurality of candidate sets that correspond with an index part of the input address. The lookup circuitry determines a selected storage location from the candidate sets using an access policy. The access policy causes the lookup circuitry to iterate through the candidate sets to attempt to locate an appropriate storage location. The appropriate storage location is accessed in response to the appropriate storage location being found.

    Identifying read-set information based on an encoding of replaceable-information values

    公开(公告)号:US10783031B2

    公开(公告)日:2020-09-22

    申请号:US16105129

    申请日:2018-08-20

    Applicant: Arm Limited

    Abstract: An apparatus comprises processing circuitry, transactional memory support circuitry and a cache. The processing circuitry processes threads of data processing, and the transactional memory support circuitry supports execution of a transaction within a thread, including tracking a read set of addresses, comprising addresses accessed by read instructions within the transaction. A transaction comprises instructions for which the processing circuitry is configured to prevent commitment of the results of speculatively executed instruction until the transaction has completed. The cache has a plurality of entries, each associated with an address and specifying a replaceable-information value for that address that comprises information for which, outside of the transaction, processing would be functionally correct even if the information was incorrect. While the transaction is pending, the transactional memory support circuitry identifies, based on an encoding of the replaceable-information values, read-set information identifying addresses in the read set tracked for the transaction.

    Technique for managing a cache memory in a system employing transactional memory including storing a backup copy of initial data for quick recovery from a transaction abort

    公开(公告)号:US10956206B2

    公开(公告)日:2021-03-23

    申请号:US16372690

    申请日:2019-04-02

    Applicant: Arm Limited

    Abstract: A technique is described for managing a cache structure in a system employing transactional memory. The apparatus comprises processing circuitry to perform data processing operations in response to instructions, the processing circuitry comprising transactional memory support circuitry to support execution of a transaction, and a cache structure comprising a plurality of cache entries for storing data for access by the processing circuitry. Each cache entry has associated therewith an allocation tag, and allocation tag control circuitry is provided to control use of a plurality of allocation tags and to maintain an indication of a current state of each of those allocation tags. The transactional memory support circuitry is arranged, when initial data in a chosen cache entry is to be written to during the transaction, to cause a backup copy of the initial data to be stored in a further cache entry and to cause the allocation tag control circuitry to associate with that further cache entry a selected allocation tag selected for the transaction. The current state of that selected allocation tag is updated to a first state which prevents the processing circuitry from accessing that further cache entry. In the event that the transaction is aborted prior to reaching a transaction end point, the transactional memory support circuitry causes the chosen cache entry to be invalidated, and the allocation tag control circuitry changes the state of the selected allocation tag to a second state that allows the processing circuitry to access the further cache entry. As a result, this enables a hit to subsequently be detected within the cache structure for the initial data without a requirement to refetch the initial data into the cache structure. This can give rise to significant performance enhancements.

    Prefetching in data processing circuitry

    公开(公告)号:US10776274B2

    公开(公告)日:2020-09-15

    申请号:US15910122

    申请日:2018-03-02

    Applicant: Arm Limited

    Abstract: Data processing circuitry comprises a cache memory to cache a subset of data elements from a main memory; a processing element to execute program code to access data elements having respective memory addresses, the processing element being configured to access the data elements in the cache memory and, in the case of a cache miss, to fetch the data elements from the main memory; prefetch circuitry, responsive to an access to a current data element, to initiate prefetching into the cache memory of a data element at a memory address defined by a current offset value relative to the address of the current data element; and offset value selection circuitry comprising: an address table to store memory addresses for which a data element accessed by the processing element resulted in a cache miss or an access to a previously prefetched data element; and detector circuitry to detect, for each of a group of candidate offset values, one or more respective metrics representing a proportion of a set of data element accesses which resulted in a cache miss or an access to a previously prefetched data element, for which the memory address for that data element access differs by the candidate offset value from a memory address in the address table; in which the detector circuitry is configured to process the group of candidate offset values as successive complementary sub-groups of one or more of the group of candidate offset values and to set a next instance of the current offset value in response to processing each sub-group, in dependence upon the proportions indicated by the one or more detected metrics for that sub-group; and the one or more metrics previously detected for the current offset value.

    Prefetching in data processing circuitry

    公开(公告)号:US10769069B2

    公开(公告)日:2020-09-08

    申请号:US15910137

    申请日:2018-03-02

    Applicant: Arm Limited

    Abstract: Data processing circuitry comprises a cache memory to cache a subset of data elements from a main memory; a processing element to execute program code to access data elements having respective memory addresses, the processing element being configured to access the data elements in the cache memory and, in the case of a cache miss, to fetch the data elements from the main memory; prefetch circuitry, responsive to an access to a current data element, to initiate prefetching into the cache memory of a data element at a memory address defined by a current offset value relative to the address of the current data element; offset value selection circuitry comprising: an address table to store memory addresses for which a data element accessed by the processing element resulted in a cache miss or an access to a previously prefetched data element; detector circuitry to detect, for each of a group of candidate offset values, one or more respective metrics representing a proportion of a set of data element accesses which resulted in a cache miss or an access to a previously prefetched data element, for which the memory address for that data element access differs by the candidate offset value from a memory address in the address table; in which the detector circuitry is configured to set a next instance of the current offset value in response to the one or more detected metrics; verification circuitry to detect, at one or more predetermined stages with respect to the processing of the group of candidate offset values by the offset value selection circuitry, one or more verification metrics representing a proportion of a set of data element accesses which resulted in a cache miss or an access to a previously prefetched data element, for which the memory address for that data element access differs by the current offset value from a memory address in the address table, to detect whether the one or more verification metrics comply with a predetermined condition; and control circuitry to inhibit prefetching at least until a next selection of a current offset value by the offset value selection circuitry, in response to a detection by the verification circuitry that the one or more verification metrics do not comply with the predetermined condition.

Patent Agency Ranking