Register freeing latency
    11.
    发明授权

    公开(公告)号:US12112169B2

    公开(公告)日:2024-10-08

    申请号:US18096141

    申请日:2023-01-12

    Applicant: Arm Limited

    CPC classification number: G06F9/30098 G06F9/30094 G06F9/384

    Abstract: A data processing apparatus is provided. Instruction send circuitry sends an instruction to an external processor to be executed by the external processor. Allocation circuitry allocates a specified one of several registers for a result of the instruction having been executed on the external processor and data receive circuitry receives the result of the instruction having been executed on the external processor and stores the result in the specified one of the several registers. In response to a condition being met: the specified one of the several registers is dereserved prior to the result being received by the data receive circuitry, and the result is discarded by the data receive circuitry when the result is received by the data receive circuitry.

    Technique for improving efficiency of data processing operations in an apparatus that employs register renaming

    公开(公告)号:US12099847B2

    公开(公告)日:2024-09-24

    申请号:US18101726

    申请日:2023-01-26

    Applicant: Arm Limited

    CPC classification number: G06F9/384

    Abstract: A data processing apparatus comprises: execution circuitry to execute instructions in order to perform data processing operations specified by those instructions; a plurality of registers to store data values for access by the execution circuitry when performing the data processing operations, each register having an associated physical register identifier; register rename circuitry to select physical register identifiers to associate with architectural register identifiers specified by the instructions; and rename storage having a plurality of entries, each entry being associated with one of the architectural register identifiers and used by the register rename circuitry to indicate a physical register identifier selected for association with that one of the architectural register identifiers; the register rename circuitry comprising an execute unit, and being responsive to detection of an early execute condition for a given instruction, the early execute condition requiring at least detection that each source value required to execute the given instruction is available to the register rename circuitry without accessing the plurality of registers, to cause the execute unit to perform the data processing operation specified by the given instruction in order to generate a result value, and to cause the generated result value to be stored in an entry of the rename storage associated with a destination architectural register identifier specified by the given instruction.

    Cache eviction control for a private cache in an out-of-order data processing apparatus

    公开(公告)号:US11720494B1

    公开(公告)日:2023-08-08

    申请号:US17692305

    申请日:2022-03-11

    Applicant: Arm Limited

    CPC classification number: G06F12/0802 G06F12/12 G06F2212/60

    Abstract: Apparatuses and methods relating to controlling cache evictions are disclosed. Processing circuitry which execute instructions out-of-order is provided with a private cache into which blocks of data are copied from a shared storage location to which the processing circuitry shares access. The processing circuitry also has a read-after-read buffer, into which an entry is allocated when out-of-order execution of a load instruction occurs comprising an address accessed by the load instruction. The address remains as a valid entry in the read-after-read buffer until the load instruction is committed. Eviction of an eviction candidate block of data from the private cache to the shared storage location is controlled in dependence on whether the eviction candidate block of data has a corresponding valid entry in the read-after-read buffer.

    Cache control circuitry and methods

    公开(公告)号:US11132202B2

    公开(公告)日:2021-09-28

    申请号:US16580158

    申请日:2019-09-24

    Applicant: Arm Limited

    Abstract: An apparatus comprises execution circuitry to perform operations on source data values and to generate result data values; issue circuitry comprising one or more issue queues identifying pending operations awaiting performance by the execution circuitry, and selection circuitry to select pending operations to issue to the execution circuitry; data value cache storage comprising first and second cache regions; and cache control circuitry to control the storing to the first cache region of result data values generated by the execution circuitry and the eviction of stored result data values from the first cache region in response to newly generated result data values being stored in the first cache region; the cache control circuitry being configured to store to the second cache region result data values required as source data values for one or more oldest pending operations identified by the one or more issue queues and to inhibit eviction of a given result data value stored in the second cache region until initiation of execution of a pending operation which requires that given result data value as a source data value.

    Dynamic SIMD instruction issue target selection

    公开(公告)号:US10725964B2

    公开(公告)日:2020-07-28

    申请号:US16005790

    申请日:2018-06-12

    Applicant: Arm Limited

    Abstract: Apparatuses and methods of data processing are disclosed. An apparatus comprises two data processing clusters each having multiple data processing lanes to perform single instruction multiple data (SIMD) processing. Decoded instructions are issued to at least one of the two data processing clusters. A decoded SIMD instruction specifying a vector length which is more than the width of the data processing lanes of the first data processing cluster has a first part issued to the first data processing cluster for execution. An issuance target for a second remaining part of the decoded SIMD instruction is selected in dependence on a dynamic performance condition. When the dynamic performance condition has a first state the issuance target is the first data processing cluster and when the dynamic performance condition has a second state the issuance target is the second data processing cluster. When the issuance target is the first data processing cluster, to schedule the first and second parts of the decoded SIMD instruction in series.

    Apparatus and method for controlling branch prediction

    公开(公告)号:US10649782B2

    公开(公告)日:2020-05-12

    申请号:US15939827

    申请日:2018-03-29

    Applicant: Arm Limited

    Abstract: An apparatus and method are provided for controlling branch prediction. The apparatus has processing circuitry for executing instructions, and branch prediction circuitry that comprises a plurality of branch prediction mechanisms used to predict target addresses for branch instructions to be executed by the processing circuitry. The branch instructions comprise a plurality of branch types, where one branch type is a return instruction. The branch prediction mechanisms include a return prediction mechanism used by default to predict a target address when a return instruction is detected by the branch prediction circuitry. However, the branch prediction circuitry is responsive to a trigger condition indicative of misprediction of the target address when using the return prediction mechanism to predict the target address for a given return instruction, to switch to using an alternative branch prediction mechanism for predicting the target address for the given return instruction. This has been found to improve performance in certain situations.

Patent Agency Ranking