-
公开(公告)号:US11972264B2
公开(公告)日:2024-04-30
申请号:US17838713
申请日:2022-06-13
Applicant: Arm Limited
Inventor: Guillaume Bolbenes , Thibaut Elie Lanois , Houdhaifa Bouzguarrou , Luca Nassi
CPC classification number: G06F9/3806 , G06F9/325
Abstract: Processing circuitry performs processing operations in response to micro-operations. Front end circuitry supplies the micro-operations to be processed by the processing circuitry. Prediction circuitry generates a prediction of a number of loop iterations for which one or more micro-operations per loop iteration are to be supplied by the front end circuitry, where an actual number of loop iterations to be processed by the processing circuitry is resolvable by the processing circuitry based on at least one operand corresponding to a first loop iteration to be processed by the processing circuitry. The front end circuitry varies, based on a level of confidence in the prediction of the number of loop iterations, a supply rate with which the one or more micro-operations for at least a subset of the loop iterations are supplied to the processing circuitry.
-
公开(公告)号:US10977044B2
公开(公告)日:2021-04-13
申请号:US16561383
申请日:2019-09-05
Applicant: Arm Limited
Inventor: Remi Marius Teyssier , Luca Nassi , Albin Pierrick Tonnerre , François Donati
Abstract: An apparatus comprising processing circuitry is provided, the processing circuitry comprising execution circuitry, commit circuitry, issue circuitry comprising an issue queue and selection circuitry, and a branch predictor. The processing circuitry is configured to identify a speculation barrier instruction in the commit queue. While an entry in the commit queue identifies a speculation barrier instruction, when a branch instruction that follows the speculation barrier instruction in the program order is selected for issue, the processing circuitry performs a first execution of the instruction, inhibiting updating of branch prediction data items associated with the branch instruction and inhibiting the selection circuitry from invalidating the associated issue queue entry. When the speculation barrier instruction completes, the processing circuitry is configured to perform a second execution of the instruction, updating the branch prediction data items associated with the branch instruction and allowing the issue circuitry to invalidate the associated issue queue entry.
-
公开(公告)号:US12260218B2
公开(公告)日:2025-03-25
申请号:US18343294
申请日:2023-06-28
Applicant: Arm Limited
Inventor: Quentin Éric Nouvel , Luca Nassi , Nicola Piano , Albin Pierrick Tonnerre , Geoffray Matthieu Lacourba
Abstract: There is provided an apparatus, method for data processing. The apparatus comprises post decode cracking circuitry responsive to receipt of decoded instructions from decode circuitry of a processing pipeline, to crack the decoded instructions into micro-operations to be processed by processing circuitry of the processing pipeline. The post decode cracking circuitry is responsive to receipt of a decoded instruction suitable for cracking into a plurality of micro-operations including at least one pair of micro-operations having a producer-consumer data dependency, to generate the plurality of micro-operations including a producer micro-operation and a consumer micro-operation, and to assign a transfer register to transfer data between the producer micro-operation and the consumer micro-operation.
-
公开(公告)号:US12182427B2
公开(公告)日:2024-12-31
申请号:US17966071
申请日:2022-10-14
Applicant: Arm Limited
Inventor: Stefano Ghiggini , Natalya Bondarenko , Luca Nassi , Geoffray Matthieu Lacourba , Huzefa Moiz Sanjeliwala , Miles Robert Dooley , Abhishek Raja
IPC: G06F3/06
Abstract: An apparatus is provided for controlling the operating mode of control circuitry, such that the control circuitry may change between two operating modes. In an allocation mode, data that is loaded in response to an instruction is allocated into storage circuitry from an intermediate buffer, and the data is read from the storage circuitry. In a non-allocation mode, the data is not allocated to the storage circuitry, and is read directly from intermediate buffer. The control of the operating mode may be performed by mode control circuitry, and the mode may be changed in dependence on the type of instruction that calls the data, and whether the data may be used again in the near future, or whether it is expected to be used only once.
-
公开(公告)号:US11157277B2
公开(公告)日:2021-10-26
申请号:US16561430
申请日:2019-09-05
Applicant: Arm Limited
Abstract: Data processing apparatus comprises a processing element configured to access an architectural register representing a given system register; mapping circuitry to map the architectural register representing the given system register to a physical register selected from a set of physical registers; a register bank having a set of two or more respective banked versions of the given system register, in which a respective one of the banked versions of the system register is associated with each of a plurality of current operating states of the processing element; in which, when the processing element changes operating state from a first operating state associated with a first one of the banked versions of the system register to a second operating state associated with a second, different, one of the banked versions of the system register, the processing element is configured to store the current contents of the architectural register representing the given system register to the first one of the banked versions of the system register and to copy the contents of the second one of the banked versions of the system register to the architectural register representing the given system register.
-
公开(公告)号:US11847056B1
公开(公告)日:2023-12-19
申请号:US17824199
申请日:2022-05-25
Applicant: Arm Limited
Inventor: Damien Matthieu Valentin Cathrine , Ugo Castorina , Luca Nassi
IPC: G06F12/08 , G06F12/0862
CPC classification number: G06F12/0862 , G06F2212/602
Abstract: An apparatus comprises prefetch circuitry, and a cache having a plurality of entries to store data for access by processing circuitry and blocks of metadata for reference by the prefetch circuitry. The prefetch circuitry can detect one or more access sequences in dependence on training inputs derived from demand accesses processed by the cache in response to memory access operations performed by the processing circuitry. On detecting a given access sequence, this causes an associated given block of metadata providing information indicative of the given access sequence to be stored in a selected entry of the cache. Eviction control circuitry, responsive to a victimisation event, performs an operation to select a victim entry in the cache, the victim entry being selected from one or more candidate victim entries. Each entry has an associated age indication value used to determine whether that entry is allowed to be a candidate victim entry, and the eviction control circuitry is arranged to perform a dynamic ageing operation to determine an ageing control value used to control updating of the associated age indication value for any entry storing a block of metadata. The dynamic ageing operation is arranged to determine the ageing control value in dependence on at least a training rate indication for the prefetch circuitry, where the training rate indication is indicative of a number of training inputs per memory access operation performed by the processing circuitry.
-
公开(公告)号:US11531547B2
公开(公告)日:2022-12-20
申请号:US17326864
申请日:2021-05-21
Applicant: Arm Limited
Inventor: Damian Maiorano , Luca Nassi , Cédric Denis Robert Airaud , Christophe Laurent Carbonne , Jocelyn François Orion Jaubert , Pasquale Ranone
Abstract: Data processing circuitry comprises out-of-order instruction execution circuitry; register mapping circuitry to map zero or more architectural processor registers relating to execution of that program instruction to respective ones of a set of physical processor registers; commit circuitry to commit, in a program code order, the results of executed program instructions, the commit circuitry being configured to access a data store which stores register tag data to indicate which physical registers mapped by the register mapping circuitry relate to a given program instruction; fault detection circuitry to detect a memory access fault in respect of a vector memory access operation and to generate fault indication data indicative of an element earliest in the element order for which a memory access fault was detected; a fault indication register to store the fault indication data, in which the register mapping circuitry is configured to generate a register mapping for a program instruction for any architectural processor registers relating to execution of that program instruction other than the fault indication register; and control circuitry to encode the fault indication data, applicable to a program instruction not yet committed by the commit circuitry, to register tag data associated with that program instruction.
-
公开(公告)号:US11010159B2
公开(公告)日:2021-05-18
申请号:US16118528
申请日:2018-08-31
Applicant: Arm Limited
Inventor: Xiaoyang Shen , Cedric Denis Robert Airaud , Luca Nassi , Damien Robin Martin
Abstract: Apparatus comprises counter and bit-shift circuitry to provide a succession of processing stages each comprising a count operation stage and a corresponding bit-shift stage, each processing stage operating with respect to a set of contiguous n-bit groups of bit positions, where n is 1 for a first processing stage and n doubles from one processing stage in the succession of processing stages to a next processing stage in the succession of processing stages; each count operation stage being configured to generate, for a first set of alternate instances of the n-bit groups of bit positions, count values indicating a respective number of bits of a predetermined bit value in a mask data word; and each bit-shift stage being configured to generate a bit-shifted data word by bit-shifting bits of a data word to be processed, for a second set of alternate instances of the n-bit groups of bit positions complementary to the first set, by respective numbers of bit positions dependent upon the count values generated by the respective count operation stage, in which the bit-shifted data word for one bit-shift stage in the succession of processing stages is used as the data word to be processed by the next bit-shift stage in the succession of processing stages.
-
9.
公开(公告)号:US10915327B2
公开(公告)日:2021-02-09
申请号:US16220050
申请日:2018-12-14
Applicant: Arm Limited
Inventor: Luca Nassi , Remi Marius Teyssier , François Donati , Damian Maiorano
Abstract: Aspects of the present disclosure relate to an apparatus comprising a plurality of clusters, each cluster having a plurality of execution units to execute instructions. The apparatus comprises dispatch circuitry to determine, for each instruction to be executed, a chosen cluster from amongst the plurality of clusters to which to dispatch that instruction for execution. This determination is performed by selecting between a default dispatch policy wherein said chosen cluster is a cluster to which an earlier instruction to generate at least one source operand of said instruction was dispatched for execution, and an alternative dispatch policy for selecting said chosen cluster. Said selecting is based on a selection parameter. The dispatch circuitry is further configured to dispatch said instruction to the chosen cluster for execution.
-
公开(公告)号:US10558462B2
公开(公告)日:2020-02-11
申请号:US15987002
申请日:2018-05-23
Applicant: Arm Limited
Abstract: An apparatus and method are provided for storing source operands for operations. The apparatus comprises execution circuitry for performing operations on data values, and a register file comprising a plurality of registers to store the data values operated on by the execution circuitry. Issue circuitry is also provided that has a pending operations storage identifying pending operations awaiting performance by the execution circuitry and selection circuitry to select pending operations from the pending operation storage to issue to the execution circuitry. The pending operations storage comprises an entry for each pending operation, each entry storing attribute information identifying the operation to be performed, where that attribute information includes a source identifier field for each source operand of the pending operation. The source identifier field has a field size sufficient to enable a register identifier to be stored within the source identifier field to identify the register used to store the data value forming the source operand. However, the field size is insufficient to store the data value as stored in the register. Value analysis circuitry is responsive to the execution circuitry generating a data value that will be used as a source operand for a pending operation, to determine whether a reduced size representation of that generated data value can be accommodated within the associated source identifier field of the entry for that pending operation. If so, the reduced size representation is generated and a control signal is issued to the issue circuitry to cause the register identifier for that source operand to be replaced by the reduced size representation of the data value. By such an approach, it is possible to increase the performance of the apparatus and/or to simplify the construction of the register file.
-
-
-
-
-
-
-
-
-