Abstract:
A data processing apparatus and a method of controlling performance of speculative vector operations are provided. The apparatus comprises processing circuitry for performing a sequence of speculative vector operations on vector operands, each vector operand comprising a plurality of vector elements, and speculation control circuitry for maintaining a speculation width indication indicating the number of vector elements of each vector operand to be subjected to the speculative vector operations. The speculation width indication is set to an initial value prior to performance of the sequence of speculative vector operations. The processing circuitry generates progress indications during performance of the sequence of speculative vector operations, and the speculation control circuitry detects, with reference to the progress indications and speculation reduction criteria, presence of a speculation reduction condition. The speculation reduction condition is a condition indicating that a reduction in the speculation width indication is expected to improve at least one performance characteristic of the data processing apparatus relative to continued operation without the reduction in the speculation width indication. The speculation control circuitry is responsive to detection of the speculation reduction condition to reduce the speculation width indication. This can significantly increase performance (for example in terms of throughput and/or energy consumption) when performing speculative vector operations.
Abstract:
A vector data access unit includes data access ordering circuitry, for issuing data access requests indicated by elements of earlier and a later vector instructions, one being a write instruction. An element indicating the next data access for each of the instructions is determined. The next data accesses for the earlier and the later instructions may be reordered. The next data access of the earlier instruction is selected if the position of the earlier instruction's next data element is less than or equal to the position of the later instruction's next data element minus a predetermined value. The next data access of the later instruction may be selected if the position of the earlier instruction's next data element is higher than the position of the later instruction's next data element minus a predetermined value. Thus data accesses from earlier and later instructions are partially interleaved.
Abstract:
A data processing apparatus comprises branch prediction circuitry adapted to store at least one branch prediction state entry in relation to a stream of instructions, input circuitry to receive at least one input to generate a new branch prediction state entry, wherein the at least one input comprises a plurality of bits; and coding circuitry adapted to perform an encoding operation to encode at least some of the plurality of bits based on a value associated with a current execution environment in which the stream of instructions is being executed. This guards against potential attacks which exploit the ability for branch prediction entries trained by one execution environment to be used by another execution environment as a basis for branch predictions.
Abstract:
A processing apparatus includes floating point arithmetic circuitry coupled to monitoring circuitry. The monitoring circuitry stores exponent limit data indicating at least one of a maximum exponent value and a minimum exponent value processed when performing the floating point arithmetic operations. The monitoring circuitry may be selectively enabled in dependence upon a virtual machine identifier, an application specific identifier or a program counter value range. Exponent limit data may be gathered in respect of different portions of the floating point arithmetic circuitry and/or may be aggregated to form global exponent limit data for the system.
Abstract:
An apparatus comprises: processing circuitry 18 to process instructions from a plurality of software workloads; a branch prediction cache 40-42 to cache branch prediction state data selected from a plurality of sets of branch prediction state data 60 stored in a memory system 30, 32, 34, each set of branch prediction state data corresponding to one of said plurality of software workloads; and branch prediction circuitry 4 to predict an outcome of a branch instruction of a given software workload based on branch prediction state data cached in the branch prediction cache from the set of branch prediction state data corresponding to said given software workload. This is useful for mitigating against speculation side-channel attacks which exploit branch mispredictions caused by malicious training of a branch predictor.
Abstract:
A data processing apparatus and method for performing speculative vector access operations are provided. The data processing apparatus has a reconfigurable buffer accessible to vector data access circuitry and comprising a storage array for storing up to M vectors of N vectors elements. The vector data access circuitry performs speculative data write operations in order to cause vector elements from selected vector operands in a vector register bank to be stored into the reconfigurable buffer. On occurrence of a commit condition, the vector elements currently stored in the reconfigurable buffer are then written to a data store. Speculation control circuitry maintains a speculation width indication indicating the number of vector elements of each selected vector operand stored in the reconfigurable buffer. The speculation width indication is initialized to an initial value, but on detection of an overflow condition within the reconfigurable buffer the speculation width indication is modified to reduce the number of vector elements of each selected vector operand stored in the reconfigurable buffer. The reconfigurable buffer then responds to a change in the speculation width indication by reconfiguring the storage array to increase the number of vectors M and reduce the number of vector elements N per vector. This provides an efficient mechanism for supporting performance of speculative data write operations.
Abstract:
A vector data access unit includes data access ordering circuitry, for issuing data access requests indicated by elements of earlier and a later vector instructions, one being a write instruction. An element indicating the next data access for each of the instructions is determined. The next data accesses for the earlier and the later instructions may be reordered. The next data access of the earlier instruction is selected if the position of the earlier instruction's next data element is less than or equal to the position of the later instruction's next data element minus a predetermined value. The next data access of the later instruction may be selected if the position of the earlier instruction's next data element is higher than the position of the later instruction's next data element minus a predetermined value. Thus data accesses from earlier and later instructions are partially interleaved.
Abstract:
A data processing apparatus is provided which controls the use of data in respect of a further operation. The data processing apparatus identifies whether data is trusted or untrusted by identifying whether or not the data was determined by a speculatively executed resolve-pending operation. A permission control unit is also provided to control how the data can be used in respect of a further operation according to a security policy while the speculatively executed operation is still resolve-pending.
Abstract:
A data processing apparatus and method are provided for performing segmented operations. The data processing apparatus comprises a vector register store for storing vector operands, and vector processing circuitry providing N lanes of parallel processing, and arranged to perform a segmented operation on up to N data elements provided by a specified vector operand, each data element being allocated to one of the N lanes. The up to N data elements forms a plurality of segments, and performance of the segmented operation comprises performing a separate operation on the data elements of each segment, the separate operation involving interaction between the lanes containing the data elements of the associated segment. Predicate generation circuitry is responsive to a compute descriptor instruction specifying an input vector operand comprising a plurality of segment descriptors, to generate per lane predicate information used by the vector processing circuitry when performing the segmented operation to maintain a boundary between each of the plurality of segments. As a result, interaction between lanes containing data elements from different segments is prevented. This allows very effective utilisation of the lanes of parallel processing within the vector processing circuitry to be achieved.