摘要:
A processor 6 is provided with an instruction decoder 18 which is responsive to memory access instructions to determine whether the base register value being used matches a null value and if such a match occurs then branches to a null value exception handler.
摘要:
A data processing apparatus and method are provided for performing a cache lookup in an energy efficient manner. The data processing apparatus has at least one processing unit for performing operations and a cache having a plurality of cache lines for storing data values for access by that at least one processing unit when performing those operations. The at least one processing unit provides a plurality of sources from which access requests are issued to the cache, and each access request, in addition to specifying an address, further includes a source identifier indicating the source of the access request. A storage element is provided for storing for each source an indication as to whether the last access request from that source resulted in a hit in the cache, and cache line identification logic determines, for each access request, whether that access request is seeking to access the same cache line as the last access request issued by that source. The cache control logic is operable when handling an access request to constrain the lookup procedure to only a subset of the storage blocks within the cache if it is determined that the access request is to the same cache line as the last access request issued by the relevant source, and the storage element indicates that the last access request from that source resulted in a hit in the cache. This yields significant energy savings when accessing the cache.
摘要:
A multi-threaded in-order superscalar processor 2 is described having a fetch stage 8 within which thread interleaving circuitry 36 interleaves instructions taken from different program threads to form an interleaved stream of instructions which is then decoded and subject to issue. Hint generation circuitry 62 within the fetch stage 8 adds hint data to the threads indicating that parallel issue of an associated instruction is permitted with one of more other instructions.
摘要:
A multi-threaded in-order superscalar processor 2 is described having a fetch stage 8 within which thread interleaving circuitry 36 interleaves instructions taken from different program threads to form an interleaved stream of instructions which is then decoded and subject to issue. Hint generation circuitry 62 within the fetch stage 8 adds hint data to the threads indicating that parallel issue of an associated instruction is permitted with one of more other instructions.
摘要:
A data processing system includes an instruction fetching circuit 2, an instruction queue 4 and further processing circuits 6. A branch target cache, which maybe a branch target address cache 8, a branch target instruction cache 10 or both, is used to store branch target addresses or blocks of instructions starting at the branch target respectively. A control circuit 12 is responsive to the contents of the instruction queue 4 when a branch instruction is encountered to determine whether or not storage resources within the branch target cache 8, 10 should be allocated to that branch instruction. Storage resources within the branch target cache 8, 10 will be allocated when the number of program instructions within the instruction queue is below a threshold number and/or the estimated execution time of the program instructions is below a threshold time.
摘要:
A branch prediction mechanism 16, 18 within a multithreaded processor having hardware scheduling logic 6, 8, 10, 12 uses a shared global history table 18 which is indexed by respective branch history registers 20, 22 for each program thread. Different mappings are used between preceding branch behaviour and the prediction value stored within respective branch history registers 20, 22. These different mappings may be provided by inverters placed into the shift in paths for the branch history registers 20, 22 or by adders 40, 42 or in some other way. The different mappings help to equalise the probability of use of the particular storage locations within the global history table 18 such that the plurality of program threads are not competing excessively for the same storage locations corresponding to the more commonly occurring patterns of preceding branch behaviour.
摘要:
A data processing system 2 is provided with breakpoint circuitry 28 having breakpoint registers 30 which can specify a variety of different types of breakpoint conditions. These breakpoint conditions include register access breakpoints which are triggered when an access is made to either a general purpose register 8 or a configuration register 22, 24. The breakpoints can also include input/output port access breakpoints which are triggered when an access is made to a predetermined one of a plurality of input/output ports 26 by an appropriate program instruction or in another way.
摘要:
A multithreading processor 4 interleaves program instructions from different program threads to perform fine grained multithreading. Thread performance monitoring circuitry 30 monitors performance parameters of individual program threads to generate performance values. Issue control circuitry 28 reads these performance values to determine which program thread is next selected to be active when a thread switch event occurs. The performance parameters measured may include the proportion of cycles in which a program thread is able to provide a program instruction for execution by the execution circuitry 12 within the processor 4.
摘要:
A multithreading processor 4 interleaves program instructions from different program threads to perform fine grained multithreading. Thread performance monitoring circuitry 30 monitors performance parameters of individual program threads to generate performance values. Issue control circuitry 28 reads these performance values to determine which program thread is next selected to be active when a thread switch event occurs. The performance parameters measured may include the proportion of cycles in which a program thread is able to provide a program instruction for execution by the execution circuitry 12 within the processor 4.