摘要:
A digital signal processing system comprising a central processing unit core 2, a memory 8 and a coprocessor 4 operates using coprocessor memory access instructions (e.g. LDC, STC). The addressing mode information within these coprocessor memory access instructions (P, U, W, Offset) not only controls the addressing mode used by the central processing unit core 2 but is also used by the coprocessor 4 to determine the number of data words in the transfer being specified such that the coprocessor 4 can terminate the transfer at the appropriate time. Knowledge in advance of the number of words in a transfer is also advantageous in some bus systems, such as those that can be used with synchronous DRAM. The Offset field within the instruction may be used to specify changes to be made in the value provided by the central processing unit core 2 upon execution of a particular instruction and also to specify the number of words in the transfer. This arrangement is well suited to working through a regular array of data such as in digital signal processing operations. If the Offset field is not being used, then the number of words to be transferred may default to 1.
摘要:
A data processing system is provided with an instruction (ADD8TO16) that unpacks non-adjacent portions of a data word using sign or zero extension and combines this with a single-instruction-multiple-data type arithmetic operation, such as an add, performed in response to the same instruction. The instruction is well suited to use within systems having a data path (2) including a shifting circuit (6) upstream of an arithmetic circuit (8).
摘要:
The present invention provides a data processing apparatus and method for performing aligned access operations. The data processing apparatus comprises a register data store having a plurality of registers operable to store data elements, and a processor operable to perform a data processing operation on one or more data elements accessed in at least one of the registers. Further, access logic is provided which is operable in response to an access instruction to perform an access operation in order to move a number of data elements between specified registers and a portion of a memory, the portion having a start address specified by the access instruction. Further, the access instruction has an alignment specifier associated therewith which is settable either to a first value or one of a plurality of second values. The first value indicates that the start address is to be treated as byte aligned, and each of the second values indicates a different predetermined alignment that the start address is to be treated as conforming to. The access logic is then operable to adapt the access operation in dependence on the value of alignment specifier. This provides significantly improved flexibility in the performance of access operations.
摘要:
The present invention provides a system, method and computer program for performing a modular multiplication a*b*2−N modulo n, where a, b and n are N-bit integers. The system comprises a multiplier for multiplying a Y-bit number by a Z-bit number, and partitioning logic for partitioning the integer a into a plurality of first sections, each first section being of a size which is a multiple of Y, and for partitioning the integer b into a plurality of second sections, each second section being of a size which is a multiple of Z. A multiplication unit is then provided to apply operations to control the multiplier to perform a sequence of multiplications to multiply one of said first sections by one of said second sections in order to generate a number of output operands for use in subsequent operations performed by the multiplication unit. A controller is used to sequentially input one of said first sections and one of said second sections into the multiplication unit along with predetermined ones of said output operands from preceding operations performed by the multiplication unit, until each first section has been multiplied by each second section. By this approach, a multiplication unit can be provided which is of a fixed size, irrespective of the size of the input integers, a b and n. This alleviates the requirements for increasingly larger fast storage, the size of the fast storage being dependent not on the ultimate size of the N-bit integers, but rather on the predetermined size of the sections into which those integers are partitioned.
摘要:
A data processing apparatus is provided comprising processing circuitry and an instruction decoder responsive to program instructions to control processing circuitry to perform the data processing. The instruction decoder is responsive to an address calculating instruction to perform an address calculating operation for calculating a partial address result from a non-fixed reference address and a partial offset value such that a full address specifying a memory location of an information entity is calculable from said partial address result using at least one supplementary program instruction. The partial offset value has a bit-width greater than or equal to said instruction size and is encoded within at least one partial offset field of said address calculating instruction. A corresponding data processing method, virtual machine and computer program product are also provided.
摘要:
A data processing apparatus 2 comprises a processing circuit 4 and instruction decoder 6. A bitfield manipulation instruction controls the processing apparatus 2 to generate at least one result data element from corresponding first and second source data elements src1, src2. Each result data element includes a portion corresponding to a bitfield bf of the corresponding first source data element src1. Bits of the result data element that are more significant than the inserted bitfield bf have a prefix value p that is selected, based on a control value specified by the instruction, as one of a first prefix value having a zero value, a second prefix value having the value of a portion of the corresponding second source data element src2, and a third prefix value corresponding to a sign extension of the bitfield bf of the first source data element src1.
摘要:
A method of controlling data processing logic which causes a data value to be rotated by a number of bits in order to generate a rotated data value; a number of least significant bits of the rotated data value are masked with other bits of said rotated data value not being masked in order to generate a masked rotated data value; a selected bit of said rotated data value are masked with other bits of said rotated data value not being masked in order to generate a bit preset rotated data value; and said sign-extended bit field extracted data value to be generated by subtracting said masked rotated data value from said bit preset data value or said zero-extended bit field extracted data value to be generated by performing a logical exclusive-OR operation with the masked rotated data value and said bit preset data value.
摘要:
A floating point unit is provided with a register bank comprising 32 registers that may be used as either vector registers of scalar registers. A data processing instruction includes at least one register specifying field pointing to a register containing a data value to be used in that operation. An increase in the instruction bit space available to encode more opcodes or to allow for more registers is provided by encoding whether a register is to be treated as a vector or a scalar within the register field itself. Further, the register field for one register of the instruction may encode whether another register is a vector or a scalar. The registers can be initially accessed using the values within the register fields of the instruction independently of the opcode allowing for easier decode.
摘要:
A floating point unit is described that performs addition operations. An adder 16 within the floating point unit receives a first input and a second input to generate a sum. This sum is subject to subsequent normalization by a normalizer 60 and rounding by an incrementer 64. If an operation is performed that is immediately followed by an addition operation using the result of the preceding operation, then the normalized but unrounded sum is fed back to the adder 16 together with an indication of its rounding requirement. This rounding requirement can be performed by the adder 16 in parallel with the execution of the following addition by using the carry-in bit of the adder 16 to apply any increment required to rounding of the preceding result.
摘要:
A data processing system having a processor core 4, a memory management unit 6 and a cache memory 8 uses the memory management unit 6 to produce a confirm signal C that indicates that a memory access request will be processed no further, i.e. the outcome is fully determined. The next memory access request is initiated prior to this confirm signal C being available and accordingly if the confirm signal C indicates a result different to that predicted, then a stall of the system is required until the non-confirmed memory access request can be dealt with. A result prediction unit 14 is responsive to one or more variable signals characterising the memory access request and serves to produce a result prediction signal RP indicating in which of a plurality of ways said memory access requested will complete, which is available before the confirm signal C, and upon which an appropriate predicted next memory access request may be prepared for use if the confirm signal C indicates that the memory access request will be processed no further. In this way, the performance impact of late indication of partial processing can be reduced, particularly in the case of common and relatively easy to predict circumstances such as cache storage line wrap.