Abstract:
A method of predicting a backward conditional branch instruction used in a vector partitioning loop includes detecting the first conditional branch instruction that occurs after consumption of a dependency index vector by a predicate generating instruction. The dependency index vector includes information indicative of a number of iterations of the vector partitioning loop, and the conditional branch instruction may branch backwards when taken. The conditional branch instruction may then be predicted to be taken a number of times that is determined by the dependency index vector.
Abstract:
In an embodiment, a processor may include a completion time prediction circuit. The completion time prediction circuit may be configured to track one or more aspects of previous instances of a vector memory operation, and may be configured to predict a completion time for a current instance of the vector memory operation. The prediction may be used by the issue circuit to schedule operations dependent on the vector memory operation, if any.
Abstract:
Systems, methods, and devices for dynamically mapping and remapping memory when a portion of memory is activated or deactivated are provided. In accordance with an embodiment, an electronic device may include several memory banks, one or more processors, and a memory controller. The memory banks may store data in hardware memory locations and may be independently deactivated. The processors may request the data using physical memory addresses, and the memory controller may translate the physical addresses to hardware memory locations. The memory controller may use a first memory mapping function when a first number of memory banks is active and a second memory mapping function when a second number is active. When one of the memory banks is to be deactivated, the memory controller may copy data from only the memory bank that is to be deactivated to the active remainder of memory banks.
Abstract:
Systems, apparatuses and methods for utilizing enhanced Macroscalar vector operations which take an enhanced predicate operand that designates the element width and which elements are to be processed. The element width and the number of elements per vector are determined at run-time rather than being defined in the architectural definition of the instruction. This enables additional parallelism when processing smaller-sized data. The instruction performs the requested operation on the elements specified by the enhanced predicate, assuming an element-width also specified by the enhanced predicate, and returns the result as a vector of elements of the same element width.
Abstract:
Systems, apparatuses and methods for utilizing enhanced macroscalar predicate operations which take enhanced predicate operands that designate the element width and which elements are to be processed. The element width and the number of elements per vector are determined at run-time rather than being defined in the architectural definition of the instruction. This enables additional parallelism when processing smaller-sized data. The instruction performs the requested operation on the elements specified by the enhanced control predicate, assuming an element-width also specified by the enhanced control predicate, and returns the result as an enhanced predicate of the same element width.
Abstract:
Embodiments of a system and a method in which a processor may execute instructions that cause the processor to receive an input vector and a control vector are disclosed. The executed instructions may also cause the processor to perform a negation operation dependent upon the input vector and the control vector.
Abstract:
In the described embodiments, a processor generates a result vector when executing a RunningShiftForDivide1P or RunningShiftForDivide2P instruction. In these embodiments, upon executing a RunningShiftForDivide1P/2P instruction, the processor receives a first input vector and a second input vector. The processor then records a base value from an element at a key element position in the first input vector. Next, when generating the result vector, for each active element in the result vector to the right of the key element position, the processor generates a shifted base value using shift values from the second input vector. The processor then corrects the shifted base value when a predetermined condition is met. Next, the processor sets the element of the result vector equal to the shifted base value.
Abstract:
Embodiments of a system and a method in which a processor may execute instructions that cause the processor to receive an input vector and a control vector are disclosed. The executed instructions may also cause the processor to perform a negation operation dependent upon the input vector and the control vector.
Abstract:
Embodiments of a system and a method in which a processor may execute instructions that cause the processor to receive an operand vector, a selection vector, and a control vector are disclosed. The executed instructions may also cause the processor to perform a wrapping rotate previous operation dependent upon the input vectors.
Abstract:
A memory permissions model for a processor that is based on the memory address accessed by an instruction as well as the program counter of the instruction. These permissions may be stored in permissions tables and indexed using the memory addresses of the instruction and the address of the memory locations that it accesses. Those indexes may be obtained from a page table in some cases. These memory permissions may be used in conjunction with other permissions, such as execute permissions and secondary execution privileges that are based on whether the instruction belongs to a particular instruction group.