Abstract:
A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.
Abstract:
A heterogeneous processor architecture and a method of booting a heterogeneous processor is described. A processor according to one embodiment comprises: a set of large physical processor cores; a set of small physical processor cores having relatively lower performance processing capabilities and relatively lower power usage relative to the large physical processor cores; and a package unit, to enable a bootstrap processor. The bootstrap processor initializes the homogeneous physical processor cores, while the heterogeneous processor presents the appearance of a homogeneous processor to a system firmware interface.
Abstract:
A processor includes multiple physical cores that support multiple logical cores of different core types, where the core types include a big core type and a small core type. A multi-threaded application includes multiple software threads are concurrently executed by a first subset of logical cores in a first time slot. Based on data gathered from monitoring the execution in the first time slot, the processor selects a second subset of logical cores for concurrent execution of the software threads in a second time slot. Each logical core in the second subset has one of the core types that matches the characteristics of one of the software threads.
Abstract:
A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.
Abstract:
A heterogeneous processor architecture is described. For example, a processor according to one embodiment of the invention comprises: a set of large physical processor cores; a set of small physical processor cores having relatively lower performance processing capabilities and relatively lower power usage relative to the large physical processor cores; virtual-to-physical (V-P) mapping logic to expose the set of large physical processor cores to software through a corresponding set of virtual cores and to hide the set of small physical processor core from the software.
Abstract:
A processor is operable to process conditional branches. The processor includes instruction fetch logic to fetch a conditional short forward branch. The conditional short forward branch is to include a conditional branch instruction and a set of one or more instructions that are to sequentially follow the conditional branch instruction in program order. The set of the one or more instructions are between the conditional branch instruction and a forward branch target instruction that is to be indicated by the conditional branch instruction. The processor also includes instruction conversion logic coupled with the instruction fetch logic. The instruction conversion logic is to convert the conditional short forward branch to a computationally equivalent set of one or more predicated instructions. Other processors are also disclosed, as are various methods and systems.
Abstract:
A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.
Abstract:
A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.
Abstract:
A processor of an aspect includes a set of registers capable of storing packed data. An execution unit is coupled with the set of registers. The execution unit is to access the set of registers in at least two different ways in response to instructions. The at least two different ways include a first way in which the set of registers are to represent a plurality of N-bit registers. The at least two different ways also include a second way in which the set of registers are to represent a single register of at least 2N-bits. In one aspect, the at least 2N-bits is to be at least 256-bits.
Abstract:
A processor of an aspect includes at least one lower processing capability and lower power consumption physical compute element and at least one higher processing capability and higher power consumption physical compute element. Migration performance benefit evaluation logic is to evaluate a performance benefit of a migration of a workload from the at least one lower processing capability compute element to the at least one higher processing capability compute element, and to determine whether or not to allow the migration based on the evaluated performance benefit. Available energy and thermal budget evaluation logic is to evaluate available energy and thermal budgets and to determine to allow the migration if the migration fits within the available energy and thermal budgets. Workload migration logic is to perform the migration when allowed by both the migration performance benefit evaluation logic and the available energy and thermal budget evaluation logic.