-
公开(公告)号:US10365988B2
公开(公告)日:2019-07-30
申请号:US15705854
申请日:2017-09-15
申请人: Intel Corporation
摘要: Embodiments disclosed herein provide for monitoring performance of a processing device to manage non-precise events. A processing device includes a performance counter to track a non-precise event and to increment upon occurrence of the non-precise event, wherein the non-precise event comprises a first type of performance event that is not linked to an instruction in an instruction trace. The processing device also includes a first handler circuit to generate and store a first record, the first record comprising architectural metadata defining a state of the processing device at a time of generation of the first record, wherein the first handler circuit to generate records corresponding to precise events. The processing device further includes a second handler circuit communicably coupled to the first handler circuit, the second handler circuit to cause the first handler circuit to generate a second record for the non-precise event upon overflow of the performance counter.
-
公开(公告)号:US20200210178A1
公开(公告)日:2020-07-02
申请号:US16811242
申请日:2020-03-06
申请人: Intel Corporation
发明人: Michael W. Chynoweth , Jonathan D. Combs , Joseph K. Olivas , Beeman C. Strong , Rajshree A. Chabukswar , Ahmad Yasin , Jason W. Brandt , Ofer Levy , John M. Esper , Andreas Kleen , Christopher M. Chrulski
IPC分类号: G06F9/30
摘要: A processor includes a counter to store a cycle count that tracks a number of cycles between retirement of a first branch instruction and retirement of a second branch instruction during execution of a set of instructions. The processor further includes a stack of registers coupled to the counter, wherein the stack of registers is to store branch type information including: a first value of the counter when the first branch instruction is retired; a second value of the counter when the second branch instruction is retired; a first type information value indicating a type of the first branch instruction; and a second type information value indicating a type of the second branch instruction.
-
公开(公告)号:US10067762B2
公开(公告)日:2018-09-04
申请号:US15201218
申请日:2016-07-01
申请人: Intel Corporation
IPC分类号: G06F9/30
摘要: Apparatuses, methods, and systems relating to memory disambiguation are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction, an execution unit to execute the decoded instruction, a retirement unit to retire an executed instruction in program order, and a memory disambiguation circuit to allocate an entry in a memory disambiguation table for a first load instruction that is to be flushed for a memory ordering violation, the entry comprising a counter value and an instruction pointer for the first load instruction.
-
公开(公告)号:US09766999B2
公开(公告)日:2017-09-19
申请号:US14292140
申请日:2014-05-30
申请人: Intel Corporation
CPC分类号: G06F11/348 , G06F11/3466 , G06F2201/86 , G06F2201/88
摘要: In accordance with embodiments disclosed herein, there is provided systems and methods for monitoring performance of a processing device to manage non-precise events. A processing device includes a performance counter to increment upon occurrence of a non-precise event in the processing device. The processing device also includes a precise event based sampling (PEBS) enable control communicably coupled to the performance counter. The processing device also includes a PEBS handler to generate and store a PEBS record including an architectural metadata defining a state of the processing device at a time of generation of the PEBS record. The processing device further includes a non-precise event based sampling (NPEBS) module communicably coupled to the PEBS control and the PEBS handler. The NPEBS module causes the PEBS handler to generate the PEBS record for the non-precise event upon overflow of the performance counter.
-
公开(公告)号:US20240220253A1
公开(公告)日:2024-07-04
申请号:US18148397
申请日:2022-12-29
申请人: Intel Corporation
发明人: Mathew Lowes , Martin J. Licht , Jonathan D. Combs
IPC分类号: G06F9/30 , G06F12/0875
CPC分类号: G06F9/30047 , G06F12/0875
摘要: Techniques for implementing a variable width unaligned fetch for instructions are described. In certain examples, a hardware processor core includes fetch circuitry to perform a single fetch operation to fetch from a paged memory: (i) a multiple cache line width of instruction data, between a minimum width that is greater than one cache line and a maximum width that is a plurality of cache lines, when the multiple cache line width of the instruction data does not include a page boundary of the paged memory, and (ii) less than or equal to one cache line width of the instruction data when the multiple cache line width of the instruction data does include the page boundary of the paged memory; decoder circuitry to decode a single instruction, comprising an opcode, from the instruction data into a decoded instruction; and execution circuitry to execute the decoded instruction according to the opcode.
-
公开(公告)号:US10592244B2
公开(公告)日:2020-03-17
申请号:US15423143
申请日:2017-02-02
申请人: INTEL CORPORATION
发明人: Michael W. Chynoweth , Jonathan D. Combs , Joseph K. Olivas , Beeman C. Strong , Rajshree A. Chabukswar , Ahmad Yasin , Jason W. Brandt , Ofer Levy , John M. Esper , Andreas Kleen , Christopher M. Chrulski
IPC分类号: G06F9/30
摘要: An example processor that includes a decoder, an execution circuit, a counter, and a last branch recorder (LBR) register. The decoder may decode a branch instruction for a program. The execution circuit may be coupled to the decoder, where the execution circuit may execute the branch instruction. The counter may be coupled to the execution circuit, where the counter may store a cycle count. The LBR register coupled to the execution circuit, where the LBR register may include a counter field to store a first value of the counter when the branch instruction is executed and a type field to store type information indicating a type of the branch instruction.
-
公开(公告)号:US10417001B2
公开(公告)日:2019-09-17
申请号:US13728416
申请日:2012-12-27
申请人: Intel Corporation
摘要: Embodiments of an invention for a physical register table for eliminating move instructions are disclosed. In one embodiment, a processor includes a physical register file, a register allocation table, and a physical register table. The register allocation table is to store mappings of logical registers to physical registers. The physical register table is to store entries including pointers to physical registers in the mappings. The number of entry locations in the physical register table is less than the number of physical registers in the physical register file.
-
公开(公告)号:US20180088956A1
公开(公告)日:2018-03-29
申请号:US15280460
申请日:2016-09-29
申请人: Intel Corporation
发明人: Jonathan D. Combs
CPC分类号: G06F9/3802 , G06F9/3822 , G06F9/3836
摘要: A processor includes a back end to execute decoded instructions and a front end. The front end includes two decode clusters and circuitry to receive data elements representing undecoded instructions, in program order, and to direct subsets of the data elements to the decode clusters. An IP generator directs one subset of data elements to the first cluster, detects a condition indicating that a load balancing action should be taken, and directs a subset of data elements immediately following the first subset in program order to the first or second decode cluster dependent on the action taken. The action may include annotating a BTB entry, inserting a fake branch in the BTB, forcing a cluster switch, or suppressing a cluster switch. The detected condition may be a predicated taken branch or an annotation thereof, or a heuristic based on a queue state, a count of uops, or a latency value.
-
公开(公告)号:US20230092268A1
公开(公告)日:2023-03-23
申请号:US17992407
申请日:2022-11-22
申请人: Intel Corporation
发明人: Michael W. Chynoweth , Jonathan D. Combs , Joseph K. Olivas , Beeman C. Strong , Rajshree A. Chabukswar , Ahmad Yasin , Jason W. Brandt , Ofer Levy , John M. Esper , Andreas Kleen , Christopher M. Chrulski
IPC分类号: G06F9/30
摘要: A processor includes a counter to store a cycle count that tracks a number of cycles between retirement of a first branch instruction and retirement of a second branch instruction during execution of a set of instructions. The processor further includes a stack of registers coupled to the counter, wherein the stack of registers is to store branch type information including: a first value of the counter when the first branch instruction is retired; a second value of the counter when the second branch instruction is retired; a first type information value indicating a type of the first branch instruction; and a second type information value indicating a type of the second branch instruction.
-
公开(公告)号:US10331454B2
公开(公告)日:2019-06-25
申请号:US15280460
申请日:2016-09-29
申请人: Intel Corporation
发明人: Jonathan D. Combs
IPC分类号: G06F9/38
摘要: A processor includes a back end to execute decoded instructions and a front end. The front end includes two decode clusters and circuitry to receive data elements representing undecoded instructions, in program order, and to direct subsets of the data elements to the decode clusters. An IP generator directs one subset of data elements to the first cluster, detects a condition indicating that a load balancing action should be taken, and directs a subset of data elements immediately following the first subset in program order to the first or second decode cluster dependent on the action taken. The action may include annotating a BTB entry, inserting a fake branch in the BTB, forcing a cluster switch, or suppressing a cluster switch. The detected condition may be a predicated taken branch or an annotation thereof, or a heuristic based on a queue state, a count of uops, or a latency value.
-
-
-
-
-
-
-
-
-