-
公开(公告)号:US11579934B2
公开(公告)日:2023-02-14
申请号:US17208928
申请日:2021-03-22
Applicant: Apple Inc.
Inventor: Jeremy C. Andrus , John G. Dorsey , James M. Magee , Daniel A. Chimene , Cyril de la Cropte de Chanterac , Bryan R. Hinch , Aditya Venkataraman , Andrei Dorofeev , Nigel R. Gamble , Russell A. Blaine , Constantin Pistol , James S. Ismail
IPC: G06F9/50 , G06F9/48 , G06F1/3234 , G06F1/329 , G06F1/3296 , G06F9/38 , G06F9/26 , G06F9/54 , G06F1/20 , G06F1/324 , G06F1/3206 , G06F9/30
Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.
-
公开(公告)号:US10942850B2
公开(公告)日:2021-03-09
申请号:US16513225
申请日:2019-07-16
Applicant: Apple Inc.
Inventor: John G. Dorsey , Aditya Venkataraman , Bryan R. Hinch , Daniel A. Chimene , Andrei Dorofeev , Constantin Pistol
IPC: G06F12/00 , G06F12/0802
Abstract: A processing system can include a plurality of processing clusters. Each processing cluster can include a plurality of processor cores and a last level cache. Each processor core can include one or more dedicated caches and a plurality of counters. The plurality of counters may be configured to count different types of cache fills. The plurality of counters may be configured to count different types of cache fills, including at least one counter configured to count total cache fills and at least one counter configured to count off-cluster cache fills. Off-cluster cache fills can include at least one of cross-cluster cache fills and cache fills from system memory. The processing system can further include one or more controllers configured to control performance of one or more of the clusters, the processor cores, the fabric, and the memory responsive to cache fill metrics derived from the plurality of counters.
-
公开(公告)号:US20180349176A1
公开(公告)日:2018-12-06
申请号:US15870763
申请日:2018-01-12
Applicant: Apple Inc.
Inventor: Jeremy C. Andrus , John G. Dorsey , James M. Magee , Daniel A. Chimene , Cyril de la Cropte de Chanterac , Bryan R. Hinch , Aditya Venkataraman , Andrei Dorofeev , Nigel R. Gamble , Russell A. Blaine , Constantin Pistol
CPC classification number: G06F9/505 , G06F1/206 , G06F1/3206 , G06F1/324 , G06F1/3243 , G06F1/329 , G06F1/3296 , G06F9/268 , G06F9/30145 , G06F9/3851 , G06F9/3891 , G06F9/4856 , G06F9/4881 , G06F9/4893 , G06F9/5044 , G06F9/5094 , G06F9/54 , G06F2209/501 , G06F2209/5018 , G06F2209/509
Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.
-
公开(公告)号:US11080095B2
公开(公告)日:2021-08-03
申请号:US15870766
申请日:2018-01-12
Applicant: Apple Inc.
Inventor: Jeremy C. Andrus , John G. Dorsey , James M. Magee , Daniel A. Chimene , Cyril de la Cropte de Chanterac , Bryan R. Hinch , Aditya Venkataraman , Andrei Dorofeev , Nigel R. Gamble , Russell A. Blaine , Constantin Pistol
IPC: G06F9/46 , G06F9/50 , G06F9/48 , G06F1/3234 , G06F1/329 , G06F1/3296 , G06F9/38 , G06F9/26 , G06F9/54 , G06F1/20 , G06F1/324 , G06F1/3206 , G06F9/30
Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.
-
公开(公告)号:US10599481B2
公开(公告)日:2020-03-24
申请号:US15870770
申请日:2018-01-12
Applicant: Apple Inc.
Inventor: Constantin Pistol , Daniel A. Chimene , Jeremy C. Andrus , Russell A. Blaine , Kushal Dalmia
IPC: G06F9/50 , G06F9/48 , G06F1/3234 , G06F1/329 , G06F1/3296 , G06F9/38 , G06F9/26 , G06F9/54 , G06F1/20 , G06F1/324 , G06F1/3206 , G06F9/30
Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.
-
公开(公告)号:US10445102B1
公开(公告)日:2019-10-15
申请号:US15707834
申请日:2017-09-18
Applicant: Apple Inc.
Inventor: Constantin Pistol , Ian D. Kountanis
Abstract: Systems, apparatuses, and methods for efficient program flow prediction. After receiving a current fetch address, a first predictor performs a lookup of a first table. When the lookup results in a miss and the first table has no available entries, the first predictor overwrites a given entry of the first table with the received fetch address, in response to detecting a strength value for the given entry is below a threshold. Otherwise, in response to detecting no entries of the first table have a strength value below the threshold, the first predictor allocates an entry in the second table for the received fetch address. When an indication of a target address for the received fetch address is a return address for a function call, a third predictor allocates an entry of a third table with the received fetch address.
-
公开(公告)号:US20210318909A1
公开(公告)日:2021-10-14
申请号:US17208928
申请日:2021-03-22
Applicant: Apple Inc.
Inventor: Jeremy C. Andrus , John G. Dorsey , James M. Magee , Daniel A. Chimene , Cyril de la Cropte de Chanterac , Bryan R. Hinch , Aditya Venkataraman , Andrei Dorofeev , Nigel R. Gamble , Russell A. Blaine , Constantin Pistol , James S. Ismail
IPC: G06F9/50 , G06F9/48 , G06F1/3234 , G06F1/329 , G06F1/3296 , G06F9/38 , G06F9/26 , G06F9/54 , G06F1/20 , G06F1/324
Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.
-
公开(公告)号:US10747539B1
公开(公告)日:2020-08-18
申请号:US15350486
申请日:2016-11-14
Applicant: Apple Inc.
Inventor: James Robert Howard Hakewill , Constantin Pistol
IPC: G06F9/30 , G06F9/38 , G06F12/0875
Abstract: Systems, apparatuses, and methods for instruction next fetch prediction. A scan-on-fill target predictor in a processor generates a predicted next fetch address for the instruction fetch unit. When a group of instructions is used to fill an instruction cache but is not currently being retrieved from the instruction cache for processing by other pipeline stages, the group of instructions are scanned to identify exit points of basic blocks within the group. An entry of a table in the scan-on-fill target predictor is allocated for an instruction in a basic block in the group when the basic block has an exit point with a target address that can be resolved within a single clock cycle. The scan-on-fill target predictor may perform a lookup of the table with the current fetch address. The prediction may be compared to a main branch predictor at a later pipeline stage for training purposes.
-
9.
公开(公告)号:US20180349186A1
公开(公告)日:2018-12-06
申请号:US15870764
申请日:2018-01-12
Applicant: Apple Inc.
Inventor: Jeremy C. Andrus , John G. Dorsey , James M. Magee , Daniel A. Chimene , Cyril de la Cropte de Chanterac , Bryan R. Hinch , Aditya Venkataraman , Andrei Dorofeev , Nigel R. Gamble , Russell A. Blaine , Constantin Pistol
Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.
-
公开(公告)号:US12182003B1
公开(公告)日:2024-12-31
申请号:US17647539
申请日:2022-01-10
Applicant: Apple Inc.
Inventor: Jaidev P. Patwardhan , Matthias Knoth , Shekhar S. Srikantaiah , Prakhar Malhotra , Matthew C. Widmann , Dmitriy B. Solomonov , Constantin Pistol
Abstract: An apparatus includes a processor circuit that includes a memory circuit, one or more processor cores, and a debug circuit. The debug circuit may be configured, in response to activation of a trace mode to record information indicative of instructions executing on the one or more processor cores, to write a trace data stream to the memory circuit that includes trace data collected on the instructions executing on the one or more processor cores. In response to a particular instruction within one of the processor cores specifying a write of a data value to an architecturally visible trace register, the debug circuit may be further configured to output the data value to the trace data stream as part of executing the particular instruction.
-
-
-
-
-
-
-
-
-