专利检索 ap:("Apple Inc.") AND inv:"Mridul Agarwal" 第 1 页

1.

发明授权
Shared learning table for load value prediction and load address prediction 有权

公开(公告)号：US12067398B1

公开(公告)日：2024-08-20

申请号：US17661491

申请日：2022-04-29

申请人： Apple Inc.

发明人： Yuan C. Chou , Debasish Chandra , Mridul Agarwal , Haoyan Jia

IPC分类号： G06F9/38

CPC分类号： G06F9/3842 , G06F9/383 , G06F9/3832

摘要： Techniques are disclosed relating to load value prediction. In some embodiments, a processor includes learning table circuitry that is shared for both address and value prediction. Loads may be trained for value prediction when they are eligible for both value and address prediction. Entries in the learning table may be promoted to an address prediction table or a load value prediction table for prediction, e.g., when they reach a threshold confidence level in the training table. In some embodiments, the learning table stores a hash of a predicted load value and control circuitry uses a probing load to retrieve the actual predicted load value for the value prediction table.

2.

发明授权
Load/store ordering violation management 有权

公开(公告)号：US10983801B2

公开(公告)日：2021-04-20

申请号：US16562675

申请日：2019-09-06

申请人： Apple Inc.

发明人： Kulin N. Kothari , Mridul Agarwal

IPC分类号： G06F9/38 , G06F9/30

摘要： A processor includes a load/store unit that includes one or more load pipelines and one or more store pipelines. Load operations may be issued into the load pipelines out of order with respect to older store operations. If a load operation is executed out or order with an older store operation that writes one or more bytes read by the load operation, and if the store operation is issued shortly after the load operation, such that the load operation is still in the load pipeline when the store operation is issued, some cases of flushing may be converted to replays by detecting the ordering violation while the load operation is still in the load pipeline.

3.

发明授权
Load/store dependency predictor optimization for replayed loads 有权

公开(公告)号：US10437595B1

公开(公告)日：2019-10-08

申请号：US15070435

申请日：2016-03-15

申请人： Apple Inc.

发明人： Pradeep Kanapathipillai , Stephan G. Meier , Gerard R. Williams, III , Mridul Agarwal , Kulin N. Kothari

IPC分类号： G06F9/38 , G06F9/30

摘要： Systems, apparatuses, and methods for optimizing a load-store dependency predictor (LSDP). When a younger load instruction is issued before an older store instruction and the younger load is dependent on the older store, the LSDP is trained on this ordering violation. A replay/flush indicator is stored in a corresponding entry in the LSDP to indicate whether the ordering violation resulted in a flush or replay. On subsequent executions, a dependency may be enforced for the load-store pair if a confidence counter is above a threshold, with the threshold varying based on the status of the replay/flush indicator. If a given load matches on multiple entries in the LSDP, and if at least one of the entries has a flush indicator, then the given load may be marked as a multimatch case and forced to wait to issue until all older stores have issued.

4.

发明授权
Content-directed prefetch circuit with quality filtering 有权

公开(公告)号：US09886385B1

公开(公告)日：2018-02-06

申请号：US15247421

申请日：2016-08-25

申请人： Apple Inc.

发明人： Tyler J. Huberty , Stephan G. Meier , Mridul Agarwal

IPC分类号： G06F12/08 , G06F12/0862 , G06F12/0897 , G06F12/0864

CPC分类号： G06F12/0862 , G06F12/0864 , G06F12/0897 , G06F2212/1024 , G06F2212/6022 , G06F2212/6024

摘要： In a content-directed prefetcher, a pointer detection circuit identifies a given memory pointer candidate within a data cache line fill from a lower level cache (LLC), where the LLC is at a lower level of a memory hierarchy relative to the data cache. A pointer filter circuit initiates a prefetch request to the LLC candidate dependent on determining that a given counter in a quality factor (QF) table satisfies QF counter threshold value. The QF table is indexed dependent upon a program counter address and relative cache line offset of the candidate. Upon initiation of the prefetch request, the given counter is updated to reflect a prefetch cost. In response to determining that a subsequent data cache line fill arriving from the LLC corresponds to the prefetch request for the given memory pointer candidate, a particular counter of the QF table may be updated to reflect a successful prefetch credit.

5.

发明公开
Processing of Synchronization Barrier Instructions 审中-公开

公开(公告)号：US20240329990A1

公开(公告)日：2024-10-03

申请号：US18740430

申请日：2024-06-11

申请人： Apple Inc.

发明人： Deepankar Duggal , Kulin N Kothari , Mridul Agarwal , Chang Xu , Yanran Yang , Richard F Russo , Yuan C Chou , Douglas C Holman

IPC分类号： G06F9/30 , G06F9/38 , G06F9/52

CPC分类号： G06F9/30087 , G06F9/3802 , G06F9/522

摘要： A system, e.g., a system on a chip (SOC), may include one or more processors. A processor may execute an instruction synchronization barrier (ISB) instruction to enforce an ordering constraint on instructions. To execute the ISB instruction, the processor may determine whether contexts of the processor required for execution of instructions older than the ISB instruction are consumed for the older instructions. Responsive to determining that the contexts are consumed for the older instructions, the processor may initiate fetching of an instruction younger than the ISB instruction, without waiting for the older instructions to retire.

6.

发明公开
Scalable Interrupts 审中-公开

公开(公告)号：US20240311319A1

公开(公告)日：2024-09-19

申请号：US18674203

申请日：2024-05-24

申请人： Apple Inc.

发明人： Jeffrey E. Gonion , Charles E. Tucker , Tal Kuzi , Richard F. Russo , Mridul Agarwal , Christopher M. Tsay , Gideon N. Levinsky , Shih-Chieh Wen , Lior Zimet

IPC分类号： G06F13/24 , G06F1/26

CPC分类号： G06F13/24 , G06F1/26

摘要： An interrupt delivery mechanism for a system includes and interrupt controller and a plurality of cluster interrupt controllers coupled to respective pluralities of processors in an embodiment. The interrupt controller may serially transmit an interrupt request to respective cluster interrupt controllers, which may acknowledge (Ack) or non-acknowledge (Nack) the interrupt based on attempting to deliver the interrupt to processors to which the cluster interrupt controller is coupled. In a soft iteration, the cluster interrupt controller may attempt to deliver the interrupt to processors that are powered on, without attempting to power on processors that are powered off. If the soft iteration does not result in an Ack response from one of the plurality of cluster interrupt controllers, a hard iteration may be performed in which the powered-off processors may be powered on.

7.

发明申请
DSB Operation with Excluded Region 有权

公开(公告)号：US20220083338A1

公开(公告)日：2022-03-17

申请号：US17469504

申请日：2021-09-08

申请人： Apple Inc.

发明人： Jeff Gonion , John H. Kelm , James Vash , Pradeep Kanapathipillai , Mridul Agarwal , Gideon N. Levinsky , Richard F. Russo , Christopher M. Tsay

IPC分类号： G06F9/30 , G06F12/0875 , G06F12/02

摘要： Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.

8.

发明授权
Out of order store commit 有权

公开(公告)号：US10228951B1

公开(公告)日：2019-03-12

申请号：US14831661

申请日：2015-08-20

申请人： Apple Inc.

发明人： Kulin N. Kothari , Mridul Agarwal , Pradeep Kanapathipillai

IPC分类号： G06F9/38 , G06F9/30

摘要： Systems, apparatuses, and methods for committing store instructions out of order from a store queue are described. A processor may store a first store instruction and a second store instruction in the store queue, wherein the first store instruction is older than the second store instruction. In response to determining the second store instruction is ready to commit to the memory hierarchy, the processor may allow the second store instruction to commit before the first store instruction, in response to determining that all store instructions in the store queue older than the second store instruction are non-speculative. However, if it is determined that at least one store instruction in the store queue older than the second store instruction is speculative, the processor may prevent the second store instruction from committing to the memory hierarchy before the first store instruction.

9.

发明公开
Decoupling Atomicity from Operation Size 审中-公开

公开(公告)号：US20240248844A1

公开(公告)日：2024-07-25

申请号：US18587289

申请日：2024-02-26

申请人： Apple Inc.

发明人： Francesco Spadini , Gideon Levinsky , Mridul Agarwal

IPC分类号： G06F12/0804 , G06F9/30 , G06F9/38

CPC分类号： G06F12/0804 , G06F9/30043 , G06F9/3826 , G06F9/3834 , G06F2212/601

摘要： In an embodiment, a processor implements a different atomicity size (for memory consistency order) than the operation size. More particularly, the processor may implement a smaller atomicity size than the operation size. For example, for multiple register loads, the atomicity size may be the register size. In another example, the vector element size may be the atomicity size for vector load instructions. In yet another example, multiple contiguous vector elements, but fewer than all the vector elements in a vector register, may be the atomicity size for vector load instructions.

10.

发明授权
Load-store unit with banked queue 有权

公开(公告)号：US10133571B1

公开(公告)日：2018-11-20

申请号：US15171369

申请日：2016-06-02

申请人： Apple Inc.

发明人： Aditya Kesiraju , Mridul Agarwal , Pradeep Kanapathipillai , Sean M. Reynolds

IPC分类号： G06F9/30 , G06F9/38

摘要： A load-store unit having one or more banked queues is disclosed. In one embodiment, a load-store unit includes at least one queue that is subdivided into multiple banks. Although divided into multiple banks, the queue logically appears to software as a single queue. A first bank of the queue includes a first plurality of entries, with the second bank of the queue having a second plurality of entries, wherein each of the entries is arranged to store memory instructions. Each of the banks is associated with corresponding logic circuitry that controls one or more pointers for that bank. The pointer information may be exchanged between the logic circuits associated with the banks. Based on the pointer information that is exchanged, each bank may output (e.g., for retirement) one entry per cycle.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类