-
公开(公告)号:US11941428B2
公开(公告)日:2024-03-26
申请号:US17657506
申请日:2022-03-31
Applicant: Apple Inc.
Inventor: Sagi Lahav , Lital Levy-Rubin , Gaurav Garg , Gerard R. Williams, III , Samer Nassar , Per H. Hammarlund , Harshavardhan Kaushikkar , Srinivasa Rangan Sridharan , Jeff Gonion
CPC classification number: G06F9/466 , G06F13/1668
Abstract: Techniques are disclosed relating to an I/O agent circuit. The I/O agent circuit may include one or more queues and a transaction pipeline. The I/O agent circuit may issue, to the transaction pipeline from a queue of the one or more queues, a transaction of a series of transactions enqueued in a particular order. The I/O agent circuit may generate, at the transaction pipeline, a determination to return the transaction to the queue based on a detection of one or more conditions being satisfied. Based on the determination, the I/O agent circuit may reject, at the transaction pipeline, up to a threshold number of transactions that issued from the queue after the transaction issued. The I/O agent circuit may insert the transaction at a head of the queue such that the transaction is enqueued at the queue sequentially first for the series of transactions according to the particular order.
-
公开(公告)号:US20230125798A1
公开(公告)日:2023-04-27
申请号:US18069033
申请日:2022-12-20
Applicant: Apple Inc.
Inventor: Steven Fishwick , Jeffry E. Gonion , Per H. Hammarlund , Eran Tamari , Lior Zimet , Gerard R. Williams, III
IPC: G06F3/06 , G06F12/02 , G06F12/0871 , G06F12/0882 , G06F12/1045 , G06F12/06 , G06F12/1018 , G06F13/16
Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.
-
公开(公告)号:US20220342806A1
公开(公告)日:2022-10-27
申请号:US17519284
申请日:2021-11-04
Applicant: Apple Inc.
Inventor: Steven Fishwick , Jeffry E. Gonion , Per H. Hammarlund , Eran Tamari , Lior Zimet , Gerard R. Williams, III
IPC: G06F12/02 , G06F12/0871 , G06F12/0882 , G06F12/1045
Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.
-
公开(公告)号:US11042373B2
公开(公告)日:2021-06-22
申请号:US16928752
申请日:2020-07-14
Applicant: Apple Inc.
Inventor: Eric Bainville , Jeffry E. Gonion , Ali Sazegari , Gerard R. Williams, III
Abstract: In an embodiment, a computation engine is configured to perform vector multiplications, producing either vector results or outer product (matrix) results. The instructions provided to the computation engine specify a matrix mode or a vector mode for the instructions. The computation engine performs the specified operation. The computation engine may perform numerous computations in parallel, in an embodiment. In an embodiment, the instructions may also specify an offset with the input memories, providing additional flexibility in the location of operands. More particularly, the computation engine may be configured to perform numerous multiplication operations in parallel and to accumulate results in a result memory, performing multiply-accumulate operations for each matrix/vector element in the targeted locations of the output memory.
-
公开(公告)号:US20190294541A1
公开(公告)日:2019-09-26
申请号:US16436635
申请日:2019-06-10
Applicant: Apple Inc.
Inventor: Ali Sazegari , Charles E. Tucker , Jeffry E. Gonion , Gerard R. Williams, III , Chris Cheng-Chieh Lee
IPC: G06F12/08 , H03M7/30 , G06F12/0886
Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing are described. In various embodiments, a compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.
-
公开(公告)号:US10289191B2
公开(公告)日:2019-05-14
申请号:US15866014
申请日:2018-01-09
Applicant: Apple Inc.
Inventor: David J. Williamson , Gerard R. Williams, III
IPC: G06F1/32 , G06F1/3293 , G06F1/3287 , G06F9/46 , G06F1/3206 , G06F1/3234 , G06F1/3296
Abstract: In an embodiment, an integrated circuit may include one or more processors. Each processor may include multiple processor cores, and each core has a different design/implementation and performance level. For example, a core may be implemented for high performance, but may have higher minimum voltage at which it operates correctly. Another core may be implemented at a lower maximum performance, but may be optimized for efficiency and may operate correctly at a lower minimum voltage. The processor may support multiple processor states (PStates). Each PState may specify an operating point and may be mapped to one of the processor cores. During operation, one of the cores is active: the core to which the current PState is mapped. If a new PState is selected and is mapped to a different core, the processor may automatically context switch the processor state to the newly-selected core and may begin execution on that core.
-
公开(公告)号:US09959120B2
公开(公告)日:2018-05-01
申请号:US13750013
申请日:2013-01-25
Applicant: Apple Inc.
Inventor: Josh P. de Cesare , Gerard R. Williams, III , Michael J. Smith , Wei-Han Lien
CPC classification number: G06F9/322 , G06F9/30076
Abstract: In an embodiment, an integrated circuit includes at least one processor. The processor may include a reset vector base address register configured to store a reset vector address for the processor. Responsive to a reset, the processor may be configured to capture a reset vector address on an input, updating the reset vector base address register. Upon release from reset, the processor may initiate instruction execution at the reset vector address. The integrated circuit may further include a logic circuit that is coupled to provide the reset vector address. The logic circuit may include a register that is programmable with the reset vector address. More particularly, in an embodiment, the register may be programmable via a write operation issued by the processor (e.g. a memory-mapped write operation). Accordingly, the reset vector address may be programmable in the integrated circuit, and may be changed from time to time.
-
8.
公开(公告)号:US09311100B2
公开(公告)日:2016-04-12
申请号:US13735694
申请日:2013-01-07
Applicant: Apple Inc.
Inventor: Sandeep Gupta , Shyam Sundar , Wei-Han Lien , Gerard R. Williams, III , Conrado Blasco-Allue
CPC classification number: G06F9/3844 , G06F9/30072 , G06F9/3806 , G06F9/3848
Abstract: A circuit for implementing a branch target buffer. The branch target buffer may include a memory that stores a plurality of entries. Each entry may include a tag value, a target value, and a prediction accuracy value. A received index value corresponding to an indirect branch instruction may be used to select one of entries of the plurality of entries, and a received tag value may then be compared to the tag value of the selected entries in the memory. An entry in the memory may be selected in response to a determination that the received tag does not match the tag value of compared entries. The selected entry may be allocated to the indirect instruction branch dependent upon the prediction accuracy values of the plurality of entries.
Abstract translation: 用于实现分支目标缓冲器的电路。 分支目标缓冲器可以包括存储多个条目的存储器。 每个条目可以包括标签值,目标值和预测精度值。 对应于间接分支指令的接收到的索引值可以用于选择多个条目中的一个条目,然后将接收到的标签值与存储器中所选条目的标签值进行比较。 响应于接收到的标签与被比较的条目的标签值不匹配的确定,可以选择存储器中的条目。 所选择的条目可以根据多个条目的预测精度值分配给间接指令分支。
-
公开(公告)号:US09280471B2
公开(公告)日:2016-03-08
申请号:US14081549
申请日:2013-11-15
Applicant: Apple Inc.
Inventor: Manu Gulati , Harshavardhan Kaushikkar , Gurjeet S. Saund , Wei-Han Lien , Gerard R. Williams, III , Sukalpa Biswas , Brian P. Lilly , Shinye Shiu
CPC classification number: G06F12/084 , G06F12/0802 , G06F12/0806 , G06F12/0811 , G06F12/0815 , G06F12/0817 , G06F12/0822 , G06F2212/1028 , Y02D10/13
Abstract: Systems, processors, and methods for sharing an agent's private cache with other agents within a SoC. Many agents in the SoC have a private cache in addition to the shared caches and memory of the SoC. If an agent's processor is shut down or operating at less than full capacity, the agent's private cache can be shared with other agents. When a requesting agent generates a memory request and the memory request misses in the memory cache, the memory cache can allocate the memory request in a separate agent's cache rather than allocating the memory request in the memory cache.
Abstract translation: 与SoC中的其他代理程序共享代理的私有缓存的系统,处理器和方法。 SoC中的许多代理除了SoC的共享缓存和内存之外还有一个专用缓存。 如果代理的处理器关闭或以小于满容量运行,代理的私有缓存可以与其他代理共享。 当请求代理产生存储器请求并且存储器请求丢失在存储器高速缓存中时,存储器高速缓存可以在单独的代理的高速缓存中分配存储器请求,而不是在存储器高速缓存中分配存储器请求。
-
公开(公告)号:US20140215188A1
公开(公告)日:2014-07-31
申请号:US13749999
申请日:2013-01-25
Applicant: APPLE INC.
Inventor: John H. Mylius , Gerard R. Williams, III , Shyam Sundar Balasubramanian , Conrado Blasco-Allue
IPC: G06F9/30
CPC classification number: G06F9/3836 , G06F9/30145 , G06F9/4881 , G06F9/4887
Abstract: In an embodiment, a processor includes a multi-level dispatch circuit configured to supply operations for execution by multiple parallel execution pipelines. The multi-level dispatch circuit may include multiple dispatch buffers, each of which is coupled to multiple reservation stations. Each reservation station may be coupled to a respective execution pipeline and may be configured to schedule instruction operations (ops) for execution in the respective execution pipeline. The sets of reservation stations coupled to each dispatch buffer may be non-overlapping. Thus, if a given op is to be executed in a given execution pipeline, the op may be sent to the dispatch buffer which is coupled to the reservation station that provides ops to the given execution pipeline.
Abstract translation: 在一个实施例中,处理器包括被配置为提供由多个并行执行管线执行的操作的多级调度电路。 多级调度电路可以包括多个调度缓冲器,每个调度缓冲器耦合到多个保留站。 每个保留站可以耦合到相应的执行流水线,并且可以被配置为调度用于在相应的执行流水线中执行的指令操作(op)。 耦合到每个调度缓冲器的保留站组可以是不重叠的。 因此,如果在给定的执行流水线中执行给定的操作,则操作可以被发送到调度缓冲器,该调度缓冲器耦合到向给定的执行流水线提供操作的保留站。
-
-
-
-
-
-
-
-
-