Hashing with Soft Memory Folding
    3.
    发明申请

    公开(公告)号:US20220342806A1

    公开(公告)日:2022-10-27

    申请号:US17519284

    申请日:2021-11-04

    Applicant: Apple Inc.

    Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.

    Computation engine that operates in matrix and vector modes

    公开(公告)号:US11042373B2

    公开(公告)日:2021-06-22

    申请号:US16928752

    申请日:2020-07-14

    Applicant: Apple Inc.

    Abstract: In an embodiment, a computation engine is configured to perform vector multiplications, producing either vector results or outer product (matrix) results. The instructions provided to the computation engine specify a matrix mode or a vector mode for the instructions. The computation engine performs the specified operation. The computation engine may perform numerous computations in parallel, in an embodiment. In an embodiment, the instructions may also specify an offset with the input memories, providing additional flexibility in the location of operands. More particularly, the computation engine may be configured to perform numerous multiplication operations in parallel and to accumulate results in a result memory, performing multiply-accumulate operations for each matrix/vector element in the targeted locations of the output memory.

    SYSTEMS AND METHODS FOR PERFORMING MEMORY COMPRESSION

    公开(公告)号:US20190294541A1

    公开(公告)日:2019-09-26

    申请号:US16436635

    申请日:2019-06-10

    Applicant: Apple Inc.

    Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing are described. In various embodiments, a compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.

    Processor including multiple dissimilar processor cores

    公开(公告)号:US10289191B2

    公开(公告)日:2019-05-14

    申请号:US15866014

    申请日:2018-01-09

    Applicant: Apple Inc.

    Abstract: In an embodiment, an integrated circuit may include one or more processors. Each processor may include multiple processor cores, and each core has a different design/implementation and performance level. For example, a core may be implemented for high performance, but may have higher minimum voltage at which it operates correctly. Another core may be implemented at a lower maximum performance, but may be optimized for efficiency and may operate correctly at a lower minimum voltage. The processor may support multiple processor states (PStates). Each PState may specify an operating point and may be mapped to one of the processor cores. During operation, one of the cores is active: the core to which the current PState is mapped. If a new PState is selected and is mapped to a different core, the processor may automatically context switch the processor state to the newly-selected core and may begin execution on that core.

    Persistent relocatable reset vector for processor

    公开(公告)号:US09959120B2

    公开(公告)日:2018-05-01

    申请号:US13750013

    申请日:2013-01-25

    Applicant: Apple Inc.

    CPC classification number: G06F9/322 G06F9/30076

    Abstract: In an embodiment, an integrated circuit includes at least one processor. The processor may include a reset vector base address register configured to store a reset vector address for the processor. Responsive to a reset, the processor may be configured to capture a reset vector address on an input, updating the reset vector base address register. Upon release from reset, the processor may initiate instruction execution at the reset vector address. The integrated circuit may further include a logic circuit that is coupled to provide the reset vector address. The logic circuit may include a register that is programmable with the reset vector address. More particularly, in an embodiment, the register may be programmable via a write operation issued by the processor (e.g. a memory-mapped write operation). Accordingly, the reset vector address may be programmable in the integrated circuit, and may be changed from time to time.

    Usefulness indication for indirect branch prediction training
    8.
    发明授权
    Usefulness indication for indirect branch prediction training 有权
    间接分支预测训练的实用指标

    公开(公告)号:US09311100B2

    公开(公告)日:2016-04-12

    申请号:US13735694

    申请日:2013-01-07

    Applicant: Apple Inc.

    CPC classification number: G06F9/3844 G06F9/30072 G06F9/3806 G06F9/3848

    Abstract: A circuit for implementing a branch target buffer. The branch target buffer may include a memory that stores a plurality of entries. Each entry may include a tag value, a target value, and a prediction accuracy value. A received index value corresponding to an indirect branch instruction may be used to select one of entries of the plurality of entries, and a received tag value may then be compared to the tag value of the selected entries in the memory. An entry in the memory may be selected in response to a determination that the received tag does not match the tag value of compared entries. The selected entry may be allocated to the indirect instruction branch dependent upon the prediction accuracy values of the plurality of entries.

    Abstract translation: 用于实现分支目标缓冲器的电路。 分支目标缓冲器可以包括存储多个条目的存储器。 每个条目可以包括标签值,目标值和预测精度值。 对应于间接分支指令的接收到的索引值可以用于选择多个条目中的一个条目,然后将接收到的标签值与存储器中所选条目的标签值进行比较。 响应于接收到的标签与被比较的条目的标签值不匹配的确定,可以选择存储器中的条目。 所选择的条目可以根据多个条目的预测精度值分配给间接指令分支。

    Mechanism for sharing private caches in a SoC
    9.
    发明授权
    Mechanism for sharing private caches in a SoC 有权
    在SoC中共享私有缓存的机制

    公开(公告)号:US09280471B2

    公开(公告)日:2016-03-08

    申请号:US14081549

    申请日:2013-11-15

    Applicant: Apple Inc.

    Abstract: Systems, processors, and methods for sharing an agent's private cache with other agents within a SoC. Many agents in the SoC have a private cache in addition to the shared caches and memory of the SoC. If an agent's processor is shut down or operating at less than full capacity, the agent's private cache can be shared with other agents. When a requesting agent generates a memory request and the memory request misses in the memory cache, the memory cache can allocate the memory request in a separate agent's cache rather than allocating the memory request in the memory cache.

    Abstract translation: 与SoC中的其他代理程序共享代理的私有缓存的系统,处理器和方法。 SoC中的许多代理除了SoC的共享缓存和内存之外还有一个专用缓存。 如果代理的处理器关闭或以小于满容量运行,代理的私有缓存可以与其他代理共享。 当请求代理产生存储器请求并且存储器请求丢失在存储器高速缓存中时,存储器高速缓存可以在单独的代理的高速缓存中分配存储器请求,而不是在存储器高速缓存中分配存储器请求。

    Multi-Level Dispatch for a Superscalar Processor
    10.
    发明申请
    Multi-Level Dispatch for a Superscalar Processor 有权
    超标量处理器的多级调度

    公开(公告)号:US20140215188A1

    公开(公告)日:2014-07-31

    申请号:US13749999

    申请日:2013-01-25

    Applicant: APPLE INC.

    CPC classification number: G06F9/3836 G06F9/30145 G06F9/4881 G06F9/4887

    Abstract: In an embodiment, a processor includes a multi-level dispatch circuit configured to supply operations for execution by multiple parallel execution pipelines. The multi-level dispatch circuit may include multiple dispatch buffers, each of which is coupled to multiple reservation stations. Each reservation station may be coupled to a respective execution pipeline and may be configured to schedule instruction operations (ops) for execution in the respective execution pipeline. The sets of reservation stations coupled to each dispatch buffer may be non-overlapping. Thus, if a given op is to be executed in a given execution pipeline, the op may be sent to the dispatch buffer which is coupled to the reservation station that provides ops to the given execution pipeline.

    Abstract translation: 在一个实施例中,处理器包括被配置为提供由多个并行执行管线执行的操作的多级调度电路。 多级调度电路可以包括多个调度缓冲器,每个调度缓冲器耦合到多个保留站。 每个保留站可以耦合到相应的执行流水线,并且可以被配置为调度用于在相应的执行流水线中执行的指令操作(op)。 耦合到每个调度缓冲器的保留站组可以是不重叠的。 因此,如果在给定的执行流水线中执行给定的操作,则操作可以被发送到调度缓冲器,该调度缓冲器耦合到向给定的执行流水线提供操作的保留站。

Patent Agency Ranking