Coprocessors with Bypass Optimization, Variable Grid Architecture, and Fused Vector Operations

    公开(公告)号:US20250094381A1

    公开(公告)日:2025-03-20

    申请号:US18959080

    申请日:2024-11-25

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor may include a plurality of processing element circuits arranged in a first grid, where a given coprocessor instruction of an instruction set for the coprocessor is defined to cause evaluation of a second plurality of processing element circuits arranged in a second grid, where the second grid includes more processing element circuits than the first grid. The coprocessor may further include a scheduler circuit configured to issue instruction operations to the plurality of processing element circuits, where the scheduler circuit is configured to issue a given instruction operation corresponding to the given coprocessor instruction a plurality of times to complete the given coprocessor instruction, wherein different issuances of the given instruction operation are configured to cause respective different portions of the evaluation defined by the given coprocessor instruction to be performed.

    Coprocessor operation bundling
    2.
    发明授权

    公开(公告)号:US11210100B2

    公开(公告)日:2021-12-28

    申请号:US16242151

    申请日:2019-01-08

    Applicant: Apple Inc.

    Abstract: In an embodiment, a processor includes a buffer in an interface unit. The buffer may be used to accumulate coprocessor instructions to be transmitted to a coprocessor. In an embodiment, the processor issues the coprocessor instructions to the buffer when ready to be issued to the coprocessor. The interface unit may accumulate the coprocessor instructions in the buffer, generating a bundle of instructions. The bundle may be closed based on various predetermined conditions and then the bundle may be transmitted to the coprocessor. If a sequence of coprocessor instructions appears consecutively in a program, the rate at which the instructions are provided to the coprocessor (on average) at least matches the rate at which the coprocessor consumes the instructions, in an embodiment.

    Coprocessors with Bypass Optimization, Variable Grid Architecture, and Fused Vector Operations

    公开(公告)号:US20200272597A1

    公开(公告)日:2020-08-27

    申请号:US16286170

    申请日:2019-02-26

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.

    Coprocessor operation bundling
    6.
    发明授权

    公开(公告)号:US11755328B2

    公开(公告)日:2023-09-12

    申请号:US17527872

    申请日:2021-11-16

    Applicant: Apple Inc.

    Abstract: In an embodiment, a processor includes a buffer in an interface unit. The buffer may be used to accumulate coprocessor instructions to be transmitted to a coprocessor. In an embodiment, the processor issues the coprocessor instructions to the buffer when ready to be issued to the coprocessor. The interface unit may accumulate the coprocessor instructions in the buffer, generating a bundle of instructions. The bundle may be closed based on various predetermined conditions and then the bundle may be transmitted to the coprocessor. If a sequence of coprocessor instructions appears consecutively in a program, the rate at which the instructions are provided to the coprocessor (on average) at least matches the rate at which the coprocessor consumes the instructions, in an embodiment.

    Coprocessors with Bypass Optimization, Variable Grid Architecture, and Fused Vector Operations

    公开(公告)号:US20220358082A1

    公开(公告)日:2022-11-10

    申请号:US17869620

    申请日:2022-07-20

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.

    Coprocessor Synchronizing Instruction Suppression

    公开(公告)号:US20220214887A1

    公开(公告)日:2022-07-07

    申请号:US17668869

    申请日:2022-02-10

    Applicant: Apple Inc.

    Abstract: An instruction set architecture including instructions for a processor and instructions for a coprocessor may include synchronizing instructions that may be used to begin and end instruction sequences that include coprocessor instructions (coprocessor sequences). If a terminating synchronizing instruction is followed by an initial synchronizing instruction and the pair are detected in the coprocessor concurrently, the coprocessor may suppress execution of the pair of instructions.

    Coprocessor synchronizing instruction suppression

    公开(公告)号:US11249766B1

    公开(公告)日:2022-02-15

    申请号:US17077654

    申请日:2020-10-22

    Applicant: Apple Inc.

    Abstract: An instruction set architecture including instructions for a processor and instructions for a coprocessor may include synchronizing instructions that may be used to begin and end instruction sequences that include coprocessor instructions (coprocessor sequences). If a terminating synchronizing instruction is followed by an initial synchronizing instruction and the pair are detected in the coprocessor concurrently, the coprocessor may suppress execution of the pair of instructions.

    Coprocessor memory ordering table
    10.
    发明授权

    公开(公告)号:US11055102B2

    公开(公告)日:2021-07-06

    申请号:US16991858

    申请日:2020-08-12

    Applicant: Apple Inc.

    Abstract: In an embodiment, at least one CPU processor and at least one coprocessor are included in a system. The CPU processor may issue operations to the coprocessor to perform, including load/store operations. The CPU processor may generate the addresses that are accessed by the coprocessor load/store operations, as well as executing its own CPU load/store operations. The CPU processor may include a memory ordering table configured to track at least one memory region within which there are outstanding coprocessor load/store memory operations that have not yet completed. The CPU processor may delay CPU load/store operations until the outstanding coprocessor load/store operations are complete. In this fashion, the proper ordering of CPU load/store operations and coprocessor load/store operations may be maintained.

Patent Agency Ranking