Acceleration of in-memory-compute arrays

    公开(公告)号:US12230361B2

    公开(公告)日:2025-02-18

    申请号:US18346565

    申请日:2023-07-03

    Applicant: Apple Inc.

    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

    Mappable filter for neural processor circuit

    公开(公告)号:US12141679B2

    公开(公告)日:2024-11-12

    申请号:US17065428

    申请日:2020-10-07

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.

    Acceleration of In-Memory-Compute Arrays

    公开(公告)号:US20230059200A1

    公开(公告)日:2023-02-23

    申请号:US17406817

    申请日:2021-08-19

    Applicant: Apple Inc.

    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

    MAPPABLE FILTER FOR NEURAL PROCESSOR CIRCUIT

    公开(公告)号:US20220108155A1

    公开(公告)日:2022-04-07

    申请号:US17065428

    申请日:2020-10-07

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.

    MAPPABLE FILTER FOR NEURAL PROCESSOR CIRCUIT

    公开(公告)号:US20250021808A1

    公开(公告)日:2025-01-16

    申请号:US18903466

    申请日:2024-10-01

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.

    Acceleration of In-Memory-Compute Arrays
    7.
    发明公开

    公开(公告)号:US20240005972A1

    公开(公告)日:2024-01-04

    申请号:US18346565

    申请日:2023-07-03

    Applicant: Apple Inc.

    CPC classification number: G11C7/222 H03M1/82 G11C7/1087 G11C7/106

    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

Patent Agency Ranking