-
51.
公开(公告)号:US20250165763A1
公开(公告)日:2025-05-22
申请号:US19030985
申请日:2025-01-17
Applicant: APPLE INC
Inventor: Christopher L. Mills
Abstract: Embodiments relate to a neural processor circuit that includes multiple neural engine circuits, a data buffer, and a kernel fetcher circuit. At least one of the neural engine circuits receives multiple sub-channels of a portion of input data from the data buffer. Neural engine circuit further receives a kernel of the one or more kernels from the kernel fetcher circuit, wherein the kernel was decomposed into a corresponding sub-kernel for each sub-channel of the portion of the input data. Neural engine circuit performs a convolution operation on each sub-channel of the portion of the input data and the corresponding sub-kernel. Neural engine circuit accumulates corresponding outputs of each sub-channel portion of the convolution operation to generate a single channel of the output data.
-
公开(公告)号:US12229586B2
公开(公告)日:2025-02-18
申请号:US17155878
申请日:2021-01-22
Applicant: Apple Inc.
Inventor: Christopher L. Mills , Kenneth W. Waters
Abstract: A neural processor includes neural engines for performing convolution operations on input data corresponding to one or more tasks to generate output data. The neural processor also includes a data processor circuit coupled to external system memory. The data processor circuit includes a buffer for storing the output data from the neural engines. The neural processor further includes a task manager coupled to the data processor circuit. The task manager receives a context-switch task. The context-switch task specifies a switch of the data processor circuit from handling an outgoing task to an incoming task. The task manager sends configuration data of the context-switch task to cause the data processor circuit to transmit the output data corresponding to the outgoing task from the buffer to the external system memory. The data processor circuit also fetches data corresponding to the incoming task from the external system memory to the buffer.
-
公开(公告)号:US12124943B2
公开(公告)日:2024-10-22
申请号:US18120218
申请日:2023-03-10
Applicant: Apple Inc.
Inventor: Christopher L. Mills , Kenneth W. Waters , Youchang Kim
Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In an elementwise mode, the planar engine circuit may combine two tensors by performing operations element by element. The planar engine circuit may support elementwise operation for two tensors that are in different sizes and ranks. The planar engine circuit may perform a broadcasting operation to duplicate one or more values across one or more channels to make a smaller tensor matching the size of the larger tensor.
-
公开(公告)号:US20240320470A1
公开(公告)日:2024-09-26
申请号:US18125554
申请日:2023-03-23
Applicant: Apple Inc.
Inventor: Christopher L. Mills
Abstract: A system-on-a-chip circuit may include a neural processor circuit coupled to a central processor unit. The neural processor circuit may include a plurality of neural engines and a data processor circuit. The central processor unit is configured to execute a compiler, which is in turn configured to determine a data broadcast mode and an input data dimension configuration mode based on a neural network description. The compiler is configured to generate one or more task descriptors, the task descriptors distributed to components of the neural processor circuit. The data processor circuit is configured to broadcast data from the buffer to the plurality of neural engines based on the determined data dimension configuration mode. The neural engines are configured to perform computational operations according to the determined input data dimension configuration mode.
-
公开(公告)号:US11972348B2
公开(公告)日:2024-04-30
申请号:US17086023
申请日:2020-10-30
Applicant: Apple Inc.
Inventor: Christopher L. Mills
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Embodiments of the present disclosure relate to a texture unit circuit in a neural processor circuit. The neural processor circuit includes a tensor access operation circuit with the texture unit circuit, a data processor circuit, and at least one neural engine circuit. The texture unit circuit fetches a source tensor from a system memory by referencing an index tensor in the system memory representing indexing information into the source tensor. The data processor circuit stores an output version of the source tensor obtained from the tensor access operation circuit and sends the output version of the source tensor as multiple of units of input data to the at least one neural engine circuit. The at least one neural engine circuit performs at least convolution operations on the units of input data and at least one kernel to generate output data.
-
公开(公告)号:US11934941B2
公开(公告)日:2024-03-19
申请号:US17989275
申请日:2022-11-17
Applicant: Apple Inc.
Inventor: Christopher L. Mills , Kenneth W. Waters
Abstract: A neural processor circuit includes one or more planar engine circuits that perform non-convolution operations in parallel with convolution operations performed by one or more neural engine circuits. The neural engine circuits perform the convolution operations on neural input data corresponding to one or more neural engine tasks to generate neural output data. The planar engine circuits perform non-convolution operations on planar input data corresponding to one or more planar engine tasks to generate planar output data. A data processor circuit in the neural processor circuit addresses data dependency between the one or more neural engine tasks and the one or more planar engine tasks by controlling reading of the neural output data as the planar input data by the planar engine circuits or reading of the planar output data as the neural input data by the neural engine circuits.
-
公开(公告)号:US20230206051A1
公开(公告)日:2023-06-29
申请号:US18120218
申请日:2023-03-10
Applicant: Apple Inc.
Inventor: Christopher L. Mills , Kenneth W. Waters , Youchang Kim
Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In an elementwise mode, the planar engine circuit may combine two tensors by performing operations element by element. The planar engine circuit may support elementwise operation for two tensors that are in different sizes and ranks. The planar engine circuit may perform a broadcasting operation to duplicate one or more values across one or more channels to make a smaller tensor matching the size of the larger tensor.
-
公开(公告)号:US20230121448A1
公开(公告)日:2023-04-20
申请号:US17505426
申请日:2021-10-19
Applicant: Apple Inc.
Inventor: Christopher L. Mills , Youchang Kim
IPC: G06N3/063
Abstract: Embodiments of the present disclosure relate to a reduction operation in a neural processor circuit where results of the reduction operation are retained for multiple post-processing operations. The neural processor circuit includes neural engine circuits and a planar engine circuit coupled to the neural engine circuits. At least one neural engine circuit performs a convolution operation to generate output data. The planar engine circuit includes a filter circuit and a line buffer coupled to the filter circuit. The filter circuit performs a reduction operation for each patch of a tensor from the output data to generate a respective reduced value associated with a corresponding channel of the tensor. The line buffer stores reduced values each being associated with a respective channel of the tensor. The line buffer retains the reduced values for a defined number of operating cycles as indicated by a refresh flag defining resetting of the line buffer.
-
公开(公告)号:US11475283B2
公开(公告)日:2022-10-18
申请号:US16662789
申请日:2019-10-24
Applicant: Apple Inc.
Inventor: Christopher L. Mills , Sung Hee Park
Abstract: Embodiments of the present disclosure relate to a neural engine of a neural processor circuit having multiple multiply-add circuits and an accumulator circuit coupled to the multiply-add circuits. The multiply-add circuits perform multiply-add operations of a three dimensional convolution on a work unit of input data using a kernel to generate at least a portion of output data in a processing cycle. The accumulator circuit includes multiple batches of accumulators. Each batch of accumulators receives and stores, after the processing cycle, the portion of the output data for each output depth plane of multiple output depth planes. A corresponding batch of accumulators stores, after the processing cycle, the portion of the output data for a subset of the output channels and for each output depth plane.
-
公开(公告)号:US20220156575A1
公开(公告)日:2022-05-19
申请号:US16953033
申请日:2020-11-19
Applicant: Apple Inc.
Inventor: Christopher L. Mills
IPC: G06N3/08
Abstract: Embodiments of the present disclosure relate to a tensor access operation circuit in a neural processor circuit. The neural processor circuit further includes a data processor circuit and at least one neural engine circuit. The tensor access operation circuit indirectly accesses at least a region of a source tensor in a system memory having a rank, and maps one or more source components of the source tensor into an input tensor having another rank. The data processor circuit stores an output version of the input tensor obtained from the tensor access operation circuit and sends the output version of the input tensor as multiple of units of input data to the at least one neural engine circuit. The at least one neural engine circuit performs at least convolution operations on the units of input data and at least one kernel to generate output data.
-
-
-
-
-
-
-
-
-