Processors and methods for pipelined runtime services in a spatial array

    公开(公告)号:US10467183B2

    公开(公告)日:2019-11-05

    申请号:US15640538

    申请日:2017-07-01

    Abstract: Methods and apparatuses relating to pipelined runtime services in spatial arrays are described. In one embodiment, a processor includes processing elements; an interconnect network between the processing elements; a first configuration controller coupled to a first subset of the processing elements; and a second configuration controller coupled to a second, different subset of the processing elements, the first configuration controller and the second configuration controller are to configure the first subset and the second, different subset according to configuration information for a first context, and, for a context switch, the first configuration controller is to configure the first subset according to configuration information for a second context after pending operations of the first context are completed in the first subset and block second context dataflow into the second, different subset's input from the first subset's output until pending operations of the first context are completed in the second, different subset.

    Methods and apparatus to map single static assignment instructions onto a data flow graph in a data flow architecture

    公开(公告)号:US10346144B2

    公开(公告)日:2019-07-09

    申请号:US15721454

    申请日:2017-09-29

    Abstract: Methods, apparatus, systems and articles of manufacture to map a set of instructions onto a data flow graph are disclosed herein. An example apparatus includes a variable handler to modify a variable in the set of instructions. The variable is used multiple times in the set of instructions and the set of instructions are in a static single assignment form. The apparatus also includes a PHI handler to replace a PHI instruction contained in the set of instructions with a set of control data flow instructions and a data flow graph generator to map the set of instructions modified by the variable handler and the PHI handler onto a data flow graph without transforming the instructions out of the static single assignment form.

    RUNTIME ADDRESS DISAMBIGUATION IN ACCELERATION HARDWARE

    公开(公告)号:US20180188983A1

    公开(公告)日:2018-07-05

    申请号:US15396049

    申请日:2016-12-30

    Abstract: An integrated circuit includes a processor to execute instructions and to interact with memory, and acceleration hardware, to execute a sub-program corresponding to instructions. A set of input queues includes a store address queue to receive, from the acceleration hardware, a first address of the memory, the first address associated with a store operation and a store data queue to receive, from the acceleration hardware, first data to be stored at the first address of the memory. The set of input queues also includes a completion queue to buffer response data for a load operation. A disambiguator circuit, coupled to the set of input queues and the memory, is to, responsive to determining the load operation, which succeeds the store operation, has an address conflict with the first address, copy the first data from the store data queue into the completion queue for the load operation.

    Memory circuits and methods for distributed memory hazard detection and error recovery

    公开(公告)号:US10515049B1

    公开(公告)日:2019-12-24

    申请号:US15640541

    申请日:2017-07-01

    Abstract: Methods and apparatuses relating to distributed memory hazard detection and error recovery are described. In one embodiment, a memory circuit includes a memory interface circuit to service memory requests from a spatial array of processing elements for data stored in a plurality of cache banks; and a hazard detection circuit in each of the plurality of cache banks, wherein a first hazard detection circuit for a speculative memory load request from the memory interface circuit, that is marked with a potential dynamic data dependency, to an address within a first cache bank of the first hazard detection circuit, is to mark the address for tracking of other memory requests to the address, store data from the address in speculative completion storage, and send the data from the speculative completion storage to the spatial array of processing elements when a memory dependency token is received for the speculative memory load request.

    Apparatus, methods, and systems with a configurable spatial accelerator

    公开(公告)号:US10445250B2

    公开(公告)日:2019-10-15

    申请号:US15859454

    申请日:2017-12-30

    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a second operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements.

Patent Agency Ranking