Accelerator controller hub
    11.
    发明授权

    公开(公告)号:US12229069B2

    公开(公告)日:2025-02-18

    申请号:US17083200

    申请日:2020-10-28

    Abstract: Methods and apparatus for an accelerator controller hub (ACH). The ACH may be a stand-alone component or integrated on-die or on package in an accelerator such as a GPU. The ACH may include a host device link (HDL) interface, one or more Peripheral Component Interconnect Express (PCIe) interfaces, one or more high performance accelerator link (HPAL) interfaces, and a router, operatively coupled to each of the HDL interface, the one or more PCIe interfaces, and the one or more HPAL interfaces. The HDL interface is configured to be coupled to a host CPU via an HDL link and the one or more HPAL interfaces are configured to be coupled to one or more HPALs that are used to access high performance accelerator fabrics (HPAFs) such as NVlink fabrics and CCIX (Cache Coherent Interconnect for Accelerators) fabrics. Platforms including ACHs or accelerators with integrated ACHs support RDMA transfers using RDMA semantics to enable transfers between accelerator memory on initiators and targets without CPU involvement.

    Mission-critical computing architecture

    公开(公告)号:US10514990B2

    公开(公告)日:2019-12-24

    申请号:US15823313

    申请日:2017-11-27

    Abstract: Operational faults, including transient faults, are detected within computing hardware for mission-critical applications. Operational requests received from a requestor node are to be processed by shared agents to produce corresponding responses. A first request is duplicated to be redundantly processed independently and asynchronously by distinct shared agents to produce redundant counterpart responses including a first redundant response and a second redundant response. The first redundant response is compared against the second redundant response. In response to a match, the redundant responses are merged to produce a single final response to the first request to be read by the requestor node. In response to a non-match, an exception response is performed.

    MISSION-CRITICAL COMPUTING ARCHITECTURE
    15.
    发明申请

    公开(公告)号:US20190163583A1

    公开(公告)日:2019-05-30

    申请号:US15823313

    申请日:2017-11-27

    Abstract: Operational faults, including transient faults, are detected within computing hardware for mission-critical applications. Operational requests received from a requestor node are to be processed by shared agents to produce corresponding responses. A first request is duplicated to be redundantly processed independently and asynchronously by distinct shared agents to produce redundant counterpart responses including a first redundant response and a second redundant response. The first redundant response is compared against the second redundant response. In response to a match, the redundant responses are merged to produce a single final response to the first request to be read by the requestor node. In response to a non-match, an exception response is performed.

    Branch Predictor with Empirical Branch Bias Override

    公开(公告)号:US20180173533A1

    公开(公告)日:2018-06-21

    申请号:US15383832

    申请日:2016-12-19

    Abstract: A processor may include a baseline branch predictor and an empirical branch bias override circuit. The baseline branch predictor may receive a branch instruction associated with a given address identifier, and generate, based on a global branch history, an initial prediction of a branch direction for the instruction. The empirical branch bias override circuit may determine, dependent on a direction of an observed branch direction bias in executed branch instruction instances associated with the address identifier, whether the initial prediction should be overridden, may determine, in response to determining that the initial prediction should be overridden, a final prediction that matches the observed branch direction bias, or may determine, in response determining that the initial prediction should not be overridden, a final prediction that matches the initial prediction. The predictor may update an entry in the global branch history reflecting the resolved branch direction for the instruction following its execution.

    Changing cache ownership in clustered multiprocessor

    公开(公告)号:US09940238B2

    公开(公告)日:2018-04-10

    申请号:US15602603

    申请日:2017-05-23

    CPC classification number: G06F12/084 G06F2212/2542 G06F2212/271

    Abstract: A chip multiprocessor may include a first cluster and a second cluster, each having multiple cores of a processor, multiple co-located cache slices, and a memory controller. The processor stores directory information in a memory to indicate cluster cache ownership of a first address space to the first cluster. In response to a request to change the cluster cache ownership of the first address space to a second address space of the second cluster, the processor provides a quiesce period during which to block new read or write requests to the first cluster and the second cluster; drain read or write requests issued on the first cluster and the second cluster; and remove the block on new read or write requests. The processor may also update the directory information to change the cluster cache ownership of the first address space to the second address space of the second cluster.

    Hardware lockstep checking within a fault detection interval in a system on chip

    公开(公告)号:US10831628B2

    公开(公告)日:2020-11-10

    申请号:US16218078

    申请日:2018-12-12

    Abstract: A method to check for redundancy in two or more data lines comprises receiving data on a first data line, computing a first cyclic redundancy check (CRC) value on the data of the first data line, performing an exclusive OR (XOR) function on the first CRC value with a stored memory value, and updating the stored memory value with a result of the XOR function, and repeating on additional data lines until a last line is processed such that an error is indicated if a final stored memory value is not zero. An apparatus to check that two cores are operating in lockstep comprises a first core comprising a first data checker, a second core comprising a second data checker, and a lockstep checker to compare an output of the first data checker with an output of the second data checker.

Patent Agency Ranking