DEEP LEARNING INFERENCE EFFICIENCY TECHNOLOGY WITH EARLY EXIT AND SPECULATIVE EXECUTION

    公开(公告)号:US20190180168A1

    公开(公告)日:2019-06-13

    申请号:US16266880

    申请日:2019-02-04

    申请人: Intel Corporation

    IPC分类号: G06N3/04 G06F9/30 G06F17/50

    摘要: Systems, apparatuses and methods may provide for technology that processes an inference workload in a first subset of layers of a neural network that prevents or inhibits data dependent branch operations, conducts an exit determination as to whether an output of the first subset of layers satisfies one or more exit criteria, and selectively bypasses processing of the output in a second subset of layers of the neural network based on the exit determination. The technology may also speculatively initiate the processing of the output in the second subset of layers while the exit determination is pending. Additionally, when the inference workloads include a plurality of batches, the technology may mask one or more of the plurality of batches from processing in the second subset of layers.

    Deep learning inference efficiency technology with early exit and speculative execution

    公开(公告)号:US11562200B2

    公开(公告)日:2023-01-24

    申请号:US16266880

    申请日:2019-02-04

    申请人: Intel Corporation

    IPC分类号: G06N3/04 G06F30/33

    摘要: Systems, apparatuses and methods may provide for technology that processes an inference workload in a first subset of layers of a neural network that prevents or inhibits data dependent branch operations, conducts an exit determination as to whether an output of the first subset of layers satisfies one or more exit criteria, and selectively bypasses processing of the output in a second subset of layers of the neural network based on the exit determination. The technology may also speculatively initiate the processing of the output in the second subset of layers while the exit determination is pending. Additionally, when the inference workloads include a plurality of batches, the technology may mask one or more of the plurality of batches from processing in the second subset of layers.

    EARLY EXIT FOR NATURAL LANGUAGE PROCESSING MODELS

    公开(公告)号:US20190266236A1

    公开(公告)日:2019-08-29

    申请号:US16411763

    申请日:2019-05-14

    申请人: Intel Corporation

    IPC分类号: G06F17/27

    摘要: The disclosure provides a natural language processing (NLP) model arranged to operate on two lexicons, where one lexicon is a sub-set of the other lexicon. The NLP model can be arranged to generate output based on the sub-set lexicon and exit processing of the NLP model, to potentially save computation cycles.