Methods and devices for computing a memory size for software optimization

    公开(公告)号:US11656854B2

    公开(公告)日:2023-05-23

    申请号:US17460749

    申请日:2021-08-30

    CPC classification number: G06F8/441 G06F8/37 G06F8/433 G06F8/452 G06F8/457

    Abstract: There is provided methods and devices for computing a tile size for software optimization. A method includes receiving, by a computing device, information indicative of one or more of a set of loop bounds and a set of data shapes; processing, by the computing device, the information to determine a computation configuration based on the obtained information, the computation configuration implementable by a compiler, said processing including evaluating at least the computation configuration based on a build cost model, the build cost model representative of a data transfer cost and a data efficiency of the computation configuration; and transmitting, by the computing device, instructions directing the compiler to implement the computation configuration.

    Method and apparatus for enabling autonomous acceleration of dataflow AI applications

    公开(公告)号:US11144290B2

    公开(公告)日:2021-10-12

    申请号:US16570822

    申请日:2019-09-13

    Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.

    Re-playable Execution Optimized for Page Sharing in a Managed Runtime Environment

    公开(公告)号:US20190087321A1

    公开(公告)日:2019-03-21

    申请号:US15710678

    申请日:2017-09-20

    Abstract: Embodiments of this disclosure allow non-position-independent-code to be shared between a closed application and a subsequent application without converting the non-position-independent-code into position-independent-code. In particular, embodiment techniques store live data of a closed application during runtime of the closed application, and thereafter page a portion of the live data that is common to both the closed application and a subsequent application back into volatile memory at the same virtual memory address in which the portion of live data was stored during runtime of the closed application so that the paged lived data may be re-used to execute the subsequent application in the managed runtime environment. Because the paged live data is stored at the same virtual memory address during the runtimes of both applications, non-position-independent-code can be shared between the applications.

    Method and apparatus for enabling autonomous acceleration of dataflow AI applications

    公开(公告)号:US11573777B2

    公开(公告)日:2023-02-07

    申请号:US17186352

    申请日:2021-02-26

    Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.

    Re-Playable Execution Optimized for Page Sharing in a Managed Runtime Environment

    公开(公告)号:US20190087211A1

    公开(公告)日:2019-03-21

    申请号:US15890256

    申请日:2018-02-06

    Abstract: Embodiments of this disclosure allow non-position-independent-code to be shared between a closed application and a subsequent application without converting the non-position-independent-code into position-independent-code. In particular, embodiment techniques store live data of a closed application during runtime of the closed application, and thereafter page a portion of the live data that is common to both the closed application and a subsequent application back into volatile memory at the same virtual memory address in which the portion of live data was stored during runtime of the closed application so that the paged lived data may be re-used to execute the subsequent application in the managed runtime environment. Because the paged live data is stored at the same virtual memory address during the runtimes of both applications, non-position-independent-code can be shared between the applications.

    Re-playable execution optimized for page sharing in a managed runtime environment

    公开(公告)号:US11243790B2

    公开(公告)日:2022-02-08

    申请号:US15890256

    申请日:2018-02-06

    Abstract: Embodiments of this disclosure allow non-position-independent-code to be shared between a closed application and a subsequent application without converting the non-position-independent-code into position-independent-code. In particular, embodiment techniques store live data of a closed application during runtime of the closed application, and thereafter page a portion of the live data that is common to both the closed application and a subsequent application back into volatile memory at the same virtual memory address in which the portion of live data was stored during runtime of the closed application so that the paged lived data may be re-used to execute the subsequent application in the managed runtime environment. Because the paged live data is stored at the same virtual memory address during the runtimes of both applications, non-position-independent-code can be shared between the applications.

Patent Agency Ranking