DISTRIBUTED DATA PROCESSING IN MULTI-TENANT ENVIRONMENTS

    公开(公告)号:US20200004858A1

    公开(公告)日:2020-01-02

    申请号:US16024264

    申请日:2018-06-29

    Abstract: Methods, systems, and devices for data processing within a distributed data system are described. In a multi-tenant distributed data system, a provider may supply executable code for processing data using declarative processing instructions received from a tenant. For example, a tenant may provide tenant-specific processing instructions for a requested set of data. The processing instructions may indicate input information (e.g., a data structure, tenant-specific fields, etc.), transformation information (e.g., from a set of pre-defined transformations), and output information. The provider-supplied code may use the tenant-specific processing instructions to process and generate the requested set of data, where the code may be executed by multiple nodes within the system. As such, the code executed by multiple nodes may utilize the input information, transformation information, and output information from the tenant-specific processing instructions to generate the requested data and provide the data to the tenant.

    Systems, methods, and apparatuses for implementing concurrent dataflow execution with write conflict protection within a cloud based computing environment

    公开(公告)号:US10685034B2

    公开(公告)日:2020-06-16

    申请号:US15786448

    申请日:2017-10-17

    Abstract: In accordance with disclosed embodiments, there are provided systems, methods, and apparatuses for implementing concurrent dataflow execution with write conflict protection within a cloud based computing environment. For instance, an exemplary system having at least a processor and a memory therein includes means for: creating a dataflow definition for a first dataflow type, wherein the dataflow definition includes at least one or more datasets to be accessed by the dataflow and a plurality of functional operations to be performed on the one or more datasets when the dataflow is executed; generating and storing a dataflow version identifying all datasets accessed by the dataflow based on the dataflow definition created; receiving multiple requests for the first dataflow type; enqueuing the multiple requests into a message queue pending execution; selecting, from the message queue, a first runnable dataflow having been earliest enqueued of the first dataflow type for execution based on (i) the first dataflow type being allowable within system limits and based further on (ii) verification that the selected first runnable dataflow is not already executing and based further on (iii) verification there is no write conflict for any dataset accessed by the selected first runnable dataflow. Other related embodiments are disclosed.

    Dataflow life cycles
    5.
    发明授权

    公开(公告)号:US10853131B2

    公开(公告)日:2020-12-01

    申请号:US15817582

    申请日:2017-11-20

    Abstract: System and methods for implementing dataflow life cycles are described and include forming, by a first server computing system, a dataflow life cycle by associating a dataflow with a customized code; associating, by the first server computing system, the customized code of the dataflow life cycle with context information, the customized code including one or more of pre-processing customized code and post-processing customized code; scheduling, by the first server computing system, the dataflow of the dataflow life cycle to be executed by a second server computing system when the customized code includes the pre-processing customized code and when the pre-processing customized code is successfully executed by the first server computing system; and executing, by the first server computing system, the post-processing customized code when the customized code includes the post-processing customized code and when the dataflow of the dataflow life cycle is successfully executed by the second server computing system.

    ORCHESTRATION FOR DATA PIPELINE EXECUTION PLANS

    公开(公告)号:US20210240519A1

    公开(公告)日:2021-08-05

    申请号:US16779040

    申请日:2020-01-31

    Abstract: Methods, systems, and devices supporting dynamic process orchestration are described. An orchestration server may receive a request defining a data modification process from a user device. The orchestration server may generate an execution file based on the request, and the execution file may include a set of tasks for performing the data modification process and an order for performing the set of tasks. The orchestration server may execute, for the execution file, a first set of tasks according to the order for performing the set of tasks and, in some cases, may update the execution file based on executing the first subset of tasks. For example, updating the execution file may involve modifying a second subset of tasks of the set of tasks. The orchestration server may execute, for the updated execution file, the modified second subset of tasks according to the order for performing the set of tasks.

    Pseudo-synchronous processing by an analytic query and build cluster

    公开(公告)号:US10515089B2

    公开(公告)日:2019-12-24

    申请号:US15589728

    申请日:2017-05-08

    Abstract: The technology disclosed relates to creating and frequently updating multiple online analytic processing (OLAF) analytic databases from an online transaction processing (OLTP) transaction updatable system that includes transaction commit, rollback, and field level security capabilities. It also relates to transparently decoupling extraction from rebuilding of frequently updated OLAP analytic databases from the OLTP transaction updatable system.

    DATAFLOW LIFE CYCLES
    10.
    发明申请

    公开(公告)号:US20190155642A1

    公开(公告)日:2019-05-23

    申请号:US15817582

    申请日:2017-11-20

    Abstract: System and methods for implementing dataflow life cycles are described and include forming, by a first server computing system, a dataflow life cycle by associating a dataflow with a customized code; associating, by the first server computing system, the customized code of the dataflow life cycle with context information, the customized code including one or more of pre-processing customized code and post-processing customized code; scheduling, by the first server computing system, the dataflow of the dataflow life cycle to be executed by a second server computing system when the customized code includes the pre-processing customized code and when the pre-processing customized code is successfully executed by the first server computing system; and executing, by the first server computing system, the post-processing customized code when the customized code includes the post-processing customized code and when the dataflow of the dataflow life cycle is successfully executed by the second server computing system.

Patent Agency Ranking