DESIGN-TIME INFORMATION BASED ON RUN-TIME ARTIFACTS IN A DISTRIBUTED COMPUTING CLUSTER

    公开(公告)号:US20200065136A1

    公开(公告)日:2020-02-27

    申请号:US16667609

    申请日:2019-10-29

    申请人: Cloudera, Inc.

    IPC分类号: G06F9/46 G06F16/21

    摘要: Techniques are disclosed for inferring design-time information based on run-time artifacts generated by services operating in a distributed computing cluster. In an embodiment, a metadata system extracts metadata including run-time artifacts generated by services in a distributed computing cluster while processing a workflow including multiple jobs. The extracted metadata is processed to identify entities and entity relationships which can then be used to generate lineage information. Using the lineage information, the metadata system can infer design-time information associated with the workflow. The inferred design-time information can then be utilized to, for example, recreate the workflow, recreate previous versions of the workflow, optimize the workflow, etc.

    Design-time information based on run-time artifacts in a distributed computing cluster

    公开(公告)号:US10929173B2

    公开(公告)日:2021-02-23

    申请号:US16667609

    申请日:2019-10-29

    申请人: Cloudera, Inc.

    IPC分类号: G06F9/46 G06F16/14 G06F16/21

    摘要: Techniques are disclosed for inferring design-time information based on run-time artifacts generated by services operating in a distributed computing cluster. In an embodiment, a metadata system extracts metadata including run-time artifacts generated by services in a distributed computing cluster while processing a workflow including multiple jobs. The extracted metadata is processed to identify entities and entity relationships which can then be used to generate lineage information. Using the lineage information, the metadata system can infer design-time information associated with the workflow. The inferred design-time information can then be utilized to, for example, recreate the workflow, recreate previous versions of the workflow, optimize the workflow, etc.

    QUERYING OPERATING SYSTEM STATE ON MULTIPLE MACHINES DECLARATIVELY
    4.
    发明申请
    QUERYING OPERATING SYSTEM STATE ON MULTIPLE MACHINES DECLARATIVELY 有权
    查询多台机器的操作系统状态

    公开(公告)号:US20160103874A1

    公开(公告)日:2016-04-14

    申请号:US14510006

    申请日:2014-10-08

    申请人: Cloudera, Inc.

    发明人: Philip Zeyliger

    摘要: A sysSQL technology for querying operating system states of multiple hosts in a cluster using a Structured Query Language (SQL) query is disclosed. An administrator of a cluster can use a graphical or text-based user interface to submit an SQL query to determine the operating system states of multiple hosts in parallel. The technology parses the SQL query to determine the datasets needed to execute the SQL query and aggregates those datasets from the multiple hosts. The technology then creates a temporary database to execute the SQL query and provides the results from the SQL query for display on the user interface.

    摘要翻译: 公开了一种使用结构化查询语言(SQL)查询来查询集群中多个主机的操作系统状态的sysSQL技术。 集群的管理员可以使用图形或基于文本的用户界面提交SQL查询,以并行确定多个主机的操作系统状态。 该技术解析SQL查询以确定执行SQL查询所需的数据集,并从多个主机聚合这些数据集。 该技术然后创建一个临时数据库来执行SQL查询,并提供SQL查询的结果以在用户界面上显示。

    Information based on run-time artifacts in a distributed computing cluster

    公开(公告)号:US10514948B2

    公开(公告)日:2019-12-24

    申请号:US15808805

    申请日:2017-11-09

    申请人: Cloudera, Inc.

    IPC分类号: G06F9/46 G06F16/14 G06F16/21

    摘要: Techniques are disclosed for inferring design-time information based on run-time artifacts generated by services operating in a distributed computing cluster. In an embodiment, a metadata system extracts metadata including run-time artifacts generated by services in a distributed computing cluster while processing a workflow including multiple jobs. The extracted metadata is processed to identify entities and entity relationships which can then be used to generate lineage information. Using the lineage information, the metadata system can infer design-time information associated with the workflow. The inferred design-time information can then be utilized to, for example, recreate the workflow, recreate previous versions of the workflow, optimize the workflow, etc.

    CENTRALIZED CONFIGURATION OF A DISTRIBUTED COMPUTING CLUSTER
    6.
    发明申请
    CENTRALIZED CONFIGURATION OF A DISTRIBUTED COMPUTING CLUSTER 有权
    分布式计算集群的集中配置

    公开(公告)号:US20150039735A1

    公开(公告)日:2015-02-05

    申请号:US14509300

    申请日:2014-10-08

    申请人: Cloudera, Inc.

    IPC分类号: H04L12/24 H04L29/08

    摘要: Systems and methods for centralized configuration of a distributed computing cluster are disclosed. One embodiment of the disclosed technology provides a user environment that facilitates a selection of a service to be run on hosts in the distributed computing cluster and configuration of the service or hosts in the distributed computer cluster. The disclosed technology can further configure each of the hosts in the distributed computing cluster to run the service based on a set of configuration settings.

    摘要翻译: 公开了用于集中式配置分布式计算集群的系统和方法。 所公开技术的一个实施例提供了便于选择要在分布式计算群集中的主机上运行的服务以及分布式计算机群集中的服务或主机的配置的用户环境。 所公开的技术可以进一步配置分布式计算集群中的每个主机以基于一组配置设置来运行服务。

    DESIGN-TIME INFORMATION BASED ON RUN-TIME ARTIFACTS IN A DISTRIBUTED COMPUTING CLUSTER

    公开(公告)号:US20210173696A1

    公开(公告)日:2021-06-10

    申请号:US17179155

    申请日:2021-02-18

    申请人: Cloudera, Inc.

    IPC分类号: G06F9/46 G06F16/21

    摘要: Techniques are disclosed for inferring design-time information based on run-time artifacts generated by services operating in a distributed computing cluster. In an embodiment, a metadata system extracts metadata including run-time artifacts generated by services in a distributed computing cluster while processing a workflow including multiple jobs. The extracted metadata is processed to identify entities and entity relationships which can then be used to generate lineage information. Using the lineage information, the metadata system can infer design-time information associated with the workflow. The inferred design-time information can then be utilized to, for example, recreate the workflow, recreate previous versions of the workflow, optimize the workflow, etc.

    INFORMATION BASED ON RUN-TIME ARTIFACTS IN A DISTRIBUTED COMPUTING CLUSTER

    公开(公告)号:US20190138345A1

    公开(公告)日:2019-05-09

    申请号:US15808805

    申请日:2017-11-09

    申请人: Cloudera, Inc.

    IPC分类号: G06F9/46

    CPC分类号: G06F9/46 G06F16/14 G06F16/211

    摘要: Techniques are disclosed for inferring design-time information based on run-time artifacts generated by services operating in a distributed computing cluster. In an embodiment, a metadata system extracts metadata including run-time artifacts generated by services in a distributed computing cluster while processing a workflow including multiple jobs. The extracted metadata is processed to identify entities and entity relationships which can then be used to generate lineage information. Using the lineage information, the metadata system can infer design-time information associated with the workflow. The inferred design-time information can then be utilized to, for example, recreate the workflow, recreate previous versions of the workflow, optimize the workflow, etc.