Managing distributed execution of programs

    公开(公告)号:US09826031B2

    公开(公告)日:2017-11-21

    申请号:US14885776

    申请日:2015-10-16

    CPC classification number: H04L67/1008 G06F9/485 H04L29/08135 H04L67/16

    Abstract: Techniques are described for managing distributed execution of programs. In some situations, the techniques include determining configuration information to be used for executing a particular program in a distributed manner on multiple computing nodes and/or include providing information and associated controls to a user regarding ongoing distributed execution of one or more programs to enable the user to modify the ongoing distributed execution in various manners. Determined configuration information may include, for example, configuration parameters such as a quantity of computing nodes and/or other measures of computing resources to be used for the executing, and may be determined in various manners, including by interactively gathering values for at least some types of configuration information from an associated user (e.g., via a GUI that is displayed to the user) and/or by automatically determining values for at least some types of configuration information (e.g., for use as recommendations to a user).

    System and method for conditionally updating an item with attribute granularity

    公开(公告)号:US09507818B1

    公开(公告)日:2016-11-29

    申请号:US14092779

    申请日:2013-11-27

    Abstract: A system that implements a scaleable data storage service may maintain tables in a non-relational data store on behalf of clients. Each table may include multiple items. Each item may include one or more attributes, each containing a name-value pair. Attribute values may be scalars or sets of numbers or strings. The system may provide an API usable to request that values of one or more of an item's attributes be updated. An update request may be conditional on expected values of one or more item attributes (e.g., the same or different item attributes). In response to a request to update the values of one or more item attributes, the previous values and/or updated values may be optionally returned for the updated item attributes or for all attributes of an item targeted by an update request. Items stored in tables may be indexed using a simple or composite primary key.

    Detecting and reconciling system resource metadata anomolies in a distributed storage system
    3.
    发明授权
    Detecting and reconciling system resource metadata anomolies in a distributed storage system 有权
    检测和调和分布式存储系统中的系统资源元数据异常

    公开(公告)号:US09244958B1

    公开(公告)日:2016-01-26

    申请号:US13917320

    申请日:2013-06-13

    Abstract: A system that implements detection and reconciliation of system resource metadata for a distributed storage system is described. A node may obtain resource metadata specific to the node from another node that maintains system resource metadata for a distributed storage system. Based on the resource metadata specific to the node, a determination may be made that the node is not reconciled with the system resource metadata. A corrective operation may be performed to reconcile the node with the system resource metadata. A corrective operation may include terminating a resource, making unavailable a resource, modifying resource attributes, or sending a resource metadata update to system resource metadata for correction.

    Abstract translation: 描述了实现分布式存储系统的系统资源元数据的检测和协调的系统。 节点可以从维护分布式存储系统的系统资源元数据的另一个节点获取特定于节点的资源元数据。 基于特定于节点的资源元数据,可以确定节点不与系统资源元数据协调。 可以执行校正操作以使节点与系统资源元数据协调。 纠正操作可以包括终止资源,使资源变得不可用,修改资源属性,或将资源元数据更新发送到系统资源元数据进行纠正。

    DYNAMICALLY MODIFYING A CLUSTER OF COMPUTING NODES USED FOR DISTRIBUTED EXECUTION OF A PROGRAM
    6.
    发明申请
    DYNAMICALLY MODIFYING A CLUSTER OF COMPUTING NODES USED FOR DISTRIBUTED EXECUTION OF A PROGRAM 审中-公开
    动态修改用于分布式执行程序的计算编号的集群

    公开(公告)号:US20160234300A1

    公开(公告)日:2016-08-11

    申请号:US15133098

    申请日:2016-04-19

    CPC classification number: H04L67/1029 G06F9/5072 G06F9/5083

    Abstract: Techniques are described for managing distributed execution of programs. In some situations, the techniques include dynamically modifying the distributed program execution in various manners, such as based on monitored status information. The dynamic modifying of the distributed program execution may include adding and/or removing computing nodes from a cluster that is executing the program, modifying the amount of computing resources that are available for the distributed program execution, terminating or temporarily suspending execution of the program (e.g., if an insufficient quantity of computing nodes of the cluster are available to perform execution), etc.

    Abstract translation: 描述了用于管理程序的分布式执行的技术。 在某些情况下,这些技术包括以各种方式动态地修改分布式程序执行,例如基于被监视的状态信息。 分布式程序执行的动态修改可以包括从执行程序的集群中添加和/或移除计算节点,修改可用于分布式程序执行的计算资源的数量,终止或暂时中止程序的执行( 例如,如果集群的计算节点数量不足可用于执行)等等

    Dynamic replica failure detection and healing
    7.
    发明授权
    Dynamic replica failure detection and healing 有权
    动态复制失败检测和愈合

    公开(公告)号:US09304815B1

    公开(公告)日:2016-04-05

    申请号:US13917317

    申请日:2013-06-13

    Abstract: Detecting replica faults within a replica group and dynamically scheduling replica healing operations are described. Status metadata for one or more replica groups may be accessed. Based, at least in part, the status data a number of available replicas for at least one replica group may be determined to incompliant with a healthy state definition for the replica group. One or more healing operations to restore the number of available replicas for the at least one replica group to the respective healthy state definition may be dynamically scheduled. In some embodiments, one or more resource constraints for performing healing operations and one or more resource requirements for each of the one or more healing operations may be used to order the one or more healing operations.

    Abstract translation: 对副本组中的副本故障进行检测并动态调度复制恢复操作。 可以访问一个或多个副本组的状态元数据。 至少部分地基于状态数据,至少一个副本组的可用副本的数量可以被确定为与副本组的健康状态定义不一致。 可以动态地调度用于将至少一个副本组的可用副本的数量恢复到相应的健康状态定义的一个或多个愈合操作。 在一些实施例中,用于执行愈合操作的一个或多个资源约束和针对所述一个或多个愈合操作中的每一个的一个或多个资源需求可用于对一个或多个愈合操作进行排序。

    Configurable-capacity time-series tables
    8.
    发明授权
    Configurable-capacity time-series tables 有权
    可配置容量时间序列表

    公开(公告)号:US09128965B1

    公开(公告)日:2015-09-08

    申请号:US13961778

    申请日:2013-08-07

    Abstract: Methods and apparatus for configurable-capacity time-series tables are disclosed. A schedule of database table management operations, including at least an operation to change a throughput constraint associated with a table in response to a triggering event, is generated. The table is instantiated with an initial throughput constraint in accordance with the schedule. Work requests directed to the table are accepted based on the initial throughput constraint. The throughput constraint is modified in response to the triggering event. Subsequent work requests are accepted based on the modified throughput constraint.

    Abstract translation: 公开了可配置容量时间序列表的方法和装置。 生成数据库表管理操作的调度表,其包括响应于触发事件而至少改变与表关联的吞吐量约束的操作。 该表根据时间表以初始吞吐量约束进行实例化。 基于初始吞吐量约束接受指向表的工作请求。 响应于触发事件修改吞吐量约束。 基于修改的吞吐量约束接受后续工作请求。

    DYNAMIC SCALING OF A CLUSTER OF COMPUTING NODES
    9.
    发明申请
    DYNAMIC SCALING OF A CLUSTER OF COMPUTING NODES 审中-公开
    计算节点集群的动态缩放

    公开(公告)号:US20150135185A1

    公开(公告)日:2015-05-14

    申请号:US14598137

    申请日:2015-01-15

    Abstract: Techniques are described for managing distributed execution of programs, including by dynamically scaling a cluster of multiple computing nodes performing ongoing distributed execution of a program, such as to increase and/or decrease computing node quantity. An architecture may be used that has core nodes that each participate in a distributed storage system for the distributed program execution, and that has one or more other auxiliary nodes that do not participate in the distributed storage system. Furthermore, as part of performing the dynamic scaling of a cluster, computing nodes that are only temporarily available may be selected and used, such as computing nodes that might be removed from the cluster during the ongoing program execution to be put to other uses and that may also be available for a different fee (e.g., a lower fee) than other computing nodes that are available throughout the ongoing use of the cluster.

    Abstract translation: 描述了用于管理程序的分布式执行的技术,包括通过动态地缩放执行程序的正在进行的分布式执行的多个计算节点的集群,例如增加和/或减少计算节点数量。 可以使用具有每个参与用于分布式程序执行的分布式存储系统的核心节点并且具有不参与分布式存储系统的一个或多个其他辅助节点的架构。 此外,作为执行群集的动态缩放的一部分,可以选择和使用仅临时可用的计算节点,例如在正在进行的程序执行期间可能从群集中移除的计算节点以被放置到其他用途, 也可能与在整个持续使用集群时可用的其他计算节点的费用不同(例如,较低的费用)可用。

    System and method for performing live partitioning in a data store

    公开(公告)号:US10712950B2

    公开(公告)日:2020-07-14

    申请号:US14733851

    申请日:2015-06-08

    Abstract: A system that implements a scalable data storage service may maintain tables in a data store on behalf of storage service clients. The service may maintain table data in multiple replicas of partitions that are stored on respective computing nodes in the system. In response to detecting an anomaly in the system, detecting a change in data volume on a partition or service request traffic directed to a partition, or receiving a service request from a client to split a partition, the data storage service may create additional copies of a partition replica using a physical copy mechanism. The data storage service may issue a split command defined in an API for the data store to divide the original and additional replicas into multiple replica groups, and to configure each replica group to maintain a respective portion of the table data that was stored in the partition before the split.

Patent Agency Ranking