Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Barry Bailey Hunter

1.

发明授权
Dynamic replica failure detection and healing 有权
Title translation: 动态复制失败检测和愈合

公开(公告)号：US09304815B1

公开(公告)日：2016-04-05

申请号：US13917317

申请日：2013-06-13

Applicant: Amazon Technologies, Inc.

Inventor： Jai Vasanth , Barry Bailey Hunter, Jr. , Kiran-Kumar Muniswamy-Reddy , David Alan Lutz , Jian Wang , Maximiliano MacCanti

IPC: G06F9/48 , G06F11/07 , G06F3/06 , G06F17/30

CPC classification number: G06F17/30575 , G06F3/0617 , G06F9/4881 , G06F11/006 , G06F11/07 , G06F11/0709 , G06F11/0793 , G06F2201/86 , H04L67/1095

Abstract: Detecting replica faults within a replica group and dynamically scheduling replica healing operations are described. Status metadata for one or more replica groups may be accessed. Based, at least in part, the status data a number of available replicas for at least one replica group may be determined to incompliant with a healthy state definition for the replica group. One or more healing operations to restore the number of available replicas for the at least one replica group to the respective healthy state definition may be dynamically scheduled. In some embodiments, one or more resource constraints for performing healing operations and one or more resource requirements for each of the one or more healing operations may be used to order the one or more healing operations.

Abstract translation: 对副本组中的副本故障进行检测并动态调度复制恢复操作。可以访问一个或多个副本组的状态元数据。至少部分地基于状态数据，至少一个副本组的可用副本的数量可以被确定为与副本组的健康状态定义不一致。可以动态地调度用于将至少一个副本组的可用副本的数量恢复到相应的健康状态定义的一个或多个愈合操作。在一些实施例中，用于执行愈合操作的一个或多个资源约束和针对所述一个或多个愈合操作中的每一个的一个或多个资源需求可用于对一个或多个愈合操作进行排序。

2.

发明授权
Scheduling and tracking control plane operations for distributed storage systems 有权
Title translation: 分布式存储系统的调度和跟踪控制平面操作

公开(公告)号：US09438665B1

公开(公告)日：2016-09-06

申请号：US13921084

申请日：2013-06-18

Applicant: Amazon Technologies, Inc.

Inventor： Jai Vasanth , Kiran-Kumar Muniswamy-Reddy , David Alan Lutz , Barry Bailey Hunter, Jr.

IPC: G06F15/173 , H04L29/08

CPC classification number: H04L67/10 , H04L67/02 , H04L67/32 , H04L67/322

Abstract: A system that implements distributed storage may schedule and track control plane operations for performance at the distributed storage service. Information may be maintained for control plane events detected at a distributed storage system. Resource utilization for currently performing control plane operations and currently scheduled control plane operations of the distributed storage system may be determined. The information about detected control plane events may be analyzed to schedule control plane operations to be performed in response to detecting the control plane events. As part of scheduling control plane operations, resource constraints may be applied to the determine resource utilization for the distributed storage system.

Abstract translation: 实现分布式存储的系统可以调度和跟踪控制平面操作，以便在分布式存储服务中执行性能。可以维护在分布式存储系统检测到的控制平面事件的信息。可以确定当前执行的控制平面操作和分布式存储系统的当前调度的控制平面操作的资源利用。可以分析关于检测到的控制平面事件的信息，以响应于检测到控制平面事件来调度要执行的控制平面操作。作为调度控制平面操作的一部分，可以将资源约束应用于确定分布式存储系统的资源利用。

3.

发明授权
Dynamic replica failure detection and healing 有权

公开(公告)号：US09971823B2

公开(公告)日：2018-05-15

申请号：US15090547

申请日：2016-04-04

Applicant: Amazon Technologies, Inc.

Inventor： Jai Vasanth , Barry Bailey Hunter, Jr. , Kiran-Kumar Muniswamy-Reddy , David Alan Lutz , Jian Wang , Maximiliano Maccanti

IPC: G06F17/30 , G06F11/07 , G06F9/48 , G06F3/06 , G06F11/00 , H04L29/08

CPC classification number: G06F17/30575 , G06F3/0617 , G06F9/4881 , G06F11/006 , G06F11/07 , G06F11/0709 , G06F11/0793 , G06F2201/86 , H04L67/1095

Abstract: Detecting replica faults within a replica group and dynamically scheduling replica healing operations are described. Status metadata for one or more replica groups may be accessed. Based, at least in part, the status data a number of available replicas for at least one replica group may be determined to incompliant with a healthy state definition for the replica group. One or more healing operations to restore the number of available replicas for the at least one replica group to the respective healthy state definition may be dynamically scheduled. In some embodiments, one or more resource constraints for performing healing operations and one or more resource requirements for each of the one or more healing operations may be used to order the one or more healing operations.

4.

发明授权
Distributed system capacity dial-up 有权

公开(公告)号：US09996573B1

公开(公告)日：2018-06-12

申请号：US14222377

申请日：2014-03-21

Applicant: Amazon Technologies, Inc.

Inventor： Akshat Vig , Wei Xiao , Somasundaram Perianayagam , Timothy Andrew Rath , Barry Bailey Hunter, Jr. , Kiran-Kumar Muniswamy-Reddy , Yijun Lu , Qiang Liu , Ying Lin , Stuart Henry Seelye Marshall

IPC: G06F17/30

CPC classification number: G06F17/30584

Abstract: A hosted service may limit access to a table initially comprising one or more partitions. Access to the table may be limited to a provisioned capacity. A client of the service may request an increased capacity. A minimum number of partitions for providing the increased capacity may be determined. Proportions of the increased capacity may be allocated among members of successive generations of partitions to be provided by a member of a generation or its descendants. The proportions may be allocated to minimize the costs associated with splitting partitions based on the minimum number of partitions.

5.

发明授权
Distributed computing fault management 有权
Title translation: 分布式计算故障管理

公开(公告)号：US09274902B1

公开(公告)日：2016-03-01

申请号：US13961720

申请日：2013-08-07

Applicant: Amazon Technologies, Inc.

Inventor： Adam Douglas Morley , Barry Bailey Hunter, Jr. , Yijun Lu , Timothy Andrew Rath , Kiran-Kumar Muniswamy-Reddy , Xianglong Huang , Jiandan Zheng

IPC: G06F11/00 , G06F11/20 , G06F11/07

CPC classification number: G06F11/2002 , G06F11/0709 , G06F11/0787 , G06F11/079 , G06F11/0793 , G06F11/2094

Abstract: An automated system may be employed to perform detection, analysis and recovery from faults occurring in a distributed computing system. Faults may be recorded in a metadata store for verification and analysis by an automated fault management process. Diagnostic procedures may confirm detected faults. The automated fault management process may perform recovery workflows involving operations such as rebooting faulting devices and excommunicating unrecoverable computing nodes from affected clusters.

Abstract translation: 可以采用自动化系统来执行从发生在分布式计算系统中的故障的检测，分析和恢复。可以将故障记录在元数据存储中，以通过自动化故障管理过程进行验证和分析。诊断程序可以确认检测到的故障。自动化故障管理过程可以执行涉及诸如重新启动故障设备和从受影响的集群中传播不可恢复的计算节点的操作的恢复工作流。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification