-
公开(公告)号:US20140195486A1
公开(公告)日:2014-07-10
申请号:US13736861
申请日:2013-01-08
Applicant: Facebook, Inc.
Inventor: Sachin Kulkarni , Sanjeev Kumar , Harry Li , Laurent Demailly , Liat Atsmon Guz
IPC: G06F17/30
CPC classification number: G06F17/30581 , G06F11/1662 , G06F11/2082 , G06F11/2094
Abstract: Disclosed are a method and system for recovering a distributed system from a failure of a data storage unit. The distributed system includes a plurality of computer systems, each having a read-write computer and a data storage unit. Data is replicated from a particular data storage unit to other data storage units using publish-subscribe model. A read-write computer receives the replicated data, processes the data for any conflicts and stores it in the data storage unit. If a data storage unit fails, another data storage unit that has latest data corresponding to the failed data storage unit is determined and the latest data is replicated to other data storage units. Accordingly, the distributed system continues to have the data of the failed data storage unit. The failed data storage unit may be reconstructed using data from one of the other data storage units in the distributed system.
Abstract translation: 公开了一种用于从数据存储单元的故障中恢复分布式系统的方法和系统。 分布式系统包括多个计算机系统,每个计算机系统具有读写计算机和数据存储单元。 使用发布 - 订阅模型将数据从特定数据存储单元复制到其他数据存储单元。 读写计算机接收复制数据,处理任何冲突的数据并将其存储在数据存储单元中。 如果数据存储单元发生故障,则确定具有与故障数据存储单元相对应的最新数据的另一数据存储单元,并将最新数据复制到其他数据存储单元。 因此,分布式系统继续具有故障数据存储单元的数据。 可以使用来自分布式系统中的其他数据存储单元之一的数据来重构故障数据存储单元。
-
公开(公告)号:US09824132B2
公开(公告)日:2017-11-21
申请号:US13736861
申请日:2013-01-08
Applicant: Facebook, Inc.
Inventor: Sachin Kulkarni , Sanjeev Kumar , Harry Li , Laurent Demailly , Liat Atsmon Guz
CPC classification number: G06F17/30581 , G06F11/1662 , G06F11/2082 , G06F11/2094
Abstract: Disclosed are a method and system for recovering a distributed system from a failure of a data storage unit. The distributed system includes a plurality of computer systems, each having a read-write computer and a data storage unit. Data is replicated from a particular data storage unit to other data storage units using publish-subscribe model. A read-write computer receives the replicated data, processes the data for any conflicts and stores it in the data storage unit. If a data storage unit fails, another data storage unit that has latest data corresponding to the failed data storage unit is determined and the latest data is replicated to other data storage units. Accordingly, the distributed system continues to have the data of the failed data storage unit. The failed data storage unit may be reconstructed using data from one of the other data storage units in the distributed system.
-