Abstract:
A method of prioritizing data for recovery in a distributed storage system includes, for each stripe of a file having chunks, determining whether the stripe comprises high-availability chunks or low-availability chunks and determining an effective redundancy value for each stripe. The effective redundancy value is based on the chunks and any system domains associated with the corresponding stripe. The distributed storage system has a system hierarchy including system domains. Chunks of a stripe associated with a system domain in an active state are accessible, whereas chunks of a stripe associated with a system domain in an inactive state are inaccessible. The method also includes reconstructing substantially immediately inaccessible, high-availability chunks having an effective redundancy value less than a threshold effective redundancy value and reconstructing the inaccessible low-availability and other inaccessible high-availability chunks, after a threshold period of time.
Abstract:
A method of prioritizing data for recovery in a distributed storage system includes, for each stripe of a file having chunks, determining whether the stripe comprises high-availability chunks or low-availability chunks and determining an effective redundancy value for each stripe. The effective redundancy value is based on the chunks and any system domains associated with the corresponding stripe. The distributed storage system has a system hierarchy including system domains. Chunks of a stripe associated with a system domain in an active state are accessible, whereas chunks of a stripe associated with a system domain in an inactive state are inaccessible. The method also includes reconstructing substantially immediately inaccessible, high-availability chunks having an effective redundancy value less than a threshold effective redundancy value and reconstructing the inaccessible low-availability and other inaccessible high-availability chunks, after a threshold period of time.
Abstract:
A method of distributing data in a distributed storage system includes receiving a file into non-transitory memory and dividing the received file into chunks using a computer processor in communication with the non-transitory memory. The method also includes distributing chunks to storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance units each having active and inactive states. Moreover, each storage device is associated with a maintenance unit. The chunks are distributed across multiple maintenance units to maintain accessibility of the file when a maintenance unit is in an inactive state.
Abstract:
A method of prioritizing data for recovery in a distributed storage system includes, for each stripe of a file having chunks, determining whether the stripe comprises high-availability chunks or low-availability chunks and determining an effective redundancy value for each stripe. The effective redundancy value is based on the chunks and any system domains associated with the corresponding stripe. The distributed storage system has a system hierarchy including system domains. Chunks of a stripe associated with a system domain in an active state are accessible, whereas chunks of a stripe associated with a system domain in an inactive state are inaccessible. The method also includes reconstructing substantially immediately inaccessible, high-availability chunks having an effective redundancy value less than a threshold effective redundancy value and reconstructing the inaccessible low-availability and other inaccessible high-availability chunks, after a threshold period of time.
Abstract:
A method of distributing data in a distributed storage system includes receiving a file and dividing the received file into chunks. The chunks are data-chunks and non-data chunks. The method further includes grouping chunks into a group and determining a distribution of the chunks of the group among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes hierarchical maintenance levels and maintenance domains. Each maintenance domain has an active state or an inactive state; and each storage device is associated with at least one maintenance domain. The method also includes distributing the chunks of the group to the storage devices based on the determined distribution. The chunks of the group are distributed across multiple maintenance domains to maintain an ability to reconstruct chunks of the group when a maintenance domain is in the inactive state.
Abstract:
The present technology proposes techniques for ensuring globally consistent transactions. This technology may allow distributed systems to ensure the causal order of read and write transactions across different partitions of a distributed database. By assigning causally generated timestamps to the transactions based on one or more globally coherent time services, the timestamps can be used to preserve and represent the causal order of the transactions in the distributed system. In this regard, certain transactions may wait for a period of time after choosing a timestamp in order to delay the start of any second transaction that might depend on it. The wait may ensure that the effects of the first transaction are not made visible until its timestamp is guaranteed to be in the past. This may ensure that a consistent snapshot of the distributed database can be determined for any past timestamp.
Abstract:
A method of distributing data in a distributed storage system includes receiving a file into non-transitory memory and dividing the received file into chunks using a computer processor in communication with the non-transitory memory. The method also includes distributing chunks to storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance units each having active and inactive states. Moreover, each storage device is associated with a maintenance unit. The chunks are distributed across multiple maintenance units to maintain accessibility of the file when a maintenance unit is in an inactive state.
Abstract:
A method of prioritizing data for recovery in a distributed storage system includes, for each stripe of a file having chunks, determining whether the stripe comprises high-availability chunks or low-availability chunks and determining an effective redundancy value for each stripe. The effective redundancy value is based on the chunks and any system domains associated with the corresponding stripe. The distributed storage system has a system hierarchy including system domains. Chunks of a stripe associated with a system domain in an active state are accessible, whereas chunks of a stripe associated with a system domain in an inactive state are inaccessible. The method also includes reconstructing substantially immediately inaccessible, high-availability chunks having an effective redundancy value less than a threshold effective redundancy value and reconstructing the inaccessible low-availability and other inaccessible high-availability chunks, after a threshold period of time.
Abstract:
A method of distributing data in a distributed storage system includes receiving a file and dividing the received file into chunks. The chunks are data-chunks and non-data chunks. The method further includes grouping chunks into a group and determining a distribution of the chunks of the group among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes hierarchical maintenance levels and maintenance domains. Each maintenance domain has an active state or an inactive state; and each storage device is associated with at least one maintenance domain. The method also includes distributing the chunks of the group to the storage devices based on the determined distribution. The chunks of the group are distributed across multiple maintenance domains to maintain an ability to reconstruct chunks of the group when a maintenance domain is in the inactive state.
Abstract:
A method of distributing data in a distributed storage system includes receiving a file into non-transitory memory and dividing the received file into chunks. The chunks are data-chunks and non-data chunks. The method also includes grouping one or more of the data chunks and one or more of the non-data chunks in a group. One or more chunks of the group is capable of being reconstructed from other chunks of the group. The method also includes distributing the chunks of the group to storage devices of the distributed storage system based on a hierarchy of the distributed storage system. The hierarchy includes maintenance domains having active and inactive states, each storage device associated with a maintenance domain, the chunks of a group are distributed across multiple maintenance domains to maintain the ability to reconstruct chunks of the group when a maintenance domain is in an inactive state.