Abstract:
Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.
Abstract:
Segment sizes are controlled by setting the size of a segment boundary in a hash-based backup deduplication system in a distributed computing environment. A subsequence of size K of a sequence of characters S is set. Segment boundaries are set by using the sequence of the decreasingly restrictive logical tests if one of the sequence of the decreasingly restrictive logical tests returns a true value when applied on the sequence of characters S.
Abstract:
Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input data and data in the synthetic backup.
Abstract:
Systems. Methods, and Computer Program Products are provided for managing a global cache coherency in a distributed shared caching for a clustered file systems (CFS). The CFS manages access permissions to an entire space of data segments by using the DSM module. In response to receiving a request to access one of the data segments, a calculation operation is performed for obtaining most recent contents of one of the data segments. The calculation operation performs one of providing the most recent contents via communication with a remote DSM module which obtains the one of the data segments from an associated external cache memory, instructing by the DSM module to read from storage the one of the data segments, and determining that any existing contents of the one of the data segments in the local external cache are the most recent contents.
Abstract:
Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.
Abstract:
A deduplication storage system and a backup application create a synthetic backup. Metadata instructions are provided to the deduplication storage system. Each of the metadata instructions specifies the data segment of an originating backup and a designated location of the data segment in the synthetic backup. A set of metadata instructions is transformed into a transformed set of metadata instructions.
Abstract:
A deduplication storage system and a backup application create a synthetic backup. Metadata instructions are provided to the deduplication storage system. Each of the metadata instructions specifies the data segment of an originating backup and a designated location of the data segment in the synthetic backup. Each of the metadata instructions are processed by locating those data sub-segments in the deduplication storage system specified by the data segment in each of the metadata instructions, and creating metadata references to each of the data sub-segments and adding the metadata references to metadata of the synthetic backup being created.
Abstract:
Exemplary method, system, and computer program product embodiments for full exploitation of parallel processors for data processing are provided. In one embodiment, by way of example only, a set of parallel processors is partitioned into disjoint subsets according to indices of the set of the parallel processors. The size of each of the disjoint subsets corresponds to a number of processors assigned to the processing of the data chunks at one of the layers. Each of the processors are assigned to different layers in different data chunks such that each of processors are busy and the data chunks are fully processed within a number of the time steps equal to the number of the layers. A transition function is devised from the indices of the set of the parallel processors at one time steps to the indices of the set of the parallel processors at a following time step.
Abstract:
Systems. Methods, and Computer Program Products are provided for managing a global cache coherency in a distributed shared caching for a clustered file systems (CFS). The CFS manages access permissions to an entire space of data segments by using the DSM module. In response to receiving a request to access one of the data segments, a calculation operation is performed for obtaining most recent contents of one of the data segments. The calculation operation performs one of providing the most recent contents via communication with a remote DSM module which obtains the one of the data segments from an associated external cache memory, instructing by the DSM module to read from storage the one of the data segments, and determining that any existing contents of the one of the data segments in the local external cache are the most recent contents.
Abstract:
Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.