-
公开(公告)号:US09059728B2
公开(公告)日:2015-06-16
申请号:US14015228
申请日:2013-08-30
Applicant: International Business Machines Corporation
Inventor: David D. Chambliss , Mihail C. Constantinescu , Joseph S. Glider , Dilip N. Simha
IPC: H03M7/30
CPC classification number: H03M7/3084 , H03M7/6005
Abstract: Aspects of the invention are provided for decoding a selected span of data within a compressed code stream. A selection of data within the compressed code stream from an arbitrary position is presented for decompression. The arbitrary position is the starting point in the compressed code stream for decompression, and a phrase within the compressed code stream containing the starting point is identified. From the arbitrary starting point, a back pointer may provide direction to the literal. The literal is extracted as a decoding of the compressed data associated with the starting point.
Abstract translation: 本发明的各方面被提供用于对压缩码流内的所选数据量进行解码。 呈现来自任意位置的压缩码流内的数据选择用于解压缩。 任意位置是用于解压缩的压缩码流中的起始点,并且识别包含起始点的压缩码流内的短语。 从任意的起点,后指针可以提供文字的方向。 提取文字作为与起始点相关联的压缩数据的解码。
-
公开(公告)号:US20140358870A1
公开(公告)日:2014-12-04
申请号:US14016268
申请日:2013-09-03
Applicant: International Business Machines Corporation
Inventor: David D. Chambliss , Mihail C. Constantinescu , Joseph S. Glider , Maohua Lu
IPC: G06F17/30
CPC classification number: G06F17/30156
Abstract: Assignment of files to a de-duplication domain. Address space of data files is divided into multiple containers. For each of the containers, a file metadata scan is performed to obtain file system metadata, which is aggregated and summarized in a content feature summary. A content feature summary prediction measurement is measured between containers from the generated content feature summary, and files from each container are assigned to a de-duplication domain based upon the content similarity predication measurement.
Abstract translation: 将文件分配给重复数据删除域。 数据文件的地址空间分为多个容器。 对于每个容器,执行文件元数据扫描以获得在内容特征摘要中聚合和汇总的文件系统元数据。 根据生成的内容特征摘要在容器之间测量内容特征摘要预测测量,并且基于内容相似性预测测量将来自每个容器的文件分配给重复数据删除域。
-
公开(公告)号:US10754550B2
公开(公告)日:2020-08-25
申请号:US16107750
申请日:2018-08-21
Applicant: International Business Machines Corporation
Inventor: Mihail C. Constantinescu , Abdullah Gharaibeh , Maohua Lu , David A. Pease , Anurag Sharma
IPC: G06F3/16 , G06F16/174 , G06F16/2455 , G06F3/06
Abstract: Data deduplication for data storage tapes includes intercepting tape control commands for a single data storage tape. The intercepted tape control commands are modified for adding processing logic and parameters for placement of deduplicated file data on the single data storage tape. Deduplication metadata is written to a metadata portion of the single data storage tape. The deduplicated file data is written to a data portion of the single data storage tape based on the placement to increase read throughput for a deduplicated set of individual files and to reduce an average number of per-file gaps on the single data storage tape without re-duplicating deduplicated data for meeting optimization of individual file accesses.
-
公开(公告)号:US10101916B2
公开(公告)日:2018-10-16
申请号:US14927151
申请日:2015-10-29
Applicant: International Business Machines Corporation
Inventor: Mihail C. Constantinescu , Abdullah Gharaibeh , Maohua Lu , David A. Pease , Anurag Sharma
IPC: G06F17/30 , G06F3/06 , G11B5/86 , G11B27/032
Abstract: Data deduplication for data storage tapes includes intercepting tape control commands for a single data storage tape. The intercepted tape control commands are modified by adding processing logic and parameters for placement of deduplicated file data on the single data storage tape. Deduplication metadata is written to a metadata portion of the single data storage tape. The deduplicated file data is written to a data portion of the single data storage tape based on the placement to increase read throughput for a deduplicated set of individual files and to reduce an average number of per-file gaps on the single data storage tape without re-duplicating deduplicated data for meeting optimization of individual file accesses.
-
公开(公告)号:US09280551B2
公开(公告)日:2016-03-08
申请号:US13908955
申请日:2013-06-03
Applicant: International Business Machines Corporation
Inventor: David D. Chambliss , Mihail C. Constantinescu , Joseph S. Glider , Maohua Lu
IPC: G06F17/30
CPC classification number: G06F17/30156
Abstract: Assignment of files to a de-duplication domain. Address space of data files is divided into multiple containers. For each of the containers, a file metadata scan is performed to obtain file system metadata, which is aggregated and summarized in a content feature summary. A content feature summary prediction measurement is measured between containers from the generated content feature summary, and files from each container are assigned to a de-duplication domain based upon the content similarity predication measurement.
-
6.
公开(公告)号:US20150363457A1
公开(公告)日:2015-12-17
申请号:US14835268
申请日:2015-08-25
Applicant: International Business Machines Corporation
Inventor: David D. Chambliss , Mihail C. Constantinescu , Joseph S. Glider , Maohua Lu
IPC: G06F17/30
CPC classification number: G06F17/30371 , G06F17/30489 , G06F17/30876
Abstract: Detecting data duplication includes maintaining a fingerprint directory including one or more entries. Each entry includes a data fingerprint and a data location for a data chunk. A shadow list including a record of fingerprint values not contained in the fingerprint directory is maintained. Each entry is associated with a seen-count attribute, which is an indication of how often a data fingerprint has been seen in arriving data chunks to be written in a storage system, and distinguishes multiply-seen entries for data fingerprints present in at least two data chunks from once-seen entries for data fingerprints present in no more than a single data chunk. Each entry retrieved from the shadow list relates to twice-seen fingerprints.
Abstract translation: 检测数据复制包括维护包括一个或多个条目的指纹目录。 每个条目包括数据指纹和数据块的数据位置。 维护指纹目录中不包含指纹值记录的影子列表。 每个条目与被看到的数量属性相关联,该属性指示在到达要写入存储系统的数据块中看到数据指纹的频率,并且将存在于至少两个中的数据指纹的多视表项区分开 来自用于数据指纹的一次看到的条目的数据块不存在于单个数据块中。 从影子列表检索的每个条目涉及两次看到的指纹。
-
公开(公告)号:US09158468B2
公开(公告)日:2015-10-13
申请号:US13732472
申请日:2013-01-02
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Duane M. Baldwin , Clodoaldo Barrera , Mihail C. Constantinescu , Sandeep R. Patil , Riyazahamad M. Shiraguppi
CPC classification number: G06F3/0641 , G06F3/0604 , G06F3/0608 , G06F3/0611 , G06F3/0619 , G06F3/064 , G06F3/0643 , G06F3/067 , G06F3/0674 , G06F3/0676 , G06F17/30182
Abstract: Methods, systems, and computer program products are provided for deduplicating data mapping a plurality of file blocks of selected data to a plurality of logical blocks, deduplicating the plurality of logical blocks to thereby associate each logical block with a corresponding physical block of a plurality of physical blocks located on a physical memory device, two or more of the corresponding physical blocks being non-contiguous with each other, determining whether one or more of the corresponding physical blocks are one or more frequently accessed physical blocks being accessed at a frequency above a threshold frequency and being referred to by a common set of applications, and relocating data stored at the one or more frequently accessed physical blocks to different ones of the plurality of physical blocks, the different ones of the plurality of physical blocks being physically contiguous.
Abstract translation: 提供了方法,系统和计算机程序产品,用于将多个选定数据的文件块映射到多个逻辑块的数据进行重复数据删除,重复执行多个逻辑块,从而将每个逻辑块与多个逻辑块的对应物理块相关联 位于物理存储设备上的物理块,两个或多个对应的物理块彼此不连续,确定对应的物理块中的一个或多个是一个或多个经常存取的物理块,以高于 阈值频率,并且由公共应用集合引用,以及将存储在所述一个或多个频繁访问的物理块的数据重定位到所述多个物理块中的不同物理块,所述多个物理块中的不同的物理块是物理上连续的。
-
公开(公告)号:US20140359244A1
公开(公告)日:2014-12-04
申请号:US13909050
申请日:2013-06-03
Applicant: International Business Machines Corporation
Inventor: David D. Chambliss , Mihail C. Constantinescu , Joseph S. Glider , Bhushan P. Jain , Maohua Lu
IPC: G06F3/06
CPC classification number: G06F17/30156 , G06F3/0604 , G06F3/0641 , G06F3/0647 , G06F3/0683 , G06F11/1453
Abstract: Migrating a sub-volume in data storage with at least two de-duplication domains, each of the domains having at least one sub-volume. A first sub-volume is assigned to a de-duplication domain and a first content summary is computed for the first sub-volume. Similarly, a second sub-volume is assigned to a second de-duplication domains and a second content summary is computed for the second sub-volume. A first content affinity is calculated between the first sub-volume and a third sub-volume, and a second content affinity is calculated between the second sub-volume and the third sub-volume. A domain placement is selected for the third sub-volume based on comparison of the first content affinity and the second content affinity.
Abstract translation: 使用至少两个重复数据删除域迁移数据存储中的子卷,每个域具有至少一个子卷。 第一子卷被分配给重复数据删除域,并且为第一子卷计算第一内容摘要。 类似地,第二子卷被分配给第二重复数据删除域,并且为第二子卷计算第二内容摘要。 在第一子卷和第三子卷之间计算第一内容亲和度,并且在第二子卷和第三子卷之间计算第二内容亲和度。 基于第一内容亲和度和第二内容亲和度的比较,为第三子卷选择域布局。
-
公开(公告)号:US20140333457A1
公开(公告)日:2014-11-13
申请号:US14015228
申请日:2013-08-30
Applicant: International Business Machines Corporation
Inventor: David D. Chambliss , Mihail C. Constantinescu , Joseph S. Glider , Dilip N. Simha
IPC: H03M7/30
CPC classification number: H03M7/3084 , H03M7/6005
Abstract: Aspects of the invention are provided for decoding a selected span of data within a compressed code stream. A selection of data within the compressed code stream from an arbitrary position is presented for decompression. The arbitrary position is the starting point in the compressed code stream for decompression, and a phrase within the compressed code stream containing the starting point is identified. From the arbitrary starting point, a back pointer may provide direction to the literal. The literal is extracted as a decoding of the compressed data associated with the starting point.
Abstract translation: 本发明的各方面被提供用于对压缩码流内的所选数据量进行解码。 呈现来自任意位置的压缩码流内的数据选择用于解压缩。 任意位置是用于解压缩的压缩码流中的起始点,并且识别包含起始点的压缩码流内的短语。 从任意的起点,后指针可以提供文字的方向。 提取文字作为与起始点相关联的压缩数据的解码。
-
公开(公告)号:US08823557B1
公开(公告)日:2014-09-02
申请号:US13891241
申请日:2013-05-10
Applicant: International Business Machines Corporation
Inventor: David D. Chambliss , Mihail C. Constantinescu , Joseph S. Glider , Dilip N. Simha
CPC classification number: H03M7/3084 , H03M7/6005
Abstract: Aspects of the invention are provided for decoding a selected span of data within a compressed code stream. A selection of data within the compressed code stream from an arbitrary position is presented for decompression. The arbitrary position is the starting point in the compressed code stream for decompression, and a phrase within the compressed code stream containing the starting point is identified. From the arbitrary starting point, a back pointer may provide direction to the literal. The literal is extracted as a decoding of the compressed data associated with the starting point.
Abstract translation: 本发明的各方面被提供用于对压缩码流内的所选数据量进行解码。 呈现来自任意位置的压缩码流内的数据选择用于解压缩。 任意位置是用于解压缩的压缩码流中的起始点,并且识别包含起始点的压缩码流内的短语。 从任意的起点,后指针可以提供文字的方向。 提取文字作为与起始点相关联的压缩数据的解码。
-
-
-
-
-
-
-
-
-