Abstract:
A method and system for providing information management of data from hosted services receives information management policies for a hosted account of a hosted service, requests data associated with the hosted account from the hosted service, receives data associated with the hosted account from the hosted service, and provides a preview version of the received data to a computing device. In some examples, the system indexes the received data to associate the received data with a user of an information management system, and/or provides index information related to the received data to the computing device.
Abstract:
Various systems and methods may be used for performing data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods for content indexing data stored within a cloud environment may facilitate later searching, including collaborative searching. Methods for performing containerized deduplication may reduce the strain on a system namespace, effectuate cost savings, etc. Methods may identify suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, the systems and methods may be used for providing a cloud gateway and a scalable data object store within a cloud environment.
Abstract:
An illustrative cloud-based air-gapped data storage management (destination) system obtains authorized access to other (source) systems' backup copies, replicates those copies within the destination system, parses supplemental metadata included in the source backup copies, and integrates the replica copies into the destination system as though natively created there. Replica copies are integrated as backup copies without first restoring the source backup copies to a native data format. The source system lacks knowledge of or connectivity with the destination system, thus maintaining an “air gap” between the systems. The destination system preferably operates in a cloud computing environment. The destination system uses supplemental metadata from the replica copies to re-create or mimic the source's computing environment and to restore backed up data from the replica copies. The destination system also operates as an autonomous analytics engine, applying value-added services to backed up data pulled from source system(s).
Abstract:
A deduplicated storage system is provided according to certain embodiments that uses one or more mechanisms to update the deduplication database and remove records corresponding to data blocks that have been or will be erased from the secondary copies, without using or tracking reference counting values. Some embodiments described herein use a secondary table to identify the corresponding records from the primary table that can be removed and/or moved to another table for storing “zero-reference” data blocks. In other embodiments, the system will then traverse the “zero-reference” table and remove those primary data blocks from secondary storage devices.
Abstract:
A method and system for providing information management of data from hosted services receives information management policies for a hosted account of a hosted service, requests data associated with the hosted account from the hosted service, receives data associated with the hosted account from the hosted service, and provides a preview version of the received data to a computing device. In some examples, the system indexes the received data to associate the received data with a user of an information management system, and/or provides index information related to the received data to the computing device.
Abstract:
Data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, are performed within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.
Abstract:
Disclosed methods and systems leverage resources in a storage management system to restore a selected backup to a production site. The backup is partitioned into blocks with associated signatures. The production site may have blocks that have not changed from when the backup occurred, so those blocks do not need to be restored. Block signatures from the production site are compared with block signatures from the incremental backup to identify blocks that need to be restored. Efficiency may be achieved by synchronizing the replacement blocks from more easily accessible location where available before synchronizing from less accessible locations. In some embodiments, a user may specify the location of the site with the replacement blocks.
Abstract:
An improved information management system that implements a staging area or cache to temporarily store primary data in a native format before the primary data is converted into secondary copies in a secondary format is described herein. For example, the improved information management system can include various media agents that each include one or more high speed drives. When a client computing device provides primary data for conversion into secondary copies, the primary data can initially be stored in the native format in the high speed drive(s). If the client computing device then submits a request for the primary data, the media agent can simply retrieve the primary data from the high speed drive(s) and transmit the primary data to the client computing device. Because the primary data is already in the native format, no conversion operations are performed by the media agent, thereby reducing the restore delay.
Abstract:
Described in detail herein are systems and methods for single instancing blocks of data in a data storage system. For example, the data storage system may include multiple computing devices (e.g., client computing devices) that store primary data. The data storage system may also include a secondary storage computing device, a single instance database, and one or more storage devices that store copies of the primary data (e.g., secondary copies, tertiary copies, etc.). The secondary storage computing device receives blocks of data from the computing devices and accesses the single instance database to determine whether the blocks of data are unique (meaning that no instances of the blocks of data are stored on the storage devices). If a block of data is unique, the single instance database stores it on a storage device. If not, the secondary storage computing device can avoid storing the block of data on the storage devices.
Abstract:
An improved information management system that implements a staging area or cache to temporarily store primary data in a native format before the primary data is converted into secondary copies in a secondary format is described herein. For example, the improved information management system can include various media agents that each include one or more high speed drives. When a client computing device provides primary data for conversion into secondary copies, the primary data can initially be stored in the native format in the high speed drive(s). If the client computing device then submits a request for the primary data, the media agent can simply retrieve the primary data from the high speed drive(s) and transmit the primary data to the client computing device. Because the primary data is already in the native format, no conversion operations are performed by the media agent, thereby reducing the restore delay.