摘要:
Various embodiments are generally directed to techniques for reducing the time required for a node to take over for a failed node or to boot. An apparatus includes an access component to retrieve a metadata from a storage device coupled to a first D-module of a first node during boot, the metadata generated from a first mutable metadata portion and an immutable metadata portion, and the first metadata specifying a first address of a second D-module of a second node; a replication component to contact the second data storage module at the first address; and a generation component to, in response to failure of the contact, request a second mutable metadata portion from a N-module of the first node and generate a second metadata from the second mutable metadata portion and the immutable metadata portion, the second mutable metadata portion specifying a second address of the second D-module.
摘要:
Various embodiments are directed to techniques for coordinating at least partially parallel performance and cancellation of data access commands between nodes of a storage cluster system. An apparatus may include a processor component of a first node coupled to a first storage device storing client device data; an access component to perform replica data access commands of replica command sets on the client device data, each replica command set assigned a set ID; a communications component to analyze a set ID included in a network packet to determine whether a portion of a replica command set in the network packet is redundant, and to reassemble the replica command set from the portion based if the portion is not redundant; and an ordering component to provide the communications component with set IDs of replica command sets of which the access component has fully performed the set of replica data access commands.
摘要:
Various embodiments are generally directed to techniques for preparing to respond to failures in performing a data access command to modify client device data in a storage cluster system. An apparatus may include a processor component of a first node coupled to a first storage device; an access component to perform a command on the first storage device; a replication component to exchange a replica of the command with the second node via a communications session formed between the first and second nodes to enable at least a partially parallel performance of the command by the first and second nodes; and a multipath component to change a state of the communications session from inactive to active to enable the exchange of the replica based on an indication of a failure within a third node that precludes performance of the command by the third node. Other embodiments are described and claimed.
摘要:
Various embodiments are generally directed to techniques for handling errors affecting the at least partially parallel performance of data access commands between nodes of a storage cluster system. An apparatus may include a processor component of a first node, an access component to perform a command received from a client device via a network to alter client device data stored in a first storage device coupled to the first node, a replication component to transmit a replica of the command to a second node via the network to enable performance of the replica by the second node at least partially in parallel, an error component to retry transmission of the replica based on a failure indicated by the second node and a status component to select a status indication to transmit to the client device based on the indication of failure and results of retrial of transmission of the replica.
摘要:
Various embodiments are directed to techniques for coordinating at least partially parallel performance and cancellation of data access commands between nodes of a storage cluster system. An apparatus may include a processor component of a first node coupled to a first storage device storing client device data; an access component to perform replica data access commands of replica command sets on the client device data, each replica command set assigned a set ID; a communications component to analyze a set ID included in a network packet to determine whether a portion of a replica command set in the network packet is redundant, and to reassemble the replica command set from the portion based if the portion is not redundant; and an ordering component to provide the communications component with set IDs of replica command sets of which the access component has fully performed the set of replica data access commands.
摘要:
Various embodiments are generally directed to techniques for reducing the time required for a node to take over for a failed node or to boot. An apparatus includes an access component to retrieve a metadata from a storage device coupled to a first D-module of a first node during boot, the metadata generated from a first mutable metadata portion and an immutable metadata portion, and the first metadata specifying a first address of a second D-module of a second node; a replication component to contact the second data storage module at the first address; and a generation component to, in response to failure of the contact, request a second mutable metadata portion from a N-module of the first node and generate a second metadata from the second mutable metadata portion and the immutable metadata portion, the second mutable metadata portion specifying a second address of the second D-module.
摘要:
Various embodiments are generally directed to techniques for preparing to respond to failures in performing a data access command to modify client device data in a storage cluster system. An apparatus may include a processor component of a first node coupled to a first storage device; an access component to perform a command on the first storage device; a replication component to exchange a replica of the command with the second node via a communications session formed between the first and second nodes to enable at least a partially parallel performance of the command by the first and second nodes; and a multipath component to change a state of the communications session from inactive to active to enable the exchange of the replica based on an indication of a failure within a third node that precludes performance of the command by the third node. Other embodiments are described and claimed.
摘要:
Various embodiments are generally directed an apparatus and method for receiving information to write on a clustered system comprising at least a first cluster and a second cluster, determining that a failure event has occurred on the clustered system creating unsynchronized information, the unsynchronized information comprising at least one of inflight information and dirty region information, and performing a resynchronization operation to synchronize the unsynchronized information on the first cluster and the second cluster based on log information in at least one of an inflight tracker log for the inflight information and a dirty region log for the dirty region information.
摘要:
Described herein are method and apparatus for servicing software components of nodes of a cluster storage system. During data-access sessions with clients, client IDs and file handles for accessing files are produced and stored to clients and stored (as session data) to each node. A serviced node is taken offline, whereby network connections to clients are disconnected. Each disconnected client is configured to retain its client ID and file handles and attempt reconnections. Session data of the serviced node is made available to a partner node (by transferring session data to the partner node). After clients have reconnected to the partner node, the clients may use the retained client IDs and file handles to continue a data-access session with the partner node since the partner node has access to the session data of the serviced node and thus will recognize and accept the retained client ID and file handles.
摘要:
Described herein are method and apparatus for servicing software components of nodes of a cluster storage system. During data-access sessions with clients, client IDs and file handles for accessing files are produced and stored to clients and stored (as session data) to each node. A serviced node is taken offline, whereby network connections to clients are disconnected. Each disconnected client is configured to retain its client ID and file handles and attempt reconnections. Session data of the serviced node is made available to a partner node (by transferring session data to the partner node). After clients have reconnected to the partner node, the clients may use the retained client IDs and file handles to continue a data-access session with the partner node since the partner node has access to the session data of the serviced node and thus will recognize and accept the retained client ID and file handles.