摘要:
A system, method and article of manufacture are provided for the automatic recovery from errors encountered during an automated Licensed Internal Code (LIC) update on a storage controller. The present invention functions with a concurrent or nonconcurrent automated LIC update. The automated recovery from many error conditions is transparent to the attached host system and on-site service personnel, resulting an improvement in the LIC update process.
摘要:
Provided is a method, system, and program for processing Input/Output (I/O) requests to a storage network including at least one storage device and at least two adaptors, wherein each adaptor is capable of communicating I/O requests to the at least one storage device. An error is detected in a system including a first adaptor, wherein the first adaptor is capable of communicating on the network after the error is detected. In response to detecting the error, a master switch timer is started that is less than a system timeout period if the first adaptor is the master. An error recovery procedure in the system including the first adaptor would be initiated after the system timeout period has expired. An operation is initiated to designate another adaptor in the storage network as the master if the first adaptor is the master in response to detecting an expiration of the master switch timer.
摘要:
An adaptive hierarchical cache management system for improving effective cache hit ratios by eliminating unnecessary duplicate cache entries in two coupled cache memories. When a cached Storage Controller (SC) is coupled to a Cached Storage Drawer (CSD), the hierarchical coupling of the SC cache memory and CSD cache memory unnecessarily duplicates cache entries during normal operation. A Conditional Purge procedure purges duplicate lines from the CSD cache subject to a DASD activity threshold. A Prenotify Intent parameter allows the SC to request restaging of the purged cache entry preparatory to fast write or LRU demotion in the SC cache. The new procedures substantially and transparently improve the combined caching efficiency without significant new hardware or software overhead.
摘要:
Provided is a system for processing Input/Output (I/O) requests to a storage network including at least one storage device and at least two adaptors, wherein each adaptor is capable of communicating I/O requests to the at least one storage device. An error is detected in a system including a first adaptor, wherein the first adaptor is capable of communicating on the network after the error is detected. In response to detecting the error, a master switch timer is started that is less than a system timeout period if the first adaptor is the master. An error recovery procedure in the system including the first adaptor would be initiated after the system timeout period has expired. An operation is initiated to designate another adaptor in the storage network as the master if the first adaptor is the master in response to detecting an expiration of the master switch timer.
摘要:
Provided is a method, system, and program for processing Input/Output (I/O) requests to a storage network including at least one storage device and at least two adaptors, wherein each adaptor is capable of communicating I/O requests to the at least one storage device. An error is detected in a system including a first adaptor, wherein the first adaptor is capable of communicating on the network after the error is detected. In response to detecting the error, a master switch timer is started that is less than a system timeout period if the first adaptor is the master. An error recovery procedure in the system including the first adaptor would be initiated after the system timeout period has expired. An operation is initiated to designate another adaptor in the storage network as the master if the first adaptor is the master in response to detecting an expiration of the master switch timer.
摘要:
A system and method for reducing device wait time in response to a host initiated write operation modifying a data block. The system includes a host computer channel connected to a storage controller which has cache memory and a nonvolatile storage buffer in a first embodiment. An identical system makes up the second embodiment with the exception that there is no nonvolatile storage buffer in the storage controller of the second embodiment. The controller in either embodiment is coupled to a cache storage drawer containing a plurality of DASD devices for implementing a RAID parity data protection scheme, and for permanently storing data. The drawer has nonvolatile cache memory which is used for accepting data destaged from controller cache. In a first embodiment, no commit reply is sent to the controller to indicate that data has been written to DASD. Instead a status information block is created to indicate that the data has been destaged from controller cache but is not committed. The status information is stored in directory means attached to the controller. The system uses this information to create a list of data which is in the state of Not committed. In this way data can be committed according to a cache management algorithm of least recently used (LRU), rather than requiring synchronous commit which is inefficient because it requires waiting on a commit response and ties up nonvolatile storage space allocated to back-up copies of cache data. In a second embodiment, directory means attached to the controller stores information about status blocks that may be modified or unmodified. The status information is used to eliminate wait times associated with waiting for data to be written to HDAs below.
摘要:
A dual cluster storage server maintains track control blocks (TCBs) in a data structure to describe the data stored in cache in corresponding track images or segments. Following a cluster failure and reboot, the surviving cluster uses the TCBs to rebuild data structures such as a scatter table, which is a hash table that identifies a location of a track image, and a least recently used (LRU)/most recently used (MRU) list for the track images. This allows the cache data to be recovered. The TCBs describe whether the data in the track images is modified and valid, and describe forward and backward pointers for the data in the LRU/MRU lists. A separate non-volatile memory that is updated as the track images are updated is used to verify the integrity of the TCBs.
摘要:
A system and method for changing the number of logical volumes in a drawer in a rack in a direct access storage device subsystem is disclosed. The method and system are able to change the number of logical volumes without disrupting access to the other logical volumes in the rack. Channel connection addresses, which are logical volume addresses as known by the CPUs, are freed by removing the old drawer and then are mused. If the new drawer has more logical volumes than the old drawer, the next unused channel connection addresses are used with the new drawer. In a subsystem having a storage controller for providing control for a plurality of direct access storage devices, the logical volumes are spread across multiple physical devices. The storage controller maintains configuration data for the entire subsystem in redundant, non-volatile storage locations reserved specifically for its use. Each logical volume address for the rack is set by the drawer location and the logical sequence of the volumes within the drawer. As drawers are installed, the control unit sequentially assigns the volume addresses for the control unit and the channel connection addresses for the CPUs.
摘要:
A system, method and article of manufacture are provided for the automatic recovery from errors encountered during an automated Licensed Internal Code (LIC) update on a storage controller. The present invention functions with a concurrent or nonconcurrent automated LIC update. The automated recovery from many error conditions is transparent to the attached host system and on-site service personnel, resulting in an improvement in the LIC update process.
摘要:
An improved storage controller and method for storing and recovering data are disclosed. The storage controller includes a first cluster for directing data from a host computer to a storage device and a second cluster for directing data from a host computer to a storage device. A first cache memory is connected to the first cluster and a second cache memory is connected to the second cluster. A first preserved area of memory is connected to the first cluster and a second preserved area of memory is connected to the second cluster. Data is directed to the first cache and backed up to the second preserved area in a normal operating mode. Similarly, data is directed to the second cache and backed up to the first preserved area in the normal operating mode. In the event of a power failure or comparable event, data from the first and second preserved areas are transferred to, and stored on, a first storage device. Additionally, data from the first and second preserved areas are transferred to, and stored on, a second storage device. Thus, upon resumption of normal operation, if one of the clusters subsequently fails to resume normal operations, data from the failed cluster is available through the operating cluster.