摘要:
A method and system for increasing server cluster availability by requiring at a minimum only one node and a quorum replica set of replica members to form and operate a cluster. Replica members maintain cluster operational data. A cluster operates when one node possesses a majority of replica members, which ensures that any new or surviving cluster includes consistent cluster operational data via at least one replica member from the immediately prior cluster. Arbitration provides exclusive ownership by one node of the replica members, including at cluster formation, and when the owning node fails. Arbitration uses a fast mutual exclusion algorithm and a reservation mechanism to challenge for and defend the exclusive reservation of each member. A quorum replica set algorithm brings members online and offline with data consistency, including updating unreconciled replica members, and ensures consistent read and update operations.
摘要:
Systems and methods are described that establish and maintain a virtual session between a client and one or more database servers. A database server establishes a first session with a client wherein establishing the virtual session with the client comprises associating a virtual session identifier (ID) with the first session, generates state information in association with the first session, and stores the state information in a repository in association with the virtual session ID. After the first session fails, a same or different database server establishes a second session with the client wherein establishing the second session with the client comprises receiving the virtual session ID from the client, accesses the state information that was stored in the repository in association with the virtual session ID, and associates the state information with the second session.
摘要:
Multiple versions of a set of data objects can be maintained to allow concurrent conflicting access to the objects. Additionally, a range of acceptable timestamps for each transaction in a set of database transactions can be tracked. Conflicting access requests for an object in the set of objects can be detected, and the requests can be made by two or more conflicting transactions in the set of transactions. A range of acceptable timestamps for at least one of the conflicting transactions can be adjusted, such that an order of transaction timestamps can be maintained in accordance with a specified transaction isolation level. Such timestamp range adjustment can frequently permit conflicting read and write accesses to proceed concurrently. When concurrent access cannot be allowed while maintaining such an order of transaction timestamps, in many cases one of the conflicting accesses can be blocked instead of aborting one of the transactions.
摘要:
Systems and methods for facilitating more efficient timestamping in a lazy timestamping transaction time database environment are described herein. A recovery log component can store timestamp information of a transaction in a commit record of the transaction, wherein the commit record of the transaction is included in a recovery log. A volatile reference count component can update reference count data of the transaction in a volatile timestamp table to reflect a number of records of the transaction that do not persistently include timestamp information. Further, a checkpoint component can update timestamp information for a batch of transactions, wherein the timestamp information is updated in a record of the persistent timestamp table to ensure that the timestamp information persists in the record of the persistent timestamp table before the commit record of the transaction that contains the transaction's timestamp information is deleted from the recovery log.
摘要:
Logical logging to extend recovery is described. In one aspect, a dependency cycle between at least two objects is detected. The dependency cycle indicates that the two objects should be flushed simultaneously from a volatile main memory to a non-volatile memory to preserve those objects in the event of a system crash. One of the two objects is written to a stable of to break the dependency cycle. The other of the two objects is flushed to the non-volatile memory. The object that has been written to the stable log is then flushed to the stable log to the non-volatile memory.
摘要:
A system and methodology that facilitate persistence for an execution state is provided. The system and methodology employ generalized “idempotent” request(s) that have the property they only execute a request once, and always return the result of that first execution should the request be repeated so as to ensure exactly once execution. A calling middle tier component can exploit these procedures so that it can engage in exploratory reads (which are not idempotent) yet still be able to have their state recovered via replay based on the log at the client and the results retained by the generalized idempotent procedures provided by back end services. The system and methodology can be employed to facilitate successful replay of logless persistent component(s), (e.g., components that do not themselves log any information).To exploit generalized idempotent procedures, what a middle tier logless component can do with the results of non-idempotent exploratory reads must be circumscribed so that these results only impact arguments to the next generalized idempotent procedure invoked from the middle tier.Optionally, the system and methodology can facilitate idempotent procedure(s) which support idempotent request abort(s). When an idempotent request abort is requested, it can be identified with the request via a request identifier. Then subsequent request(s) with the same request identifier can return with the same “abort” message.
摘要:
Persistent components are provided across both process and server failures, without the application programmer needing take actions for component recoverability. Application interactions with a stateful component are transparently intercepted and stably logged to persistent storage. A “virtual” component isolates an application from component failures, permitting the mapping of a component to an arbitrary “physical” component. Component failures are detected and masked from the application. A virtual component is re-mapped to a new physical component, and the operations required to recreate a component and reinstall state up to the point of the last logged interaction is replayed from the log automatically.
摘要:
A technique is described for guaranteeing recovery in a computer system comprising of recovery contracts with a plurality of obligations for a message exchange between a first component and a second component. Three forms of contract are described, governing interactions between three types of components. Each contract is bilateral, i.e. between a first component and a second component. The first and second components have mutual agreement on when the contract will be released to facilitate log truncation, and independent and/or autonomous recovery.
摘要:
This invention concerns a database computer system and method for making applications recoverable from system crashes. The application state (i.e., address space) is treated as a single object which can be atomically flushed in a manner akin to flushing individual pages in database recovery techniques. To enable this monolithic treatment of the application, executions performed by the application are mapped to logical loggable operations which can be posted to the stable log. Any modifications to the application state are accumulated and the application state is periodically flushed to stable storage using an atomic procedure. The application recovery integrates with database recovery, and effectively eliminates or at least substantially reduces the need for check pointing applications. In addition, optimization techniques are described to make the read, write, and recovery phases more efficient.
摘要:
Each node in a data processing system contains at least one undo buffer and one least one redo buffer for insuring that any changes made to a section of a non-volatile storage medium, such as a disk, can be removed, if a transaction has not been committed, or can be recreated if the transaction has not been committed. The undo buffers each correspond to a different uncommitted transaction. The redo buffer contains the changes made to a copy of the section which is maintained in the memory.