Distributed system with fault tolerance and self-maintenance
摘要:
A distributed system includes a plurality of compute nodes configured to process messages. The compute nodes each process messages corresponding an assigned value of a common parameter of the messages. The values are assigned to the compute nodes such that two or more compute nodes are available to process each message. The values can be assigned to the compute nodes in a grouping configuration or a striping configuration. The compute nodes also circulate one or more tokens among nodes, and perform a self-maintenance operation during a given state of possession of the token. During a self-maintenance operation, the values assigned to the compute node can be reassigned to other compute nodes to ensure processing of corresponding messages.
信息查询
0/0