]> git.proxmox.com Git - mirror_corosync.git/blob - README.recovery
config: Increase default token timeout to 3000 ms
[mirror_corosync.git] / README.recovery
1 SYNCHRONIZATION ALGORITHM:
2 -------------------------
3 The synchronization algorithm is used for every service in corosync to
4 synchronize state of the system.
5
6 There are 4 events of the synchronization algorithm. These events are in fact
7 functions that are registered in the service handler data structure. They
8 are called by the synchronization system whenever a network partitions or
9 merges.
10
11 init:
12 Within the init event a service handler should record temporary state variables
13 used by the process event.
14
15 process:
16 The process event is responsible for executing synchronization. This event
17 will return a state as to whether it has completed or not. This allows for
18 synchronization to be interrupted and recontinue when the message queue buffer
19 is full. The process event will be called again by the synchronization service
20 if requested to do so by the return variable returned in process.
21
22 abort:
23 The abort event occurs when during synchronization a processor failure occurs.
24
25 activate:
26 The activate event occurs when process has returned no more processing is
27 necessary for any node in the cluster and all messages originated by process
28 have completed.
29
30 CHECKPOINT SYNCHRONIZATION ALGORITHM:
31 ------------------------------------
32 The purpose of the checkpoint synchronization algorithm is to synchronize
33 checkpoints after a partition or merge of two or more partitions. The
34 secondary purpose of the algorithm is to determine the cluster-wide reference
35 count for every checkpoint.
36
37 Every cluster contains a group of checkpoints. Each checkpoint has a
38 checkpoint name and checkpoint number. The number is used to uniquely reference
39 an unlinked but still open checkpoint in the cluster.
40
41 Every checkpoint contains a reference count which is used to determine when
42 that checkpoint may be released. The algorithm rebuilds the reference count
43 information each time a partition or merge occurs.
44
45 local variables
46 my_sync_state may have the values SYNC_CHECKPOINT, SYNC_REFCOUNT
47 my_current_iteration_state contains any data used to iterate the checkpoints
48 and sections.
49 checkpoint data
50 refcount_set contains reference count for every node consisting of
51 number of opened connections to checkpoint and node identifier
52 refcount contains a summation of every reference count in the refcount_set
53
54 pseudocode executed by a processor when the synchronization service calls
55 the init event
56 call process_checkpoints_enter
57
58 pseudocode executed by a processor when the synchronization service calls
59 the process event in the SYNC_CHECKPOINT state
60 if lowest processor identifier of old ring in new ring
61 transmit checkpoints or sections starting from my_current_iteration_state
62 if all checkpoints and sections could be queued
63 call sync_refcounts_enter
64 else
65 record my_current_iteration_state
66
67 require process to continue
68
69 pseudocode executed by a processor when the synchronization service calls
70 the process event in the SYNC_REFCOUNT state
71 if lowest processor identifier of old ring in new ring
72 transmit checkpoint reference counts
73 if all checkpoint reference counts could be queued
74 require process to not continue
75 else
76 record my_current_iteration_state for checkpoint reference counts
77
78 sync_checkpoints_enter:
79 my_sync_state = SYNC_CHECKPOINT
80 my_current_iteration_state set to start of checkpoint list
81
82 sync_refcounts_enter:
83 my_sync_state = SYNC_REFCOUNT
84
85 on event receipt of foreign ring id message
86 ignore message
87
88 pseudocode executed on event receipt of checkpoint update
89 if checkpoint exists in temporary storage
90 ignore message
91 else
92 create checkpoint
93 reset checkpoint refcount array
94
95 pseudocode executed on event receipt of checkpoint section update
96 if checkpoint section exists in temporary storage
97 ignore message
98 else
99 create checkpoint section
100
101 pseudocode executed on event receipt of reference count update
102 update temporary checkpoint data storage reference count set by adding
103 any reference counts in the temporary message set to those from the
104 event
105 update that checkpoint's reference count
106 set the global checkpoint id to the current checkpoint id + 1 if it
107 would increase the global checkpoint id
108
109 pseudocode called when the synchronization service calls the activate event:
110 for all checkpoints
111 free all previously committed checkpoints and sections
112 convert temporary checkpoints and sections to regular sections
113 copy my_saved_ring_id to my_old_ring_id
114
115 pseudocode called when the synchronization service calls the abort event:
116 free all temporary checkpoints and temporary sections