Resolve abort during simulatenous stopping of atleast 4 nodes
consider 5 nodes.
node 3,4 stopped (by random stopping) node 1,2,5 form new configuration
and during recovery node 1 and node 2 are stopped (via service service
corosync stop). This causes 5 never to finish recovery within the timeout
period, triggering a token loss in recovery. Bug #623176 resolved an assert
which happens because the full ring id was being restored. The resolution
to Bug #623176 was to not restore the full ring id, and instead operate
(according to specifications) the new ring id. Unfortunately this exposes
a problem whereby the restarting of nodes 1-4 generate the same ring id.
This ring id gets to the recovery failed node 5 which is now in gather,
and triggers a condition not accounted for in the original totem specification.
It appears later work from Dr. Agarwal's PHD dissertation considers this
scenario. That solution entails rejecting the regular token in the above
condition. Since the ring id is also used to make decisions for commit token
acceptance, we must also take care to reject the regular token in all cases
after transitioning from OPERATIONAL.
Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>