ha-manager: add section for recovery after fencing

author Thomas Lamprecht <t.lamprecht@proxmox.com>

Tue, 14 Jun 2016 14:57:45 +0000 (16:57 +0200)

committer Dietmar Maurer <dietmar@proxmox.com>

Tue, 14 Jun 2016 16:06:31 +0000 (18:06 +0200)
author Thomas Lamprecht <t.lamprecht@proxmox.com>
Tue, 14 Jun 2016 14:57:45 +0000 (16:57 +0200)
committer Dietmar Maurer <dietmar@proxmox.com>
Tue, 14 Jun 2016 16:06:31 +0000 (18:06 +0200)
diff --git a/ha-manager.adoc b/ha-manager.adoc

index 53ee3199a427b7e74b05e3404e177199c82fddcd..5db5b052e44d3f1817298a284ca8b65efc8eadca 100644 (file)
--- a/ha-manager.adoc
+++ b/ha-manager.adoc
@@ -350,6 +350,24 @@ If you have a hardware watchdog available remove its kernel module from the
  blacklist, load it with insmod and restart the 'watchdog-mux' service or reboot
  the node.
  
+Recover Fenced Services
+~~~~~~~~~~~~~~~~~~~~~~~
+
+After a node failed and its fencing was successful we start to recover services
+to other available nodes and restart them there so that they can provide service
+again.
+
+The selection of the node on which the services gets recovered is influenced
+by the users group settings, the currently active nodes and their respective
+active service count.
+First we build a set out of the intersection between user selected nodes and
+available nodes. Then the subset with the highest priority of those nodes
+gets chosen as possible nodes for recovery. We select the node with the
+currently lowest active service count as a new node for the service.
+That minimizes the possibility of an overload, which else could cause an
+unresponsive node and as a result a chain reaction of node failures in the
+cluster.
+
  Groups
  ------
author	Thomas Lamprecht <t.lamprecht@proxmox.com>
	Tue, 14 Jun 2016 14:57:45 +0000 (16:57 +0200)
committer	Dietmar Maurer <dietmar@proxmox.com>
	Tue, 14 Jun 2016 16:06:31 +0000 (18:06 +0200)