+On node failures, fencing ensures that the erroneous node is
+guaranteed to be offline. This is required to make sure that no
+resource runs twice when it gets recovered on another node. This is a
+really important task, because without, it would not be possible to
+recover a resource on another node.
+
+If a node did not get fenced, it would be in an unknown state where
+it may have still access to shared resources. This is really
+dangerous! Imagine that every network but the storage one broke. Now,
+while not reachable from the public network, the VM still runs and
+writes to the shared storage.
+
+If we then simply start up this VM on another node, we would get a
+dangerous race conditions because we write from both nodes. Such
+condition can destroy all VM data and the whole VM could be rendered
+unusable. The recovery could also fail if the storage protects from
+multiple mounts.
+
+
+How {pve} Fences
+~~~~~~~~~~~~~~~~
+
+There are different methods to fence a node, for example, fence
+devices which cut off the power from the node or disable their
+communication completely. Those are often quite expensive and bring
+additional critical components into a system, because if they fail you
+cannot recover any service.
+
+We thus wanted to integrate a simpler fencing method, which does not
+require additional external hardware. This can be done using
+watchdog timers.
+
+.Possible Fencing Methods
+- external power switches
+- isolate nodes by disabling complete network traffic on the switch
+- self fencing using watchdog timers
+
+Watchdog timers are widely used in critical and dependable systems
+since the beginning of micro controllers. They are often independent
+and simple integrated circuits which are used to detect and recover
+from computer malfunctions.
+
+During normal operation, `ha-manager` regularly resets the watchdog
+timer to prevent it from elapsing. If, due to a hardware fault or
+program error, the computer fails to reset the watchdog, the timer
+will elapse and triggers a reset of the whole server (reboot).
+
+Recent server motherboards often include such hardware watchdogs, but
+these need to be configured. If no watchdog is available or
+configured, we fall back to the Linux Kernel 'softdog'. While still
+reliable, it is not independent of the servers hardware, and thus has
+a lower reliability than a hardware watchdog.