X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=pmxcfs.adoc;fp=pmxcfs.adoc;h=5a68598b9621319db66dbcfc2b1f9789f779bcb9;hp=12e51d19968dfd50ea0f1bad6b27fe20973c4bc1;hb=5db724de29143ffa5f0484723d9b8009b15c7ee0;hpb=4a751f38e6a2a957024fe36d34acb4fc5a3944a6 diff --git a/pmxcfs.adoc b/pmxcfs.adoc index 12e51d1..5a68598 100644 --- a/pmxcfs.adoc +++ b/pmxcfs.adoc @@ -176,6 +176,45 @@ In some cases, you might prefer to put a node back to local mode without reinstall, which is described in <> + +Recovering/Moving Guests from Failed Nodes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For the guest configuration files in `nodes//qemu-server/` (VMs) and +`nodes//lxc/` (containers), {pve} sees the containing node `` as +owner of the respective guest. This concept enables the usage of local locks +instead of expensive cluster-wide locks for preventing concurrent guest +configuration changes. + +As a consequence, if the owning node of a guest fails (e.g., because of a power +outage, fencing event, ..), a regular migration is not possible (even if all +the disks are located on shared storage) because such a local lock on the +(dead) owning node is unobtainable. This is not a problem for HA-managed +guests, as {pve}'s High Availability stack includes the necessary +(cluster-wide) locking and watchdog functionality to ensure correct and +automatic recovery of guests from fenced nodes. + +If a non-HA-managed guest has only shared disks (and no other local resources +which are only available on the failed node are configured), a manual recovery +is possible by simply moving the guest configuration file from the failed +node's directory in `/etc/pve/` to an alive node's directory (which changes the +logical owner or location of the guest). + +For example, recovering the VM with ID `100` from a dead `node1` to another +node `node2` works with the following command executed when logged in as root +on any member node of the cluster: + + mv /etc/pve/nodes/node1/qemu-server/100.conf /etc/pve/nodes/node2/ + +WARNING: Before manually recovering a guest like this, make absolutely sure +that the failed source node is really powered off/fenced. Otherwise {pve}'s +locking principles are violated by the `mv` command, which can have unexpected +consequences. + +WARNING: Guest with local disks (or other local resources which are only +available on the dead node) are not recoverable like this. Either wait for the +failed node to rejoin the cluster or restore such guests from backups. + ifdef::manvolnum[] include::pve-copyright.adoc[] endif::manvolnum[]