X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=pmxcfs.adoc;h=7b9cfac0d4faa0b51506f7d0a5f41389dc2182ef;hp=e50c1c41ab768adeb049f71a22201329f3019315;hb=94958b8b9230d5b9b5e2e70c481f115b18a5fa0b;hpb=5eba07434fd010e7b96459da2a5bb676a62fe8b1 diff --git a/pmxcfs.adoc b/pmxcfs.adoc index e50c1c4..7b9cfac 100644 --- a/pmxcfs.adoc +++ b/pmxcfs.adoc @@ -1,17 +1,18 @@ +[[chapter_pmxcfs]] ifdef::manvolnum[] -PVE({manvolnum}) -================ -include::attributes.txt[] +pmxcfs(8) +========= +:pve-toplevel: NAME ---- pmxcfs - Proxmox Cluster File System -SYNOPSYS +SYNOPSIS -------- -include::pmxcfs.8-cli.adoc[] +include::pmxcfs.8-synopsis.adoc[] DESCRIPTION ----------- @@ -20,7 +21,7 @@ endif::manvolnum[] ifndef::manvolnum[] Proxmox Cluster File System (pmxcfs) ==================================== -include::attributes.txt[] +:pve-toplevel: endif::manvolnum[] The Proxmox Cluster file system (``pmxcfs'') is a database-driven file @@ -99,6 +100,7 @@ Files |`datacenter.cfg` | {pve} datacenter wide configuration (keyboard layout, proxy, ...) |`user.cfg` | {pve} access control configuration (users/groups/...) |`domains.cfg` | {pve} authentication domains +|`status.cfg` | {pve} external metrics server configuration |`authkey.pub` | Public key used by ticket system |`pve-root-ca.pem` | Public certificate of cluster CA |`priv/shadow.cfg` | Shadow password file @@ -145,11 +147,11 @@ Enable/Disable debugging You can enable verbose syslog messages with: - echo "1" >/etc/pve/.debug + echo "1" >/etc/pve/.debug And disable verbose syslog messages with: - echo "0" >/etc/pve/.debug + echo "0" >/etc/pve/.debug Recovery @@ -172,34 +174,48 @@ The recommended way is to reinstall the node after you removed it from your cluster. This makes sure that all secret cluster/ssh keys and any shared configuration data is destroyed. -In some cases, you might prefer to put a node back to local mode -without reinstall, which is described here: +In some cases, you might prefer to put a node back to local mode without +reinstall, which is described in +<> -* stop the cluster file system in `/etc/pve/` - # systemctl stop pve-cluster +Recovering/Moving Guests from Failed Nodes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -* start it again but forcing local mode +For the guest configuration files in `nodes//qemu-server/` (VMs) and +`nodes//lxc/` (containers), {pve} sees the containing node `` as +owner of the respective guest. This concept enables the usage of local locks +instead of expensive cluster-wide locks for preventing concurrent guest +configuration changes. - # pmxcfs -l +As a consequence, if the owning node of a guest fails (e.g., because of a power +outage, fencing event, ..), a regular migration is not possible (even if all +the disks are located on shared storage) because such a local lock on the +(dead) owning node is unobtainable. This is not a problem for HA-managed +guests, as {pve}'s High Availability stack includes the necessary +(cluster-wide) locking and watchdog functionality to ensure correct and +automatic recovery of guests from fenced nodes. -* remove the cluster configuration +If a non-HA-managed guest has only shared disks (and no other local resources +which are only available on the failed node are configured), a manual recovery +is possible by simply moving the guest configuration file from the failed +node's directory in `/etc/pve/` to an alive node's directory (which changes the +logical owner or location of the guest). - # rm /etc/pve/cluster.conf - # rm /etc/cluster/cluster.conf - # rm /var/lib/pve-cluster/corosync.authkey +For example, recovering the VM with ID `100` from a dead `node1` to another +node `node2` works with the following command executed when logged in as root +on any member node of the cluster: -* stop the cluster file system again + mv /etc/pve/nodes/node1/qemu-server/100.conf /etc/pve/nodes/node2/ - # systemctl stop pve-cluster - -* restart PVE services (or reboot) - - # systemctl start pve-cluster - # systemctl restart pvedaemon - # systemctl restart pveproxy - # systemctl restart pvestatd +WARNING: Before manually recovering a guest like this, make absolutely sure +that the failed source node is really powered off/fenced. Otherwise {pve}'s +locking principles are violated by the `mv` command, which can have unexpected +consequences. +WARNING: Guest with local disks (or other local resources which are only +available on the dead node) are not recoverable like this. Either wait for the +failed node to rejoin the cluster or restore such guests from backups. ifdef::manvolnum[] include::pve-copyright.adoc[]