X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=pmxcfs.adoc;h=7b9cfac0d4faa0b51506f7d0a5f41389dc2182ef;hp=9d88763b5bb96b4535be98aa41f9a068e78f7354;hb=94958b8b9230d5b9b5e2e70c481f115b18a5fa0b;hpb=a8e9975414246aa8b481578fd5b860b3effc7b69 diff --git a/pmxcfs.adoc b/pmxcfs.adoc index 9d88763..7b9cfac 100644 --- a/pmxcfs.adoc +++ b/pmxcfs.adoc @@ -1,18 +1,40 @@ -Proxmox Cluster file system (pmxcfs) +[[chapter_pmxcfs]] +ifdef::manvolnum[] +pmxcfs(8) +========= +:pve-toplevel: + +NAME +---- + +pmxcfs - Proxmox Cluster File System + +SYNOPSIS +-------- + +include::pmxcfs.8-synopsis.adoc[] + +DESCRIPTION +----------- +endif::manvolnum[] + +ifndef::manvolnum[] +Proxmox Cluster File System (pmxcfs) ==================================== +:pve-toplevel: +endif::manvolnum[] -The Proxmox Cluster file system (pmxcfs) is a database-driven file +The Proxmox Cluster file system (``pmxcfs'') is a database-driven file system for storing configuration files, replicated in real time to all -cluster nodes using corosync. We use this to store all PVE related +cluster nodes using `corosync`. We use this to store all PVE related configuration files. Although the file system stores all data inside a persistent database on disk, a copy of the data resides in RAM. That imposes restriction -on the maximal size, which is currently 30MB. This is still enough to +on the maximum size, which is currently 30MB. This is still enough to store the configuration of several thousand virtual machines. -Advantages ----------- +This system provides the following advantages: * seamless replication of all configuration to all nodes in real time * provides strong consistency checks to avoid duplicate VM IDs @@ -20,8 +42,9 @@ Advantages * automatic updates of the corosync cluster configuration to all nodes * includes a distributed locking mechanism + POSIX Compatibility -~~~~~~~~~~~~~~~~~~~ +------------------- The file system is based on FUSE, so the behavior is POSIX like. But some feature are simply not implemented, because we do not need them: @@ -39,11 +62,11 @@ some feature are simply not implemented, because we do not need them: * `O_TRUNC` creates are not atomic (FUSE restriction) -File access rights -~~~~~~~~~~~~~~~~~~ +File Access Rights +------------------ -All files and directories are owned by user 'root' and have group -'www-data'. Only root has write permissions, but group 'www-data' can +All files and directories are owned by user `root` and have group +`www-data`. Only root has write permissions, but group `www-data` can read most files. Files below the following paths: /etc/pve/priv/ @@ -51,15 +74,16 @@ read most files. Files below the following paths: are only accessible by root. + Technology ---------- We use the http://www.corosync.org[Corosync Cluster Engine] for cluster communication, and http://www.sqlite.org[SQlite] for the -database file. The filesystem is implemented in user space using +database file. The file system is implemented in user space using http://fuse.sourceforge.net[FUSE]. -File system layout +File System Layout ------------------ The file system is mounted at: @@ -71,59 +95,63 @@ Files [width="100%",cols="m,d"] |======= -|corosync.conf |corosync cluster configuration file (previous to {pve} 4.x this file was called cluster.conf) -|storage.cfg |{pve} storage configuration -|datacenter.cfg |{pve} datacenter wide configuration (keyboard layout, proxy, ...) -|user.cfg |{pve} access control configuration (users/groups/...) -|domains.cfg |{pve} Authentication domains -|authkey.pub | public key used by ticket system -|pve-root-ca.pem | public certificate of cluster CA -|priv/shadow.cfg | shadow password file -|priv/authkey.key | private key used by ticket system -|priv/pve-root-ca.key | private key of cluster CA -|nodes//pve-ssl.pem | public ssl certificate for web server (signed by cluster CA) -|nodes//pve-ssl.key | private ssl key for pve-ssl.pem -|nodes//pveproxy-ssl.pem | public ssl certificate (chain) for web server (optional override for pve-ssl.pem) -|nodes//pveproxy-ssl.key | private ssl key for pveproxy-ssl.pem (optional) -|nodes//qemu-server/.conf | VM configuration data for KVM VMs -|nodes//lxc/.conf | VM configuration data for LXC containers -|firewall/cluster.fw | Firewall config applied to all nodes -|firewall/.fw | Firewall config for individual nodes -|firewall/.fw | Firewall config for VMs and Containers +|`corosync.conf` | Corosync cluster configuration file (previous to {pve} 4.x this file was called cluster.conf) +|`storage.cfg` | {pve} storage configuration +|`datacenter.cfg` | {pve} datacenter wide configuration (keyboard layout, proxy, ...) +|`user.cfg` | {pve} access control configuration (users/groups/...) +|`domains.cfg` | {pve} authentication domains +|`status.cfg` | {pve} external metrics server configuration +|`authkey.pub` | Public key used by ticket system +|`pve-root-ca.pem` | Public certificate of cluster CA +|`priv/shadow.cfg` | Shadow password file +|`priv/authkey.key` | Private key used by ticket system +|`priv/pve-root-ca.key` | Private key of cluster CA +|`nodes//pve-ssl.pem` | Public SSL certificate for web server (signed by cluster CA) +|`nodes//pve-ssl.key` | Private SSL key for `pve-ssl.pem` +|`nodes//pveproxy-ssl.pem` | Public SSL certificate (chain) for web server (optional override for `pve-ssl.pem`) +|`nodes//pveproxy-ssl.key` | Private SSL key for `pveproxy-ssl.pem` (optional) +|`nodes//qemu-server/.conf` | VM configuration data for KVM VMs +|`nodes//lxc/.conf` | VM configuration data for LXC containers +|`firewall/cluster.fw` | Firewall configuration applied to all nodes +|`firewall/.fw` | Firewall configuration for individual nodes +|`firewall/.fw` | Firewall configuration for VMs and Containers |======= + Symbolic links ~~~~~~~~~~~~~~ [width="100%",cols="m,m"] |======= -|local |nodes/ -|qemu-server |nodes//qemu-server/ -|lxc |nodes//lxc/ +|`local` | `nodes/` +|`qemu-server` | `nodes//qemu-server/` +|`lxc` | `nodes//lxc/` |======= + Special status files for debugging (JSON) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [width="100%",cols="m,d"] |======= -| .version |file versions (to detect file modifications) -| .members |Info about cluster members -| .vmlist |List of all VMs -| .clusterlog |Cluster log (last 50 entries) -| .rrd |RRD data (most recent entries) +|`.version` |File versions (to detect file modifications) +|`.members` |Info about cluster members +|`.vmlist` |List of all VMs +|`.clusterlog` |Cluster log (last 50 entries) +|`.rrd` |RRD data (most recent entries) |======= + Enable/Disable debugging ~~~~~~~~~~~~~~~~~~~~~~~~ You can enable verbose syslog messages with: - echo "1" >/etc/pve/.debug + echo "1" >/etc/pve/.debug And disable verbose syslog messages with: - echo "0" >/etc/pve/.debug + echo "0" >/etc/pve/.debug Recovery @@ -131,13 +159,14 @@ Recovery If you have major problems with your Proxmox VE host, e.g. hardware issues, it could be helpful to just copy the pmxcfs database file -/var/lib/pve-cluster/config.db and move it to a new Proxmox VE +`/var/lib/pve-cluster/config.db` and move it to a new Proxmox VE host. On the new host (with nothing running), you need to stop the -pve-cluster service and replace the config.db file (needed permissions -0600). Second, adapt '/etc/hostname' and '/etc/hosts' according to the -lost Proxmox VE host, then reboot and check. (And don´t forget your +`pve-cluster` service and replace the `config.db` file (needed permissions +`0600`). Second, adapt `/etc/hostname` and `/etc/hosts` according to the +lost Proxmox VE host, then reboot and check. (And don't forget your VM/CT data) + Remove Cluster configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -145,31 +174,49 @@ The recommended way is to reinstall the node after you removed it from your cluster. This makes sure that all secret cluster/ssh keys and any shared configuration data is destroyed. -In some cases, you might prefer to put a node back to local mode -without reinstall, which is described here: - -* stop the cluster file system in '/etc/pve/' +In some cases, you might prefer to put a node back to local mode without +reinstall, which is described in +<> - # systemctl stop pve-cluster -* start it again but forcing local mode +Recovering/Moving Guests from Failed Nodes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - # pmxcfs -l +For the guest configuration files in `nodes//qemu-server/` (VMs) and +`nodes//lxc/` (containers), {pve} sees the containing node `` as +owner of the respective guest. This concept enables the usage of local locks +instead of expensive cluster-wide locks for preventing concurrent guest +configuration changes. -* remove the cluster config +As a consequence, if the owning node of a guest fails (e.g., because of a power +outage, fencing event, ..), a regular migration is not possible (even if all +the disks are located on shared storage) because such a local lock on the +(dead) owning node is unobtainable. This is not a problem for HA-managed +guests, as {pve}'s High Availability stack includes the necessary +(cluster-wide) locking and watchdog functionality to ensure correct and +automatic recovery of guests from fenced nodes. - # rm /etc/pve/cluster.conf - # rm /etc/cluster/cluster.conf - # rm /var/lib/pve-cluster/corosync.authkey +If a non-HA-managed guest has only shared disks (and no other local resources +which are only available on the failed node are configured), a manual recovery +is possible by simply moving the guest configuration file from the failed +node's directory in `/etc/pve/` to an alive node's directory (which changes the +logical owner or location of the guest). -* stop the cluster file system again +For example, recovering the VM with ID `100` from a dead `node1` to another +node `node2` works with the following command executed when logged in as root +on any member node of the cluster: - # service pve-cluster stop + mv /etc/pve/nodes/node1/qemu-server/100.conf /etc/pve/nodes/node2/ -* restart pve services (or reboot) +WARNING: Before manually recovering a guest like this, make absolutely sure +that the failed source node is really powered off/fenced. Otherwise {pve}'s +locking principles are violated by the `mv` command, which can have unexpected +consequences. - # service pve-cluster start - # service pvedaemon restart - # service pveproxy restart - # service pvestatd restart +WARNING: Guest with local disks (or other local resources which are only +available on the dead node) are not recoverable like this. Either wait for the +failed node to rejoin the cluster or restore such guests from backups. +ifdef::manvolnum[] +include::pve-copyright.adoc[] +endif::manvolnum[]