X-Git-Url: https://git.proxmox.com/?a=blobdiff_plain;f=pvecm.adoc;h=bb1477b8810ee06bcab2489443234f00d9a58637;hb=7cbfd9192ee6ba3e660e12fb95a667ffeba183b8;hp=2b7f5d21825379a34528b6c947612075c2cffc46;hpb=3be22308fb07213fb33b8781932d6ab8675e9cc4;p=pve-docs.git diff --git a/pvecm.adoc b/pvecm.adoc index 2b7f5d2..bb1477b 100644 --- a/pvecm.adoc +++ b/pvecm.adoc @@ -1,14 +1,15 @@ ifdef::manvolnum[] -PVE({manvolnum}) -================ +pvecm(1) +======== include::attributes.txt[] +:pve-toplevel: NAME ---- pvecm - Proxmox VE Cluster Manager -SYNOPSYS +SYNOPSIS -------- include::pvecm.1-synopsis.adoc[] @@ -21,6 +22,7 @@ ifndef::manvolnum[] Cluster Manager =============== include::attributes.txt[] +:pve-toplevel: endif::manvolnum[] The {PVE} cluster manager `pvecm` is a tool to create a group of @@ -177,7 +179,9 @@ When adding a node to a cluster with a separated cluster network you need to use the 'ringX_addr' parameters to set the nodes address on those networks: [source,bash] +---- pvecm add IP-ADDRESS-CLUSTER -ring0_addr IP-ADDRESS-RING0 +---- If you want to use the Redundant Ring Protocol you will also want to pass the 'ring1_addr' parameter. @@ -291,6 +295,7 @@ cluster again, you have to * then join it, as explained in the previous section. +[[pvecm_separate_node_without_reinstall]] Separate A Node Without Reinstalling ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -315,32 +320,44 @@ conflicts and problems else. First stop the corosync and the pve-cluster services on the node: [source,bash] +---- systemctl stop pve-cluster systemctl stop corosync +---- Start the cluster filesystem again in local mode: [source,bash] +---- pmxcfs -l +---- Delete the corosync configuration files: [source,bash] +---- rm /etc/pve/corosync.conf rm /etc/corosync/* +---- You can now start the filesystem again as normal service: [source,bash] +---- killall pmxcfs systemctl start pve-cluster +---- The node is now separated from the cluster. You can deleted it from a remaining node of the cluster with: [source,bash] +---- pvecm delnode oldnode +---- If the command failed, because the remaining node in the cluster lost quorum when the now separate node exited, you may set the expected votes to 1 as a workaround: [source,bash] +---- pvecm expected 1 +---- And the repeat the 'pvecm delnode' command. @@ -349,7 +366,9 @@ from the old cluster. This ensures that the node can be added to another cluster again without problems. [source,bash] +---- rm /var/lib/corosync/* +---- As the configuration files from the other nodes are still in the cluster filesystem you may want to clean those up too. Remove simply the whole @@ -420,7 +439,9 @@ omping -c 10000 -i 0.001 -F -q NODE1-IP NODE2-IP ... no multicast querier is active. This test has a duration of around 10 minutes. [source,bash] +---- omping -c 600 -i 1 -q NODE1-IP NODE2-IP ... +---- Your network is not ready for clustering if any of these test fails. Recheck your network configuration. Especially switches are notorious for having @@ -456,11 +477,15 @@ and want to send and receive all cluster communication over this interface you would execute: [source,bash] +---- pvecm create test --ring0_addr 10.10.10.1 --bindnet0_addr 10.10.10.0 +---- To check if everything is working properly execute: [source,bash] +---- systemctl status corosync +---- [[separate-cluster-net-after-creation]] Separate After Cluster Creation @@ -596,12 +621,16 @@ As our change cannot be enforced live from corosync we have to do an restart. On a single node execute: [source,bash] +---- systemctl restart corosync +---- Now check if everything is fine: [source,bash] +---- systemctl status corosync +---- If corosync runs again correct restart corosync also on all other nodes. They will then join the cluster membership one by one on the new network. @@ -628,15 +657,18 @@ So if you have two networks, one on the 10.10.10.1/24 and the other on the 10.10.20.1/24 subnet you would execute: [source,bash] +---- pvecm create CLUSTERNAME -bindnet0_addr 10.10.10.1 -ring0_addr 10.10.10.1 \ -bindnet1_addr 10.10.20.1 -ring1_addr 10.10.20.1 +---- RRP On A Created Cluster ~~~~~~~~~~~~~~~~~~~~~~~~ When enabling an already running cluster to use RRP you will take similar steps -as describe in <>. You just do it on another ring. +as describe in +<>. You +just do it on another ring. First add a new `interface` subsection in the `totem` section, set its `ringnumber` property to `1`. Set the interfaces `bindnetaddr` property to an @@ -691,8 +723,8 @@ nodelist { ---- -Bring it in effect like described in the <> section. +Bring it in effect like described in the +<> section. This is a change which cannot take live in effect and needs at least a restart of corosync. Recommended is a restart of the whole cluster. @@ -708,7 +740,9 @@ The `/ect/pve/corosync.conf` file plays a central role in {pve} cluster. It controls the cluster member ship and its network. For reading more about it check the corosync.conf man page: [source,bash] +---- man corosync.conf +---- For node membership you should always use the `pvecm` tool provided by {pve}. You may have to edit the configuration file manually for other changes. @@ -729,7 +763,9 @@ instantly effect. So you should always make a copy and edit that instead, to avoid triggering some unwanted changes by an in between safe. [source,bash] +---- cp /etc/pve/corosync.conf /etc/pve/corosync.conf.new +---- Then open the Config file with your favorite editor, `nano` and `vim.tiny` are preinstalled on {pve} for example. @@ -742,21 +778,29 @@ configuration file. This serves as a backup if the new configuration fails to apply or makes problems in other ways. [source,bash] +---- cp /etc/pve/corosync.conf /etc/pve/corosync.conf.bak +---- Then move the new configuration file over the old one: [source,bash] +---- mv /etc/pve/corosync.conf.new /etc/pve/corosync.conf +---- You may check with the commands [source,bash] +---- systemctl status corosync journalctl -b -u corosync +---- If the change could applied automatically. If not you may have to restart the corosync service via: [source,bash] +---- systemctl restart corosync +---- On errors check the troubleshooting section below. @@ -786,7 +830,9 @@ Write Configuration When Not Quorate If you need to change '/etc/pve/corosync.conf' on an node with no quorum, and you know what you do, use: [source,bash] +---- pvecm expected 1 +---- This sets the expected vote count to 1 and makes the cluster quorate. You can now fix your configuration, or revert it back to the last working backup.