pvesdn: add evpn Advertise Subnets && Exit Nodes Local routing options

[pve-docs.git] / pvecm.adoc
diff --git a/pvecm.adoc b/pvecm.adoc

index 5a73c1e03bf7e41a9917d7446957b79c6ef7d69c..570bf1ee2573ddaa0fc199bc1af16171c7620333 100644 (file)
--- a/pvecm.adoc
+++ b/pvecm.adoc
@@ -27,12 +27,14 @@ endif::manvolnum[]
  The {PVE} cluster manager `pvecm` is a tool to create a group of
  physical servers. Such a group is called a *cluster*. We use the
  http://www.corosync.org[Corosync Cluster Engine] for reliable group
-communication, and such clusters can consist of up to 32 physical nodes
-(probably more, dependent on network latency).
+communication. There's no explicit limit for the number of nodes in a cluster.
+In practice, the actual possible node count may be limited by the host and
+network performance. Currently (2021), there are reports of clusters (using
+high-end enterprise hardware) with over 50 nodes in production.
  
  `pvecm` can be used to create a new cluster, join nodes to a cluster,
-leave the cluster, get status information and do various other cluster
-related tasks. The **P**rox**m**o**x** **C**luster **F**ile **S**ystem (``pmxcfs'')
+leave the cluster, get status information and do various other cluster-related
+tasks. The **P**rox**m**o**x** **C**luster **F**ile **S**ystem (``pmxcfs'')
  is used to transparently distribute the cluster configuration to all cluster
  nodes.
  
@@ -61,7 +63,7 @@ Requirements
  
  * Date and time have to be synchronized.
  
-* SSH tunnel on TCP port 22 between nodes is used. 
+* SSH tunnel on TCP port 22 between nodes is used.
  
  * If you are interested in High Availability, you need to have at
    least three nodes for reliable quorum. All nodes should have the
@@ -327,11 +329,11 @@ After powering off the node hp4, we can safely remove it from the cluster.
  
  ----
   hp1# pvecm delnode hp4
+ Killing node 4
  ----
  
-If the operation succeeds no output is returned, just check the node
-list again with `pvecm nodes` or `pvecm status`. You should see
-something like:
+Use `pvecm nodes` or `pvecm status` to check the node list again. It should
+look something like:
  
  ----
  hp1# pvecm status
@@ -384,7 +386,7 @@ You can also separate a node from a cluster without reinstalling it from
  scratch.  But after removing the node from the cluster it will still have
  access to the shared storages! This must be resolved before you start removing
  the node from the cluster. A {pve} cluster cannot share the exact same
-storage with another cluster, as storage locking doesn't work over cluster
+storage with another cluster, as storage locking doesn't work over the cluster
  boundary. Further, it may also lead to VMID conflicts.
  
  Its suggested that you create a new storage where only the node which you want
@@ -397,7 +399,7 @@ node from the cluster.
  WARNING: Ensure all shared resources are cleanly separated! Otherwise you will
  run into conflicts and problems.
  
-First stop the corosync and the pve-cluster services on the node:
+First, stop the corosync and the pve-cluster services on the node:
  [source,bash]
  ----
  systemctl stop pve-cluster
@@ -414,7 +416,7 @@ Delete the corosync configuration files:
  [source,bash]
  ----
  rm /etc/pve/corosync.conf
-rm /etc/corosync/*
+rm -r /etc/corosync/*
  ----
  
  You can now start the filesystem again as normal service:
@@ -527,7 +529,7 @@ application.
  Setting Up A New Network
  ^^^^^^^^^^^^^^^^^^^^^^^^
  
-First you have to set up a new network interface. It should be on a physically
+First, you have to set up a new network interface. It should be on a physically
  separate network. Ensure that your network fulfills the
  xref:pvecm_cluster_network_requirements[cluster network requirements].
  
@@ -730,7 +732,7 @@ resolve to can be changed without touching corosync or the node it runs on -
  which may lead to a situation where an address is changed without thinking
  about implications for corosync.
  
-A seperate, static hostname specifically for corosync is recommended, if
+A separate, static hostname specifically for corosync is recommended, if
  hostnames are preferred. Also, make sure that every node in the cluster can
  resolve all hostnames correctly.
  
@@ -739,7 +741,7 @@ entry. Only the resolved IP is then saved to the configuration.
  
  Nodes that joined the cluster on earlier versions likely still use their
  unresolved hostname in `corosync.conf`. It might be a good idea to replace
-them with IPs or a seperate hostname, as mentioned above.
+them with IPs or a separate hostname, as mentioned above.
  
  
  [[pvecm_redundancy]]
@@ -758,14 +760,14 @@ physical network connection.
  
  Links are used according to a priority setting. You can configure this priority
  by setting 'knet_link_priority' in the corresponding interface section in
-`corosync.conf`, or, preferrably, using the 'priority' parameter when creating
+`corosync.conf`, or, preferably, using the 'priority' parameter when creating
  your cluster with `pvecm`:
  
  ----
- # pvecm create CLUSTERNAME --link0 10.10.10.1,priority=20 --link1 10.20.20.1,priority=15
+ # pvecm create CLUSTERNAME --link0 10.10.10.1,priority=15 --link1 10.20.20.1,priority=20
  ----
  
-This would cause 'link1' to be used first, since it has the lower priority.
+This would cause 'link1' to be used first, since it has the higher priority.
  
  If no priorities are configured manually (or two links have the same priority),
  links will be used in order of their number, with the lower number having higher
@@ -869,6 +871,50 @@ pvecm status
  If you see a healthy cluster state, it means that your new link is being used.
  
  
+Role of SSH in {PVE} Clusters
+-----------------------------
+
+{PVE} utilizes SSH tunnels for various features.
+
+* Proxying console/shell sessions (node and guests)
++
+When using the shell for node B while being connected to node A, connects to a
+terminal proxy on node A, which is in turn connected to the login shell on node
+B via a non-interactive SSH tunnel.
+
+* VM and CT memory and local-storage migration in 'secure' mode.
++
+During the migration one or more SSH tunnel(s) are established between the
+source and target nodes, in order to exchange migration information and
+transfer memory and disk contents.
+
+* Storage replication
+
+.Pitfalls due to automatic execution of `.bashrc` and siblings
+[IMPORTANT]
+====
+In case you have a custom `.bashrc`, or similar files that get executed on
+login by the configured shell, `ssh` will automatically run it once the session
+is established successfully. This can cause some unexpected behavior, as those
+commands may be executed with root permissions on any above described
+operation. That can cause possible problematic side-effects!
+
+In order to avoid such complications, it's recommended to add a check in
+`/root/.bashrc` to make sure the session is interactive, and only then run
+`.bashrc` commands.
+
+You can add this snippet at the beginning of your `.bashrc` file:
+
+----
+# Early exit if not running interactively to avoid side-effects!
+case $- in
+    *i*) ;;
+      *) return;;
+esac
+----
+====
+
+
  Corosync External Vote Support
  ------------------------------
  
@@ -888,7 +934,7 @@ example 2+1 nodes).
  QDevice Technical Overview
  ~~~~~~~~~~~~~~~~~~~~~~~~~~
  
-The Corosync Quroum Device (QDevice) is a daemon which runs on each cluster
+The Corosync Quorum Device (QDevice) is a daemon which runs on each cluster
  node. It provides a configured number of votes to the clusters quorum
  subsystem based on an external running third-party arbitrator's decision.
  Its primary use is to allow a cluster to sustain more node failures than
@@ -952,13 +998,22 @@ QDevice-Net Setup
  ~~~~~~~~~~~~~~~~~
  
  We recommend to run any daemon which provides votes to corosync-qdevice as an
-unprivileged user. {pve} and Debian provides a package which is already
+unprivileged user. {pve} and Debian provide a package which is already
  configured to do so.
  The traffic between the daemon and the cluster must be encrypted to ensure a
  safe and secure QDevice integration in {pve}.
  
-First install the 'corosync-qnetd' package on your external server and
-the 'corosync-qdevice' package on all cluster nodes.
+First, install the 'corosync-qnetd' package on your external server
+
+----
+external# apt install corosync-qnetd
+----
+
+and the 'corosync-qdevice' package on all cluster nodes
+
+----
+pve# apt install corosync-qdevice
+----
  
  After that, ensure that all your nodes on the cluster are online.
  
@@ -969,8 +1024,10 @@ of the {pve} nodes:
  pve# pvecm qdevice setup <QDEVICE-IP>
  ----
  
-The SSH key from the cluster will be automatically copied to the QDevice. You
-might need to enter an SSH password during this step.
+The SSH key from the cluster will be automatically copied to the QDevice.
+
+NOTE: Make sure that the SSH configuration on your external server allows root
+login via password, if you are asked for a password during this step.
  
  After you enter the password and all the steps are successfully completed, you
  will see "Done". You can check the status now:
@@ -1256,9 +1313,9 @@ iface vmbr0 inet static
      address 192.X.Y.57
      netmask 255.255.250.0
      gateway 192.X.Y.1
-    bridge_ports eno1
-    bridge_stp off
-    bridge_fd 0
+    bridge-ports eno1
+    bridge-stp off
+    bridge-fd 0
  
  # cluster network
  auto eno2