4 include::attributes.txt[]
9 pvecm - {pve} Cluster Manager
14 include::pvecm.1-synopsis.adoc[]
23 include::attributes.txt[]
26 The {PVE} cluster manager `pvecm` is a tool to create a group of
27 physical servers. Such a group is called a *cluster*. We use the
28 http://www.corosync.org[Corosync Cluster Engine] for reliable group
29 communication, and such clusters can consist of up to 32 physical nodes
30 (probably more, dependent on network latency).
32 `pvecm` can be used to create a new cluster, join nodes to a cluster,
33 leave the cluster, get status information and do various other cluster
34 related tasks. The **P**rox**m**o**x** **C**luster **F**ile **S**ystem (``pmxcfs'')
35 is used to transparently distribute the cluster configuration to all cluster
38 Grouping nodes into a cluster has the following advantages:
40 * Centralized, web based management
42 * Multi-master clusters: each node can do all management task
44 * `pmxcfs`: database-driven file system for storing configuration files,
45 replicated in real-time on all nodes using `corosync`.
47 * Easy migration of virtual machines and containers between physical
52 * Cluster-wide services like firewall and HA
58 * All nodes must be in the same network as `corosync` uses IP Multicast
59 to communicate between nodes (also see
60 http://www.corosync.org[Corosync Cluster Engine]). Corosync uses UDP
61 ports 5404 and 5405 for cluster communication.
63 NOTE: Some switches do not support IP multicast by default and must be
64 manually enabled first.
66 * Date and time have to be synchronized.
68 * SSH tunnel on TCP port 22 between nodes is used.
70 * If you are interested in High Availability, you need to have at
71 least three nodes for reliable quorum. All nodes should have the
74 * We recommend a dedicated NIC for the cluster traffic, especially if
75 you use shared storage.
77 NOTE: It is not possible to mix Proxmox VE 3.x and earlier with
78 Proxmox VE 4.0 cluster nodes.
84 First, install {PVE} on all nodes. Make sure that each node is
85 installed with the final hostname and IP configuration. Changing the
86 hostname and IP is not possible after cluster creation.
88 Currently the cluster creation has to be done on the console, so you
89 need to login via `ssh`.
94 Login via `ssh` to the first {pve} node. Use a unique name for your cluster.
95 This name cannot be changed later.
97 hp1# pvecm create YOUR-CLUSTER-NAME
99 CAUTION: The cluster name is used to compute the default multicast
100 address. Please use unique cluster names if you run more than one
101 cluster inside your network.
103 To check the state of your cluster use:
108 Adding Nodes to the Cluster
109 ---------------------------
111 Login via `ssh` to the node you want to add.
113 hp2# pvecm add IP-ADDRESS-CLUSTER
115 For `IP-ADDRESS-CLUSTER` use the IP from an existing cluster node.
117 CAUTION: A new node cannot hold any VMs, because you would get
118 conflicts about identical VM IDs. Also, all existing configuration in
119 `/etc/pve` is overwritten when you join a new node to the cluster. To
120 workaround, use `vzdump` to backup and restore to a different VMID after
121 adding the node to the cluster.
123 To check the state of cluster:
127 .Cluster status after adding 4 nodes
132 Date: Mon Apr 20 12:30:13 2015
133 Quorum provider: corosync_votequorum
139 Votequorum information
140 ~~~~~~~~~~~~~~~~~~~~~~
147 Membership information
148 ~~~~~~~~~~~~~~~~~~~~~~
150 0x00000001 1 192.168.15.91
151 0x00000002 1 192.168.15.92 (local)
152 0x00000003 1 192.168.15.93
153 0x00000004 1 192.168.15.94
156 If you only want the list of all nodes use:
160 .List nodes in a cluster
164 Membership information
165 ~~~~~~~~~~~~~~~~~~~~~~
174 Remove a Cluster Node
175 ---------------------
177 CAUTION: Read carefully the procedure before proceeding, as it could
178 not be what you want or need.
180 Move all virtual machines from the node. Make sure you have no local
181 data or backups you want to keep, or save them accordingly.
183 Log in to one remaining node via ssh. Issue a `pvecm nodes` command to
184 identify the node ID:
191 Date: Mon Apr 20 12:30:13 2015
192 Quorum provider: corosync_votequorum
198 Votequorum information
199 ~~~~~~~~~~~~~~~~~~~~~~
206 Membership information
207 ~~~~~~~~~~~~~~~~~~~~~~
209 0x00000001 1 192.168.15.91 (local)
210 0x00000002 1 192.168.15.92
211 0x00000003 1 192.168.15.93
212 0x00000004 1 192.168.15.94
215 IMPORTANT: at this point you must power off the node to be removed and
216 make sure that it will not power on again (in the network) as it
222 Membership information
223 ~~~~~~~~~~~~~~~~~~~~~~
231 Log in to one remaining node via ssh. Issue the delete command (here
232 deleting node `hp4`):
234 hp1# pvecm delnode hp4
236 If the operation succeeds no output is returned, just check the node
237 list again with `pvecm nodes` or `pvecm status`. You should see
245 Date: Mon Apr 20 12:44:28 2015
246 Quorum provider: corosync_votequorum
252 Votequorum information
253 ~~~~~~~~~~~~~~~~~~~~~~
260 Membership information
261 ~~~~~~~~~~~~~~~~~~~~~~
263 0x00000001 1 192.168.15.90 (local)
264 0x00000002 1 192.168.15.91
265 0x00000003 1 192.168.15.92
268 IMPORTANT: as said above, it is very important to power off the node
269 *before* removal, and make sure that it will *never* power on again
270 (in the existing cluster network) as it is.
272 If you power on the node as it is, your cluster will be screwed up and
273 it could be difficult to restore a clean cluster state.
275 If, for whatever reason, you want that this server joins the same
276 cluster again, you have to
278 * reinstall {pve} on it from scratch
280 * then join it, as explained in the previous section.
286 {pve} use a quorum-based technique to provide a consistent state among
289 [quote, from Wikipedia, Quorum (distributed computing)]
291 A quorum is the minimum number of votes that a distributed transaction
292 has to obtain in order to be allowed to perform an operation in a
296 In case of network partitioning, state changes requires that a
297 majority of nodes are online. The cluster switches to read-only mode
300 NOTE: {pve} assigns a single vote to each node by default.
306 It is obvious that a cluster is not quorate when all nodes are
307 offline. This is a common case after a power failure.
309 NOTE: It is always a good idea to use an uninterruptible power supply
310 (``UPS'', also called ``battery backup'') to avoid this state, especially if
313 On node startup, service `pve-manager` is started and waits for
314 quorum. Once quorate, it starts all guests which have the `onboot`
317 When you turn on nodes, or when power comes back after power failure,
318 it is likely that some nodes boots faster than others. Please keep in
319 mind that guest startup is delayed until you reach quorum.
323 include::pve-copyright.adoc[]