]> git.proxmox.com Git - pve-docs.git/blame - pvecm.adoc
qm/pct: Add explanation for --purge
[pve-docs.git] / pvecm.adoc
CommitLineData
bde0e57d 1[[chapter_pvecm]]
d8742b0c 2ifdef::manvolnum[]
b2f242ab
DM
3pvecm(1)
4========
5f09af76
DM
5:pve-toplevel:
6
d8742b0c
DM
7NAME
8----
9
74026b8f 10pvecm - Proxmox VE Cluster Manager
d8742b0c 11
49a5e11c 12SYNOPSIS
d8742b0c
DM
13--------
14
15include::pvecm.1-synopsis.adoc[]
16
17DESCRIPTION
18-----------
19endif::manvolnum[]
20
21ifndef::manvolnum[]
22Cluster Manager
23===============
5f09af76 24:pve-toplevel:
194d2f29 25endif::manvolnum[]
5f09af76 26
8c1189b6
FG
27The {PVE} cluster manager `pvecm` is a tool to create a group of
28physical servers. Such a group is called a *cluster*. We use the
8a865621 29http://www.corosync.org[Corosync Cluster Engine] for reliable group
5eba0743 30communication, and such clusters can consist of up to 32 physical nodes
8a865621
DM
31(probably more, dependent on network latency).
32
8c1189b6 33`pvecm` can be used to create a new cluster, join nodes to a cluster,
8a865621 34leave the cluster, get status information and do various other cluster
e300cf7d
FG
35related tasks. The **P**rox**m**o**x** **C**luster **F**ile **S**ystem (``pmxcfs'')
36is used to transparently distribute the cluster configuration to all cluster
8a865621
DM
37nodes.
38
39Grouping nodes into a cluster has the following advantages:
40
41* Centralized, web based management
42
6d3c0b34 43* Multi-master clusters: each node can do all management tasks
8a865621 44
8c1189b6
FG
45* `pmxcfs`: database-driven file system for storing configuration files,
46 replicated in real-time on all nodes using `corosync`.
8a865621 47
5eba0743 48* Easy migration of virtual machines and containers between physical
8a865621
DM
49 hosts
50
51* Fast deployment
52
53* Cluster-wide services like firewall and HA
54
55
56Requirements
57------------
58
a9e7c3aa
SR
59* All nodes must be able to connect to each other via UDP ports 5404 and 5405
60 for corosync to work.
8a865621
DM
61
62* Date and time have to be synchronized.
63
94cfc9d4 64* SSH tunnel on TCP port 22 between nodes is used.
8a865621 65
ceabe189
DM
66* If you are interested in High Availability, you need to have at
67 least three nodes for reliable quorum. All nodes should have the
68 same version.
8a865621
DM
69
70* We recommend a dedicated NIC for the cluster traffic, especially if
71 you use shared storage.
72
d4a9910f
DL
73* Root password of a cluster node is required for adding nodes.
74
e4b62d04
TL
75NOTE: It is not possible to mix {pve} 3.x and earlier with {pve} 4.X cluster
76nodes.
77
6cab1704
TL
78NOTE: While it's possible to mix {pve} 4.4 and {pve} 5.0 nodes, doing so is
79not supported as production configuration and should only used temporarily
80during upgrading the whole cluster from one to another major version.
8a865621 81
a9e7c3aa
SR
82NOTE: Running a cluster of {pve} 6.x with earlier versions is not possible. The
83cluster protocol (corosync) between {pve} 6.x and earlier versions changed
84fundamentally. The corosync 3 packages for {pve} 5.4 are only intended for the
85upgrade procedure to {pve} 6.0.
86
8a865621 87
ceabe189
DM
88Preparing Nodes
89---------------
8a865621
DM
90
91First, install {PVE} on all nodes. Make sure that each node is
92installed with the final hostname and IP configuration. Changing the
93hostname and IP is not possible after cluster creation.
94
a9e7c3aa
SR
95While it's common to reference all nodenames and their IPs in `/etc/hosts` (or
96make their names resolvable through other means), this is not necessary for a
97cluster to work. It may be useful however, as you can then connect from one node
98to the other with SSH via the easier to remember node name (see also
99xref:pvecm_corosync_addresses[Link Address Types]). Note that we always
100recommend to reference nodes by their IP addresses in the cluster configuration.
101
9a7396aa 102
11202f1d 103[[pvecm_create_cluster]]
6cab1704
TL
104Create a Cluster
105----------------
106
107You can either create a cluster on the console (login via `ssh`), or through
108the API using the {pve} Webinterface (__Datacenter -> Cluster__).
8a865621 109
6cab1704
TL
110NOTE: Use a unique name for your cluster. This name cannot be changed later.
111The cluster name follows the same rules as node names.
3e380ce0 112
6cab1704 113[[pvecm_cluster_create_via_gui]]
3e380ce0
SR
114Create via Web GUI
115~~~~~~~~~~~~~~~~~~
116
24398259
SR
117[thumbnail="screenshot/gui-cluster-create.png"]
118
3e380ce0
SR
119Under __Datacenter -> Cluster__, click on *Create Cluster*. Enter the cluster
120name and select a network connection from the dropdown to serve as the main
121cluster network (Link 0). It defaults to the IP resolved via the node's
122hostname.
123
124To add a second link as fallback, you can select the 'Advanced' checkbox and
125choose an additional network interface (Link 1, see also
126xref:pvecm_redundancy[Corosync Redundancy]).
127
6cab1704
TL
128NOTE: Ensure the network selected for the cluster communication is not used for
129any high traffic loads like those of (network) storages or live-migration.
130While the cluster network itself produces small amounts of data, it is very
131sensitive to latency. Check out full
132xref:pvecm_cluster_network_requirements[cluster network requirements].
133
134[[pvecm_cluster_create_via_cli]]
3e380ce0
SR
135Create via Command Line
136~~~~~~~~~~~~~~~~~~~~~~~
137
138Login via `ssh` to the first {pve} node and run the following command:
8a865621 139
c15cdfba
TL
140----
141 hp1# pvecm create CLUSTERNAME
142----
8a865621 143
3e380ce0 144To check the state of the new cluster use:
8a865621 145
c15cdfba 146----
8a865621 147 hp1# pvecm status
c15cdfba 148----
8a865621 149
dd1aa0e0
TL
150Multiple Clusters In Same Network
151~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
152
153It is possible to create multiple clusters in the same physical or logical
3e380ce0
SR
154network. Each such cluster must have a unique name to avoid possible clashes in
155the cluster communication stack. This also helps avoid human confusion by making
156clusters clearly distinguishable.
dd1aa0e0
TL
157
158While the bandwidth requirement of a corosync cluster is relatively low, the
159latency of packages and the package per second (PPS) rate is the limiting
160factor. Different clusters in the same network can compete with each other for
161these resources, so it may still make sense to use separate physical network
162infrastructure for bigger clusters.
8a865621 163
11202f1d 164[[pvecm_join_node_to_cluster]]
8a865621 165Adding Nodes to the Cluster
ceabe189 166---------------------------
8a865621 167
3e380ce0
SR
168CAUTION: A node that is about to be added to the cluster cannot hold any guests.
169All existing configuration in `/etc/pve` is overwritten when joining a cluster,
170since guest IDs could be conflicting. As a workaround create a backup of the
171guest (`vzdump`) and restore it as a different ID after the node has been added
172to the cluster.
173
6cab1704
TL
174Join Node to Cluster via GUI
175~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3e380ce0 176
24398259
SR
177[thumbnail="screenshot/gui-cluster-join-information.png"]
178
3e380ce0
SR
179Login to the web interface on an existing cluster node. Under __Datacenter ->
180Cluster__, click the button *Join Information* at the top. Then, click on the
181button *Copy Information*. Alternatively, copy the string from the 'Information'
182field manually.
183
24398259
SR
184[thumbnail="screenshot/gui-cluster-join.png"]
185
3e380ce0
SR
186Next, login to the web interface on the node you want to add.
187Under __Datacenter -> Cluster__, click on *Join Cluster*. Fill in the
6cab1704
TL
188'Information' field with the 'Join Information' text you copied earlier.
189Most settings required for joining the cluster will be filled out
190automatically. For security reasons, the cluster password has to be entered
191manually.
3e380ce0
SR
192
193NOTE: To enter all required data manually, you can disable the 'Assisted Join'
194checkbox.
195
6cab1704
TL
196After clicking the *Join* button, the cluster join process will start
197immediately. After the node joined the cluster its current node certificate
198will be replaced by one signed from the cluster certificate authority (CA),
199that means the current session will stop to work after a few seconds. You might
200then need to force-reload the webinterface and re-login with the cluster
201credentials.
3e380ce0 202
6cab1704 203Now your node should be visible under __Datacenter -> Cluster__.
3e380ce0 204
6cab1704
TL
205Join Node to Cluster via Command Line
206~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3e380ce0 207
6cab1704 208Login via `ssh` to the node you want to join into an existing cluster.
8a865621 209
c15cdfba 210----
8a865621 211 hp2# pvecm add IP-ADDRESS-CLUSTER
c15cdfba 212----
8a865621 213
270757a1 214For `IP-ADDRESS-CLUSTER` use the IP or hostname of an existing cluster node.
a9e7c3aa 215An IP address is recommended (see xref:pvecm_corosync_addresses[Link Address Types]).
8a865621 216
8a865621 217
a9e7c3aa 218To check the state of the cluster use:
8a865621 219
c15cdfba 220----
8a865621 221 # pvecm status
c15cdfba 222----
8a865621 223
ceabe189 224.Cluster status after adding 4 nodes
8a865621
DM
225----
226hp2# pvecm status
227Quorum information
228~~~~~~~~~~~~~~~~~~
229Date: Mon Apr 20 12:30:13 2015
230Quorum provider: corosync_votequorum
231Nodes: 4
232Node ID: 0x00000001
a9e7c3aa 233Ring ID: 1/8
8a865621
DM
234Quorate: Yes
235
236Votequorum information
237~~~~~~~~~~~~~~~~~~~~~~
238Expected votes: 4
239Highest expected: 4
240Total votes: 4
91f3edd0 241Quorum: 3
8a865621
DM
242Flags: Quorate
243
244Membership information
245~~~~~~~~~~~~~~~~~~~~~~
246 Nodeid Votes Name
2470x00000001 1 192.168.15.91
2480x00000002 1 192.168.15.92 (local)
2490x00000003 1 192.168.15.93
2500x00000004 1 192.168.15.94
251----
252
253If you only want the list of all nodes use:
254
c15cdfba 255----
8a865621 256 # pvecm nodes
c15cdfba 257----
8a865621 258
5eba0743 259.List nodes in a cluster
8a865621
DM
260----
261hp2# pvecm nodes
262
263Membership information
264~~~~~~~~~~~~~~~~~~~~~~
265 Nodeid Votes Name
266 1 1 hp1
267 2 1 hp2 (local)
268 3 1 hp3
269 4 1 hp4
270----
271
3254bfdd 272[[pvecm_adding_nodes_with_separated_cluster_network]]
e4ec4154
TL
273Adding Nodes With Separated Cluster Network
274~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
275
276When adding a node to a cluster with a separated cluster network you need to
a9e7c3aa 277use the 'link0' parameter to set the nodes address on that network:
e4ec4154
TL
278
279[source,bash]
4d19cb00 280----
a9e7c3aa 281pvecm add IP-ADDRESS-CLUSTER -link0 LOCAL-IP-ADDRESS-LINK0
4d19cb00 282----
e4ec4154 283
a9e7c3aa
SR
284If you want to use the built-in xref:pvecm_redundancy[redundancy] of the
285kronosnet transport layer, also use the 'link1' parameter.
e4ec4154 286
3e380ce0
SR
287Using the GUI, you can select the correct interface from the corresponding 'Link 0'
288and 'Link 1' fields in the *Cluster Join* dialog.
8a865621
DM
289
290Remove a Cluster Node
ceabe189 291---------------------
8a865621
DM
292
293CAUTION: Read carefully the procedure before proceeding, as it could
294not be what you want or need.
295
296Move all virtual machines from the node. Make sure you have no local
297data or backups you want to keep, or save them accordingly.
e8503c6c 298In the following example we will remove the node hp4 from the cluster.
8a865621 299
e8503c6c
EK
300Log in to a *different* cluster node (not hp4), and issue a `pvecm nodes`
301command to identify the node ID to remove:
8a865621
DM
302
303----
304hp1# pvecm nodes
305
306Membership information
307~~~~~~~~~~~~~~~~~~~~~~
308 Nodeid Votes Name
309 1 1 hp1 (local)
310 2 1 hp2
311 3 1 hp3
312 4 1 hp4
313----
314
e8503c6c
EK
315
316At this point you must power off hp4 and
317make sure that it will not power on again (in the network) as it
318is.
319
320IMPORTANT: As said above, it is critical to power off the node
321*before* removal, and make sure that it will *never* power on again
322(in the existing cluster network) as it is.
323If you power on the node as it is, your cluster will be screwed up and
324it could be difficult to restore a clean cluster state.
325
326After powering off the node hp4, we can safely remove it from the cluster.
8a865621 327
c15cdfba 328----
8a865621 329 hp1# pvecm delnode hp4
10da5ce1 330 Killing node 4
c15cdfba 331----
8a865621 332
10da5ce1
DJ
333Use `pvecm nodes` or `pvecm status` to check the node list again. It should
334look something like:
8a865621
DM
335
336----
337hp1# pvecm status
338
339Quorum information
340~~~~~~~~~~~~~~~~~~
341Date: Mon Apr 20 12:44:28 2015
342Quorum provider: corosync_votequorum
343Nodes: 3
344Node ID: 0x00000001
a9e7c3aa 345Ring ID: 1/8
8a865621
DM
346Quorate: Yes
347
348Votequorum information
349~~~~~~~~~~~~~~~~~~~~~~
350Expected votes: 3
351Highest expected: 3
352Total votes: 3
91f3edd0 353Quorum: 2
8a865621
DM
354Flags: Quorate
355
356Membership information
357~~~~~~~~~~~~~~~~~~~~~~
358 Nodeid Votes Name
3590x00000001 1 192.168.15.90 (local)
3600x00000002 1 192.168.15.91
3610x00000003 1 192.168.15.92
362----
363
a9e7c3aa
SR
364If, for whatever reason, you want this server to join the same cluster again,
365you have to
8a865621 366
26ca7ff5 367* reinstall {pve} on it from scratch
8a865621
DM
368
369* then join it, as explained in the previous section.
d8742b0c 370
41925ede
SR
371NOTE: After removal of the node, its SSH fingerprint will still reside in the
372'known_hosts' of the other nodes. If you receive an SSH error after rejoining
9121b45b
TL
373a node with the same IP or hostname, run `pvecm updatecerts` once on the
374re-added node to update its fingerprint cluster wide.
41925ede 375
38ae8db3 376[[pvecm_separate_node_without_reinstall]]
555e966b
TL
377Separate A Node Without Reinstalling
378~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
379
380CAUTION: This is *not* the recommended method, proceed with caution. Use the
381above mentioned method if you're unsure.
382
383You can also separate a node from a cluster without reinstalling it from
384scratch. But after removing the node from the cluster it will still have
385access to the shared storages! This must be resolved before you start removing
386the node from the cluster. A {pve} cluster cannot share the exact same
2ea5c4a5
TL
387storage with another cluster, as storage locking doesn't work over cluster
388boundary. Further, it may also lead to VMID conflicts.
555e966b 389
3be22308 390Its suggested that you create a new storage where only the node which you want
a9e7c3aa 391to separate has access. This can be a new export on your NFS or a new Ceph
3be22308
TL
392pool, to name a few examples. Its just important that the exact same storage
393does not gets accessed by multiple clusters. After setting this storage up move
394all data from the node and its VMs to it. Then you are ready to separate the
395node from the cluster.
555e966b 396
a9e7c3aa
SR
397WARNING: Ensure all shared resources are cleanly separated! Otherwise you will
398run into conflicts and problems.
555e966b 399
9ffebff5 400First, stop the corosync and the pve-cluster services on the node:
555e966b 401[source,bash]
4d19cb00 402----
555e966b
TL
403systemctl stop pve-cluster
404systemctl stop corosync
4d19cb00 405----
555e966b
TL
406
407Start the cluster filesystem again in local mode:
408[source,bash]
4d19cb00 409----
555e966b 410pmxcfs -l
4d19cb00 411----
555e966b
TL
412
413Delete the corosync configuration files:
414[source,bash]
4d19cb00 415----
555e966b 416rm /etc/pve/corosync.conf
838081cd 417rm -r /etc/corosync/*
4d19cb00 418----
555e966b
TL
419
420You can now start the filesystem again as normal service:
421[source,bash]
4d19cb00 422----
555e966b
TL
423killall pmxcfs
424systemctl start pve-cluster
4d19cb00 425----
555e966b
TL
426
427The node is now separated from the cluster. You can deleted it from a remaining
428node of the cluster with:
429[source,bash]
4d19cb00 430----
555e966b 431pvecm delnode oldnode
4d19cb00 432----
555e966b
TL
433
434If the command failed, because the remaining node in the cluster lost quorum
435when the now separate node exited, you may set the expected votes to 1 as a workaround:
436[source,bash]
4d19cb00 437----
555e966b 438pvecm expected 1
4d19cb00 439----
555e966b 440
96d698db 441And then repeat the 'pvecm delnode' command.
555e966b
TL
442
443Now switch back to the separated node, here delete all remaining files left
444from the old cluster. This ensures that the node can be added to another
445cluster again without problems.
446
447[source,bash]
4d19cb00 448----
555e966b 449rm /var/lib/corosync/*
4d19cb00 450----
555e966b
TL
451
452As the configuration files from the other nodes are still in the cluster
453filesystem you may want to clean those up too. Remove simply the whole
454directory recursive from '/etc/pve/nodes/NODENAME', but check three times that
455you used the correct one before deleting it.
456
457CAUTION: The nodes SSH keys are still in the 'authorized_key' file, this means
458the nodes can still connect to each other with public key authentication. This
459should be fixed by removing the respective keys from the
460'/etc/pve/priv/authorized_keys' file.
d8742b0c 461
a9e7c3aa 462
806ef12d
DM
463Quorum
464------
465
466{pve} use a quorum-based technique to provide a consistent state among
467all cluster nodes.
468
469[quote, from Wikipedia, Quorum (distributed computing)]
470____
471A quorum is the minimum number of votes that a distributed transaction
472has to obtain in order to be allowed to perform an operation in a
473distributed system.
474____
475
476In case of network partitioning, state changes requires that a
477majority of nodes are online. The cluster switches to read-only mode
5eba0743 478if it loses quorum.
806ef12d
DM
479
480NOTE: {pve} assigns a single vote to each node by default.
481
a9e7c3aa 482
e4ec4154
TL
483Cluster Network
484---------------
485
486The cluster network is the core of a cluster. All messages sent over it have to
a9e7c3aa
SR
487be delivered reliably to all nodes in their respective order. In {pve} this
488part is done by corosync, an implementation of a high performance, low overhead
e4ec4154
TL
489high availability development toolkit. It serves our decentralized
490configuration file system (`pmxcfs`).
491
3254bfdd 492[[pvecm_cluster_network_requirements]]
e4ec4154
TL
493Network Requirements
494~~~~~~~~~~~~~~~~~~~~
495This needs a reliable network with latencies under 2 milliseconds (LAN
a9e7c3aa
SR
496performance) to work properly. The network should not be used heavily by other
497members, ideally corosync runs on its own network. Do not use a shared network
498for corosync and storage (except as a potential low-priority fallback in a
499xref:pvecm_redundancy[redundant] configuration).
e4ec4154 500
a9e7c3aa
SR
501Before setting up a cluster, it is good practice to check if the network is fit
502for that purpose. To make sure the nodes can connect to each other on the
503cluster network, you can test the connectivity between them with the `ping`
504tool.
e4ec4154 505
a9e7c3aa
SR
506If the {pve} firewall is enabled, ACCEPT rules for corosync will automatically
507be generated - no manual action is required.
e4ec4154 508
a9e7c3aa
SR
509NOTE: Corosync used Multicast before version 3.0 (introduced in {pve} 6.0).
510Modern versions rely on https://kronosnet.org/[Kronosnet] for cluster
511communication, which, for now, only supports regular UDP unicast.
e4ec4154 512
a9e7c3aa
SR
513CAUTION: You can still enable Multicast or legacy unicast by setting your
514transport to `udp` or `udpu` in your xref:pvecm_edit_corosync_conf[corosync.conf],
515but keep in mind that this will disable all cryptography and redundancy support.
516This is therefore not recommended.
e4ec4154
TL
517
518Separate Cluster Network
519~~~~~~~~~~~~~~~~~~~~~~~~
520
a9e7c3aa
SR
521When creating a cluster without any parameters the corosync cluster network is
522generally shared with the Web UI and the VMs and their traffic. Depending on
523your setup, even storage traffic may get sent over the same network. Its
524recommended to change that, as corosync is a time critical real time
525application.
e4ec4154
TL
526
527Setting Up A New Network
528^^^^^^^^^^^^^^^^^^^^^^^^
529
9ffebff5 530First, you have to set up a new network interface. It should be on a physically
e4ec4154 531separate network. Ensure that your network fulfills the
3254bfdd 532xref:pvecm_cluster_network_requirements[cluster network requirements].
e4ec4154
TL
533
534Separate On Cluster Creation
535^^^^^^^^^^^^^^^^^^^^^^^^^^^^
536
a9e7c3aa
SR
537This is possible via the 'linkX' parameters of the 'pvecm create'
538command used for creating a new cluster.
e4ec4154 539
a9e7c3aa
SR
540If you have set up an additional NIC with a static address on 10.10.10.1/25,
541and want to send and receive all cluster communication over this interface,
e4ec4154
TL
542you would execute:
543
544[source,bash]
4d19cb00 545----
a9e7c3aa 546pvecm create test --link0 10.10.10.1
4d19cb00 547----
e4ec4154
TL
548
549To check if everything is working properly execute:
550[source,bash]
4d19cb00 551----
e4ec4154 552systemctl status corosync
4d19cb00 553----
e4ec4154 554
a9e7c3aa 555Afterwards, proceed as described above to
3254bfdd 556xref:pvecm_adding_nodes_with_separated_cluster_network[add nodes with a separated cluster network].
82d52451 557
3254bfdd 558[[pvecm_separate_cluster_net_after_creation]]
e4ec4154
TL
559Separate After Cluster Creation
560^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
561
a9e7c3aa 562You can do this if you have already created a cluster and want to switch
e4ec4154
TL
563its communication to another network, without rebuilding the whole cluster.
564This change may lead to short durations of quorum loss in the cluster, as nodes
565have to restart corosync and come up one after the other on the new network.
566
3254bfdd 567Check how to xref:pvecm_edit_corosync_conf[edit the corosync.conf file] first.
a9e7c3aa 568Then, open it and you should see a file similar to:
e4ec4154
TL
569
570----
571logging {
572 debug: off
573 to_syslog: yes
574}
575
576nodelist {
577
578 node {
579 name: due
580 nodeid: 2
581 quorum_votes: 1
582 ring0_addr: due
583 }
584
585 node {
586 name: tre
587 nodeid: 3
588 quorum_votes: 1
589 ring0_addr: tre
590 }
591
592 node {
593 name: uno
594 nodeid: 1
595 quorum_votes: 1
596 ring0_addr: uno
597 }
598
599}
600
601quorum {
602 provider: corosync_votequorum
603}
604
605totem {
a9e7c3aa 606 cluster_name: testcluster
e4ec4154 607 config_version: 3
a9e7c3aa 608 ip_version: ipv4-6
e4ec4154
TL
609 secauth: on
610 version: 2
611 interface {
a9e7c3aa 612 linknumber: 0
e4ec4154
TL
613 }
614
615}
616----
617
a9e7c3aa
SR
618NOTE: `ringX_addr` actually specifies a corosync *link address*, the name "ring"
619is a remnant of older corosync versions that is kept for backwards
620compatibility.
621
622The first thing you want to do is add the 'name' properties in the node entries
623if you do not see them already. Those *must* match the node name.
e4ec4154 624
a9e7c3aa
SR
625Then replace all addresses from the 'ring0_addr' properties of all nodes with
626the new addresses. You may use plain IP addresses or hostnames here. If you use
270757a1 627hostnames ensure that they are resolvable from all nodes. (see also
a9e7c3aa 628xref:pvecm_corosync_addresses[Link Address Types])
e4ec4154 629
a9e7c3aa
SR
630In this example, we want to switch the cluster communication to the
63110.10.10.1/25 network. So we replace all 'ring0_addr' respectively.
e4ec4154 632
a9e7c3aa
SR
633NOTE: The exact same procedure can be used to change other 'ringX_addr' values
634as well, although we recommend to not change multiple addresses at once, to make
635it easier to recover if something goes wrong.
636
637After we increase the 'config_version' property, the new configuration file
e4ec4154
TL
638should look like:
639
640----
e4ec4154
TL
641logging {
642 debug: off
643 to_syslog: yes
644}
645
646nodelist {
647
648 node {
649 name: due
650 nodeid: 2
651 quorum_votes: 1
652 ring0_addr: 10.10.10.2
653 }
654
655 node {
656 name: tre
657 nodeid: 3
658 quorum_votes: 1
659 ring0_addr: 10.10.10.3
660 }
661
662 node {
663 name: uno
664 nodeid: 1
665 quorum_votes: 1
666 ring0_addr: 10.10.10.1
667 }
668
669}
670
671quorum {
672 provider: corosync_votequorum
673}
674
675totem {
a9e7c3aa 676 cluster_name: testcluster
e4ec4154 677 config_version: 4
a9e7c3aa 678 ip_version: ipv4-6
e4ec4154
TL
679 secauth: on
680 version: 2
681 interface {
a9e7c3aa 682 linknumber: 0
e4ec4154
TL
683 }
684
685}
686----
687
a9e7c3aa
SR
688Then, after a final check if all changed information is correct, we save it and
689once again follow the xref:pvecm_edit_corosync_conf[edit corosync.conf file]
690section to bring it into effect.
e4ec4154 691
a9e7c3aa
SR
692The changes will be applied live, so restarting corosync is not strictly
693necessary. If you changed other settings as well, or notice corosync
694complaining, you can optionally trigger a restart.
e4ec4154
TL
695
696On a single node execute:
a9e7c3aa 697
e4ec4154 698[source,bash]
4d19cb00 699----
e4ec4154 700systemctl restart corosync
4d19cb00 701----
e4ec4154
TL
702
703Now check if everything is fine:
704
705[source,bash]
4d19cb00 706----
e4ec4154 707systemctl status corosync
4d19cb00 708----
e4ec4154
TL
709
710If corosync runs again correct restart corosync also on all other nodes.
711They will then join the cluster membership one by one on the new network.
712
3254bfdd 713[[pvecm_corosync_addresses]]
270757a1
SR
714Corosync addresses
715~~~~~~~~~~~~~~~~~~
716
a9e7c3aa
SR
717A corosync link address (for backwards compatibility denoted by 'ringX_addr' in
718`corosync.conf`) can be specified in two ways:
270757a1
SR
719
720* **IPv4/v6 addresses** will be used directly. They are recommended, since they
721are static and usually not changed carelessly.
722
723* **Hostnames** will be resolved using `getaddrinfo`, which means that per
724default, IPv6 addresses will be used first, if available (see also
725`man gai.conf`). Keep this in mind, especially when upgrading an existing
726cluster to IPv6.
727
728CAUTION: Hostnames should be used with care, since the address they
729resolve to can be changed without touching corosync or the node it runs on -
730which may lead to a situation where an address is changed without thinking
731about implications for corosync.
732
5f318cc0 733A separate, static hostname specifically for corosync is recommended, if
270757a1
SR
734hostnames are preferred. Also, make sure that every node in the cluster can
735resolve all hostnames correctly.
736
737Since {pve} 5.1, while supported, hostnames will be resolved at the time of
738entry. Only the resolved IP is then saved to the configuration.
739
740Nodes that joined the cluster on earlier versions likely still use their
741unresolved hostname in `corosync.conf`. It might be a good idea to replace
5f318cc0 742them with IPs or a separate hostname, as mentioned above.
270757a1 743
e4ec4154 744
a9e7c3aa
SR
745[[pvecm_redundancy]]
746Corosync Redundancy
747-------------------
e4ec4154 748
a9e7c3aa
SR
749Corosync supports redundant networking via its integrated kronosnet layer by
750default (it is not supported on the legacy udp/udpu transports). It can be
751enabled by specifying more than one link address, either via the '--linkX'
3e380ce0
SR
752parameters of `pvecm`, in the GUI as **Link 1** (while creating a cluster or
753adding a new node) or by specifying more than one 'ringX_addr' in
754`corosync.conf`.
e4ec4154 755
a9e7c3aa
SR
756NOTE: To provide useful failover, every link should be on its own
757physical network connection.
e4ec4154 758
a9e7c3aa
SR
759Links are used according to a priority setting. You can configure this priority
760by setting 'knet_link_priority' in the corresponding interface section in
5f318cc0 761`corosync.conf`, or, preferably, using the 'priority' parameter when creating
a9e7c3aa 762your cluster with `pvecm`:
e4ec4154 763
4d19cb00 764----
fcf0226e 765 # pvecm create CLUSTERNAME --link0 10.10.10.1,priority=15 --link1 10.20.20.1,priority=20
4d19cb00 766----
e4ec4154 767
fcf0226e 768This would cause 'link1' to be used first, since it has the higher priority.
a9e7c3aa
SR
769
770If no priorities are configured manually (or two links have the same priority),
771links will be used in order of their number, with the lower number having higher
772priority.
773
774Even if all links are working, only the one with the highest priority will see
775corosync traffic. Link priorities cannot be mixed, i.e. links with different
776priorities will not be able to communicate with each other.
e4ec4154 777
a9e7c3aa
SR
778Since lower priority links will not see traffic unless all higher priorities
779have failed, it becomes a useful strategy to specify even networks used for
780other tasks (VMs, storage, etc...) as low-priority links. If worst comes to
781worst, a higher-latency or more congested connection might be better than no
782connection at all.
e4ec4154 783
a9e7c3aa
SR
784Adding Redundant Links To An Existing Cluster
785~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
e4ec4154 786
a9e7c3aa
SR
787To add a new link to a running configuration, first check how to
788xref:pvecm_edit_corosync_conf[edit the corosync.conf file].
e4ec4154 789
a9e7c3aa
SR
790Then, add a new 'ringX_addr' to every node in the `nodelist` section. Make
791sure that your 'X' is the same for every node you add it to, and that it is
792unique for each node.
793
794Lastly, add a new 'interface', as shown below, to your `totem`
795section, replacing 'X' with your link number chosen above.
796
797Assuming you added a link with number 1, the new configuration file could look
798like this:
e4ec4154
TL
799
800----
a9e7c3aa
SR
801logging {
802 debug: off
803 to_syslog: yes
e4ec4154
TL
804}
805
806nodelist {
a9e7c3aa 807
e4ec4154 808 node {
a9e7c3aa
SR
809 name: due
810 nodeid: 2
e4ec4154 811 quorum_votes: 1
a9e7c3aa
SR
812 ring0_addr: 10.10.10.2
813 ring1_addr: 10.20.20.2
e4ec4154
TL
814 }
815
a9e7c3aa
SR
816 node {
817 name: tre
818 nodeid: 3
e4ec4154 819 quorum_votes: 1
a9e7c3aa
SR
820 ring0_addr: 10.10.10.3
821 ring1_addr: 10.20.20.3
e4ec4154
TL
822 }
823
a9e7c3aa
SR
824 node {
825 name: uno
826 nodeid: 1
827 quorum_votes: 1
828 ring0_addr: 10.10.10.1
829 ring1_addr: 10.20.20.1
830 }
831
832}
833
834quorum {
835 provider: corosync_votequorum
836}
837
838totem {
839 cluster_name: testcluster
840 config_version: 4
841 ip_version: ipv4-6
842 secauth: on
843 version: 2
844 interface {
845 linknumber: 0
846 }
847 interface {
848 linknumber: 1
849 }
e4ec4154 850}
a9e7c3aa 851----
e4ec4154 852
a9e7c3aa
SR
853The new link will be enabled as soon as you follow the last steps to
854xref:pvecm_edit_corosync_conf[edit the corosync.conf file]. A restart should not
855be necessary. You can check that corosync loaded the new link using:
e4ec4154 856
a9e7c3aa
SR
857----
858journalctl -b -u corosync
e4ec4154
TL
859----
860
a9e7c3aa
SR
861It might be a good idea to test the new link by temporarily disconnecting the
862old link on one node and making sure that its status remains online while
863disconnected:
e4ec4154 864
a9e7c3aa
SR
865----
866pvecm status
867----
868
869If you see a healthy cluster state, it means that your new link is being used.
e4ec4154 870
e4ec4154 871
9d999d1b
TL
872Role of SSH in {PVE} Clusters
873-----------------------------
39aa8892 874
4e8fe2a9 875{PVE} utilizes SSH tunnels for various features.
39aa8892 876
4e8fe2a9 877* Proxying console/shell sessions (node and guests)
9d999d1b 878+
4e8fe2a9
FG
879When using the shell for node B while being connected to node A, connects to a
880terminal proxy on node A, which is in turn connected to the login shell on node
881B via a non-interactive SSH tunnel.
39aa8892 882
4e8fe2a9
FG
883* VM and CT memory and local-storage migration in 'secure' mode.
884+
885During the migration one or more SSH tunnel(s) are established between the
886source and target nodes, in order to exchange migration information and
887transfer memory and disk contents.
9d999d1b
TL
888
889* Storage replication
39aa8892 890
9d999d1b
TL
891.Pitfalls due to automatic execution of `.bashrc` and siblings
892[IMPORTANT]
893====
894In case you have a custom `.bashrc`, or similar files that get executed on
895login by the configured shell, `ssh` will automatically run it once the session
896is established successfully. This can cause some unexpected behavior, as those
897commands may be executed with root permissions on any above described
898operation. That can cause possible problematic side-effects!
39aa8892
OB
899
900In order to avoid such complications, it's recommended to add a check in
901`/root/.bashrc` to make sure the session is interactive, and only then run
902`.bashrc` commands.
903
904You can add this snippet at the beginning of your `.bashrc` file:
905
906----
9d999d1b 907# Early exit if not running interactively to avoid side-effects!
39aa8892
OB
908case $- in
909 *i*) ;;
910 *) return;;
911esac
912----
9d999d1b 913====
39aa8892
OB
914
915
c21d2cbe
OB
916Corosync External Vote Support
917------------------------------
918
919This section describes a way to deploy an external voter in a {pve} cluster.
920When configured, the cluster can sustain more node failures without
921violating safety properties of the cluster communication.
922
923For this to work there are two services involved:
924
925* a so called qdevice daemon which runs on each {pve} node
926
927* an external vote daemon which runs on an independent server.
928
929As a result you can achieve higher availability even in smaller setups (for
930example 2+1 nodes).
931
932QDevice Technical Overview
933~~~~~~~~~~~~~~~~~~~~~~~~~~
934
5f318cc0 935The Corosync Quorum Device (QDevice) is a daemon which runs on each cluster
c21d2cbe
OB
936node. It provides a configured number of votes to the clusters quorum
937subsystem based on an external running third-party arbitrator's decision.
938Its primary use is to allow a cluster to sustain more node failures than
939standard quorum rules allow. This can be done safely as the external device
940can see all nodes and thus choose only one set of nodes to give its vote.
51730d56 941This will only be done if said set of nodes can have quorum (again) when
c21d2cbe
OB
942receiving the third-party vote.
943
944Currently only 'QDevice Net' is supported as a third-party arbitrator. It is
945a daemon which provides a vote to a cluster partition if it can reach the
946partition members over the network. It will give only votes to one partition
947of a cluster at any time.
948It's designed to support multiple clusters and is almost configuration and
949state free. New clusters are handled dynamically and no configuration file
950is needed on the host running a QDevice.
951
952The external host has the only requirement that it needs network access to the
953cluster and a corosync-qnetd package available. We provide such a package
954for Debian based hosts, other Linux distributions should also have a package
955available through their respective package manager.
956
957NOTE: In contrast to corosync itself, a QDevice connects to the cluster over
a9e7c3aa
SR
958TCP/IP. The daemon may even run outside of the clusters LAN and can have longer
959latencies than 2 ms.
c21d2cbe
OB
960
961Supported Setups
962~~~~~~~~~~~~~~~~
963
964We support QDevices for clusters with an even number of nodes and recommend
965it for 2 node clusters, if they should provide higher availability.
966For clusters with an odd node count we discourage the use of QDevices
967currently. The reason for this, is the difference of the votes the QDevice
968provides for each cluster type. Even numbered clusters get single additional
969vote, with this we can only increase availability, i.e. if the QDevice
970itself fails we are in the same situation as with no QDevice at all.
971
972Now, with an odd numbered cluster size the QDevice provides '(N-1)' votes --
973where 'N' corresponds to the cluster node count. This difference makes
974sense, if we had only one additional vote the cluster can get into a split
975brain situation.
976This algorithm would allow that all nodes but one (and naturally the
977QDevice itself) could fail.
978There are two drawbacks with this:
979
980* If the QNet daemon itself fails, no other node may fail or the cluster
981 immediately loses quorum. For example, in a cluster with 15 nodes 7
982 could fail before the cluster becomes inquorate. But, if a QDevice is
983 configured here and said QDevice fails itself **no single node** of
984 the 15 may fail. The QDevice acts almost as a single point of failure in
985 this case.
986
987* The fact that all but one node plus QDevice may fail sound promising at
988 first, but this may result in a mass recovery of HA services that would
989 overload the single node left. Also ceph server will stop to provide
990 services after only '((N-1)/2)' nodes are online.
991
992If you understand the drawbacks and implications you can decide yourself if
993you should use this technology in an odd numbered cluster setup.
994
c21d2cbe
OB
995QDevice-Net Setup
996~~~~~~~~~~~~~~~~~
997
998We recommend to run any daemon which provides votes to corosync-qdevice as an
7c039095 999unprivileged user. {pve} and Debian provide a package which is already
e34c3e91 1000configured to do so.
c21d2cbe
OB
1001The traffic between the daemon and the cluster must be encrypted to ensure a
1002safe and secure QDevice integration in {pve}.
1003
41a37193
DJ
1004First, install the 'corosync-qnetd' package on your external server
1005
1006----
1007external# apt install corosync-qnetd
1008----
1009
1010and the 'corosync-qdevice' package on all cluster nodes
1011
1012----
1013pve# apt install corosync-qdevice
1014----
c21d2cbe
OB
1015
1016After that, ensure that all your nodes on the cluster are online.
1017
1018You can now easily set up your QDevice by running the following command on one
1019of the {pve} nodes:
1020
1021----
1022pve# pvecm qdevice setup <QDEVICE-IP>
1023----
1024
1b80fbaa
DJ
1025The SSH key from the cluster will be automatically copied to the QDevice.
1026
1027NOTE: Make sure that the SSH configuration on your external server allows root
1028login via password, if you are asked for a password during this step.
c21d2cbe
OB
1029
1030After you enter the password and all the steps are successfully completed, you
1031will see "Done". You can check the status now:
1032
1033----
1034pve# pvecm status
1035
1036...
1037
1038Votequorum information
1039~~~~~~~~~~~~~~~~~~~~~
1040Expected votes: 3
1041Highest expected: 3
1042Total votes: 3
1043Quorum: 2
1044Flags: Quorate Qdevice
1045
1046Membership information
1047~~~~~~~~~~~~~~~~~~~~~~
1048 Nodeid Votes Qdevice Name
1049 0x00000001 1 A,V,NMW 192.168.22.180 (local)
1050 0x00000002 1 A,V,NMW 192.168.22.181
1051 0x00000000 1 Qdevice
1052
1053----
1054
1055which means the QDevice is set up.
1056
c21d2cbe
OB
1057Frequently Asked Questions
1058~~~~~~~~~~~~~~~~~~~~~~~~~~
1059
1060Tie Breaking
1061^^^^^^^^^^^^
1062
00821894
TL
1063In case of a tie, where two same-sized cluster partitions cannot see each other
1064but the QDevice, the QDevice chooses randomly one of those partitions and
c21d2cbe
OB
1065provides a vote to it.
1066
d31de328
TL
1067Possible Negative Implications
1068^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1069
00821894
TL
1070For clusters with an even node count there are no negative implications when
1071setting up a QDevice. If it fails to work, you are as good as without QDevice at
1072all.
d31de328 1073
870c2817
OB
1074Adding/Deleting Nodes After QDevice Setup
1075^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
d31de328
TL
1076
1077If you want to add a new node or remove an existing one from a cluster with a
00821894
TL
1078QDevice setup, you need to remove the QDevice first. After that, you can add or
1079remove nodes normally. Once you have a cluster with an even node count again,
1080you can set up the QDevice again as described above.
870c2817
OB
1081
1082Removing the QDevice
1083^^^^^^^^^^^^^^^^^^^^
1084
00821894
TL
1085If you used the official `pvecm` tool to add the QDevice, you can remove it
1086trivially by running:
870c2817
OB
1087
1088----
1089pve# pvecm qdevice remove
1090----
d31de328 1091
51730d56
TL
1092//Still TODO
1093//^^^^^^^^^^
a9e7c3aa 1094//There is still stuff to add here
c21d2cbe
OB
1095
1096
e4ec4154
TL
1097Corosync Configuration
1098----------------------
1099
a9e7c3aa
SR
1100The `/etc/pve/corosync.conf` file plays a central role in a {pve} cluster. It
1101controls the cluster membership and its network.
1102For further information about it, check the corosync.conf man page:
e4ec4154 1103[source,bash]
4d19cb00 1104----
e4ec4154 1105man corosync.conf
4d19cb00 1106----
e4ec4154
TL
1107
1108For node membership you should always use the `pvecm` tool provided by {pve}.
1109You may have to edit the configuration file manually for other changes.
1110Here are a few best practice tips for doing this.
1111
3254bfdd 1112[[pvecm_edit_corosync_conf]]
e4ec4154
TL
1113Edit corosync.conf
1114~~~~~~~~~~~~~~~~~~
1115
a9e7c3aa
SR
1116Editing the corosync.conf file is not always very straightforward. There are
1117two on each cluster node, one in `/etc/pve/corosync.conf` and the other in
e4ec4154
TL
1118`/etc/corosync/corosync.conf`. Editing the one in our cluster file system will
1119propagate the changes to the local one, but not vice versa.
1120
1121The configuration will get updated automatically as soon as the file changes.
1122This means changes which can be integrated in a running corosync will take
a9e7c3aa
SR
1123effect immediately. So you should always make a copy and edit that instead, to
1124avoid triggering some unwanted changes by an in-between safe.
e4ec4154
TL
1125
1126[source,bash]
4d19cb00 1127----
e4ec4154 1128cp /etc/pve/corosync.conf /etc/pve/corosync.conf.new
4d19cb00 1129----
e4ec4154 1130
a9e7c3aa
SR
1131Then open the config file with your favorite editor, `nano` and `vim.tiny` are
1132preinstalled on any {pve} node for example.
e4ec4154
TL
1133
1134NOTE: Always increment the 'config_version' number on configuration changes,
1135omitting this can lead to problems.
1136
1137After making the necessary changes create another copy of the current working
1138configuration file. This serves as a backup if the new configuration fails to
1139apply or makes problems in other ways.
1140
1141[source,bash]
4d19cb00 1142----
e4ec4154 1143cp /etc/pve/corosync.conf /etc/pve/corosync.conf.bak
4d19cb00 1144----
e4ec4154
TL
1145
1146Then move the new configuration file over the old one:
1147[source,bash]
4d19cb00 1148----
e4ec4154 1149mv /etc/pve/corosync.conf.new /etc/pve/corosync.conf
4d19cb00 1150----
e4ec4154
TL
1151
1152You may check with the commands
1153[source,bash]
4d19cb00 1154----
e4ec4154
TL
1155systemctl status corosync
1156journalctl -b -u corosync
4d19cb00 1157----
e4ec4154 1158
a9e7c3aa 1159If the change could be applied automatically. If not you may have to restart the
e4ec4154
TL
1160corosync service via:
1161[source,bash]
4d19cb00 1162----
e4ec4154 1163systemctl restart corosync
4d19cb00 1164----
e4ec4154
TL
1165
1166On errors check the troubleshooting section below.
1167
1168Troubleshooting
1169~~~~~~~~~~~~~~~
1170
1171Issue: 'quorum.expected_votes must be configured'
1172^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1173
1174When corosync starts to fail and you get the following message in the system log:
1175
1176----
1177[...]
1178corosync[1647]: [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
1179corosync[1647]: [SERV ] Service engine 'corosync_quorum' failed to load for reason
1180 'configuration error: nodelist or quorum.expected_votes must be configured!'
1181[...]
1182----
1183
1184It means that the hostname you set for corosync 'ringX_addr' in the
1185configuration could not be resolved.
1186
e4ec4154
TL
1187Write Configuration When Not Quorate
1188^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1189
1190If you need to change '/etc/pve/corosync.conf' on an node with no quorum, and you
1191know what you do, use:
1192[source,bash]
4d19cb00 1193----
e4ec4154 1194pvecm expected 1
4d19cb00 1195----
e4ec4154
TL
1196
1197This sets the expected vote count to 1 and makes the cluster quorate. You can
1198now fix your configuration, or revert it back to the last working backup.
1199
6d3c0b34 1200This is not enough if corosync cannot start anymore. Here it is best to edit the
e4ec4154
TL
1201local copy of the corosync configuration in '/etc/corosync/corosync.conf' so
1202that corosync can start again. Ensure that on all nodes this configuration has
1203the same content to avoid split brains. If you are not sure what went wrong
1204it's best to ask the Proxmox Community to help you.
1205
1206
3254bfdd 1207[[pvecm_corosync_conf_glossary]]
e4ec4154
TL
1208Corosync Configuration Glossary
1209~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1210
1211ringX_addr::
a9e7c3aa
SR
1212This names the different link addresses for the kronosnet connections between
1213nodes.
e4ec4154 1214
806ef12d
DM
1215
1216Cluster Cold Start
1217------------------
1218
1219It is obvious that a cluster is not quorate when all nodes are
1220offline. This is a common case after a power failure.
1221
1222NOTE: It is always a good idea to use an uninterruptible power supply
8c1189b6 1223(``UPS'', also called ``battery backup'') to avoid this state, especially if
806ef12d
DM
1224you want HA.
1225
204231df 1226On node startup, the `pve-guests` service is started and waits for
8c1189b6 1227quorum. Once quorate, it starts all guests which have the `onboot`
612417fd
DM
1228flag set.
1229
1230When you turn on nodes, or when power comes back after power failure,
1231it is likely that some nodes boots faster than others. Please keep in
1232mind that guest startup is delayed until you reach quorum.
806ef12d 1233
054a7e7d 1234
082ea7d9
TL
1235Guest Migration
1236---------------
1237
054a7e7d
DM
1238Migrating virtual guests to other nodes is a useful feature in a
1239cluster. There are settings to control the behavior of such
1240migrations. This can be done via the configuration file
1241`datacenter.cfg` or for a specific migration via API or command line
1242parameters.
1243
da6c7dee
DC
1244It makes a difference if a Guest is online or offline, or if it has
1245local resources (like a local disk).
1246
1247For Details about Virtual Machine Migration see the
a9e7c3aa 1248xref:qm_migration[QEMU/KVM Migration Chapter].
da6c7dee
DC
1249
1250For Details about Container Migration see the
a9e7c3aa 1251xref:pct_migration[Container Migration Chapter].
082ea7d9
TL
1252
1253Migration Type
1254~~~~~~~~~~~~~~
1255
44f38275 1256The migration type defines if the migration data should be sent over an
d63be10b 1257encrypted (`secure`) channel or an unencrypted (`insecure`) one.
082ea7d9 1258Setting the migration type to insecure means that the RAM content of a
470d4313 1259virtual guest gets also transferred unencrypted, which can lead to
b1743473
DM
1260information disclosure of critical data from inside the guest (for
1261example passwords or encryption keys).
054a7e7d
DM
1262
1263Therefore, we strongly recommend using the secure channel if you do
1264not have full control over the network and can not guarantee that no
6d3c0b34 1265one is eavesdropping on it.
082ea7d9 1266
054a7e7d
DM
1267NOTE: Storage migration does not follow this setting. Currently, it
1268always sends the storage content over a secure channel.
1269
1270Encryption requires a lot of computing power, so this setting is often
1271changed to "unsafe" to achieve better performance. The impact on
1272modern systems is lower because they implement AES encryption in
b1743473
DM
1273hardware. The performance impact is particularly evident in fast
1274networks where you can transfer 10 Gbps or more.
082ea7d9 1275
082ea7d9
TL
1276Migration Network
1277~~~~~~~~~~~~~~~~~
1278
a9baa444
TL
1279By default, {pve} uses the network in which cluster communication
1280takes place to send the migration traffic. This is not optimal because
1281sensitive cluster traffic can be disrupted and this network may not
1282have the best bandwidth available on the node.
1283
1284Setting the migration network parameter allows the use of a dedicated
1285network for the entire migration traffic. In addition to the memory,
1286this also affects the storage traffic for offline migrations.
1287
1288The migration network is set as a network in the CIDR notation. This
1289has the advantage that you do not have to set individual IP addresses
1290for each node. {pve} can determine the real address on the
1291destination node from the network specified in the CIDR form. To
1292enable this, the network must be specified so that each node has one,
1293but only one IP in the respective network.
1294
082ea7d9
TL
1295Example
1296^^^^^^^
1297
a9baa444
TL
1298We assume that we have a three-node setup with three separate
1299networks. One for public communication with the Internet, one for
1300cluster communication and a very fast one, which we want to use as a
1301dedicated network for migration.
1302
1303A network configuration for such a setup might look as follows:
082ea7d9
TL
1304
1305----
7a0d4784 1306iface eno1 inet manual
082ea7d9
TL
1307
1308# public network
1309auto vmbr0
1310iface vmbr0 inet static
1311 address 192.X.Y.57
1312 netmask 255.255.250.0
1313 gateway 192.X.Y.1
7a39aabd
AL
1314 bridge-ports eno1
1315 bridge-stp off
1316 bridge-fd 0
082ea7d9
TL
1317
1318# cluster network
7a0d4784
WL
1319auto eno2
1320iface eno2 inet static
082ea7d9
TL
1321 address 10.1.1.1
1322 netmask 255.255.255.0
1323
1324# fast network
7a0d4784
WL
1325auto eno3
1326iface eno3 inet static
082ea7d9
TL
1327 address 10.1.2.1
1328 netmask 255.255.255.0
082ea7d9
TL
1329----
1330
a9baa444
TL
1331Here, we will use the network 10.1.2.0/24 as a migration network. For
1332a single migration, you can do this using the `migration_network`
1333parameter of the command line tool:
1334
082ea7d9 1335----
b1743473 1336# qm migrate 106 tre --online --migration_network 10.1.2.0/24
082ea7d9
TL
1337----
1338
a9baa444
TL
1339To configure this as the default network for all migrations in the
1340cluster, set the `migration` property of the `/etc/pve/datacenter.cfg`
1341file:
1342
082ea7d9 1343----
a9baa444 1344# use dedicated migration network
b1743473 1345migration: secure,network=10.1.2.0/24
082ea7d9
TL
1346----
1347
a9baa444
TL
1348NOTE: The migration type must always be set when the migration network
1349gets set in `/etc/pve/datacenter.cfg`.
1350
806ef12d 1351
d8742b0c
DM
1352ifdef::manvolnum[]
1353include::pve-copyright.adoc[]
1354endif::manvolnum[]