[pmg-docs.git] / pmgcm.adoc

[[chapter_pmgcm]]
ifdef::manvolnum[]
pmgcm(1)
========
:pmg-toplevel:

NAME
----

pmgcm - Proxmox Mail Gateway Cluster Management Toolkit


SYNOPSIS
--------

include::pmgcm.1-synopsis.adoc[]


DESCRIPTION
-----------
endif::manvolnum[]
ifndef::manvolnum[]
Cluster Management
==================
:pmg-toplevel:
endif::manvolnum[]

We are living in a world where email is becoming more and more important -
failures in email systems are not acceptable. To meet these
requirements, we developed the Proxmox HA (High Availability) Cluster.

The {pmg} HA Cluster consists of a master node and several slave nodes
(minimum one slave node). Configuration is done on the master,
and data is synchronized to all cluster nodes via a VPN tunnel. This
provides the following advantages:

* centralized configuration management

* fully redundant data storage

* high availability

* high performance

We use a unique application level clustering scheme, which provides
extremely good performance. Special considerations were taken to make
management as easy as possible. A complete cluster setup is done within
minutes, and nodes automatically reintegrate after temporary failures,
without any operator interaction.

image::images/Proxmox_HA_cluster_final_1024.png[]


Hardware Requirements
---------------------

There are no special hardware requirements, although it is highly
recommended to use fast and reliable server hardware, with redundant disks on
all cluster nodes (Hardware RAID with BBU and write cache enabled).

The HA Cluster can also run in virtualized environments.


Subscriptions
-------------

Each node in a cluster has its own subscription. If you want support
for a cluster, each cluster node needs to have a valid
subscription. All nodes must have the same subscription level.


Load Balancing
--------------

It is usually advisable to distribute mail traffic among all cluster
nodes. Please note that this is not always required, because it is
also reasonable to use only one node to handle SMTP traffic. The
second node can then be used as a quarantine host, that only provides the web
interface to the user quarantine.

The normal mail delivery process looks up DNS Mail Exchange (`MX`)
records to determine the destination host. An `MX` record tells the
sending system where to deliver mail for a certain domain. It is also
possible to have several `MX` records for a single domain, each of which can
have different priorities. For example, our `MX` record looks like this:

----
# dig -t mx proxmox.com

;; ANSWER SECTION:
proxmox.com.            22879   IN      MX      10 mail.proxmox.com.

;; ADDITIONAL SECTION:
mail.proxmox.com.       22879   IN      A       213.129.239.114
----

Notice that there is a single `MX` record for the domain
`proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
automatically outputs the corresponding address record, if it
exists. In our case it points to `213.129.239.114`. The priority of
our `MX` record is set to 10 (preferred default value).


Hot standby with backup `MX` records
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Many people do not want to install two redundant mail proxies. Instead
they use the mail proxy of their ISP as a fallback. This can be done
by adding an additional `MX` record with a lower priority (higher
number). Continuing from the example above, this would look like:

----
proxmox.com.            22879   IN      MX      100 mail.provider.tld.
----

In such a setup, your provider must accept mails for your domain and
forward them to you. Please note that this is not advisable, because
spam detection needs to be done by the backup `MX` server as well, and
external servers provided by ISPs usually don't do this.

However, you will never lose mails with such a setup, because the sending Mail
Transport Agent (MTA) will simply deliver the mail to the backup
server (mail.provider.tld), if the primary server (mail.proxmox.com) is
not available.

NOTE: Any reasonable mail server retries mail delivery if the target
server is not available. {pmg} stores mail and retries delivery
for up to one week. Thus, you will not lose emails if your mail server is
down, even if you run a single server setup.


Load balancing with `MX` records
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Using your ISP's mail server is not always a good idea, because many
ISPs do not use advanced spam prevention techniques, or do not filter
spam at all. It is often better to run a second server yourself to
avoid lower spam detection rates.

It’s quite simple to set up a high-performance, load-balanced
mail cluster using `MX` records. You just need to define two `MX`
records with the same priority. The rest of this section will provide
a complete example.

First, you need to have at least two working {pmg} servers
(mail1.example.com and mail2.example.com), configured as a cluster (see
section xref:pmg_cluster_administration[Cluster Administration]
below), with each having its own IP address. Let us assume the
following DNS address records:

----
mail1.example.com.       22879   IN      A       1.2.3.4
mail2.example.com.       22879   IN      A       1.2.3.5
----

It is always a good idea to add reverse lookup entries (PTR
records) for those hosts, as many email systems nowadays reject mails
from hosts without valid PTR records. Then you need to define your `MX`
records:

----
example.com.            22879   IN      MX      10 mail1.example.com.
example.com.            22879   IN      MX      10 mail2.example.com.
----

This is all you need. Following this, you will receive mail on both
hosts, load-balanced using round-robin scheduling. If one host fails,
the other one is used.


Other ways
~~~~~~~~~~

Multiple address records
^^^^^^^^^^^^^^^^^^^^^^^^

Using several DNS `MX` records can be tedious, if you have many
domains. It is also possible to use one `MX` record per domain, but
multiple address records:

----
example.com.            22879   IN      MX      10 mail.example.com.
mail.example.com.       22879   IN      A       1.2.3.4
mail.example.com.       22879   IN      A       1.2.3.5
----


Using firewall features
^^^^^^^^^^^^^^^^^^^^^^^

Many firewalls can do some kind of RR-Scheduling (round-robin) when
using DNAT. See your firewall manual for more details.


[[pmg_cluster_administration]]
Cluster Administration
----------------------

Cluster administration can be done from the GUI or by using the command-line
utility `pmgcm`. The CLI tool is a bit more verbose, so we suggest
to use that if you run into any problems.

NOTE: Always set up the IP configuration, before adding a node to the
cluster. IP address, network mask, gateway address and hostname can’t
be changed later.

Creating a Cluster
~~~~~~~~~~~~~~~~~~

[thumbnail="screenshot/pmg-gui-cluster-panel.png", big=1]

You can create a cluster from any existing {pmg} host. All data is
preserved.

* make sure you have the right IP configuration
  (IP/MASK/GATEWAY/HOSTNAME), because you cannot change that later

* press the create button on the GUI, or run the cluster creation command:
+
----
pmgcm create
----

NOTE: The node where you run the cluster create command will be the
'master' node.


Show Cluster Status
~~~~~~~~~~~~~~~~~~~

The GUI shows the status of all cluster nodes. You can also view this
using the command-line tool:

----
pmgcm status
--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
pmg5(1)              192.168.2.127   master A       1 day 21:18   0.30    80%    41%
----


[[pmgcm_join]]
Adding Cluster Nodes
~~~~~~~~~~~~~~~~~~~~

[thumbnail="screenshot/pmg-gui-cluster-join.png", big=1]

When you add a new node to a cluster (using `join`), all data on that node is
destroyed. The whole database is initialized with the cluster data from
the master.

* make sure you have the right IP configuration

* run the cluster join command (on the new node):
+
----
pmgcm join <master_ip>
----

You need to enter the root password of the master host, when asked for
a password. When joining a cluster using the GUI, you also need to
enter the 'fingerprint' of the master node. You can get this information
by pressing the `Add` button on the master node.

NOTE: Joining a cluster with two-factor authentication enabled for the `root`
user is not supported. Remove the second factor when joining the cluster.

CAUTION: Node initialization deletes all existing databases, stops all
services accessing the database and then restarts them. Therefore, do
not add nodes which are already active and receive mail.

Also note that joining a cluster can take several minutes, because the
new node needs to synchronize all data from the master (although this
is done in the background).

NOTE: If you join a new node, existing quarantined items from the
other nodes are not synchronized to the new node.


Deleting Nodes
~~~~~~~~~~~~~~

Please detach nodes from the cluster network, before removing them
from the cluster configuration. Only then you should run the following
command on the master node:

----
pmgcm delete <cid>
----

Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.


Disaster Recovery
~~~~~~~~~~~~~~~~~

It is highly recommended to use redundant disks on all cluster nodes
(RAID). So in almost any circumstance, you just need to replace the
damaged hardware or disk. {pmg} uses an asynchronous
clustering algorithm, so you just need to reboot the repaired node,
and everything will work again transparently.

The following scenarios only apply when you really lose the contents
of the hard disk.


Single Node Failure
^^^^^^^^^^^^^^^^^^^

* delete failed node on master
+
----
pmgcm delete <cid>
----

* add (re-join) a new node
+
----
pmgcm join <master_ip>
----


Master Failure
^^^^^^^^^^^^^^

* force another node to be master
+
-----
pmgcm promote
-----

* tell other nodes that master has changed
+
----
pmgcm sync --master_ip <master_ip>
----


Total Cluster Failure
^^^^^^^^^^^^^^^^^^^^^

* restore backup (Cluster and node information is not restored; you
  have to recreate master and nodes)

* tell it to become master
+
----
pmgcm create
----

* install new nodes

* add those new nodes to the cluster
+
----
pmgcm join <master_ip>
----


ifdef::manvolnum[]
include::pmg-copyright.adoc[]
endif::manvolnum[]
Commit	Line	Data
a0f910ae DM	1	[[chapter_pmgcm]]
	2	ifdef::manvolnum[]
	3	pmgcm(1)
	4	========
	5	:pmg-toplevel:
	6
	7	NAME
	8	----
	9
	10	pmgcm - Proxmox Mail Gateway Cluster Management Toolkit
	11
	12
	13	SYNOPSIS
	14	--------
	15
	16	include::pmgcm.1-synopsis.adoc[]
	17
	18
	19	DESCRIPTION
	20	-----------
	21	endif::manvolnum[]
	22	ifndef::manvolnum[]
3ea67bfe DM	23	Cluster Management
3ea67bfe DM	24	==================
a0f910ae DM	25	:pmg-toplevel:
	26	endif::manvolnum[]
	27
8f980b65 DW	28	We are living in a world where email is becoming more and more important -
	29	failures in email systems are not acceptable. To meet these
	30	requirements, we developed the Proxmox HA (High Availability) Cluster.
3ea67bfe	31
8f980b65 DW	32	The {pmg} HA Cluster consists of a master node and several slave nodes
	33	(minimum one slave node). Configuration is done on the master,
	34	and data is synchronized to all cluster nodes via a VPN tunnel. This
3ea67bfe DM	35	provides the following advantages:
	36
	37	* centralized configuration management
	38
	39	* fully redundant data storage
	40
	41	* high availability
	42
	43	* high performance
	44
	45	We use a unique application level clustering scheme, which provides
c9c20893 OB	46	extremely good performance. Special considerations were taken to make
c9c20893 OB	47	management as easy as possible. A complete cluster setup is done within
8f980b65	48	minutes, and nodes automatically reintegrate after temporary failures,
3ea67bfe DM	49	without any operator interaction.
3ea67bfe DM	50
95f2ea5b	51	image::images/Proxmox_HA_cluster_final_1024.png[]
3ea67bfe DM	52
3ea67bfe DM	53
8f980b65	54	Hardware Requirements
3ea67bfe DM	55	---------------------
	56
	57	There are no special hardware requirements, although it is highly
8f980b65	58	recommended to use fast and reliable server hardware, with redundant disks on
3ea67bfe DM	59	all cluster nodes (Hardware RAID with BBU and write cache enabled).
	60
	61	The HA Cluster can also run in virtualized environments.
	62
	63
	64	Subscriptions
	65	-------------
	66
c9c20893	67	Each node in a cluster has its own subscription. If you want support
3ea67bfe DM	68	for a cluster, each cluster node needs to have a valid
	69	subscription. All nodes must have the same subscription level.
	70
	71
8f980b65	72	Load Balancing
3ea67bfe DM	73	--------------
3ea67bfe DM	74
9aaf2a8c DM	75	It is usually advisable to distribute mail traffic among all cluster
	76	nodes. Please note that this is not always required, because it is
	77	also reasonable to use only one node to handle SMTP traffic. The
8f980b65	78	second node can then be used as a quarantine host, that only provides the web
9aaf2a8c DM	79	interface to the user quarantine.
	80
	81	The normal mail delivery process looks up DNS Mail Exchange (`MX`)
c9c20893	82	records to determine the destination host. An `MX` record tells the
9aaf2a8c	83	sending system where to deliver mail for a certain domain. It is also
8f980b65 DW	84	possible to have several `MX` records for a single domain, each of which can
8f980b65 DW	85	have different priorities. For example, our `MX` record looks like this:
9aaf2a8c DM	86
	87	----
	88	# dig -t mx proxmox.com
	89
	90	;; ANSWER SECTION:
	91	proxmox.com. 22879 IN MX 10 mail.proxmox.com.
	92
	93	;; ADDITIONAL SECTION:
	94	mail.proxmox.com. 22879 IN A 213.129.239.114
	95	----
	96
c9c20893	97	Notice that there is a single `MX` record for the domain
9aaf2a8c	98	`proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
8f980b65	99	automatically outputs the corresponding address record, if it
9aaf2a8c DM	100	exists. In our case it points to `213.129.239.114`. The priority of
	101	our `MX` record is set to 10 (preferred default value).
	102
	103
	104	Hot standby with backup `MX` records
	105	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	106
8f980b65 DW	107	Many people do not want to install two redundant mail proxies. Instead
	108	they use the mail proxy of their ISP as a fallback. This can be done
	109	by adding an additional `MX` record with a lower priority (higher
	110	number). Continuing from the example above, this would look like:
9aaf2a8c DM	111
	112	----
	113	proxmox.com. 22879 IN MX 100 mail.provider.tld.
	114	----
	115
0c358d45 OB	116	In such a setup, your provider must accept mails for your domain and
	117	forward them to you. Please note that this is not advisable, because
	118	spam detection needs to be done by the backup `MX` server as well, and
8f980b65	119	external servers provided by ISPs usually don't do this.
9aaf2a8c	120
0c358d45	121	However, you will never lose mails with such a setup, because the sending Mail
9aaf2a8c	122	Transport Agent (MTA) will simply deliver the mail to the backup
8f980b65	123	server (mail.provider.tld), if the primary server (mail.proxmox.com) is
9aaf2a8c DM	124	not available.
9aaf2a8c DM	125
0c358d45	126	NOTE: Any reasonable mail server retries mail delivery if the target
8f980b65 DW	127	server is not available. {pmg} stores mail and retries delivery
8f980b65 DW	128	for up to one week. Thus, you will not lose emails if your mail server is
9aaf2a8c DM	129	down, even if you run a single server setup.
	130
	131
	132	Load balancing with `MX` records
	133	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	134
c9c20893	135	Using your ISP's mail server is not always a good idea, because many
9aaf2a8c	136	ISPs do not use advanced spam prevention techniques, or do not filter
c9c20893	137	spam at all. It is often better to run a second server yourself to
9aaf2a8c DM	138	avoid lower spam detection rates.
9aaf2a8c DM	139
8f980b65 DW	140	It’s quite simple to set up a high-performance, load-balanced
	141	mail cluster using `MX` records. You just need to define two `MX`
	142	records with the same priority. The rest of this section will provide
	143	a complete example.
9aaf2a8c	144
8f980b65 DW	145	First, you need to have at least two working {pmg} servers
	146	(mail1.example.com and mail2.example.com), configured as a cluster (see
	147	section xref:pmg_cluster_administration[Cluster Administration]
	148	below), with each having its own IP address. Let us assume the
	149	following DNS address records:
9aaf2a8c DM	150
	151	----
	152	mail1.example.com. 22879 IN A 1.2.3.4
	153	mail2.example.com. 22879 IN A 1.2.3.5
	154	----
	155
0c358d45	156	It is always a good idea to add reverse lookup entries (PTR
8f980b65	157	records) for those hosts, as many email systems nowadays reject mails
c9c20893	158	from hosts without valid PTR records. Then you need to define your `MX`
9aaf2a8c DM	159	records:
	160
	161	----
	162	example.com. 22879 IN MX 10 mail1.example.com.
	163	example.com. 22879 IN MX 10 mail2.example.com.
	164	----
	165
8f980b65 DW	166	This is all you need. Following this, you will receive mail on both
	167	hosts, load-balanced using round-robin scheduling. If one host fails,
	168	the other one is used.
9aaf2a8c DM	169
	170
	171	Other ways
	172	~~~~~~~~~~
	173
	174	Multiple address records
	175	^^^^^^^^^^^^^^^^^^^^^^^^
	176
8f980b65	177	Using several DNS `MX` records can be tedious, if you have many
9aaf2a8c DM	178	domains. It is also possible to use one `MX` record per domain, but
	179	multiple address records:
	180
	181	----
	182	example.com. 22879 IN MX 10 mail.example.com.
	183	mail.example.com. 22879 IN A 1.2.3.4
	184	mail.example.com. 22879 IN A 1.2.3.5
	185	----
	186
	187
	188	Using firewall features
	189	^^^^^^^^^^^^^^^^^^^^^^^
	190
	191	Many firewalls can do some kind of RR-Scheduling (round-robin) when
	192	using DNAT. See your firewall manual for more details.
3ea67bfe DM	193
3ea67bfe DM	194
9aaf2a8c	195	[[pmg_cluster_administration]]
8f980b65	196	Cluster Administration
3ea67bfe DM	197	----------------------
3ea67bfe DM	198
db96e742 NU	199	Cluster administration can be done from the GUI or by using the command-line
db96e742 NU	200	utility `pmgcm`. The CLI tool is a bit more verbose, so we suggest
c9c20893	201	to use that if you run into any problems.
3ea67bfe	202
8f980b65	203	NOTE: Always set up the IP configuration, before adding a node to the
3ea67bfe DM	204	cluster. IP address, network mask, gateway address and hostname can’t
	205	be changed later.
	206
3ea67bfe DM	207	Creating a Cluster
	208	~~~~~~~~~~~~~~~~~~
	209
38d14519	210	[thumbnail="screenshot/pmg-gui-cluster-panel.png", big=1]
5770431a	211
0c358d45	212	You can create a cluster from any existing {pmg} host. All data is
3ea67bfe DM	213	preserved.
	214
	215	* make sure you have the right IP configuration
5770431a	216	(IP/MASK/GATEWAY/HOSTNAME), because you cannot change that later
3ea67bfe	217
5770431a	218	* press the create button on the GUI, or run the cluster creation command:
3ea67bfe DM	219	+
	220	----
	221	pmgcm create
	222	----
	223
5770431a DM	224	NOTE: The node where you run the cluster create command will be the
	225	'master' node.
	226
3ea67bfe	227
5770431a	228	Show Cluster Status
3ea67bfe DM	229	~~~~~~~~~~~~~~~~~~~
3ea67bfe DM	230
8f980b65	231	The GUI shows the status of all cluster nodes. You can also view this
db96e742	232	using the command-line tool:
5770431a	233
3ea67bfe DM	234	----
	235	pmgcm status
	236	--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
	237	pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41%
	238	----
	239
	240
4a08dffe	241	[[pmgcm_join]]
3ea67bfe DM	242	Adding Cluster Nodes
	243	~~~~~~~~~~~~~~~~~~~~
	244
38d14519	245	[thumbnail="screenshot/pmg-gui-cluster-join.png", big=1]
5770431a	246
c9c20893 OB	247	When you add a new node to a cluster (using `join`), all data on that node is
c9c20893 OB	248	destroyed. The whole database is initialized with the cluster data from
3ea67bfe DM	249	the master.
	250
	251	* make sure you have the right IP configuration
	252
	253	* run the cluster join command (on the new node):
	254	+
	255	----
	256	pmgcm join <master_ip>
	257	----
	258
8f980b65	259	You need to enter the root password of the master host, when asked for
5770431a	260	a password. When joining a cluster using the GUI, you also need to
8f980b65	261	enter the 'fingerprint' of the master node. You can get this information
d7dc6300	262	by pressing the `Add` button on the master node.
3ea67bfe	263
1a4f8407 TL	264	NOTE: Joining a cluster with two-factor authentication enabled for the `root`
1a4f8407 TL	265	user is not supported. Remove the second factor when joining the cluster.
f8dc6aec	266
8f980b65 DW	267	CAUTION: Node initialization deletes all existing databases, stops all
	268	services accessing the database and then restarts them. Therefore, do
	269	not add nodes which are already active and receive mail.
3ea67bfe	270
8f980b65 DW	271	Also note that joining a cluster can take several minutes, because the
	272	new node needs to synchronize all data from the master (although this
	273	is done in the background).
3ea67bfe	274
8f980b65 DW	275	NOTE: If you join a new node, existing quarantined items from the
8f980b65 DW	276	other nodes are not synchronized to the new node.
3ea67bfe DM	277
	278
	279	Deleting Nodes
	280	~~~~~~~~~~~~~~
	281
8f980b65 DW	282	Please detach nodes from the cluster network, before removing them
	283	from the cluster configuration. Only then you should run the following
	284	command on the master node:
3ea67bfe DM	285
	286	----
	287	pmgcm delete <cid>
	288	----
	289
	290	Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
	291
	292
	293	Disaster Recovery
	294	~~~~~~~~~~~~~~~~~
	295
	296	It is highly recommended to use redundant disks on all cluster nodes
8f980b65	297	(RAID). So in almost any circumstance, you just need to replace the
3ea67bfe DM	298	damaged hardware or disk. {pmg} uses an asynchronous
	299	clustering algorithm, so you just need to reboot the repaired node,
	300	and everything will work again transparently.
	301
0c358d45	302	The following scenarios only apply when you really lose the contents
3ea67bfe DM	303	of the hard disk.
	304
	305
	306	Single Node Failure
	307	^^^^^^^^^^^^^^^^^^^
	308
	309	* delete failed node on master
	310	+
	311	----
	312	pmgcm delete <cid>
	313	----
	314
	315	* add (re-join) a new node
	316	+
	317	----
	318	pmgcm join <master_ip>
	319	----
	320
	321
	322	Master Failure
	323	^^^^^^^^^^^^^^
	324
	325	* force another node to be master
	326	+
	327	-----
	328	pmgcm promote
	329	-----
	330
	331	* tell other nodes that master has changed
	332	+
	333	----
	334	pmgcm sync --master_ip <master_ip>
	335	----
	336
	337
	338	Total Cluster Failure
	339	^^^^^^^^^^^^^^^^^^^^^
	340
8f980b65	341	* restore backup (Cluster and node information is not restored; you
3ea67bfe DM	342	have to recreate master and nodes)
	343
	344	* tell it to become master
	345	+
	346	----
	347	pmgcm create
	348	----
	349
	350	* install new nodes
	351
	352	* add those new nodes to the cluster
	353	+
	354	----
	355	pmgcm join <master_ip>
	356	----
	357
a0f910ae DM	358
	359	ifdef::manvolnum[]
	360	include::pmg-copyright.adoc[]
	361	endif::manvolnum[]