[pmg-docs.git] / pmgcm.adoc

[[chapter_pmgcm]]
ifdef::manvolnum[]
pmgcm(1)
========
:pmg-toplevel:

NAME
----

pmgcm - Proxmox Mail Gateway Cluster Management Toolkit


SYNOPSIS
--------

include::pmgcm.1-synopsis.adoc[]


DESCRIPTION
-----------
endif::manvolnum[]
ifndef::manvolnum[]
Cluster Management
==================
:pmg-toplevel:
endif::manvolnum[]

We are living in a world where email becomes more and more important -
failures in email systems are just not acceptable. To meet these
requirements we developed the Proxmox HA (High Availability) Cluster.

The {pmg} HA Cluster consists of a master and several slave nodes
(minimum one slave node). Configuration is done on the master. Configuration
and data is synchronized to all cluster nodes over a VPN tunnel. This
provides the following advantages:

* centralized configuration management

* fully redundant data storage

* high availability

* high performance

We use a unique application level clustering scheme, which provides
extremely good performance. Special considerations were taken to make
management as easy as possible. A complete cluster setup is done within
minutes, and nodes automatically reintegrate after temporary failures
without any operator interaction.

image::images/Proxmox_HA_cluster_final_1024.png[]


Hardware requirements
---------------------

There are no special hardware requirements, although it is highly
recommended to use fast and reliable server with redundant disks on
all cluster nodes (Hardware RAID with BBU and write cache enabled).

The HA Cluster can also run in virtualized environments.


Subscriptions
-------------

Each node in a cluster has its own subscription. If you want support
for a cluster, each cluster node needs to have a valid
subscription. All nodes must have the same subscription level.


Load balancing
--------------

It is usually advisable to distribute mail traffic among all cluster
nodes. Please note that this is not always required, because it is
also reasonable to use only one node to handle SMTP traffic. The
second node is used as quarantine host, and only provides the web
interface to the user quarantine.

The normal mail delivery process looks up DNS Mail Exchange (`MX`)
records to determine the destination host. An `MX` record tells the
sending system where to deliver mail for a certain domain. It is also
possible to have several `MX` records for a single domain, they can have
different priorities. For example, our `MX` record looks like that:

----
# dig -t mx proxmox.com

;; ANSWER SECTION:
proxmox.com.            22879   IN      MX      10 mail.proxmox.com.

;; ADDITIONAL SECTION:
mail.proxmox.com.       22879   IN      A       213.129.239.114
----

Notice that there is a single `MX` record for the domain
`proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
automatically puts out the corresponding address record if it
exists. In our case it points to `213.129.239.114`. The priority of
our `MX` record is set to 10 (preferred default value).


Hot standby with backup `MX` records
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Many people do not want to install two redundant mail proxies, instead
they use the mail proxy of their ISP as fallback. This is simply done
by adding an additional `MX` Record with a lower priority (higher
number). With the example above this looks like that:

----
proxmox.com.            22879   IN      MX      100 mail.provider.tld.
----

In such a setup, your provider must accept mails for your domain and
forward them to you. Please note that this is not advisable, because
spam detection needs to be done by the backup `MX` server as well, and
external servers provided by ISPs usually don't.

However, you will never lose mails with such a setup, because the sending Mail
Transport Agent (MTA) will simply deliver the mail to the backup
server (mail.provider.tld) if the primary server (mail.proxmox.com) is
not available.

NOTE: Any reasonable mail server retries mail delivery if the target
server is not available, and {pmg} stores mail and retries delivery
for up to one week. So you will not lose mails if your mail server is
down, even if you run a single server setup.


Load balancing with `MX` records
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Using your ISP's mail server is not always a good idea, because many
ISPs do not use advanced spam prevention techniques, or do not filter
spam at all. It is often better to run a second server yourself to
avoid lower spam detection rates.

It’s quite simple to set up a high performance load balanced
mail cluster using `MX` records. You need to define two `MX` records
with the same priority. Here is a complete example to make it clearer.

First, you need to have at least 2 working {pmg} servers
(mail1.example.com and mail2.example.com) configured as cluster (see
section xref:pmg_cluster_administration[Cluster administration]
below), each having its own IP address. Let us assume the following
DNS address records:

----
mail1.example.com.       22879   IN      A       1.2.3.4
mail2.example.com.       22879   IN      A       1.2.3.5
----

It is always a good idea to add reverse lookup entries (PTR
records) for those hosts. Many email systems nowadays reject mails
from hosts without valid PTR records. Then you need to define your `MX`
records:

----
example.com.            22879   IN      MX      10 mail1.example.com.
example.com.            22879   IN      MX      10 mail2.example.com.
----

This is all you need. You will receive mails on both hosts, load-balanced using
round-robin scheduling. If one host fails the other one is used.


Other ways
~~~~~~~~~~

Multiple address records
^^^^^^^^^^^^^^^^^^^^^^^^

Using several DNS `MX` records is sometimes tedious if you have many
domains. It is also possible to use one `MX` record per domain, but
multiple address records:

----
example.com.            22879   IN      MX      10 mail.example.com.
mail.example.com.       22879   IN      A       1.2.3.4
mail.example.com.       22879   IN      A       1.2.3.5
----


Using firewall features
^^^^^^^^^^^^^^^^^^^^^^^

Many firewalls can do some kind of RR-Scheduling (round-robin) when
using DNAT. See your firewall manual for more details.


[[pmg_cluster_administration]]
Cluster administration
----------------------

Cluster administration can be done in the GUI or by using the command
line utility `pmgcm`. The CLI tool is a bit more verbose, so we suggest
to use that if you run into any problems.

NOTE: Always setup the IP configuration before adding a node to the
cluster. IP address, network mask, gateway address and hostname can’t
be changed later.

Creating a Cluster
~~~~~~~~~~~~~~~~~~

[thumbnail="pmg-gui-cluster-panel.png", big=1]

You can create a cluster from any existing {pmg} host. All data is
preserved.

* make sure you have the right IP configuration
  (IP/MASK/GATEWAY/HOSTNAME), because you cannot change that later

* press the create button on the GUI, or run the cluster creation command:
+
----
pmgcm create
----

NOTE: The node where you run the cluster create command will be the
'master' node.


Show Cluster Status
~~~~~~~~~~~~~~~~~~~

The GUI shows the status of all cluster nodes, and it is also possible
to use the command line tool:

----
pmgcm status
--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
pmg5(1)              192.168.2.127   master A       1 day 21:18   0.30    80%    41%
----


[[pmgcm_join]]
Adding Cluster Nodes
~~~~~~~~~~~~~~~~~~~~

[thumbnail="pmg-gui-cluster-join.png", big=1]

When you add a new node to a cluster (using `join`), all data on that node is
destroyed. The whole database is initialized with the cluster data from
the master.

* make sure you have the right IP configuration

* run the cluster join command (on the new node):
+
----
pmgcm join <master_ip>
----

You need to enter the root password of the master host when asked for
a password. When joining a cluster using the GUI, you also need to
enter the 'fingerprint' of the master node. You can get that information
by pressing the `Add` button on the master node.

CAUTION: Node initialization deletes all existing databases, stops and
then restarts all services accessing the database. So do not add nodes
which are already active and receive mails.

Also, joining a cluster can take several minutes, because the new node
needs to synchronize all data from the master (although this is done
in the background).

NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.


Deleting Nodes
~~~~~~~~~~~~~~

Please detach nodes from the cluster network before removing them
from the cluster configuration. Then run the following command on
the master node:

----
pmgcm delete <cid>
----

Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.


Disaster Recovery
~~~~~~~~~~~~~~~~~

It is highly recommended to use redundant disks on all cluster nodes
(RAID). So in almost any circumstances you just need to replace the
damaged hardware or disk. {pmg} uses an asynchronous
clustering algorithm, so you just need to reboot the repaired node,
and everything will work again transparently.

The following scenarios only apply when you really lose the contents
of the hard disk.


Single Node Failure
^^^^^^^^^^^^^^^^^^^

* delete failed node on master
+
----
pmgcm delete <cid>
----

* add (re-join) a new node
+
----
pmgcm join <master_ip>
----


Master Failure
^^^^^^^^^^^^^^

* force another node to be master
+
-----
pmgcm promote
-----

* tell other nodes that master has changed
+
----
pmgcm sync --master_ip <master_ip>
----


Total Cluster Failure
^^^^^^^^^^^^^^^^^^^^^

* restore backup (Cluster and node information is not restored, you
  have to recreate master and nodes)

* tell it to become master
+
----
pmgcm create
----

* install new nodes

* add those new nodes to the cluster
+
----
pmgcm join <master_ip>
----


ifdef::manvolnum[]
include::pmg-copyright.adoc[]
endif::manvolnum[]
Commit	Line	Data
a0f910ae DM	1	[[chapter_pmgcm]]
	2	ifdef::manvolnum[]
	3	pmgcm(1)
	4	========
	5	:pmg-toplevel:
	6
	7	NAME
	8	----
	9
	10	pmgcm - Proxmox Mail Gateway Cluster Management Toolkit
	11
	12
	13	SYNOPSIS
	14	--------
	15
	16	include::pmgcm.1-synopsis.adoc[]
	17
	18
	19	DESCRIPTION
	20	-----------
	21	endif::manvolnum[]
	22	ifndef::manvolnum[]
3ea67bfe DM	23	Cluster Management
3ea67bfe DM	24	==================
a0f910ae DM	25	:pmg-toplevel:
	26	endif::manvolnum[]
	27
3ea67bfe DM	28	We are living in a world where email becomes more and more important -
	29	failures in email systems are just not acceptable. To meet these
	30	requirements we developed the Proxmox HA (High Availability) Cluster.
	31
	32	The {pmg} HA Cluster consists of a master and several slave nodes
c9c20893	33	(minimum one slave node). Configuration is done on the master. Configuration
3ea67bfe DM	34	and data is synchronized to all cluster nodes over a VPN tunnel. This
	35	provides the following advantages:
	36
	37	* centralized configuration management
	38
	39	* fully redundant data storage
	40
	41	* high availability
	42
	43	* high performance
	44
	45	We use a unique application level clustering scheme, which provides
c9c20893 OB	46	extremely good performance. Special considerations were taken to make
c9c20893 OB	47	management as easy as possible. A complete cluster setup is done within
3ea67bfe DM	48	minutes, and nodes automatically reintegrate after temporary failures
	49	without any operator interaction.
	50
95f2ea5b	51	image::images/Proxmox_HA_cluster_final_1024.png[]
3ea67bfe DM	52
	53
	54	Hardware requirements
	55	---------------------
	56
	57	There are no special hardware requirements, although it is highly
	58	recommended to use fast and reliable server with redundant disks on
	59	all cluster nodes (Hardware RAID with BBU and write cache enabled).
	60
	61	The HA Cluster can also run in virtualized environments.
	62
	63
	64	Subscriptions
	65	-------------
	66
c9c20893	67	Each node in a cluster has its own subscription. If you want support
3ea67bfe DM	68	for a cluster, each cluster node needs to have a valid
	69	subscription. All nodes must have the same subscription level.
	70
	71
	72	Load balancing
	73	--------------
	74
9aaf2a8c DM	75	It is usually advisable to distribute mail traffic among all cluster
	76	nodes. Please note that this is not always required, because it is
	77	also reasonable to use only one node to handle SMTP traffic. The
	78	second node is used as quarantine host, and only provides the web
	79	interface to the user quarantine.
	80
	81	The normal mail delivery process looks up DNS Mail Exchange (`MX`)
c9c20893	82	records to determine the destination host. An `MX` record tells the
9aaf2a8c DM	83	sending system where to deliver mail for a certain domain. It is also
	84	possible to have several `MX` records for a single domain, they can have
	85	different priorities. For example, our `MX` record looks like that:
	86
	87	----
	88	# dig -t mx proxmox.com
	89
	90	;; ANSWER SECTION:
	91	proxmox.com. 22879 IN MX 10 mail.proxmox.com.
	92
	93	;; ADDITIONAL SECTION:
	94	mail.proxmox.com. 22879 IN A 213.129.239.114
	95	----
	96
c9c20893	97	Notice that there is a single `MX` record for the domain
9aaf2a8c DM	98	`proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
	99	automatically puts out the corresponding address record if it
	100	exists. In our case it points to `213.129.239.114`. The priority of
	101	our `MX` record is set to 10 (preferred default value).
	102
	103
	104	Hot standby with backup `MX` records
	105	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	106
	107	Many people do not want to install two redundant mail proxies, instead
0c358d45	108	they use the mail proxy of their ISP as fallback. This is simply done
9aaf2a8c DM	109	by adding an additional `MX` Record with a lower priority (higher
	110	number). With the example above this looks like that:
	111
	112	----
	113	proxmox.com. 22879 IN MX 100 mail.provider.tld.
	114	----
	115
0c358d45 OB	116	In such a setup, your provider must accept mails for your domain and
	117	forward them to you. Please note that this is not advisable, because
	118	spam detection needs to be done by the backup `MX` server as well, and
	119	external servers provided by ISPs usually don't.
9aaf2a8c	120
0c358d45	121	However, you will never lose mails with such a setup, because the sending Mail
9aaf2a8c DM	122	Transport Agent (MTA) will simply deliver the mail to the backup
	123	server (mail.provider.tld) if the primary server (mail.proxmox.com) is
	124	not available.
	125
0c358d45	126	NOTE: Any reasonable mail server retries mail delivery if the target
c9c20893 OB	127	server is not available, and {pmg} stores mail and retries delivery
c9c20893 OB	128	for up to one week. So you will not lose mails if your mail server is
9aaf2a8c DM	129	down, even if you run a single server setup.
	130
	131
	132	Load balancing with `MX` records
	133	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	134
c9c20893	135	Using your ISP's mail server is not always a good idea, because many
9aaf2a8c	136	ISPs do not use advanced spam prevention techniques, or do not filter
c9c20893	137	spam at all. It is often better to run a second server yourself to
9aaf2a8c DM	138	avoid lower spam detection rates.
9aaf2a8c DM	139
c9c20893 OB	140	It’s quite simple to set up a high performance load balanced
c9c20893 OB	141	mail cluster using `MX` records. You need to define two `MX` records
0c358d45	142	with the same priority. Here is a complete example to make it clearer.
9aaf2a8c DM	143
	144	First, you need to have at least 2 working {pmg} servers
	145	(mail1.example.com and mail2.example.com) configured as cluster (see
	146	section xref:pmg_cluster_administration[Cluster administration]
	147	below), each having its own IP address. Let us assume the following
c9c20893	148	DNS address records:
9aaf2a8c DM	149
	150	----
	151	mail1.example.com. 22879 IN A 1.2.3.4
	152	mail2.example.com. 22879 IN A 1.2.3.5
	153	----
	154
0c358d45	155	It is always a good idea to add reverse lookup entries (PTR
9aaf2a8c	156	records) for those hosts. Many email systems nowadays reject mails
c9c20893	157	from hosts without valid PTR records. Then you need to define your `MX`
9aaf2a8c DM	158	records:
	159
	160	----
	161	example.com. 22879 IN MX 10 mail1.example.com.
	162	example.com. 22879 IN MX 10 mail2.example.com.
	163	----
	164
c9c20893 OB	165	This is all you need. You will receive mails on both hosts, load-balanced using
c9c20893 OB	166	round-robin scheduling. If one host fails the other one is used.
9aaf2a8c DM	167
	168
	169	Other ways
	170	~~~~~~~~~~
	171
	172	Multiple address records
	173	^^^^^^^^^^^^^^^^^^^^^^^^
	174
c9c20893	175	Using several DNS `MX` records is sometimes tedious if you have many
9aaf2a8c DM	176	domains. It is also possible to use one `MX` record per domain, but
	177	multiple address records:
	178
	179	----
	180	example.com. 22879 IN MX 10 mail.example.com.
	181	mail.example.com. 22879 IN A 1.2.3.4
	182	mail.example.com. 22879 IN A 1.2.3.5
	183	----
	184
	185
	186	Using firewall features
	187	^^^^^^^^^^^^^^^^^^^^^^^
	188
	189	Many firewalls can do some kind of RR-Scheduling (round-robin) when
	190	using DNAT. See your firewall manual for more details.
3ea67bfe DM	191
3ea67bfe DM	192
9aaf2a8c	193	[[pmg_cluster_administration]]
3ea67bfe DM	194	Cluster administration
	195	----------------------
	196
c9c20893	197	Cluster administration can be done in the GUI or by using the command
5770431a	198	line utility `pmgcm`. The CLI tool is a bit more verbose, so we suggest
c9c20893	199	to use that if you run into any problems.
3ea67bfe DM	200
	201	NOTE: Always setup the IP configuration before adding a node to the
	202	cluster. IP address, network mask, gateway address and hostname can’t
	203	be changed later.
	204
3ea67bfe DM	205	Creating a Cluster
	206	~~~~~~~~~~~~~~~~~~
	207
a695a527	208	[thumbnail="pmg-gui-cluster-panel.png", big=1]
5770431a	209
0c358d45	210	You can create a cluster from any existing {pmg} host. All data is
3ea67bfe DM	211	preserved.
	212
	213	* make sure you have the right IP configuration
5770431a	214	(IP/MASK/GATEWAY/HOSTNAME), because you cannot change that later
3ea67bfe	215
5770431a	216	* press the create button on the GUI, or run the cluster creation command:
3ea67bfe DM	217	+
	218	----
	219	pmgcm create
	220	----
	221
5770431a DM	222	NOTE: The node where you run the cluster create command will be the
	223	'master' node.
	224
3ea67bfe	225
5770431a	226	Show Cluster Status
3ea67bfe DM	227	~~~~~~~~~~~~~~~~~~~
3ea67bfe DM	228
5770431a DM	229	The GUI shows the status of all cluster nodes, and it is also possible
	230	to use the command line tool:
	231
3ea67bfe DM	232	----
	233	pmgcm status
	234	--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
	235	pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41%
	236	----
	237
	238
4a08dffe	239	[[pmgcm_join]]
3ea67bfe DM	240	Adding Cluster Nodes
	241	~~~~~~~~~~~~~~~~~~~~
	242
a695a527	243	[thumbnail="pmg-gui-cluster-join.png", big=1]
5770431a	244
c9c20893 OB	245	When you add a new node to a cluster (using `join`), all data on that node is
c9c20893 OB	246	destroyed. The whole database is initialized with the cluster data from
3ea67bfe DM	247	the master.
	248
	249	* make sure you have the right IP configuration
	250
	251	* run the cluster join command (on the new node):
	252	+
	253	----
	254	pmgcm join <master_ip>
	255	----
	256
	257	You need to enter the root password of the master host when asked for
5770431a	258	a password. When joining a cluster using the GUI, you also need to
c9c20893	259	enter the 'fingerprint' of the master node. You can get that information
d7dc6300	260	by pressing the `Add` button on the master node.
3ea67bfe DM	261
	262	CAUTION: Node initialization deletes all existing databases, stops and
	263	then restarts all services accessing the database. So do not add nodes
	264	which are already active and receive mails.
	265
	266	Also, joining a cluster can take several minutes, because the new node
	267	needs to synchronize all data from the master (although this is done
	268	in the background).
	269
	270	NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.
	271
	272
	273	Deleting Nodes
	274	~~~~~~~~~~~~~~
	275
	276	Please detach nodes from the cluster network before removing them
	277	from the cluster configuration. Then run the following command on
	278	the master node:
	279
	280	----
	281	pmgcm delete <cid>
	282	----
	283
	284	Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
	285
	286
	287	Disaster Recovery
	288	~~~~~~~~~~~~~~~~~
	289
	290	It is highly recommended to use redundant disks on all cluster nodes
	291	(RAID). So in almost any circumstances you just need to replace the
	292	damaged hardware or disk. {pmg} uses an asynchronous
	293	clustering algorithm, so you just need to reboot the repaired node,
	294	and everything will work again transparently.
	295
0c358d45	296	The following scenarios only apply when you really lose the contents
3ea67bfe DM	297	of the hard disk.
	298
	299
	300	Single Node Failure
	301	^^^^^^^^^^^^^^^^^^^
	302
	303	* delete failed node on master
	304	+
	305	----
	306	pmgcm delete <cid>
	307	----
	308
	309	* add (re-join) a new node
	310	+
	311	----
	312	pmgcm join <master_ip>
	313	----
	314
	315
	316	Master Failure
	317	^^^^^^^^^^^^^^
	318
	319	* force another node to be master
	320	+
	321	-----
	322	pmgcm promote
	323	-----
	324
	325	* tell other nodes that master has changed
	326	+
	327	----
	328	pmgcm sync --master_ip <master_ip>
	329	----
	330
	331
	332	Total Cluster Failure
	333	^^^^^^^^^^^^^^^^^^^^^
	334
	335	* restore backup (Cluster and node information is not restored, you
	336	have to recreate master and nodes)
	337
	338	* tell it to become master
	339	+
	340	----
	341	pmgcm create
	342	----
	343
	344	* install new nodes
	345
	346	* add those new nodes to the cluster
	347	+
	348	----
	349	pmgcm join <master_ip>
	350	----
	351
a0f910ae DM	352
	353	ifdef::manvolnum[]
	354	include::pmg-copyright.adoc[]
	355	endif::manvolnum[]