[pmg-docs.git] / pmgcm.adoc

[[chapter_pmgcm]]
ifdef::manvolnum[]
pmgcm(1)
========
:pmg-toplevel:

NAME
----

pmgcm - Proxmox Mail Gateway Cluster Management Toolkit


SYNOPSIS
--------

include::pmgcm.1-synopsis.adoc[]


DESCRIPTION
-----------
endif::manvolnum[]
ifndef::manvolnum[]
Cluster Management
==================
:pmg-toplevel:
endif::manvolnum[]

We are living in a world where email becomes more and more important -
failures in email systems are just not acceptable. To meet these
requirements we developed the Proxmox HA (High Availability) Cluster.

The {pmg} HA Cluster consists of a master and several slave nodes
(minimum one node). Configuration is done on the master. Configuration
and data is synchronized to all cluster nodes over a VPN tunnel. This
provides the following advantages:

* centralized configuration management

* fully redundant data storage

* high availability

* high performance

We use a unique application level clustering scheme, which provides
extremely good performance. Special considerations where taken to make
management as easy as possible. Complete Cluster setup is done within
minutes, and nodes automatically reintegrate after temporary failures
without any operator interaction.

image::images/pmg-ha-cluster.png[]


Hardware requirements
---------------------

There are no special hardware requirements, although it is highly
recommended to use fast and reliable server with redundant disks on
all cluster nodes (Hardware RAID with BBU and write cache enabled).

The HA Cluster can also run in virtualized environments.


Subscriptions
-------------

Each host in a cluster has its own subscription. If you want support
for a cluster, each cluster node needs to have a valid
subscription. All nodes must have the same subscription level.


Load balancing
--------------

It is usually advisable to distribute mail traffic among all cluster
nodes. Please note that this is not always required, because it is
also reasonable to use only one node to handle SMTP traffic. The
second node is used as quarantine host, and only provides the web
interface to the user quarantine.

The normal mail delivery process looks up DNS Mail Exchange (`MX`)
records to determine the destination host. A `MX` record tells the
sending system where to deliver mail for a certain domain. It is also
possible to have several `MX` records for a single domain, they can have
different priorities. For example, our `MX` record looks like that:

----
# dig -t mx proxmox.com

;; ANSWER SECTION:
proxmox.com.            22879   IN      MX      10 mail.proxmox.com.

;; ADDITIONAL SECTION:
mail.proxmox.com.       22879   IN      A       213.129.239.114
----

Please notice that there is one single `MX` record for the Domain
`proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
automatically puts out the corresponding address record if it
exists. In our case it points to `213.129.239.114`. The priority of
our `MX` record is set to 10 (preferred default value).


Hot standby with backup `MX` records
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Many people do not want to install two redundant mail proxies, instead
they use the mail proxy of their ISP as fall-back. This is simply done
by adding an additional `MX` Record with a lower priority (higher
number). With the example above this looks like that:

----
proxmox.com.            22879   IN      MX      100 mail.provider.tld.
----

Sure, your provider must accept mails for your domain and forward
received mails to you. Please note that such setup is not really
advisable, because spam detection needs to be done by that backup `MX`
server also, and external servers provided by ISPs usually don't do
that.

You will never lose mails with such a setup, because the sending Mail
Transport Agent (MTA) will simply deliver the mail to the backup
server (mail.provider.tld) if the primary server (mail.proxmox.com) is
not available.

NOTE: Any resononable mail server retries mail devivery if the target
server is not available, i.e. {pmg} stores mail and retries delivery
for up to one week. So you will not loose mail if you mail server is
down, even if you run a single server setup.


Load balancing with `MX` records
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Using your ISPs mail server is not always a good idea, because many
ISPs do not use advanced spam prevention techniques, or do not filter
SPAM at all. It is often better to run a second server yourself to
avoid lower spam detection rates.

Anyways, it’s quite simple to set up a high performance load balanced
mail cluster using `MX` records. You just need to define two `MX` records
with the same priority. I will explain this using a complete example
to make it clearer.

First, you need to have at least 2 working {pmg} servers
(mail1.example.com and mail2.example.com) configured as cluster (see
section xref:pmg_cluster_administration[Cluster administration]
below), each having its own IP address. Let us assume the following
addresses (DNS address records):

----
mail1.example.com.       22879   IN      A       1.2.3.4
mail2.example.com.       22879   IN      A       1.2.3.5
----

Btw, it is always a good idea to add reverse lookup entries (PTR
records) for those hosts. Many email systems nowadays reject mails
from hosts without valid PTR records.  Then you need to define your `MX`
records:

----
example.com.            22879   IN      MX      10 mail1.example.com.
example.com.            22879   IN      MX      10 mail2.example.com.
----

This is all you need. You will receive mails on both hosts, more or
less load-balanced using round-robin scheduling. If one host fails the
other is used.


Other ways
~~~~~~~~~~

Multiple address records
^^^^^^^^^^^^^^^^^^^^^^^^

Using several DNS `MX` record is sometime clumsy if you have many
domains. It is also possible to use one `MX` record per domain, but
multiple address records:

----
example.com.            22879   IN      MX      10 mail.example.com.
mail.example.com.       22879   IN      A       1.2.3.4
mail.example.com.       22879   IN      A       1.2.3.5
----


Using firewall features
^^^^^^^^^^^^^^^^^^^^^^^

Many firewalls can do some kind of RR-Scheduling (round-robin) when
using DNAT. See your firewall manual for more details.


[[pmg_cluster_administration]]
Cluster administration
----------------------

Cluster administration is done with a single command line utility
called `pmgcm'. So you need to login via ssh to manage the cluster
setup.

NOTE: Always setup the IP configuration before adding a node to the
cluster. IP address, network mask, gateway address and hostname can’t
be changed later.


Creating a Cluster
~~~~~~~~~~~~~~~~~~

You can create a cluster from any existing Proxmox host. All data is
preserved.

* make sure you have the right IP configuration
  (IP/MASK/GATEWAY/HOSTNAME), because you cannot changed that later

* run the cluster creation command:
+
----
pmgcm create
----


List Cluster Status
~~~~~~~~~~~~~~~~~~~

----
pmgcm status
--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
pmg5(1)              192.168.2.127   master A       1 day 21:18   0.30    80%    41%
----


Adding Cluster Nodes
~~~~~~~~~~~~~~~~~~~~

When you add a new node to a cluster (join) all data on that node is
destroyed. The whole database is initialized with cluster data from
the master.

* make sure you have the right IP configuration

* run the cluster join command (on the new node):
+
----
pmgcm join <master_ip>
----

You need to enter the root password of the master host when asked for
a password.

CAUTION: Node initialization deletes all existing databases, stops and
then restarts all services accessing the database. So do not add nodes
which are already active and receive mails.

Also, joining a cluster can take several minutes, because the new node
needs to synchronize all data from the master (although this is done
in the background).

NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.


Deleting Nodes
~~~~~~~~~~~~~~

Please detach nodes from the cluster network before removing them
from the cluster configuration. Then run the following command on
the master node:

----
pmgcm delete <cid>
----

Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.


Disaster Recovery
~~~~~~~~~~~~~~~~~

It is highly recommended to use redundant disks on all cluster nodes
(RAID). So in almost any circumstances you just need to replace the
damaged hardware or disk. {pmg} uses an asynchronous
clustering algorithm, so you just need to reboot the repaired node,
and everything will work again transparently.

The following scenarios only apply when you really loose the contents
of the hard disk.


Single Node Failure
^^^^^^^^^^^^^^^^^^^

* delete failed node on master
+
----
pmgcm delete <cid>
----

* add (re-join) a new node
+
----
pmgcm join <master_ip>
----


Master Failure
^^^^^^^^^^^^^^

* force another node to be master
+
-----
pmgcm promote
-----

* tell other nodes that master has changed
+
----
pmgcm sync --master_ip <master_ip>
----


Total Cluster Failure
^^^^^^^^^^^^^^^^^^^^^

* restore backup (Cluster and node information is not restored, you
  have to recreate master and nodes)

* tell it to become master
+
----
pmgcm create
----

* install new nodes

* add those new nodes to the cluster
+
----
pmgcm join <master_ip>
----


ifdef::manvolnum[]
include::pmg-copyright.adoc[]
endif::manvolnum[]
Commit	Line	Data
a0f910ae DM	1	[[chapter_pmgcm]]
	2	ifdef::manvolnum[]
	3	pmgcm(1)
	4	========
	5	:pmg-toplevel:
	6
	7	NAME
	8	----
	9
	10	pmgcm - Proxmox Mail Gateway Cluster Management Toolkit
	11
	12
	13	SYNOPSIS
	14	--------
	15
	16	include::pmgcm.1-synopsis.adoc[]
	17
	18
	19	DESCRIPTION
	20	-----------
	21	endif::manvolnum[]
	22	ifndef::manvolnum[]
3ea67bfe DM	23	Cluster Management
3ea67bfe DM	24	==================
a0f910ae DM	25	:pmg-toplevel:
	26	endif::manvolnum[]
	27
3ea67bfe DM	28	We are living in a world where email becomes more and more important -
	29	failures in email systems are just not acceptable. To meet these
	30	requirements we developed the Proxmox HA (High Availability) Cluster.
	31
	32	The {pmg} HA Cluster consists of a master and several slave nodes
	33	(minimum one node). Configuration is done on the master. Configuration
	34	and data is synchronized to all cluster nodes over a VPN tunnel. This
	35	provides the following advantages:
	36
	37	* centralized configuration management
	38
	39	* fully redundant data storage
	40
	41	* high availability
	42
	43	* high performance
	44
	45	We use a unique application level clustering scheme, which provides
	46	extremely good performance. Special considerations where taken to make
	47	management as easy as possible. Complete Cluster setup is done within
	48	minutes, and nodes automatically reintegrate after temporary failures
	49	without any operator interaction.
	50
	51	image::images/pmg-ha-cluster.png[]
	52
	53
	54	Hardware requirements
	55	---------------------
	56
	57	There are no special hardware requirements, although it is highly
	58	recommended to use fast and reliable server with redundant disks on
	59	all cluster nodes (Hardware RAID with BBU and write cache enabled).
	60
	61	The HA Cluster can also run in virtualized environments.
	62
	63
	64	Subscriptions
	65	-------------
	66
	67	Each host in a cluster has its own subscription. If you want support
	68	for a cluster, each cluster node needs to have a valid
	69	subscription. All nodes must have the same subscription level.
	70
	71
	72	Load balancing
	73	--------------
	74
9aaf2a8c DM	75	It is usually advisable to distribute mail traffic among all cluster
	76	nodes. Please note that this is not always required, because it is
	77	also reasonable to use only one node to handle SMTP traffic. The
	78	second node is used as quarantine host, and only provides the web
	79	interface to the user quarantine.
	80
	81	The normal mail delivery process looks up DNS Mail Exchange (`MX`)
	82	records to determine the destination host. A `MX` record tells the
	83	sending system where to deliver mail for a certain domain. It is also
	84	possible to have several `MX` records for a single domain, they can have
	85	different priorities. For example, our `MX` record looks like that:
	86
	87	----
	88	# dig -t mx proxmox.com
	89
	90	;; ANSWER SECTION:
	91	proxmox.com. 22879 IN MX 10 mail.proxmox.com.
	92
	93	;; ADDITIONAL SECTION:
	94	mail.proxmox.com. 22879 IN A 213.129.239.114
	95	----
	96
	97	Please notice that there is one single `MX` record for the Domain
	98	`proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
	99	automatically puts out the corresponding address record if it
	100	exists. In our case it points to `213.129.239.114`. The priority of
	101	our `MX` record is set to 10 (preferred default value).
	102
	103
	104	Hot standby with backup `MX` records
	105	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	106
	107	Many people do not want to install two redundant mail proxies, instead
	108	they use the mail proxy of their ISP as fall-back. This is simply done
	109	by adding an additional `MX` Record with a lower priority (higher
	110	number). With the example above this looks like that:
	111
	112	----
	113	proxmox.com. 22879 IN MX 100 mail.provider.tld.
	114	----
	115
	116	Sure, your provider must accept mails for your domain and forward
	117	received mails to you. Please note that such setup is not really
	118	advisable, because spam detection needs to be done by that backup `MX`
	119	server also, and external servers provided by ISPs usually don't do
	120	that.
	121
	122	You will never lose mails with such a setup, because the sending Mail
	123	Transport Agent (MTA) will simply deliver the mail to the backup
	124	server (mail.provider.tld) if the primary server (mail.proxmox.com) is
	125	not available.
	126
	127	NOTE: Any resononable mail server retries mail devivery if the target
	128	server is not available, i.e. {pmg} stores mail and retries delivery
	129	for up to one week. So you will not loose mail if you mail server is
	130	down, even if you run a single server setup.
	131
	132
	133	Load balancing with `MX` records
	134	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	135
	136	Using your ISPs mail server is not always a good idea, because many
	137	ISPs do not use advanced spam prevention techniques, or do not filter
	138	SPAM at all. It is often better to run a second server yourself to
139	avoid lower spam detection rates.
140
141	Anyways, it’s quite simple to set up a high performance load balanced
142	mail cluster using `MX` records. You just need to define two `MX` records
143	with the same priority. I will explain this using a complete example
144	to make it clearer.
145
146	First, you need to have at least 2 working {pmg} servers
147	(mail1.example.com and mail2.example.com) configured as cluster (see
148	section xref:pmg_cluster_administration[Cluster administration]
149	below), each having its own IP address. Let us assume the following
150	addresses (DNS address records):
151
152	----
153	mail1.example.com. 22879 IN A 1.2.3.4
154	mail2.example.com. 22879 IN A 1.2.3.5
155	----
156
157	Btw, it is always a good idea to add reverse lookup entries (PTR
158	records) for those hosts. Many email systems nowadays reject mails
159	from hosts without valid PTR records. Then you need to define your `MX`
160	records:
161
162	----
163	example.com. 22879 IN MX 10 mail1.example.com.
164	example.com. 22879 IN MX 10 mail2.example.com.
165	----
166
167	This is all you need. You will receive mails on both hosts, more or
168	less load-balanced using round-robin scheduling. If one host fails the
169	other is used.
170
171
172	Other ways
173	~~~~~~~~~~
174
175	Multiple address records
176	^^^^^^^^^^^^^^^^^^^^^^^^
177
178	Using several DNS `MX` record is sometime clumsy if you have many
179	domains. It is also possible to use one `MX` record per domain, but
180	multiple address records:
181
182	----
183	example.com. 22879 IN MX 10 mail.example.com.
184	mail.example.com. 22879 IN A 1.2.3.4
185	mail.example.com. 22879 IN A 1.2.3.5
186	----
187
188
189	Using firewall features
190	^^^^^^^^^^^^^^^^^^^^^^^
191
192	Many firewalls can do some kind of RR-Scheduling (round-robin) when
193	using DNAT. See your firewall manual for more details.
3ea67bfe DM	194
3ea67bfe DM	195
9aaf2a8c	196	[[pmg_cluster_administration]]
3ea67bfe DM	197	Cluster administration
	198	----------------------
	199
	200	Cluster administration is done with a single command line utility
	201	called `pmgcm'. So you need to login via ssh to manage the cluster
	202	setup.
	203
	204	NOTE: Always setup the IP configuration before adding a node to the
	205	cluster. IP address, network mask, gateway address and hostname can’t
	206	be changed later.
	207
	208
	209	Creating a Cluster
	210	~~~~~~~~~~~~~~~~~~
	211
	212	You can create a cluster from any existing Proxmox host. All data is
	213	preserved.
	214
	215	* make sure you have the right IP configuration
	216	(IP/MASK/GATEWAY/HOSTNAME), because you cannot changed that later
	217
	218	* run the cluster creation command:
	219	+
	220	----
	221	pmgcm create
	222	----
	223
	224
	225	List Cluster Status
	226	~~~~~~~~~~~~~~~~~~~
	227
	228	----
	229	pmgcm status
	230	--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
	231	pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41%
	232	----
	233
	234
	235	Adding Cluster Nodes
	236	~~~~~~~~~~~~~~~~~~~~
	237
	238	When you add a new node to a cluster (join) all data on that node is
	239	destroyed. The whole database is initialized with cluster data from
	240	the master.
	241
	242	* make sure you have the right IP configuration
	243
	244	* run the cluster join command (on the new node):
	245	+
	246	----
	247	pmgcm join <master_ip>
	248	----
	249
	250	You need to enter the root password of the master host when asked for
	251	a password.
	252
	253	CAUTION: Node initialization deletes all existing databases, stops and
	254	then restarts all services accessing the database. So do not add nodes
	255	which are already active and receive mails.
	256
	257	Also, joining a cluster can take several minutes, because the new node
	258	needs to synchronize all data from the master (although this is done
	259	in the background).
	260
261	NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.
262
263
264	Deleting Nodes
265	~~~~~~~~~~~~~~
266
267	Please detach nodes from the cluster network before removing them
268	from the cluster configuration. Then run the following command on
269	the master node:
270
271	----
272	pmgcm delete <cid>
273	----
274
275	Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
276
277
278	Disaster Recovery
279	~~~~~~~~~~~~~~~~~
280
281	It is highly recommended to use redundant disks on all cluster nodes
282	(RAID). So in almost any circumstances you just need to replace the
283	damaged hardware or disk. {pmg} uses an asynchronous
284	clustering algorithm, so you just need to reboot the repaired node,
285	and everything will work again transparently.
286
287	The following scenarios only apply when you really loose the contents
288	of the hard disk.
289
290
291	Single Node Failure
292	^^^^^^^^^^^^^^^^^^^
293
294	* delete failed node on master
295	+
296	----
297	pmgcm delete <cid>
298	----
299
300	* add (re-join) a new node
301	+
302	----
303	pmgcm join <master_ip>
304	----
305
306
307	Master Failure
308	^^^^^^^^^^^^^^
309
310	* force another node to be master
311	+
312	-----
313	pmgcm promote
314	-----
315
316	* tell other nodes that master has changed
317	+
318	----
319	pmgcm sync --master_ip <master_ip>
320	----
321
322
323	Total Cluster Failure
324	^^^^^^^^^^^^^^^^^^^^^
325
326	* restore backup (Cluster and node information is not restored, you
327	have to recreate master and nodes)
328
329	* tell it to become master
330	+
331	----
332	pmgcm create
333	----
334
335	* install new nodes
336
337	* add those new nodes to the cluster
338	+
339	----
340	pmgcm join <master_ip>
341	----
342
a0f910ae DM	343
	344	ifdef::manvolnum[]
	345	include::pmg-copyright.adoc[]
	346	endif::manvolnum[]