]> git.proxmox.com Git - pmg-docs.git/blame - pmgcm.adoc
api-viewer: properly HTML encode properties
[pmg-docs.git] / pmgcm.adoc
CommitLineData
a0f910ae
DM
1[[chapter_pmgcm]]
2ifdef::manvolnum[]
3pmgcm(1)
4========
5:pmg-toplevel:
6
7NAME
8----
9
10pmgcm - Proxmox Mail Gateway Cluster Management Toolkit
11
12
13SYNOPSIS
14--------
15
16include::pmgcm.1-synopsis.adoc[]
17
18
19DESCRIPTION
20-----------
21endif::manvolnum[]
22ifndef::manvolnum[]
3ea67bfe
DM
23Cluster Management
24==================
a0f910ae
DM
25:pmg-toplevel:
26endif::manvolnum[]
27
3ea67bfe
DM
28We are living in a world where email becomes more and more important -
29failures in email systems are just not acceptable. To meet these
30requirements we developed the Proxmox HA (High Availability) Cluster.
31
32The {pmg} HA Cluster consists of a master and several slave nodes
33(minimum one node). Configuration is done on the master. Configuration
34and data is synchronized to all cluster nodes over a VPN tunnel. This
35provides the following advantages:
36
37* centralized configuration management
38
39* fully redundant data storage
40
41* high availability
42
43* high performance
44
45We use a unique application level clustering scheme, which provides
46extremely good performance. Special considerations where taken to make
47management as easy as possible. Complete Cluster setup is done within
48minutes, and nodes automatically reintegrate after temporary failures
49without any operator interaction.
50
95f2ea5b 51image::images/Proxmox_HA_cluster_final_1024.png[]
3ea67bfe
DM
52
53
54Hardware requirements
55---------------------
56
57There are no special hardware requirements, although it is highly
58recommended to use fast and reliable server with redundant disks on
59all cluster nodes (Hardware RAID with BBU and write cache enabled).
60
61The HA Cluster can also run in virtualized environments.
62
63
64Subscriptions
65-------------
66
67Each host in a cluster has its own subscription. If you want support
68for a cluster, each cluster node needs to have a valid
69subscription. All nodes must have the same subscription level.
70
71
72Load balancing
73--------------
74
9aaf2a8c
DM
75It is usually advisable to distribute mail traffic among all cluster
76nodes. Please note that this is not always required, because it is
77also reasonable to use only one node to handle SMTP traffic. The
78second node is used as quarantine host, and only provides the web
79interface to the user quarantine.
80
81The normal mail delivery process looks up DNS Mail Exchange (`MX`)
82records to determine the destination host. A `MX` record tells the
83sending system where to deliver mail for a certain domain. It is also
84possible to have several `MX` records for a single domain, they can have
85different priorities. For example, our `MX` record looks like that:
86
87----
88# dig -t mx proxmox.com
89
90;; ANSWER SECTION:
91proxmox.com. 22879 IN MX 10 mail.proxmox.com.
92
93;; ADDITIONAL SECTION:
94mail.proxmox.com. 22879 IN A 213.129.239.114
95----
96
97Please notice that there is one single `MX` record for the Domain
98`proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
99automatically puts out the corresponding address record if it
100exists. In our case it points to `213.129.239.114`. The priority of
101our `MX` record is set to 10 (preferred default value).
102
103
104Hot standby with backup `MX` records
105~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106
107Many people do not want to install two redundant mail proxies, instead
108they use the mail proxy of their ISP as fall-back. This is simply done
109by adding an additional `MX` Record with a lower priority (higher
110number). With the example above this looks like that:
111
112----
113proxmox.com. 22879 IN MX 100 mail.provider.tld.
114----
115
116Sure, your provider must accept mails for your domain and forward
117received mails to you. Please note that such setup is not really
118advisable, because spam detection needs to be done by that backup `MX`
119server also, and external servers provided by ISPs usually don't do
120that.
121
122You will never lose mails with such a setup, because the sending Mail
123Transport Agent (MTA) will simply deliver the mail to the backup
124server (mail.provider.tld) if the primary server (mail.proxmox.com) is
125not available.
126
127NOTE: Any resononable mail server retries mail devivery if the target
128server is not available, i.e. {pmg} stores mail and retries delivery
129for up to one week. So you will not loose mail if you mail server is
130down, even if you run a single server setup.
131
132
133Load balancing with `MX` records
134~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
135
136Using your ISPs mail server is not always a good idea, because many
137ISPs do not use advanced spam prevention techniques, or do not filter
138SPAM at all. It is often better to run a second server yourself to
139avoid lower spam detection rates.
140
141Anyways, it’s quite simple to set up a high performance load balanced
142mail cluster using `MX` records. You just need to define two `MX` records
143with the same priority. I will explain this using a complete example
144to make it clearer.
145
146First, you need to have at least 2 working {pmg} servers
147(mail1.example.com and mail2.example.com) configured as cluster (see
148section xref:pmg_cluster_administration[Cluster administration]
149below), each having its own IP address. Let us assume the following
150addresses (DNS address records):
151
152----
153mail1.example.com. 22879 IN A 1.2.3.4
154mail2.example.com. 22879 IN A 1.2.3.5
155----
156
157Btw, it is always a good idea to add reverse lookup entries (PTR
158records) for those hosts. Many email systems nowadays reject mails
159from hosts without valid PTR records. Then you need to define your `MX`
160records:
161
162----
163example.com. 22879 IN MX 10 mail1.example.com.
164example.com. 22879 IN MX 10 mail2.example.com.
165----
166
167This is all you need. You will receive mails on both hosts, more or
168less load-balanced using round-robin scheduling. If one host fails the
169other is used.
170
171
172Other ways
173~~~~~~~~~~
174
175Multiple address records
176^^^^^^^^^^^^^^^^^^^^^^^^
177
178Using several DNS `MX` record is sometime clumsy if you have many
179domains. It is also possible to use one `MX` record per domain, but
180multiple address records:
181
182----
183example.com. 22879 IN MX 10 mail.example.com.
184mail.example.com. 22879 IN A 1.2.3.4
185mail.example.com. 22879 IN A 1.2.3.5
186----
187
188
189Using firewall features
190^^^^^^^^^^^^^^^^^^^^^^^
191
192Many firewalls can do some kind of RR-Scheduling (round-robin) when
193using DNAT. See your firewall manual for more details.
3ea67bfe
DM
194
195
9aaf2a8c 196[[pmg_cluster_administration]]
3ea67bfe
DM
197Cluster administration
198----------------------
199
5770431a
DM
200Cluster administration can be done on the GUI or using the command
201line utility `pmgcm`. The CLI tool is a bit more verbose, so we suggest
202to use that if you run into problems.
3ea67bfe
DM
203
204NOTE: Always setup the IP configuration before adding a node to the
205cluster. IP address, network mask, gateway address and hostname can’t
206be changed later.
207
3ea67bfe
DM
208Creating a Cluster
209~~~~~~~~~~~~~~~~~~
210
5770431a
DM
211image::images/screenshot/pmg-gui-cluster-panel.png[]
212
3ea67bfe
DM
213You can create a cluster from any existing Proxmox host. All data is
214preserved.
215
216* make sure you have the right IP configuration
5770431a 217 (IP/MASK/GATEWAY/HOSTNAME), because you cannot change that later
3ea67bfe 218
5770431a 219* press the create button on the GUI, or run the cluster creation command:
3ea67bfe
DM
220+
221----
222pmgcm create
223----
224
5770431a
DM
225NOTE: The node where you run the cluster create command will be the
226'master' node.
227
3ea67bfe 228
5770431a 229Show Cluster Status
3ea67bfe
DM
230~~~~~~~~~~~~~~~~~~~
231
5770431a
DM
232The GUI shows the status of all cluster nodes, and it is also possible
233to use the command line tool:
234
3ea67bfe
DM
235----
236pmgcm status
237--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
238pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41%
239----
240
241
4a08dffe 242[[pmgcm_join]]
3ea67bfe
DM
243Adding Cluster Nodes
244~~~~~~~~~~~~~~~~~~~~
245
5770431a
DM
246image::images/screenshot/pmg-gui-cluster-join.png[]
247
3ea67bfe
DM
248When you add a new node to a cluster (join) all data on that node is
249destroyed. The whole database is initialized with cluster data from
250the master.
251
252* make sure you have the right IP configuration
253
254* run the cluster join command (on the new node):
255+
256----
257pmgcm join <master_ip>
258----
259
260You need to enter the root password of the master host when asked for
5770431a
DM
261a password. When joining a cluster using the GUI, you also need to
262enter the 'fingerprint' of the master node. You get that information
d7dc6300 263by pressing the `Add` button on the master node.
3ea67bfe
DM
264
265CAUTION: Node initialization deletes all existing databases, stops and
266then restarts all services accessing the database. So do not add nodes
267which are already active and receive mails.
268
269Also, joining a cluster can take several minutes, because the new node
270needs to synchronize all data from the master (although this is done
271in the background).
272
273NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.
274
275
276Deleting Nodes
277~~~~~~~~~~~~~~
278
279Please detach nodes from the cluster network before removing them
280from the cluster configuration. Then run the following command on
281the master node:
282
283----
284pmgcm delete <cid>
285----
286
287Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
288
289
290Disaster Recovery
291~~~~~~~~~~~~~~~~~
292
293It is highly recommended to use redundant disks on all cluster nodes
294(RAID). So in almost any circumstances you just need to replace the
295damaged hardware or disk. {pmg} uses an asynchronous
296clustering algorithm, so you just need to reboot the repaired node,
297and everything will work again transparently.
298
299The following scenarios only apply when you really loose the contents
300of the hard disk.
301
302
303Single Node Failure
304^^^^^^^^^^^^^^^^^^^
305
306* delete failed node on master
307+
308----
309pmgcm delete <cid>
310----
311
312* add (re-join) a new node
313+
314----
315pmgcm join <master_ip>
316----
317
318
319Master Failure
320^^^^^^^^^^^^^^
321
322* force another node to be master
323+
324-----
325pmgcm promote
326-----
327
328* tell other nodes that master has changed
329+
330----
331pmgcm sync --master_ip <master_ip>
332----
333
334
335Total Cluster Failure
336^^^^^^^^^^^^^^^^^^^^^
337
338* restore backup (Cluster and node information is not restored, you
339 have to recreate master and nodes)
340
341* tell it to become master
342+
343----
344pmgcm create
345----
346
347* install new nodes
348
349* add those new nodes to the cluster
350+
351----
352pmgcm join <master_ip>
353----
354
a0f910ae
DM
355
356ifdef::manvolnum[]
357include::pmg-copyright.adoc[]
358endif::manvolnum[]