]> git.proxmox.com Git - pmg-docs.git/blob - pmgcm.adoc
bump version to 5.1-2
[pmg-docs.git] / pmgcm.adoc
1 [[chapter_pmgcm]]
2 ifdef::manvolnum[]
3 pmgcm(1)
4 ========
5 :pmg-toplevel:
6
7 NAME
8 ----
9
10 pmgcm - Proxmox Mail Gateway Cluster Management Toolkit
11
12
13 SYNOPSIS
14 --------
15
16 include::pmgcm.1-synopsis.adoc[]
17
18
19 DESCRIPTION
20 -----------
21 endif::manvolnum[]
22 ifndef::manvolnum[]
23 Cluster Management
24 ==================
25 :pmg-toplevel:
26 endif::manvolnum[]
27
28 We are living in a world where email becomes more and more important -
29 failures in email systems are just not acceptable. To meet these
30 requirements we developed the Proxmox HA (High Availability) Cluster.
31
32 The {pmg} HA Cluster consists of a master and several slave nodes
33 (minimum one node). Configuration is done on the master. Configuration
34 and data is synchronized to all cluster nodes over a VPN tunnel. This
35 provides the following advantages:
36
37 * centralized configuration management
38
39 * fully redundant data storage
40
41 * high availability
42
43 * high performance
44
45 We use a unique application level clustering scheme, which provides
46 extremely good performance. Special considerations where taken to make
47 management as easy as possible. Complete Cluster setup is done within
48 minutes, and nodes automatically reintegrate after temporary failures
49 without any operator interaction.
50
51 image::images/Proxmox_HA_cluster_final_1024.png[]
52
53
54 Hardware requirements
55 ---------------------
56
57 There are no special hardware requirements, although it is highly
58 recommended to use fast and reliable server with redundant disks on
59 all cluster nodes (Hardware RAID with BBU and write cache enabled).
60
61 The HA Cluster can also run in virtualized environments.
62
63
64 Subscriptions
65 -------------
66
67 Each host in a cluster has its own subscription. If you want support
68 for a cluster, each cluster node needs to have a valid
69 subscription. All nodes must have the same subscription level.
70
71
72 Load balancing
73 --------------
74
75 It is usually advisable to distribute mail traffic among all cluster
76 nodes. Please note that this is not always required, because it is
77 also reasonable to use only one node to handle SMTP traffic. The
78 second node is used as quarantine host, and only provides the web
79 interface to the user quarantine.
80
81 The normal mail delivery process looks up DNS Mail Exchange (`MX`)
82 records to determine the destination host. A `MX` record tells the
83 sending system where to deliver mail for a certain domain. It is also
84 possible to have several `MX` records for a single domain, they can have
85 different priorities. For example, our `MX` record looks like that:
86
87 ----
88 # dig -t mx proxmox.com
89
90 ;; ANSWER SECTION:
91 proxmox.com. 22879 IN MX 10 mail.proxmox.com.
92
93 ;; ADDITIONAL SECTION:
94 mail.proxmox.com. 22879 IN A 213.129.239.114
95 ----
96
97 Please notice that there is one single `MX` record for the Domain
98 `proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
99 automatically puts out the corresponding address record if it
100 exists. In our case it points to `213.129.239.114`. The priority of
101 our `MX` record is set to 10 (preferred default value).
102
103
104 Hot standby with backup `MX` records
105 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106
107 Many people do not want to install two redundant mail proxies, instead
108 they use the mail proxy of their ISP as fall-back. This is simply done
109 by adding an additional `MX` Record with a lower priority (higher
110 number). With the example above this looks like that:
111
112 ----
113 proxmox.com. 22879 IN MX 100 mail.provider.tld.
114 ----
115
116 Sure, your provider must accept mails for your domain and forward
117 received mails to you. Please note that such setup is not really
118 advisable, because spam detection needs to be done by that backup `MX`
119 server also, and external servers provided by ISPs usually don't do
120 that.
121
122 You will never lose mails with such a setup, because the sending Mail
123 Transport Agent (MTA) will simply deliver the mail to the backup
124 server (mail.provider.tld) if the primary server (mail.proxmox.com) is
125 not available.
126
127 NOTE: Any resononable mail server retries mail devivery if the target
128 server is not available, i.e. {pmg} stores mail and retries delivery
129 for up to one week. So you will not loose mail if you mail server is
130 down, even if you run a single server setup.
131
132
133 Load balancing with `MX` records
134 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
135
136 Using your ISPs mail server is not always a good idea, because many
137 ISPs do not use advanced spam prevention techniques, or do not filter
138 SPAM at all. It is often better to run a second server yourself to
139 avoid lower spam detection rates.
140
141 Anyways, it’s quite simple to set up a high performance load balanced
142 mail cluster using `MX` records. You just need to define two `MX` records
143 with the same priority. I will explain this using a complete example
144 to make it clearer.
145
146 First, you need to have at least 2 working {pmg} servers
147 (mail1.example.com and mail2.example.com) configured as cluster (see
148 section xref:pmg_cluster_administration[Cluster administration]
149 below), each having its own IP address. Let us assume the following
150 addresses (DNS address records):
151
152 ----
153 mail1.example.com. 22879 IN A 1.2.3.4
154 mail2.example.com. 22879 IN A 1.2.3.5
155 ----
156
157 Btw, it is always a good idea to add reverse lookup entries (PTR
158 records) for those hosts. Many email systems nowadays reject mails
159 from hosts without valid PTR records. Then you need to define your `MX`
160 records:
161
162 ----
163 example.com. 22879 IN MX 10 mail1.example.com.
164 example.com. 22879 IN MX 10 mail2.example.com.
165 ----
166
167 This is all you need. You will receive mails on both hosts, more or
168 less load-balanced using round-robin scheduling. If one host fails the
169 other is used.
170
171
172 Other ways
173 ~~~~~~~~~~
174
175 Multiple address records
176 ^^^^^^^^^^^^^^^^^^^^^^^^
177
178 Using several DNS `MX` record is sometime clumsy if you have many
179 domains. It is also possible to use one `MX` record per domain, but
180 multiple address records:
181
182 ----
183 example.com. 22879 IN MX 10 mail.example.com.
184 mail.example.com. 22879 IN A 1.2.3.4
185 mail.example.com. 22879 IN A 1.2.3.5
186 ----
187
188
189 Using firewall features
190 ^^^^^^^^^^^^^^^^^^^^^^^
191
192 Many firewalls can do some kind of RR-Scheduling (round-robin) when
193 using DNAT. See your firewall manual for more details.
194
195
196 [[pmg_cluster_administration]]
197 Cluster administration
198 ----------------------
199
200 Cluster administration can be done on the GUI or using the command
201 line utility `pmgcm`. The CLI tool is a bit more verbose, so we suggest
202 to use that if you run into problems.
203
204 NOTE: Always setup the IP configuration before adding a node to the
205 cluster. IP address, network mask, gateway address and hostname can’t
206 be changed later.
207
208 Creating a Cluster
209 ~~~~~~~~~~~~~~~~~~
210
211 image::images/screenshot/pmg-gui-cluster-panel.png[]
212
213 You can create a cluster from any existing Proxmox host. All data is
214 preserved.
215
216 * make sure you have the right IP configuration
217 (IP/MASK/GATEWAY/HOSTNAME), because you cannot change that later
218
219 * press the create button on the GUI, or run the cluster creation command:
220 +
221 ----
222 pmgcm create
223 ----
224
225 NOTE: The node where you run the cluster create command will be the
226 'master' node.
227
228
229 Show Cluster Status
230 ~~~~~~~~~~~~~~~~~~~
231
232 The GUI shows the status of all cluster nodes, and it is also possible
233 to use the command line tool:
234
235 ----
236 pmgcm status
237 --NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
238 pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41%
239 ----
240
241
242 Adding Cluster Nodes
243 ~~~~~~~~~~~~~~~~~~~~
244
245 image::images/screenshot/pmg-gui-cluster-join.png[]
246
247 When you add a new node to a cluster (join) all data on that node is
248 destroyed. The whole database is initialized with cluster data from
249 the master.
250
251 * make sure you have the right IP configuration
252
253 * run the cluster join command (on the new node):
254 +
255 ----
256 pmgcm join <master_ip>
257 ----
258
259 You need to enter the root password of the master host when asked for
260 a password. When joining a cluster using the GUI, you also need to
261 enter the 'fingerprint' of the master node. You get that information
262 by pressing the `Add` button on the master node.
263
264 CAUTION: Node initialization deletes all existing databases, stops and
265 then restarts all services accessing the database. So do not add nodes
266 which are already active and receive mails.
267
268 Also, joining a cluster can take several minutes, because the new node
269 needs to synchronize all data from the master (although this is done
270 in the background).
271
272 NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.
273
274
275 Deleting Nodes
276 ~~~~~~~~~~~~~~
277
278 Please detach nodes from the cluster network before removing them
279 from the cluster configuration. Then run the following command on
280 the master node:
281
282 ----
283 pmgcm delete <cid>
284 ----
285
286 Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
287
288
289 Disaster Recovery
290 ~~~~~~~~~~~~~~~~~
291
292 It is highly recommended to use redundant disks on all cluster nodes
293 (RAID). So in almost any circumstances you just need to replace the
294 damaged hardware or disk. {pmg} uses an asynchronous
295 clustering algorithm, so you just need to reboot the repaired node,
296 and everything will work again transparently.
297
298 The following scenarios only apply when you really loose the contents
299 of the hard disk.
300
301
302 Single Node Failure
303 ^^^^^^^^^^^^^^^^^^^
304
305 * delete failed node on master
306 +
307 ----
308 pmgcm delete <cid>
309 ----
310
311 * add (re-join) a new node
312 +
313 ----
314 pmgcm join <master_ip>
315 ----
316
317
318 Master Failure
319 ^^^^^^^^^^^^^^
320
321 * force another node to be master
322 +
323 -----
324 pmgcm promote
325 -----
326
327 * tell other nodes that master has changed
328 +
329 ----
330 pmgcm sync --master_ip <master_ip>
331 ----
332
333
334 Total Cluster Failure
335 ^^^^^^^^^^^^^^^^^^^^^
336
337 * restore backup (Cluster and node information is not restored, you
338 have to recreate master and nodes)
339
340 * tell it to become master
341 +
342 ----
343 pmgcm create
344 ----
345
346 * install new nodes
347
348 * add those new nodes to the cluster
349 +
350 ----
351 pmgcm join <master_ip>
352 ----
353
354
355 ifdef::manvolnum[]
356 include::pmg-copyright.adoc[]
357 endif::manvolnum[]