]> git.proxmox.com Git - pmg-docs.git/blob - pmgcm.adoc
pmgcm: typos, grammar and rephrasing fixups
[pmg-docs.git] / pmgcm.adoc
1 [[chapter_pmgcm]]
2 ifdef::manvolnum[]
3 pmgcm(1)
4 ========
5 :pmg-toplevel:
6
7 NAME
8 ----
9
10 pmgcm - Proxmox Mail Gateway Cluster Management Toolkit
11
12
13 SYNOPSIS
14 --------
15
16 include::pmgcm.1-synopsis.adoc[]
17
18
19 DESCRIPTION
20 -----------
21 endif::manvolnum[]
22 ifndef::manvolnum[]
23 Cluster Management
24 ==================
25 :pmg-toplevel:
26 endif::manvolnum[]
27
28 We are living in a world where email becomes more and more important -
29 failures in email systems are just not acceptable. To meet these
30 requirements we developed the Proxmox HA (High Availability) Cluster.
31
32 The {pmg} HA Cluster consists of a master and several slave nodes
33 (minimum one slave node). Configuration is done on the master. Configuration
34 and data is synchronized to all cluster nodes over a VPN tunnel. This
35 provides the following advantages:
36
37 * centralized configuration management
38
39 * fully redundant data storage
40
41 * high availability
42
43 * high performance
44
45 We use a unique application level clustering scheme, which provides
46 extremely good performance. Special considerations were taken to make
47 management as easy as possible. A complete cluster setup is done within
48 minutes, and nodes automatically reintegrate after temporary failures
49 without any operator interaction.
50
51 image::images/Proxmox_HA_cluster_final_1024.png[]
52
53
54 Hardware requirements
55 ---------------------
56
57 There are no special hardware requirements, although it is highly
58 recommended to use fast and reliable server with redundant disks on
59 all cluster nodes (Hardware RAID with BBU and write cache enabled).
60
61 The HA Cluster can also run in virtualized environments.
62
63
64 Subscriptions
65 -------------
66
67 Each node in a cluster has its own subscription. If you want support
68 for a cluster, each cluster node needs to have a valid
69 subscription. All nodes must have the same subscription level.
70
71
72 Load balancing
73 --------------
74
75 It is usually advisable to distribute mail traffic among all cluster
76 nodes. Please note that this is not always required, because it is
77 also reasonable to use only one node to handle SMTP traffic. The
78 second node is used as quarantine host, and only provides the web
79 interface to the user quarantine.
80
81 The normal mail delivery process looks up DNS Mail Exchange (`MX`)
82 records to determine the destination host. An `MX` record tells the
83 sending system where to deliver mail for a certain domain. It is also
84 possible to have several `MX` records for a single domain, they can have
85 different priorities. For example, our `MX` record looks like that:
86
87 ----
88 # dig -t mx proxmox.com
89
90 ;; ANSWER SECTION:
91 proxmox.com. 22879 IN MX 10 mail.proxmox.com.
92
93 ;; ADDITIONAL SECTION:
94 mail.proxmox.com. 22879 IN A 213.129.239.114
95 ----
96
97 Notice that there is a single `MX` record for the domain
98 `proxmox.com`, pointing to `mail.proxmox.com`. The `dig` command
99 automatically puts out the corresponding address record if it
100 exists. In our case it points to `213.129.239.114`. The priority of
101 our `MX` record is set to 10 (preferred default value).
102
103
104 Hot standby with backup `MX` records
105 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106
107 Many people do not want to install two redundant mail proxies, instead
108 they use the mail proxy of their ISP as fallback. This is simply done
109 by adding an additional `MX` Record with a lower priority (higher
110 number). With the example above this looks like that:
111
112 ----
113 proxmox.com. 22879 IN MX 100 mail.provider.tld.
114 ----
115
116 In such a setup, your provider must accept mails for your domain and
117 forward them to you. Please note that this is not advisable, because
118 spam detection needs to be done by the backup `MX` server as well, and
119 external servers provided by ISPs usually don't.
120
121 However, you will never lose mails with such a setup, because the sending Mail
122 Transport Agent (MTA) will simply deliver the mail to the backup
123 server (mail.provider.tld) if the primary server (mail.proxmox.com) is
124 not available.
125
126 NOTE: Any reasonable mail server retries mail delivery if the target
127 server is not available, and {pmg} stores mail and retries delivery
128 for up to one week. So you will not lose mails if your mail server is
129 down, even if you run a single server setup.
130
131
132 Load balancing with `MX` records
133 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134
135 Using your ISP's mail server is not always a good idea, because many
136 ISPs do not use advanced spam prevention techniques, or do not filter
137 spam at all. It is often better to run a second server yourself to
138 avoid lower spam detection rates.
139
140 It’s quite simple to set up a high performance load balanced
141 mail cluster using `MX` records. You need to define two `MX` records
142 with the same priority. Here is a complete example to make it clearer.
143
144 First, you need to have at least 2 working {pmg} servers
145 (mail1.example.com and mail2.example.com) configured as cluster (see
146 section xref:pmg_cluster_administration[Cluster administration]
147 below), each having its own IP address. Let us assume the following
148 DNS address records:
149
150 ----
151 mail1.example.com. 22879 IN A 1.2.3.4
152 mail2.example.com. 22879 IN A 1.2.3.5
153 ----
154
155 It is always a good idea to add reverse lookup entries (PTR
156 records) for those hosts. Many email systems nowadays reject mails
157 from hosts without valid PTR records. Then you need to define your `MX`
158 records:
159
160 ----
161 example.com. 22879 IN MX 10 mail1.example.com.
162 example.com. 22879 IN MX 10 mail2.example.com.
163 ----
164
165 This is all you need. You will receive mails on both hosts, load-balanced using
166 round-robin scheduling. If one host fails the other one is used.
167
168
169 Other ways
170 ~~~~~~~~~~
171
172 Multiple address records
173 ^^^^^^^^^^^^^^^^^^^^^^^^
174
175 Using several DNS `MX` records is sometimes tedious if you have many
176 domains. It is also possible to use one `MX` record per domain, but
177 multiple address records:
178
179 ----
180 example.com. 22879 IN MX 10 mail.example.com.
181 mail.example.com. 22879 IN A 1.2.3.4
182 mail.example.com. 22879 IN A 1.2.3.5
183 ----
184
185
186 Using firewall features
187 ^^^^^^^^^^^^^^^^^^^^^^^
188
189 Many firewalls can do some kind of RR-Scheduling (round-robin) when
190 using DNAT. See your firewall manual for more details.
191
192
193 [[pmg_cluster_administration]]
194 Cluster administration
195 ----------------------
196
197 Cluster administration can be done in the GUI or by using the command
198 line utility `pmgcm`. The CLI tool is a bit more verbose, so we suggest
199 to use that if you run into any problems.
200
201 NOTE: Always setup the IP configuration before adding a node to the
202 cluster. IP address, network mask, gateway address and hostname can’t
203 be changed later.
204
205 Creating a Cluster
206 ~~~~~~~~~~~~~~~~~~
207
208 [thumbnail="pmg-gui-cluster-panel.png", big=1]
209
210 You can create a cluster from any existing {pmg} host. All data is
211 preserved.
212
213 * make sure you have the right IP configuration
214 (IP/MASK/GATEWAY/HOSTNAME), because you cannot change that later
215
216 * press the create button on the GUI, or run the cluster creation command:
217 +
218 ----
219 pmgcm create
220 ----
221
222 NOTE: The node where you run the cluster create command will be the
223 'master' node.
224
225
226 Show Cluster Status
227 ~~~~~~~~~~~~~~~~~~~
228
229 The GUI shows the status of all cluster nodes, and it is also possible
230 to use the command line tool:
231
232 ----
233 pmgcm status
234 --NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
235 pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41%
236 ----
237
238
239 [[pmgcm_join]]
240 Adding Cluster Nodes
241 ~~~~~~~~~~~~~~~~~~~~
242
243 [thumbnail="pmg-gui-cluster-join.png", big=1]
244
245 When you add a new node to a cluster (using `join`), all data on that node is
246 destroyed. The whole database is initialized with the cluster data from
247 the master.
248
249 * make sure you have the right IP configuration
250
251 * run the cluster join command (on the new node):
252 +
253 ----
254 pmgcm join <master_ip>
255 ----
256
257 You need to enter the root password of the master host when asked for
258 a password. When joining a cluster using the GUI, you also need to
259 enter the 'fingerprint' of the master node. You can get that information
260 by pressing the `Add` button on the master node.
261
262 CAUTION: Node initialization deletes all existing databases, stops and
263 then restarts all services accessing the database. So do not add nodes
264 which are already active and receive mails.
265
266 Also, joining a cluster can take several minutes, because the new node
267 needs to synchronize all data from the master (although this is done
268 in the background).
269
270 NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.
271
272
273 Deleting Nodes
274 ~~~~~~~~~~~~~~
275
276 Please detach nodes from the cluster network before removing them
277 from the cluster configuration. Then run the following command on
278 the master node:
279
280 ----
281 pmgcm delete <cid>
282 ----
283
284 Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
285
286
287 Disaster Recovery
288 ~~~~~~~~~~~~~~~~~
289
290 It is highly recommended to use redundant disks on all cluster nodes
291 (RAID). So in almost any circumstances you just need to replace the
292 damaged hardware or disk. {pmg} uses an asynchronous
293 clustering algorithm, so you just need to reboot the repaired node,
294 and everything will work again transparently.
295
296 The following scenarios only apply when you really lose the contents
297 of the hard disk.
298
299
300 Single Node Failure
301 ^^^^^^^^^^^^^^^^^^^
302
303 * delete failed node on master
304 +
305 ----
306 pmgcm delete <cid>
307 ----
308
309 * add (re-join) a new node
310 +
311 ----
312 pmgcm join <master_ip>
313 ----
314
315
316 Master Failure
317 ^^^^^^^^^^^^^^
318
319 * force another node to be master
320 +
321 -----
322 pmgcm promote
323 -----
324
325 * tell other nodes that master has changed
326 +
327 ----
328 pmgcm sync --master_ip <master_ip>
329 ----
330
331
332 Total Cluster Failure
333 ^^^^^^^^^^^^^^^^^^^^^
334
335 * restore backup (Cluster and node information is not restored, you
336 have to recreate master and nodes)
337
338 * tell it to become master
339 +
340 ----
341 pmgcm create
342 ----
343
344 * install new nodes
345
346 * add those new nodes to the cluster
347 +
348 ----
349 pmgcm join <master_ip>
350 ----
351
352
353 ifdef::manvolnum[]
354 include::pmg-copyright.adoc[]
355 endif::manvolnum[]