]> git.proxmox.com Git - pmg-docs.git/blob - pmgcm.adoc
add basic cluster management docs
[pmg-docs.git] / pmgcm.adoc
1 [[chapter_pmgcm]]
2 ifdef::manvolnum[]
3 pmgcm(1)
4 ========
5 :pmg-toplevel:
6
7 NAME
8 ----
9
10 pmgcm - Proxmox Mail Gateway Cluster Management Toolkit
11
12
13 SYNOPSIS
14 --------
15
16 include::pmgcm.1-synopsis.adoc[]
17
18
19 DESCRIPTION
20 -----------
21 endif::manvolnum[]
22 ifndef::manvolnum[]
23 Cluster Management
24 ==================
25 :pmg-toplevel:
26 endif::manvolnum[]
27
28 We are living in a world where email becomes more and more important -
29 failures in email systems are just not acceptable. To meet these
30 requirements we developed the Proxmox HA (High Availability) Cluster.
31
32 The {pmg} HA Cluster consists of a master and several slave nodes
33 (minimum one node). Configuration is done on the master. Configuration
34 and data is synchronized to all cluster nodes over a VPN tunnel. This
35 provides the following advantages:
36
37 * centralized configuration management
38
39 * fully redundant data storage
40
41 * high availability
42
43 * high performance
44
45 We use a unique application level clustering scheme, which provides
46 extremely good performance. Special considerations where taken to make
47 management as easy as possible. Complete Cluster setup is done within
48 minutes, and nodes automatically reintegrate after temporary failures
49 without any operator interaction.
50
51 image::images/pmg-ha-cluster.png[]
52
53
54 Hardware requirements
55 ---------------------
56
57 There are no special hardware requirements, although it is highly
58 recommended to use fast and reliable server with redundant disks on
59 all cluster nodes (Hardware RAID with BBU and write cache enabled).
60
61 The HA Cluster can also run in virtualized environments.
62
63
64 Subscriptions
65 -------------
66
67 Each host in a cluster has its own subscription. If you want support
68 for a cluster, each cluster node needs to have a valid
69 subscription. All nodes must have the same subscription level.
70
71
72 Load balancing
73 --------------
74
75 You can use one of the mechanism described in chapter 9 if you want to
76 distribute mail traffic among the cluster nodes. Please note that this
77 is not always required, because it is also reasonable to use only one
78 node to handle SMTP traffic. The second node is used as quarantine
79 host (provide the web interface to user quarantine).
80
81
82 Cluster administration
83 ----------------------
84
85 Cluster administration is done with a single command line utility
86 called `pmgcm'. So you need to login via ssh to manage the cluster
87 setup.
88
89 NOTE: Always setup the IP configuration before adding a node to the
90 cluster. IP address, network mask, gateway address and hostname can’t
91 be changed later.
92
93
94 Creating a Cluster
95 ~~~~~~~~~~~~~~~~~~
96
97 You can create a cluster from any existing Proxmox host. All data is
98 preserved.
99
100 * make sure you have the right IP configuration
101 (IP/MASK/GATEWAY/HOSTNAME), because you cannot changed that later
102
103 * run the cluster creation command:
104 +
105 ----
106 pmgcm create
107 ----
108
109
110 List Cluster Status
111 ~~~~~~~~~~~~~~~~~~~
112
113 ----
114 pmgcm status
115 --NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
116 pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41%
117 ----
118
119
120 Adding Cluster Nodes
121 ~~~~~~~~~~~~~~~~~~~~
122
123 When you add a new node to a cluster (join) all data on that node is
124 destroyed. The whole database is initialized with cluster data from
125 the master.
126
127 * make sure you have the right IP configuration
128
129 * run the cluster join command (on the new node):
130 +
131 ----
132 pmgcm join <master_ip>
133 ----
134
135 You need to enter the root password of the master host when asked for
136 a password.
137
138 CAUTION: Node initialization deletes all existing databases, stops and
139 then restarts all services accessing the database. So do not add nodes
140 which are already active and receive mails.
141
142 Also, joining a cluster can take several minutes, because the new node
143 needs to synchronize all data from the master (although this is done
144 in the background).
145
146 NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.
147
148
149 Deleting Nodes
150 ~~~~~~~~~~~~~~
151
152 Please detach nodes from the cluster network before removing them
153 from the cluster configuration. Then run the following command on
154 the master node:
155
156 ----
157 pmgcm delete <cid>
158 ----
159
160 Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
161
162
163 Disaster Recovery
164 ~~~~~~~~~~~~~~~~~
165
166 It is highly recommended to use redundant disks on all cluster nodes
167 (RAID). So in almost any circumstances you just need to replace the
168 damaged hardware or disk. {pmg} uses an asynchronous
169 clustering algorithm, so you just need to reboot the repaired node,
170 and everything will work again transparently.
171
172 The following scenarios only apply when you really loose the contents
173 of the hard disk.
174
175
176 Single Node Failure
177 ^^^^^^^^^^^^^^^^^^^
178
179 * delete failed node on master
180 +
181 ----
182 pmgcm delete <cid>
183 ----
184
185 * add (re-join) a new node
186 +
187 ----
188 pmgcm join <master_ip>
189 ----
190
191
192 Master Failure
193 ^^^^^^^^^^^^^^
194
195 * force another node to be master
196 +
197 -----
198 pmgcm promote
199 -----
200
201 * tell other nodes that master has changed
202 +
203 ----
204 pmgcm sync --master_ip <master_ip>
205 ----
206
207
208 Total Cluster Failure
209 ^^^^^^^^^^^^^^^^^^^^^
210
211 * restore backup (Cluster and node information is not restored, you
212 have to recreate master and nodes)
213
214 * tell it to become master
215 +
216 ----
217 pmgcm create
218 ----
219
220 * install new nodes
221
222 * add those new nodes to the cluster
223 +
224 ----
225 pmgcm join <master_ip>
226 ----
227
228
229 ifdef::manvolnum[]
230 include::pmg-copyright.adoc[]
231 endif::manvolnum[]