]>
Commit | Line | Data |
---|---|---|
a0f910ae DM |
1 | [[chapter_pmgcm]] |
2 | ifdef::manvolnum[] | |
3 | pmgcm(1) | |
4 | ======== | |
5 | :pmg-toplevel: | |
6 | ||
7 | NAME | |
8 | ---- | |
9 | ||
10 | pmgcm - Proxmox Mail Gateway Cluster Management Toolkit | |
11 | ||
12 | ||
13 | SYNOPSIS | |
14 | -------- | |
15 | ||
16 | include::pmgcm.1-synopsis.adoc[] | |
17 | ||
18 | ||
19 | DESCRIPTION | |
20 | ----------- | |
21 | endif::manvolnum[] | |
22 | ifndef::manvolnum[] | |
3ea67bfe DM |
23 | Cluster Management |
24 | ================== | |
a0f910ae DM |
25 | :pmg-toplevel: |
26 | endif::manvolnum[] | |
27 | ||
3ea67bfe DM |
28 | We are living in a world where email becomes more and more important - |
29 | failures in email systems are just not acceptable. To meet these | |
30 | requirements we developed the Proxmox HA (High Availability) Cluster. | |
31 | ||
32 | The {pmg} HA Cluster consists of a master and several slave nodes | |
33 | (minimum one node). Configuration is done on the master. Configuration | |
34 | and data is synchronized to all cluster nodes over a VPN tunnel. This | |
35 | provides the following advantages: | |
36 | ||
37 | * centralized configuration management | |
38 | ||
39 | * fully redundant data storage | |
40 | ||
41 | * high availability | |
42 | ||
43 | * high performance | |
44 | ||
45 | We use a unique application level clustering scheme, which provides | |
46 | extremely good performance. Special considerations where taken to make | |
47 | management as easy as possible. Complete Cluster setup is done within | |
48 | minutes, and nodes automatically reintegrate after temporary failures | |
49 | without any operator interaction. | |
50 | ||
51 | image::images/pmg-ha-cluster.png[] | |
52 | ||
53 | ||
54 | Hardware requirements | |
55 | --------------------- | |
56 | ||
57 | There are no special hardware requirements, although it is highly | |
58 | recommended to use fast and reliable server with redundant disks on | |
59 | all cluster nodes (Hardware RAID with BBU and write cache enabled). | |
60 | ||
61 | The HA Cluster can also run in virtualized environments. | |
62 | ||
63 | ||
64 | Subscriptions | |
65 | ------------- | |
66 | ||
67 | Each host in a cluster has its own subscription. If you want support | |
68 | for a cluster, each cluster node needs to have a valid | |
69 | subscription. All nodes must have the same subscription level. | |
70 | ||
71 | ||
72 | Load balancing | |
73 | -------------- | |
74 | ||
75 | You can use one of the mechanism described in chapter 9 if you want to | |
76 | distribute mail traffic among the cluster nodes. Please note that this | |
77 | is not always required, because it is also reasonable to use only one | |
78 | node to handle SMTP traffic. The second node is used as quarantine | |
79 | host (provide the web interface to user quarantine). | |
80 | ||
81 | ||
82 | Cluster administration | |
83 | ---------------------- | |
84 | ||
85 | Cluster administration is done with a single command line utility | |
86 | called `pmgcm'. So you need to login via ssh to manage the cluster | |
87 | setup. | |
88 | ||
89 | NOTE: Always setup the IP configuration before adding a node to the | |
90 | cluster. IP address, network mask, gateway address and hostname can’t | |
91 | be changed later. | |
92 | ||
93 | ||
94 | Creating a Cluster | |
95 | ~~~~~~~~~~~~~~~~~~ | |
96 | ||
97 | You can create a cluster from any existing Proxmox host. All data is | |
98 | preserved. | |
99 | ||
100 | * make sure you have the right IP configuration | |
101 | (IP/MASK/GATEWAY/HOSTNAME), because you cannot changed that later | |
102 | ||
103 | * run the cluster creation command: | |
104 | + | |
105 | ---- | |
106 | pmgcm create | |
107 | ---- | |
108 | ||
109 | ||
110 | List Cluster Status | |
111 | ~~~~~~~~~~~~~~~~~~~ | |
112 | ||
113 | ---- | |
114 | pmgcm status | |
115 | --NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK | |
116 | pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41% | |
117 | ---- | |
118 | ||
119 | ||
120 | Adding Cluster Nodes | |
121 | ~~~~~~~~~~~~~~~~~~~~ | |
122 | ||
123 | When you add a new node to a cluster (join) all data on that node is | |
124 | destroyed. The whole database is initialized with cluster data from | |
125 | the master. | |
126 | ||
127 | * make sure you have the right IP configuration | |
128 | ||
129 | * run the cluster join command (on the new node): | |
130 | + | |
131 | ---- | |
132 | pmgcm join <master_ip> | |
133 | ---- | |
134 | ||
135 | You need to enter the root password of the master host when asked for | |
136 | a password. | |
137 | ||
138 | CAUTION: Node initialization deletes all existing databases, stops and | |
139 | then restarts all services accessing the database. So do not add nodes | |
140 | which are already active and receive mails. | |
141 | ||
142 | Also, joining a cluster can take several minutes, because the new node | |
143 | needs to synchronize all data from the master (although this is done | |
144 | in the background). | |
145 | ||
146 | NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node. | |
147 | ||
148 | ||
149 | Deleting Nodes | |
150 | ~~~~~~~~~~~~~~ | |
151 | ||
152 | Please detach nodes from the cluster network before removing them | |
153 | from the cluster configuration. Then run the following command on | |
154 | the master node: | |
155 | ||
156 | ---- | |
157 | pmgcm delete <cid> | |
158 | ---- | |
159 | ||
160 | Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`. | |
161 | ||
162 | ||
163 | Disaster Recovery | |
164 | ~~~~~~~~~~~~~~~~~ | |
165 | ||
166 | It is highly recommended to use redundant disks on all cluster nodes | |
167 | (RAID). So in almost any circumstances you just need to replace the | |
168 | damaged hardware or disk. {pmg} uses an asynchronous | |
169 | clustering algorithm, so you just need to reboot the repaired node, | |
170 | and everything will work again transparently. | |
171 | ||
172 | The following scenarios only apply when you really loose the contents | |
173 | of the hard disk. | |
174 | ||
175 | ||
176 | Single Node Failure | |
177 | ^^^^^^^^^^^^^^^^^^^ | |
178 | ||
179 | * delete failed node on master | |
180 | + | |
181 | ---- | |
182 | pmgcm delete <cid> | |
183 | ---- | |
184 | ||
185 | * add (re-join) a new node | |
186 | + | |
187 | ---- | |
188 | pmgcm join <master_ip> | |
189 | ---- | |
190 | ||
191 | ||
192 | Master Failure | |
193 | ^^^^^^^^^^^^^^ | |
194 | ||
195 | * force another node to be master | |
196 | + | |
197 | ----- | |
198 | pmgcm promote | |
199 | ----- | |
200 | ||
201 | * tell other nodes that master has changed | |
202 | + | |
203 | ---- | |
204 | pmgcm sync --master_ip <master_ip> | |
205 | ---- | |
206 | ||
207 | ||
208 | Total Cluster Failure | |
209 | ^^^^^^^^^^^^^^^^^^^^^ | |
210 | ||
211 | * restore backup (Cluster and node information is not restored, you | |
212 | have to recreate master and nodes) | |
213 | ||
214 | * tell it to become master | |
215 | + | |
216 | ---- | |
217 | pmgcm create | |
218 | ---- | |
219 | ||
220 | * install new nodes | |
221 | ||
222 | * add those new nodes to the cluster | |
223 | + | |
224 | ---- | |
225 | pmgcm join <master_ip> | |
226 | ---- | |
227 | ||
a0f910ae DM |
228 | |
229 | ifdef::manvolnum[] | |
230 | include::pmg-copyright.adoc[] | |
231 | endif::manvolnum[] |