]> git.proxmox.com Git - pmg-docs.git/blame - pmgcm.adoc
add basic cluster management docs
[pmg-docs.git] / pmgcm.adoc
CommitLineData
a0f910ae
DM
1[[chapter_pmgcm]]
2ifdef::manvolnum[]
3pmgcm(1)
4========
5:pmg-toplevel:
6
7NAME
8----
9
10pmgcm - Proxmox Mail Gateway Cluster Management Toolkit
11
12
13SYNOPSIS
14--------
15
16include::pmgcm.1-synopsis.adoc[]
17
18
19DESCRIPTION
20-----------
21endif::manvolnum[]
22ifndef::manvolnum[]
3ea67bfe
DM
23Cluster Management
24==================
a0f910ae
DM
25:pmg-toplevel:
26endif::manvolnum[]
27
3ea67bfe
DM
28We are living in a world where email becomes more and more important -
29failures in email systems are just not acceptable. To meet these
30requirements we developed the Proxmox HA (High Availability) Cluster.
31
32The {pmg} HA Cluster consists of a master and several slave nodes
33(minimum one node). Configuration is done on the master. Configuration
34and data is synchronized to all cluster nodes over a VPN tunnel. This
35provides the following advantages:
36
37* centralized configuration management
38
39* fully redundant data storage
40
41* high availability
42
43* high performance
44
45We use a unique application level clustering scheme, which provides
46extremely good performance. Special considerations where taken to make
47management as easy as possible. Complete Cluster setup is done within
48minutes, and nodes automatically reintegrate after temporary failures
49without any operator interaction.
50
51image::images/pmg-ha-cluster.png[]
52
53
54Hardware requirements
55---------------------
56
57There are no special hardware requirements, although it is highly
58recommended to use fast and reliable server with redundant disks on
59all cluster nodes (Hardware RAID with BBU and write cache enabled).
60
61The HA Cluster can also run in virtualized environments.
62
63
64Subscriptions
65-------------
66
67Each host in a cluster has its own subscription. If you want support
68for a cluster, each cluster node needs to have a valid
69subscription. All nodes must have the same subscription level.
70
71
72Load balancing
73--------------
74
75You can use one of the mechanism described in chapter 9 if you want to
76distribute mail traffic among the cluster nodes. Please note that this
77is not always required, because it is also reasonable to use only one
78node to handle SMTP traffic. The second node is used as quarantine
79host (provide the web interface to user quarantine).
80
81
82Cluster administration
83----------------------
84
85Cluster administration is done with a single command line utility
86called `pmgcm'. So you need to login via ssh to manage the cluster
87setup.
88
89NOTE: Always setup the IP configuration before adding a node to the
90cluster. IP address, network mask, gateway address and hostname can’t
91be changed later.
92
93
94Creating a Cluster
95~~~~~~~~~~~~~~~~~~
96
97You can create a cluster from any existing Proxmox host. All data is
98preserved.
99
100* make sure you have the right IP configuration
101 (IP/MASK/GATEWAY/HOSTNAME), because you cannot changed that later
102
103* run the cluster creation command:
104+
105----
106pmgcm create
107----
108
109
110List Cluster Status
111~~~~~~~~~~~~~~~~~~~
112
113----
114pmgcm status
115--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
116pmg5(1) 192.168.2.127 master A 1 day 21:18 0.30 80% 41%
117----
118
119
120Adding Cluster Nodes
121~~~~~~~~~~~~~~~~~~~~
122
123When you add a new node to a cluster (join) all data on that node is
124destroyed. The whole database is initialized with cluster data from
125the master.
126
127* make sure you have the right IP configuration
128
129* run the cluster join command (on the new node):
130+
131----
132pmgcm join <master_ip>
133----
134
135You need to enter the root password of the master host when asked for
136a password.
137
138CAUTION: Node initialization deletes all existing databases, stops and
139then restarts all services accessing the database. So do not add nodes
140which are already active and receive mails.
141
142Also, joining a cluster can take several minutes, because the new node
143needs to synchronize all data from the master (although this is done
144in the background).
145
146NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.
147
148
149Deleting Nodes
150~~~~~~~~~~~~~~
151
152Please detach nodes from the cluster network before removing them
153from the cluster configuration. Then run the following command on
154the master node:
155
156----
157pmgcm delete <cid>
158----
159
160Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
161
162
163Disaster Recovery
164~~~~~~~~~~~~~~~~~
165
166It is highly recommended to use redundant disks on all cluster nodes
167(RAID). So in almost any circumstances you just need to replace the
168damaged hardware or disk. {pmg} uses an asynchronous
169clustering algorithm, so you just need to reboot the repaired node,
170and everything will work again transparently.
171
172The following scenarios only apply when you really loose the contents
173of the hard disk.
174
175
176Single Node Failure
177^^^^^^^^^^^^^^^^^^^
178
179* delete failed node on master
180+
181----
182pmgcm delete <cid>
183----
184
185* add (re-join) a new node
186+
187----
188pmgcm join <master_ip>
189----
190
191
192Master Failure
193^^^^^^^^^^^^^^
194
195* force another node to be master
196+
197-----
198pmgcm promote
199-----
200
201* tell other nodes that master has changed
202+
203----
204pmgcm sync --master_ip <master_ip>
205----
206
207
208Total Cluster Failure
209^^^^^^^^^^^^^^^^^^^^^
210
211* restore backup (Cluster and node information is not restored, you
212 have to recreate master and nodes)
213
214* tell it to become master
215+
216----
217pmgcm create
218----
219
220* install new nodes
221
222* add those new nodes to the cluster
223+
224----
225pmgcm join <master_ip>
226----
227
a0f910ae
DM
228
229ifdef::manvolnum[]
230include::pmg-copyright.adoc[]
231endif::manvolnum[]