| 1 | [[chapter-ha-manager]] |
| 2 | ifdef::manvolnum[] |
| 3 | PVE({manvolnum}) |
| 4 | ================ |
| 5 | include::attributes.txt[] |
| 6 | |
| 7 | NAME |
| 8 | ---- |
| 9 | |
| 10 | ha-manager - Proxmox VE HA Manager |
| 11 | |
| 12 | SYNOPSYS |
| 13 | -------- |
| 14 | |
| 15 | include::ha-manager.1-synopsis.adoc[] |
| 16 | |
| 17 | DESCRIPTION |
| 18 | ----------- |
| 19 | endif::manvolnum[] |
| 20 | |
| 21 | ifndef::manvolnum[] |
| 22 | High Availability |
| 23 | ================= |
| 24 | include::attributes.txt[] |
| 25 | endif::manvolnum[] |
| 26 | |
| 27 | 'ha-manager' handles management of user-defined cluster services. This |
| 28 | includes handling of user requests including service start, service |
| 29 | disable, service relocate, and service restart. The cluster resource |
| 30 | manager daemon also handles restarting and relocating services in the |
| 31 | event of failures. |
| 32 | |
| 33 | How It Works |
| 34 | ------------ |
| 35 | |
| 36 | The local resource manager ('pve-ha-lrm') is started as a daemon on |
| 37 | each node at system start and waits until the HA cluster is quorate |
| 38 | and locks are working. After initialization, the LRM determines which |
| 39 | services are enabled and starts them. Also the watchdog gets |
| 40 | initialized. |
| 41 | |
| 42 | The cluster resource manager ('pve-ha-crm') starts on each node and |
| 43 | waits there for the manager lock, which can only be held by one node |
| 44 | at a time. The node which successfully acquires the manager lock gets |
| 45 | promoted to the CRM, it handles cluster wide actions like migrations |
| 46 | and failures. |
| 47 | |
| 48 | When an node leaves the cluster quorum, its state changes to unknown. |
| 49 | If the current CRM then can secure the failed nodes lock, the services |
| 50 | will be 'stolen' and restarted on another node. |
| 51 | |
| 52 | When a cluster member determines that it is no longer in the cluster |
| 53 | quorum, the LRM waits for a new quorum to form. As long as there is no |
| 54 | quorum the node cannot reset the watchdog. This will trigger a reboot |
| 55 | after 60 seconds. |
| 56 | |
| 57 | Configuration |
| 58 | ------------- |
| 59 | |
| 60 | The HA stack is well integrated int the Proxmox VE API2. So, for |
| 61 | example, HA can be configured via 'ha-manager' or the PVE web |
| 62 | interface, which both provide an easy to use tool. |
| 63 | |
| 64 | The resource configuration file can be located at |
| 65 | '/etc/pve/ha/resources.cfg' and the group configuration file at |
| 66 | '/etc/pve/ha/groups.cfg'. Use the provided tools to make changes, |
| 67 | there shouldn't be any need to edit them manually. |
| 68 | |
| 69 | Resource/Service Agents |
| 70 | ------------------------- |
| 71 | |
| 72 | A resource or also called service can be managed by the |
| 73 | ha-manager. Currently we support virtual machines and container. |
| 74 | |
| 75 | Groups |
| 76 | ------ |
| 77 | |
| 78 | A group is a collection of cluster nodes which a service may be bound to. |
| 79 | |
| 80 | Group Settings |
| 81 | ~~~~~~~~~~~~~~ |
| 82 | |
| 83 | nodes:: |
| 84 | |
| 85 | list of group node members |
| 86 | |
| 87 | restricted:: |
| 88 | |
| 89 | resources bound to this group may only run on nodes defined by the |
| 90 | group. If no group node member is available the resource will be |
| 91 | placed in the stopped state. |
| 92 | |
| 93 | nofailback:: |
| 94 | |
| 95 | the resource won't automatically fail back when a more preferred node |
| 96 | (re)joins the cluster. |
| 97 | |
| 98 | |
| 99 | Recovery Policy |
| 100 | --------------- |
| 101 | |
| 102 | There are two service recover policy settings which can be configured |
| 103 | specific for each resource. |
| 104 | |
| 105 | max_restart:: |
| 106 | |
| 107 | maximal number of tries to restart an failed service on the actual |
| 108 | node. The default is set to one. |
| 109 | |
| 110 | max_relocate:: |
| 111 | |
| 112 | maximal number of tries to relocate the service to a different node. |
| 113 | A relocate only happens after the max_restart value is exceeded on the |
| 114 | actual node. The default is set to one. |
| 115 | |
| 116 | Note that the relocate count state will only reset to zero when the |
| 117 | service had at least one successful start. That means if a service is |
| 118 | re-enabled without fixing the error only the restart policy gets |
| 119 | repeated. |
| 120 | |
| 121 | Error Recovery |
| 122 | -------------- |
| 123 | |
| 124 | If after all tries the service state could not be recovered it gets |
| 125 | placed in an error state. In this state the service won't get touched |
| 126 | by the HA stack anymore. To recover from this state you should follow |
| 127 | these steps: |
| 128 | |
| 129 | * bring the resource back into an safe and consistent state (e.g: |
| 130 | killing its process) |
| 131 | |
| 132 | * disable the ha resource to place it in an stopped state |
| 133 | |
| 134 | * fix the error which led to this failures |
| 135 | |
| 136 | * *after* you fixed all errors you may enable the service again |
| 137 | |
| 138 | |
| 139 | Service Operations |
| 140 | ------------------ |
| 141 | |
| 142 | This are how the basic user-initiated service operations (via |
| 143 | 'ha-manager') work. |
| 144 | |
| 145 | enable:: |
| 146 | |
| 147 | the service will be started by the LRM if not already running. |
| 148 | |
| 149 | disable:: |
| 150 | |
| 151 | the service will be stopped by the LRM if running. |
| 152 | |
| 153 | migrate/relocate:: |
| 154 | |
| 155 | the service will be relocated (live) to another node. |
| 156 | |
| 157 | remove:: |
| 158 | |
| 159 | the service will be removed from the HA managed resource list. Its |
| 160 | current state will not be touched. |
| 161 | |
| 162 | start/stop:: |
| 163 | |
| 164 | start and stop commands can be issued to the resource specific tools |
| 165 | (like 'qm' or 'pct'), they will forward the request to the |
| 166 | 'ha-manager' which then will execute the action and set the resulting |
| 167 | service state (enabled, disabled). |
| 168 | |
| 169 | |
| 170 | Service States |
| 171 | -------------- |
| 172 | |
| 173 | stopped:: |
| 174 | |
| 175 | Service is stopped (confirmed by LRM) |
| 176 | |
| 177 | request_stop:: |
| 178 | |
| 179 | Service should be stopped. Waiting for confirmation from LRM. |
| 180 | |
| 181 | started:: |
| 182 | |
| 183 | Service is active an LRM should start it ASAP if not already running. |
| 184 | |
| 185 | fence:: |
| 186 | |
| 187 | Wait for node fencing (service node is not inside quorate cluster |
| 188 | partition). |
| 189 | |
| 190 | freeze:: |
| 191 | |
| 192 | Do not touch the service state. We use this state while we reboot a |
| 193 | node, or when we restart the LRM daemon. |
| 194 | |
| 195 | migrate:: |
| 196 | |
| 197 | Migrate service (live) to other node. |
| 198 | |
| 199 | error:: |
| 200 | |
| 201 | Service disabled because of LRM errors. Needs manual intervention. |
| 202 | |
| 203 | |
| 204 | ifdef::manvolnum[] |
| 205 | include::pve-copyright.adoc[] |
| 206 | endif::manvolnum[] |
| 207 | |