X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=ha-manager.adoc;h=6400e208f6338f6b136a91e2767e9c59a31d5a5f;hp=cef806d7af723599410a96f8c8a9a7b41e38b206;hb=1acab952e33ec87ba3d15fd1711fc45a1989f5b6;hpb=01911cf3ca3ed6f4560fe510f3cbbbf8b1219e0d diff --git a/ha-manager.adoc b/ha-manager.adoc index cef806d..6400e20 100644 --- a/ha-manager.adoc +++ b/ha-manager.adoc @@ -1,15 +1,15 @@ -[[chapter-ha-manager]] +[[chapter_ha_manager]] ifdef::manvolnum[] -PVE({manvolnum}) -================ -include::attributes.txt[] +ha-manager(1) +============= +:pve-toplevel: NAME ---- ha-manager - Proxmox VE HA Manager -SYNOPSYS +SYNOPSIS -------- include::ha-manager.1-synopsis.adoc[] @@ -17,14 +17,12 @@ include::ha-manager.1-synopsis.adoc[] DESCRIPTION ----------- endif::manvolnum[] - ifndef::manvolnum[] High Availability ================= -include::attributes.txt[] +:pve-toplevel: endif::manvolnum[] - Our modern society depends heavily on information provided by computers over the network. Mobile devices amplified that dependency, because people can access the network any time from anywhere. If you @@ -58,7 +56,7 @@ yourself. The following solutions works without modifying the software: * Use reliable ``server'' components - ++ NOTE: Computer components with same functionality can have varying reliability numbers, depending on the component quality. Most vendors sell components with higher reliability as ``server'' components - @@ -107,21 +105,27 @@ hard and costly. `ha-manager` has typical error detection and failover times of about 2 minutes, so you can get no more than 99.999% availability. + Requirements ------------ +You must meet the following requirements before you start with HA: + * at least three cluster nodes (to get reliable quorum) * shared storage for VMs and containers * hardware redundancy (everywhere) +* use reliable “server” components + * hardware watchdog - if not available we fall back to the linux kernel software watchdog (`softdog`) * optional hardware fencing devices +[[ha_manager_resources]] Resources --------- @@ -148,16 +152,17 @@ To provide High Availability two daemons run on each node: `pve-ha-lrm`:: -The local resource manager (LRM), it controls the services running on -the local node. -It reads the requested states for its services from the current manager -status file and executes the respective commands. +The local resource manager (LRM), which controls the services running on +the local node. It reads the requested states for its services from +the current manager status file and executes the respective commands. `pve-ha-crm`:: -The cluster resource manager (CRM), it controls the cluster wide -actions of the services, processes the LRM results and includes the state -machine which controls the state of each service. +The cluster resource manager (CRM), which makes the cluster wide +decisions. It sends commands to the LRM, processes the results, +and moves resources to other nodes if something fails. The CRM also +handles node fencing. + .Locks in the LRM & CRM [NOTE] @@ -269,17 +274,137 @@ quorum, the LRM waits for a new quorum to form. As long as there is no quorum the node cannot reset the watchdog. This will trigger a reboot after the watchdog then times out, this happens after 60 seconds. + Configuration ------------- -The HA stack is well integrated in the Proxmox VE API2. So, for -example, HA can be configured via `ha-manager` or the PVE web -interface, which both provide an easy to use tool. +The HA stack is well integrated into the {pve} API. So, for example, +HA can be configured via the `ha-manager` command line interface, or +the {pve} web interface - both interfaces provide an easy way to +manage HA. Automation tools can use the API directly. + +All HA configuration files are within `/etc/pve/ha/`, so they get +automatically distributed to the cluster nodes, and all nodes share +the same HA configuration. + + +Resources +~~~~~~~~~ + +The resource configuration file `/etc/pve/ha/resources.cfg` stores +the list of resources managed by `ha-manager`. A resource configuration +inside that list look like this: + +---- +: + + ... +---- + +It starts with a resource type followed by a resource specific name, +separated with colon. Together this forms the HA resource ID, which is +used by all `ha-manager` commands to uniquely identify a resource +(example: `vm:100` or `ct:101`). The next lines contain additional +properties: + +include::ha-resources-opts.adoc[] + +Here is a real world example with one VM and one container. As you see, +the syntax of those files is really simple, so it is even posiible to +read or edit those files using your favorite editor: + +.Configuration Example (`/etc/pve/ha/resources.cfg`) +---- +vm: 501 + state started + max_relocate 2 + +ct: 102 + # Note: use default settings for everything +---- + +Above config was generated using the `ha-manager` command line tool: + +---- +# ha-manager add vm:501 --state started --max_relocate 2 +# ha-manager add ct:102 +---- + + +[[ha_manager_groups]] +Groups +~~~~~~ + +The HA group configuration file `/etc/pve/ha/groups.cfg` is used to +define groups of cluster nodes. A resource can be restricted to run +only on the members of such group. A group configuration look like +this: + +---- +group: + nodes + + ... +---- + +include::ha-groups-opts.adoc[] + +A commom requirement is that a resource should run on a specific +node. Usually the resource is able to run on other nodes, so you can define +an unrestricted group with a single member: + +---- +# ha-manager groupadd prefer_node1 --nodes node1 +---- + +For bigger clusters, it makes sense to define a more detailed failover +behavior. For example, you may want to run a set of services on +`node1` if possible. If `node1` is not available, you want to run them +equally splitted on `node2` and `node3`. If those nodes also fail the +services should run on `node4`. To achieve this you could set the node +list to: + +---- +# ha-manager groupadd mygroup1 -nodes "node1:2,node2:1,node3:1,node4" +---- + +Another use case is if a resource uses other resources only available +on specific nodes, lets say `node1` and `node2`. We need to make sure +that HA manager does not use other nodes, so we need to create a +restricted group with said nodes: + +---- +# ha-manager groupadd mygroup2 -nodes "node1,node2" -restricted +---- + +Above commands created the following group configuration fils: + +.Configuration Example (`/etc/pve/ha/groups.cfg`) +---- +group: prefer_node1 + nodes node1 + +group: mygroup1 + nodes node2:1,node4,node1:2,node3:1 + +group: mygroup2 + nodes node2,node1 + restricted 1 +---- + + +The `nofailback` options is mostly useful to avoid unwanted resource +movements during administartion tasks. For example, if you need to +migrate a service to a node which hasn't the highest priority in the +group, you need to tell the HA manager to not move this service +instantly back by setting the `nofailback` option. + +Another scenario is when a service was fenced and it got recovered to +another node. The admin tries to repair the fenced node and brings it +up online again to investigate the failure cause and check if it runs +stable again. Setting the `nofailback` flag prevents that the +recovered services move straight back to the fenced node. -The resource configuration file can be located at -`/etc/pve/ha/resources.cfg` and the group configuration file at -`/etc/pve/ha/groups.cfg`. Use the provided tools to make changes, -there shouldn't be any need to edit them manually. Node Power Status ----------------- @@ -311,6 +436,7 @@ the update process can be too long which, in the worst case, may result in a watchdog reset. +[[ha_manager_fencing]] Fencing ------- @@ -380,56 +506,6 @@ That minimizes the possibility of an overload, which else could cause an unresponsive node and as a result a chain reaction of node failures in the cluster. -Groups ------- - -A group is a collection of cluster nodes which a service may be bound to. - -Group Settings -~~~~~~~~~~~~~~ - -nodes:: - -List of group node members where a priority can be given to each node. -A service bound to this group will run on the nodes with the highest priority -available. If more nodes are in the highest priority class the services will -get distributed to those node if not already there. The priorities have a -relative meaning only. - Example;; - You want to run all services from a group on `node1` if possible. If this node - is not available, you want them to run equally splitted on `node2` and `node3`, and - if those fail it should use `node4`. - To achieve this you could set the node list to: -[source,bash] - ha-manager groupset mygroup -nodes "node1:2,node2:1,node3:1,node4" - -restricted:: - -Resources bound to this group may only run on nodes defined by the -group. If no group node member is available the resource will be -placed in the stopped state. - Example;; - Lets say a service uses resources only available on `node1` and `node2`, - so we need to make sure that HA manager does not use other nodes. - We need to create a 'restricted' group with said nodes: -[source,bash] - ha-manager groupset mygroup -nodes "node1,node2" -restricted - -nofailback:: - -The resource won't automatically fail back when a more preferred node -(re)joins the cluster. - Examples;; - * You need to migrate a service to a node which hasn't the highest priority - in the group at the moment, to tell the HA manager to not move this service - instantly back set the nofailnback option and the service will stay on - - * A service was fenced and he got recovered to another node. The admin - repaired the node and brang it up online again but does not want that the - recovered services move straight back to the repaired node as he wants to - first investigate the failure cause and check if it runs stable. He can use - the nofailback option to achieve this. - Start Failure Policy --------------------- @@ -480,6 +556,7 @@ killing its process) * *after* you fixed all errors you may enable the service again +[[ha_manager_service_operations]] Service Operations ------------------