X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=ha-manager.adoc;h=d8489cb232652a4e2e0c04c30c8b3162749e4852;hp=d49a8c6d1ced7cdad6fe6e6f7be5e62242cc65c6;hb=a9c77fec9239c1dd979bb0fd025a4d9186ae6449;hpb=8b598c333ef7b680f1aeb8967a5ef9b88cb94d4c diff --git a/ha-manager.adoc b/ha-manager.adoc index d49a8c6..d8489cb 100644 --- a/ha-manager.adoc +++ b/ha-manager.adoc @@ -2,7 +2,6 @@ ifdef::manvolnum[] ha-manager(1) ============= -include::attributes.txt[] :pve-toplevel: NAME @@ -21,7 +20,6 @@ endif::manvolnum[] ifndef::manvolnum[] High Availability ================= -include::attributes.txt[] :pve-toplevel: endif::manvolnum[] @@ -58,7 +56,7 @@ yourself. The following solutions works without modifying the software: * Use reliable ``server'' components - ++ NOTE: Computer components with same functionality can have varying reliability numbers, depending on the component quality. Most vendors sell components with higher reliability as ``server'' components - @@ -107,15 +105,20 @@ hard and costly. `ha-manager` has typical error detection and failover times of about 2 minutes, so you can get no more than 99.999% availability. + Requirements ------------ +You must meet the following requirements before you start with HA: + * at least three cluster nodes (to get reliable quorum) * shared storage for VMs and containers * hardware redundancy (everywhere) +* use reliable “server” components + * hardware watchdog - if not available we fall back to the linux kernel software watchdog (`softdog`) @@ -149,16 +152,17 @@ To provide High Availability two daemons run on each node: `pve-ha-lrm`:: -The local resource manager (LRM), it controls the services running on -the local node. -It reads the requested states for its services from the current manager -status file and executes the respective commands. +The local resource manager (LRM), which controls the services running on +the local node. It reads the requested states for its services from +the current manager status file and executes the respective commands. `pve-ha-crm`:: -The cluster resource manager (CRM), it controls the cluster wide -actions of the services, processes the LRM results and includes the state -machine which controls the state of each service. +The cluster resource manager (CRM), which makes the cluster wide +decisions. It sends commands to the LRM, processes the results, +and moves resources to other nodes if something fails. The CRM also +handles node fencing. + .Locks in the LRM & CRM [NOTE] @@ -270,17 +274,59 @@ quorum, the LRM waits for a new quorum to form. As long as there is no quorum the node cannot reset the watchdog. This will trigger a reboot after the watchdog then times out, this happens after 60 seconds. + Configuration ------------- -The HA stack is well integrated in the Proxmox VE API2. So, for -example, HA can be configured via `ha-manager` or the PVE web -interface, which both provide an easy to use tool. +The HA stack is well integrated into the {pve} API. So, for example, +HA can be configured via the `ha-manager` command line interface, or +the {pve} web interface - both interfaces provide an easy way to +manage HA. Automation tools can use the API directly. + +All HA configuration files are within `/etc/pve/ha/`, so they get +automatically distributed to the cluster nodes, and all nodes share +the same HA configuration. + + +Resources +~~~~~~~~~ + +The resource configuration file `/etc/pve/ha/resources.cfg` stores +the list of resources managed by `ha-manager`. A resource configuration +inside that list look like this: + +---- +: + + ... +---- + +It starts with a resource type followed by a resource specific name, +separated with colon. Together this forms the HA resource ID, which is +used by all `ha-manager` commands to uniquely identify a resource +(example: `vm:100` or `ct:101`). The next lines contain additional +properties: + +include::ha-resources-opts.adoc[] + + +Groups +~~~~~~ + +The HA group configuration file `/etc/pve/ha/groups.cfg` is used to +define groups of cluster nodes. A resource can be restricted to run +only on the members of such group. A group configuration look like +this: + +---- +group: + nodes + + ... +---- + +include::ha-groups-opts.adoc[] -The resource configuration file can be located at -`/etc/pve/ha/resources.cfg` and the group configuration file at -`/etc/pve/ha/groups.cfg`. Use the provided tools to make changes, -there shouldn't be any need to edit them manually. Node Power Status -----------------