costs.
TIP: Increasing availability from 99% to 99.9% is relatively
-simply. But increasing availability from 99.9999% to 99.99999% is very
+simple. But increasing availability from 99.9999% to 99.99999% is very
hard and costly. `ha-manager` has typical error detection and failover
times of about 2 minutes, so you can get no more than 99.999%
availability.
By using the HA simulator you can test and learn all functionalities of the
Proxmox VE HA solutions.
-The simulator allows you to watch and test the behaviour of a real-world 3 node
-cluster with 6 VMs. You can also add or remove additional VMs or Container.
+By default, the simulator allows you to watch and test the behaviour of a
+real-world 3 node cluster with 6 VMs. You can also add or remove additional VMs
+or Container.
You do not have to setup or configure a real cluster, the HA simulator runs out
of the box.
apt install pve-ha-simulator
----
-You can even install the package on a Debian or Debian based system without any
+You can even install the package on any Debian based system without any
other Proxmox VE packages. For that you will need to download the package and
copy it to the system you want to run it on for installation. When you install
the package with apt from the local file system it will also resolve the
If you are on a Linux machine you can use:
----
-ssh root@<IPofPVE4> -Y
+ssh root@<IPofPVE> -Y
----
On Windows it is working with https://mobaxterm.mobatek.net/[mobaxterm].
-After starting the simulator create a working directory:
+After either connecting to a existing {pve} with the simulator installed, or
+installing it on your local Debian based system manually you can try it out as
+follows.
+
+First you need to create a working directory where the simulator saves it's
+current state and writes its the default config:
----
mkdir working
----
-To start the simulator type
+Then, simply pass the created directory as parameter to 'pve-ha-simulator':
----
pve-ha-simulator working/
----
+You can then start, stop, migrate the simulated HA services, or even check out
+what happens on a node failure.
Configuration
-------------
max_restart::
-Maximum number of tries to restart an failed service on the actual
+Maximum number of tries to restart a failed service on the actual
node. The default is set to one.
max_relocate::
When updating the ha-manager you should do one node after the other, never
all at once for various reasons. First, while we test our software
thoughtfully, a bug affecting your specific setup cannot totally be ruled out.
-Upgrading one node after the other and checking the functionality of each node
-after finishing the update helps to recover from an eventual problems, while
-updating all could render you in a broken cluster state and is generally not
+Updating one node after the other and checking the functionality of each node
+after finishing the update helps to recover from eventual problems, while
+updating all at once could result in a broken cluster and is generally not
good practice.
Also, the {pve} HA stack uses a request acknowledge protocol to perform