Thomas Lamprecht [Wed, 16 Mar 2016 12:01:28 +0000 (13:01 +0100)]
add Fence class for external fence devices
This class provides methods for starting and checking the current
status of a fence job.
When a fence job is started we execute a fence agent command.
If we can fork this happens in forked worker, which can be multiple
processes also, when parallel devices are configured.
When a device fails to successfully fence a node we try the next
device configured, or if no device is left we tell the CRM and let
him decide what to do.
If one process of a parallel device fails we kill the remaining
processes (with reset_hard) and try the next device as we want to
avoid a partial fenced node.
The current running fence jobs can be picked up (if env. allows
forking) and processed by calling the process_fencing method.
If the CRM (which should handle the fencing) looses its lock
bail_out can be called to kill all currently running fencing
processes and reset the fencing status of all nodes.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 16 Mar 2016 12:01:27 +0000 (13:01 +0100)]
Env, HW: add HW fencing related functions
This adds three methods to hardware fencing related functions:
* read_fence_config
* fencing_mode
* exec_fence_agent
'read_fence_config' allows to create a common code base between the
real world and the test/sim world regarding fencing a bit easier.
In PVE2 it parses the config from /etc/pve/ha/fence.cfg and in
the Sim method it parses the config from testdir/status/fence.cfg.
The 'fencing_mode' method checks respective to the callee's
environment if hardware fencing is enabled and returns the wanted
mode:
* PVE2: if the datacenter.cfg key 'fencing' is set then it will be
used, else we use 'watchdog' as default.
* Sim: if the cfg exist and at least one device is configured we use
'hardware' fencing, else we default to 'watchdog'.
For the simulator we should make also a option to turn HW fencing
off and on independent if devices are configured when we implement
HW fencing for it, but that is not really needed for now.
'exec_fence_agent' executes, as the name suggests, environment
specific a fence agent.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Tue, 15 Mar 2016 11:40:51 +0000 (12:40 +0100)]
status: show added but not yet active services
If the CRM is dead or not active yet and we add a new service, we do
not see it in the HA status. This can be confusing for the user as
it is queued for adding but does not shows up, so lets show those
services also.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Tue, 15 Mar 2016 07:44:28 +0000 (08:44 +0100)]
add FenceConfig class for external fencing devices
Add a FenceConfig class which includes methods to parse a config
file for fence devices in the format specified by dlm.conf, see the
Fencing section in the dlm.conf manpage for more details regarding
this format.
With this we can generate commands for fencing a node from the
parsed config file.
We assume that the fence config resides in the pve cluster fs under
/etc/pve/ha/fence_devices.cfg for the PVE2 environment.
But we can give parse_config and arbitary raw string, this allows
to use fencing also in (regression) tests and simultaion.
A simple regression testing script for the config generation was
also added. This test mainly the parse_config and get_commands
methods. A config file can be passed as argument, else we cycle
through the *.cfg files in the fence_cfgs folder.
Example configs for regression testing are located in
src/test/fence_cfgs directory
Note that not all files are valid examples as some are used to check
the error handling of the parser!
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 24 Feb 2016 07:33:59 +0000 (08:33 +0100)]
fix infinite started <=> migrate cycle
If we get an 'EWRONG_NODE' error from the migration we have no sane
way out. If we place it then in the started state we also get the
'EWRONG_NODE' error again and it even will place the service in
the migration state again (when it's not restricted by a group) and
thus result in an infinite started <=> migrate cycle.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 24 Feb 2016 09:06:52 +0000 (10:06 +0100)]
avoid out of sync command execution in LRM
We are only allowed to execute any command once as else we
may disturb the synchrony between CRM and LRM.
The following scenario could happen:
schedule CRM: deploy task 'migrate' for service vm:100 with UID 1234
schedule LRM: fork task wit UID 123
schedule CRM: idle as no result available yet
schedule LRM: collect finished task UID and write result for CRM
Result was an error
schedule LRM: fork task UID _AGAIN_
schedule CRM: processes _previous_ erroneous result from LRM
and place vm:100 in started state on source node
schedule LRM: collect finished task UID and write result for CRM
This time the task was successful and service is on
target node, but the CRM know _nothing_ of it!
schedule LRM: try to schedule task but service is not on this node!
=> error
To fix that we _never_ execute two exactly same commands after each
other, exactly means here that the UID of the actual command to
queue is already a valid result.
This enforces the originally wanted SYN - ACK behaviour between CRM
and LRMs.
We generate now a new UID for services who does not change state
the one of the following evaluates to true:
* enabled AND in started state
This ensures that the state from the CRM holds in the LRM and thus,
for example, a killed VM gets restarted.
Note that the 'stopped' command is an exception, as we do not check
its result in the CRM (thus no race here) and we always want to
execute it (even when no CRM is active).
Thomas Lamprecht [Fri, 19 Feb 2016 17:41:03 +0000 (18:41 +0100)]
fix 'uninitialized value' on online node usage computation
This fixes a bug introduced by commit 9da84a0 which set the wrong
hash when a disabled service got a migrate/relocate command.
We set "node => $target", while our state machine could handle that
we got some "uninitialized value" warnings when migrating a disabled
service to an inactive LRM. Better set "target => $target"
Further add a test for this scenario.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 12 Feb 2016 15:14:54 +0000 (16:14 +0100)]
fix 'change_service_location' misuse and recovery from fencing
First rename the change_service_location method from the environment
to an more fitting name, 'steal_service'.
The 'change_service_location' from the virtual hardware class stays
at it is, because there the name fits (those function have not the
same meaning, so it's good that they named different now).
As we misused the config steal method (former
change_service_location) in the stopped state to process the
services from fenced nodes we need another way now.
This is achieved through the private method 'recover_fenced_service'
which is now the only place who has the right to steal a service
from an node.
When a node was successfully fenced we no longer change its services
state to 'stopped', rather we drop that hack and search a new node
in 'recover_fenced_service', if found we the steal the service and
move it to from the fenced to the new (recovery) node and place it
there in the 'started' state, after that the state machine is able
to handle the rest.
If we do not find a node we try again next round as that is better
then placing it in the error state, because so we have still a
chance to recover, which we do not have with the error state.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 10 Feb 2016 13:13:44 +0000 (14:13 +0100)]
add VirtFail resource and use it in new regression tests
This resource let us test a defined failiure behaviour ofi services.
Through the VMID we define how it should behave, with the folowing
rules:
When the service has the SID "fa:abcde" the digits a - e mean:
a - no meaning but can be used for differentiating similar resources
b - how many tries are needed to start correctly (0=default)
c - how many tries are needed to migrate correctly (0=default)
d - should shutdown be successful (0 = yes, anything else no)
e - return value of $plugin->exists() defaults to 1 if not set
a,b,c should always be set even if b and c have defaults (makes test
purpose clearer)
d and e
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Wed, 10 Feb 2016 06:20:05 +0000 (07:20 +0100)]
improve verbosity of API status call
Display the state of a LRM when it is not in the "active" state
and has some service configured.
This should reduce confiusion when the LRM is active but still has
to wait for its lock. In such a case the user only sees that it's
active and could think it's malfunctioning because no action
happens.
Also displays the LRM mode, so that we can see when we restart the LRM.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Remove the "service_config_exists" method from the Environment class
and replace it with a new method in PVE::HA::Tools.
First we do not need this method in the Environment class and second
this can be reused more for checking if a note has any service
configured.
This moves the regression test also a little closer to the real
world behaviour (while maintaining determinism).
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fix postinstall script not removing watchdog-mux.socket
watchdog-mux.socket was removed in f8a3fc80af but the
postinstall script used -e instead of -L to test for the
symlink, which fails since the destination is already
removed at that point.
Thomas Lamprecht [Wed, 27 Jan 2016 12:16:35 +0000 (13:16 +0100)]
move upid_wait from PVE2 env to HA::Tools
We can now use the new wait_upid from PVE::Tools and we output a
"Task still active" every five seconds instead of every second,
so we trash the log less.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 27 Jan 2016 12:16:34 +0000 (13:16 +0100)]
fix check if a resource migrated correctly
Move the check from exec_resource_agents to the migrate method of
the resource plugins so that we may delete the traces from the
additional method 'is_on_node' which was intended for this
purpose.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 27 Jan 2016 12:16:33 +0000 (13:16 +0100)]
add after_fork method to HA environment and use it in LRM
As both the realtime simulator and the real world Environment can
fork but we only need to clean up after a fork in the real world
introduce a after_fork method
In PVE2 it closes the inherited INotify fd and reopens it for the
worker.
Also a cfs_update gets triggered as (other) workers may change the
cluster state.
also use the new introduction HA-Env method to clean up after
we forked an LRM worker.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 22 Jan 2016 16:06:42 +0000 (17:06 +0100)]
Move exec_resource_agent from environment classes to LRM
With the changes and preparation work from the previous commits
we can now move the quite important method exec_resource_agent
from the Env classes to the LRM where it get's called.
The main advantage of this is that it now underlies regression
tests and that we do not have two separate methods where it
- does not make sense as agents them self should be virtualized
not the method executing them
- adds more work as the must (or at least should) be in sync
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 22 Jan 2016 16:06:39 +0000 (17:06 +0100)]
Add virtual resources for tests and simulation
Introduce a base class for Virtual Resources with almost all methods
already implemented.
Also add a class for virtual CTs and VMs, with the primary
distinction that CTs may not migrate online.
The Resource are registered in the Hardware class and overwrite
any already registered resources from the same type (e.g. VirtVM
overwrites PVEVM) so that the correct plugins are loaded for
regression tests and the simulator.
This makes the way free for adding(deterministic) 'malicious'
resources, so we can make test where, for example, a service fails
a few times to start or migrate.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 18 Jan 2016 10:35:20 +0000 (11:35 +0100)]
LRM: release lock also on restart
Wen restarting the LRM (e.g. on a update) we get an new pid and thus
have to wait for our own lock to timeout.
We can (and should) do that as there are no services or all services
are freezed. If they are freezed only our LRM may touch them so we
we can unfreeze them faster with this patch.
The expected log of the restart-lrm test does not change much as the
test system does not need to wait for a timeout.
This let's the LRM start working directly after a restart,
especially usefull on package updates.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 18 Jan 2016 09:26:45 +0000 (10:26 +0100)]
TestHardware: correct shutdown/reboot behaviour of CRM and LRM
Instead of shutting down the LRM and then killing the CRM we now
also make a shutdown request to the CRM, that mirrors the real world
behaviour much better and let's us also test the lock release from
the CRM.
To accomplish this we add new sim_hardware commands for stopping and
starting the CRM.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The params of resource methods are normally specific for each
resource type.
CTs and VMs have the same interfaces in the needed cases so we could
generate the params in the exec_resource_agent method. This is not
clean because:
* resource specific stuff shouldn't be in this method
* this can make problems if we want to add another resource type
in the future which has a completely different interface
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 11 Jan 2016 12:20:17 +0000 (13:20 +0100)]
free cmd pointer after it's execution
Quoting the asprintf man page:
> [..]
> This pointer should be passed to free(3) to release the allocated
> storage when it is no longer needed.
> [..]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 11 Jan 2016 12:20:15 +0000 (13:20 +0100)]
small cleanup
remove the unlink_socket variable and it's check as they wher
always true, as error and the end of the programm can only be
reached when the socket is already set up.
Also unlinking an non existent file does not result in any error.
also some whitespace cleanup in the surrounding area.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 11 Jan 2016 12:20:14 +0000 (13:20 +0100)]
remove watchdog-mux.socket
The use of an systemd socket unit for the watchdog socket is not
necessary for us it even generates problems as the socket already
runs and accepts input when the watchdog-mux daemon itself is not
running. So the LRM/CRM could successfully open and update the
watchdog even if it was not running!
This patch removes the unit file, adds a postinst script which
handles the removal of the links generated from systemd itself
and removes also the code from watchdog-mux which handled
the systemd socket unit.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Tue, 22 Dec 2015 07:52:38 +0000 (08:52 +0100)]
Sim/Env: fix removing service from old node on migration
We only removed the service from the source node on a relocate, we
also want to remove it on a successfull migration else we have it
on two nodes at the same time.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Tue, 22 Dec 2015 07:52:35 +0000 (08:52 +0100)]
add service disable/enable to regression tests
Allow execution of user triggered service commands in regression
tests, like enable or disable. This is the test equivalent to a
ha-manager action service:id
command.
Also add a test for a disable enable cycle.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 21 Dec 2015 09:12:47 +0000 (10:12 +0100)]
check_active_workers: fix typo /uuid/uid/
This typo caused a bug where resource_command_finished was never
called as $w->{uuid} is not existing and thus always undefined.
Use the correct $w->{uid} instead.
Also fix a comment where used 'uuid' to avoid confusion.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>