Thomas Lamprecht [Mon, 25 Nov 2019 16:35:43 +0000 (17:35 +0100)]
lrm.service: add after ordering for SSH and pveproxy
To avoid early disconnect during shutdown ensure we order After them,
for shutdown the ordering is reversed and so we're stopped before
those two - this allows to checkout the node stats and do SSH stuff
if something fails.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 25 Nov 2019 16:48:42 +0000 (17:48 +0100)]
do simple fallback if node comes back online from maintenance
We simply remember the node we where on, if moved for maintenance.
This record gets dropped once we move to _any_ other node, be it:
* our previous node, as it came back from maintenance
* another node due to manual migration, group priority changes or
fencing
The first point is handled explicitly by this patch. In the select
service node we check for and old fallback node, if that one is found
in a online node list with top priority we _always_ move to it - even
if there's no real reason for a move.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This adds handling for a new shutdown policy, namely "migrate".
If that is set then the LRM doesn't queues stop jobs, but transitions
to a new mode, namely 'maintenance'.
The LRM modes now get passed from the CRM in the NodeStatus update
method, this allows to detect such a mode and make node-status state
transitions. Effectively we only allow to transition if we're
currently online, else this is ignored. 'maintenance' does not
protects from fencing.
The moving then gets done by select service node. A node in
maintenance mode is not in "list_online_nodes" and so also not in
online_node_usage used to re-calculate if a service needs to be
moved. Only started services will get moved, this can be done almost
by leveraging exiting behavior, the next_state_started FSM state
transition method just needs to be thought to not early return for
nodes which are not online but in maintenance mode.
A few tests get adapted from the other policy tests is added to
showcase behavior with reboot, shutdown, and shutdown of the current
manager. It also shows the behavior when a service cannot be
migrated, albeit as our test system is limited to simulate maximal 9
migration failures, it "seems" to succeed after that. But note here
that the maximal retries would have been hit way more earlier, so
this is just artifact from our test system.
Besides some implementation details two question still are not solved
by this approach:
* what if a service cannot be moved away, either by errors or as no
alternative node is found by select_service_node
- retrying indefinitely, this happens currently. The user set this
up like this in the first place. We will order SSH, pveproxy,
after the LRM service to ensure that the're still the possibility
for manual interventions
- a idea would be to track the time and see if we're stuck (this is
not to hard), in such a case we could stop the services after X
minutes and continue.
* a full cluster shutdown, but that is even without this mode not to
ideal, nodes will get fenced after no partition is quorate anymore,
already. And as long as it's just a central setting in DC config,
an admin has a single switch to flip to make it work, so not sure
how much handling we want to do here, if we go over the point where
we have no quorum we're dead anyhow, soo.. at least not really an
issue of this series, orthogonal related yes, but not more.
For real world usability the datacenter.cfg schema needs to be
changed to allow the migrate shutdown policy, but that's trivial
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 25 Nov 2019 17:05:11 +0000 (18:05 +0100)]
account service to source and target during move
As the Service load is often still happening on the source, and the
target may feel the performance impact from an incoming migrate, so
account the service to both nodes during that time.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Tue, 19 Nov 2019 13:05:30 +0000 (14:05 +0100)]
fix #1339: remove more locks from services IF the node got fenced
Remove further locks from a service after it was recovered from a
fenced node. This can be done due to the fact that the node was
fenced and thus the operation it was locked for was interrupted
anyway. We note in the syslog that we removed a lock.
Mostly we disallow the 'create' lock, as here is the only case where
we know that the service was not yet in a runnable state before.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fabian Ebner [Thu, 10 Oct 2019 10:25:08 +0000 (12:25 +0200)]
Introduce crm-command to CLI and add stop as a subcommand
This should reduce confusion between the old 'set <sid> --state stopped' and
the new 'stop' command by making explicit that it is sent as a crm command.
fix # 2241: VM resource: allow migration with local device, when not running
qemu-server ignores the flag if the VM runs, so just set it to true
hardcoded.
People have identical hosts with same HW and want to be able to
relocate VMs in such cases, so allow it here - qemu knows to complain
if it cannot work, as nothing bad happens then (VM stays just were it
is) we can only win, so do it.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 10 Apr 2019 10:41:17 +0000 (12:41 +0200)]
handle the case where a node gets fully removed
If an admin removes a node he may also remove /etc/pve/nodes/NODE
quite soon after that, if the "node really deleted" logic of our
NodeStatus module has not triggered until then (it waits an hour) the
current manager still tries to read the gone nodes LRM status, which
results in an exception. Move this exception to a warn and return a
node == unkown state in such a case.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Sat, 30 Mar 2019 18:02:26 +0000 (19:02 +0100)]
d/control: remove obsolete dh-systemd dependency
We do not need to depend explicitly on dh-systemd as we have a
versioned debhelper dependency with >= 10~, and lintian on buster for
this .dsc even warns:
> build-depends-on-obsolete-package build-depends: dh-systemd => use debhelper (>= 9.20160709)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 15 Mar 2019 08:43:28 +0000 (09:43 +0100)]
PVE2 Env: get_ha_settings: don't die if pmxcfs failed
This is a method called in our shutdown path, so if we die here we
may silent a shutdown, nad just ignore it.
In combination with the fact that our service unit is configured
with: 'TimeoutStopSec=infinity' this means that a systemctl stop may
wait infinitely for this to happen, and any other systemctl command
will be queued for that long.
So if pmxcfs is stopped, we then get a shutdown request, we cannot
start pmxcfs again, at least not through systemd.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
d/control: do not track qemu-server and pve-container dependency
While it would be correct to have them tracked here we cannot do this
at the moment, as with those two also depend on pve-ha-manager, and
with dpkg packaged under strech there's an issue with such cyclic
dependencies and trigger cycle detection only resolved for buster[0]
Currently, the issue exists on the following condition:
* update of pve-ha-manager plus either pve-container or qemu-server
* but _no_ update of pve-manager in the same upgrade cycle
Thomas Lamprecht [Wed, 23 Jan 2019 09:34:40 +0000 (10:34 +0100)]
fix #1602: allow to delete 'ignored' services over API
service_is_ha_managed returns false if a service is in the resource
configuration but marked as 'ignore', as for the internal stack it is
as it wasn't HA managed at all.
But user should be able to remvoe it from the configuration easily
even in this state, without setting the requesttate to anything else
first.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 23 Jan 2019 08:43:14 +0000 (09:43 +0100)]
fix #1842: do not pass forceStop to CT shutdown
The vm_shutdown parameter forceStop differs in behaviour between VMs
and CTs. While on VMs it ensures that a VM gets stoppped if it could
not shutdown gracefully only after the timeout passed, the container
stack always ignores any timeout if forceStop is set and hard stops
the CT immediately.
To achieve this behaviour for CTs too, the timeout is enough, as
lxc-stop then does the hard stop after timeout itself.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Sun, 13 Jan 2019 11:39:53 +0000 (12:39 +0100)]
fence config parser: early return on ignored devices
We do not support all of the dlm.conf possibilities, but we also do
not want to die on such "unkown" keys/commands as an admin should be
able to share this config if it is already used for other purposes,
e.g. lockd, gfs, or such.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
in this package we provide api functions, thus we want to activate
the pve-api-update trigger, so that packages like pve-manager get
notified about it. But we also use api functions directly so we setup
an interest in the pve-api-update trigger. This results in an lintian
error (lintian version from buster or newer) which we can override:
> [...]
> This tag is also triggered if the package has an activate trigger
> for something on which it also declares an interest. The only (but
> rather unlikely) reason to do this is if another package also
> declares an interest and this package needs to activate that other
> package. If the package is using it for this exact purpose, then
> please use a Lintian override to state this.
-- https://lintian.debian.org/tags/repeated-trigger-name.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
addresses a few nits from Fabians review at:
https://pve.proxmox.com/pipermail/pve-devel/2018-December/035061.html
https://pve.proxmox.com/pipermail/pve-devel/2018-December/035085.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Thu, 20 Dec 2018 07:44:42 +0000 (08:44 +0100)]
fix #1378: allow to specify a service shutdown policy
Allow an admin to set a datacenter wide HA policy which can change
the way we handle services on a node shutdown.
There's:
* freeze: always freeze servivces, independent of the shutdown type
(reboot, poweroff)
* failover: never freeze services, this means that a service will get
recovered to another node if possible and if the current node does
not comes back up in the grace period of 1 minute.
* default: this is the current behavior, freeze on reboot but do not
freeze on poweroff
Add to tests, shutdown-policy1 which is based of the reboot1 test,
but enforces no freeze with a failover policy, and shutdown-policy2
which is based on the shutdown1 test but with a explicit freeze
policy. You can compare (diff) each tests log result to the test it's
based on to see what changes.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
use dpkg-buildpackage and debhelper properly, add missing dependencies and
embed used perl modules from libpve-common-perl to make pve-ha-simulator
standalone.
by moving parse_sid to PVE::HA::Env, with the default implementation in
PVE::HA::Config.
the bash completion methods use PVE::HA::Config (and PVE::Cluster), but
the corresponding use statements are only in PVE::CLI::ha_manager, where the
bash completion is actually used.