]> git.proxmox.com Git - pve-ha-manager.git/log
pve-ha-manager.git
19 hours agod/postinst: make deb-systemd-invoke non-fatal master
Fabian Grünbichler [Thu, 11 Apr 2024 10:10:44 +0000 (12:10 +0200)]
d/postinst: make deb-systemd-invoke non-fatal

else this can break an upgrade for unrelated reasons.

this also mimics debhelper behaviour more (which we only not use here because
of lack of reload support) - restructured the snippet to be more similar with
an explicit `if` as well.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
5 months agobump version to 4.0.3
Thomas Lamprecht [Fri, 17 Nov 2023 13:49:08 +0000 (14:49 +0100)]
bump version to 4.0.3

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
5 months agoenv: switch to matcher-based notification system
Lukas Wagner [Tue, 14 Nov 2023 12:59:31 +0000 (13:59 +0100)]
env: switch to matcher-based notification system

Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
5 months agousage stats: tiny code style clean-up
Thomas Lamprecht [Fri, 17 Nov 2023 13:47:12 +0000 (14:47 +0100)]
usage stats: tiny code style clean-up

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
5 months agowatchdog-mux: code indentation and style cleanups
Thomas Lamprecht [Fri, 17 Nov 2023 13:46:49 +0000 (14:46 +0100)]
watchdog-mux: code indentation and style cleanups

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
5 months agobuildsys: use dpkg default makefile snippet
Thomas Lamprecht [Fri, 17 Nov 2023 13:45:35 +0000 (14:45 +0100)]
buildsys: use dpkg default makefile snippet

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
6 months agocrs: avoid auto-vivification when adding node to service usage
Fiona Ebner [Thu, 5 Oct 2023 14:05:46 +0000 (16:05 +0200)]
crs: avoid auto-vivification when adding node to service usage

Part of what caused bug #4984. Make the code future-proof and warn
when the node was never registered in the plugin, similar to what the
'static' usage plugin already does.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
 [ TL: rework commit message subject ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
6 months agofix #4984: manager: add service to migration-target usage only if online
Fiona Ebner [Thu, 5 Oct 2023 14:05:45 +0000 (16:05 +0200)]
fix #4984: manager: add service to migration-target usage only if online

Otherwise, when using the 'basic' plugin, this would lead to
auto-vilification of the $target node in the Perl hash tracking the
usage and it would wrongly be considered online when selecting the
recovery node.

The 'static' plugin was not affected, because it would check and warn
before adding usage to a node that was not registered with add_node()
first. Doing the same in the 'basic' plugin will be done by another
patch.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
 [ TL: shorten commit message subject ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
8 months agomanager: send notifications via new notification module
Lukas Wagner [Thu, 3 Aug 2023 12:16:50 +0000 (14:16 +0200)]
manager: send notifications via new notification module

... instead of using sendmail directly.

If the new 'notify.target-fencing' parameter from datacenter config
is set, we use it as a target for notifications. If it is not set,
we send the notification to the default target (mail-to-root).

There is also a new 'notify.fencing' paramter which controls if
notifications should be sent at all. If it is not set, we
default to the old behavior, which is to send.

Also add dependency to the `libpve-notify-perl` package to d/control.

Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
10 months agobump version to 4.0.2
Thomas Lamprecht [Tue, 13 Jun 2023 06:35:56 +0000 (08:35 +0200)]
bump version to 4.0.2

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agomanager: clear stale maintenance node caused by simultaneous cluster shutdown
Fiona Ebner [Mon, 12 Jun 2023 15:27:11 +0000 (17:27 +0200)]
manager: clear stale maintenance node caused by simultaneous cluster shutdown

Currently, the maintenance node for a service is only cleared when the
service is started on another node. In the edge case of a simultaneous
cluster shutdown however, it might be that the service never was
started anywhere else after the maintenance node was recorded, because
the other nodes were already in the process of being shut down too.

If a user ends up in this edge case, it would be rather surprising
that the service would be automatically migrated back to the
"maintenance node" which actually is not in maintenance mode anymore
after a migration away from it.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
10 months agotests: simulate stale maintainance node caused by simultaneous cluster shutdown
Fiona Ebner [Mon, 12 Jun 2023 15:27:10 +0000 (17:27 +0200)]
tests: simulate stale maintainance node caused by simultaneous cluster shutdown

In the test log, it can be seen that the service will unexpectedly be
migrated back. This is caused by the service's maintainance node
property being set by the initial shutdown, but never cleared, because
that currently happens only when the service is started on a different
node. The next commit will address the issue.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
10 months agobump version to 4.0.1
Thomas Lamprecht [Fri, 9 Jun 2023 08:41:15 +0000 (10:41 +0200)]
bump version to 4.0.1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agod/control: bump versioned dependency for pve-container & qemu-server
Thomas Lamprecht [Fri, 9 Jun 2023 08:33:40 +0000 (10:33 +0200)]
d/control: bump versioned dependency for pve-container & qemu-server

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agoresources: pve: avoid relying on internal configuration details
Fiona Ebner [Tue, 28 Feb 2023 10:54:10 +0000 (11:54 +0100)]
resources: pve: avoid relying on internal configuration details

Instead, use the new get_derived_property() method to get the same
information in a way that is robust regarding changes in the
configuration structure.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
10 months agoapi: fix/add return description for status endpoint
Fiona Ebner [Wed, 31 May 2023 08:12:46 +0000 (10:12 +0200)]
api: fix/add return description for status endpoint

The fact that no 'items' was specified made the api-viewer throw a
JavaScript exception: retinf.items is undefined

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
10 months agolrm: do not migrate if service already running upon rebalance on start
Fiona Ebner [Fri, 14 Apr 2023 12:38:30 +0000 (14:38 +0200)]
lrm: do not migrate if service already running upon rebalance on start

As reported in the community forum[0], currently, a newly added
service that's already running is shut down, offline migrated and
started again if rebalance selects a new node for it. This is
unexpected.

An improvement would be online migrating the service, but rebalance
is only supposed to happen for a stopped->start transition[1], so the
service should not being migrated at all.

The cleanest solution would be for the CRM to use the state 'started'
instead of 'request_start' for newly added services that are already
running, i.e. restore the behavior from before commit c2f2b9c
("manager: set new request_start state for services freshly added to
HA") for such services. But currently, there is no mechanism for the
CRM to check if the service is already running, because it could be on
a different node. For now, avoiding the migration has to be handled in
the LRM instead. If the CRM ever has access to the necessary
information in the future, to solution mentioned above can be
re-considered.

Note that the CRM log message relies on the fact that the LRM only
returns the IGNORED status in this case, but it's more user-friendly
than using a generic message like "migration ignored (check LRM
log)".

[0]: https://forum.proxmox.com/threads/125597/
[1]: https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#_crs_scheduling_points

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
 [ T: split out adding the test to a previous commit so that one can
   see in git what the original bad behavior was and how it's now ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agotests: simulate adding running services to HA with rebalance-on-start
Thomas Lamprecht [Tue, 6 Jun 2023 17:02:40 +0000 (19:02 +0200)]
tests: simulate adding running services to HA with rebalance-on-start

Split out from Fiona's original series, to better show what actually
changes with her fix.

Currently, a newly added service that's already running is shut down,
offline migrated and started again if rebalance selects a new node
for it. This is unexpected and should be fixed, encode that behavior
as a test now, showing still the undesired behavior, and fix it in
the next commit

Originally-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agotools: add IGNORED return code
Fiona Ebner [Fri, 14 Apr 2023 12:38:29 +0000 (14:38 +0200)]
tools: add IGNORED return code

Will be used to ignore rebalance-on-start when an already running
service is newly added to HA.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agosim: hardware: commands: make it possible to add already running service
Fiona Ebner [Fri, 14 Apr 2023 12:38:28 +0000 (14:38 +0200)]
sim: hardware: commands: make it possible to add already running service

Will be used in a test for balance on start, where it should make a
difference if the service is running or not.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agosim: hardware: commands: fix documentation for add
Fiona Ebner [Fri, 14 Apr 2023 12:38:27 +0000 (14:38 +0200)]
sim: hardware: commands: fix documentation for add

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agobump version to 4.0.0
Thomas Lamprecht [Wed, 24 May 2023 17:27:04 +0000 (19:27 +0200)]
bump version to 4.0.0

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agobuildsys: derive upload dist automatically
Thomas Lamprecht [Wed, 24 May 2023 17:26:14 +0000 (19:26 +0200)]
buildsys: derive upload dist automatically

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agod/control: raise standards version compliance to 4.6.2
Thomas Lamprecht [Wed, 24 May 2023 17:26:05 +0000 (19:26 +0200)]
d/control: raise standards version compliance to 4.6.2

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agobuildsys: improve DSC target & add sbuild convenience target
Thomas Lamprecht [Wed, 24 May 2023 17:25:47 +0000 (19:25 +0200)]
buildsys: improve DSC target & add sbuild convenience target

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agobuildsys: make build-dir generation atomic
Thomas Lamprecht [Wed, 24 May 2023 17:24:59 +0000 (19:24 +0200)]
buildsys: make build-dir generation atomic

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agobuildsys: rework doc-gen cleanup and makefile inclusion
Thomas Lamprecht [Wed, 24 May 2023 17:24:38 +0000 (19:24 +0200)]
buildsys: rework doc-gen cleanup and makefile inclusion

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agobuildsys: use full DEB_VERSION and correct DEB_HOST_ARCH
Thomas Lamprecht [Wed, 24 May 2023 17:10:42 +0000 (19:10 +0200)]
buildsys: use full DEB_VERSION and correct DEB_HOST_ARCH

The DEB_HOST_ARCH is the one the package is actually built for, the
DEB_BUILD_ARCH is the one of the build host; having this correct
makes cross-building easier, but otherwise it makes no difference.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
10 months agomakefile: convert to use simple parenthesis
Thomas Lamprecht [Wed, 24 May 2023 17:08:29 +0000 (19:08 +0200)]
makefile: convert to use simple parenthesis

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
11 months agobump version to 3.6.1
Thomas Lamprecht [Thu, 20 Apr 2023 12:16:18 +0000 (14:16 +0200)]
bump version to 3.6.1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
11 months agolrm: keep manual maintenance mode independent of shutdown policy
Thomas Lamprecht [Thu, 20 Apr 2023 11:13:13 +0000 (13:13 +0200)]
lrm: keep manual maintenance mode independent of shutdown policy

We did not handle being in maintenance mode explicitly with shutdown
policies, which is in practice not often an issue as most that use
the maintenance mode also switched over the shutdown policy to
'migrate', which keeps the maintenance mode, but for all those
evaluating HA or only using the manual maintenance mode it meant that
on shutdown the mode was set to 'restart' or 'shutdown', which made
the active manager think that the node got out of the maintenance
state again, and marked it as online – but as it wasn't really online
(and on the way to shutdown), this not only cleared the maintenance
mode by mistake, it also had a chance to cause fencing - if any
service was still on the node – i.e., maintenance mode wasn't reached
yet, but still in-progress of moving HA services (guests).

Fix that by checking if maintenance mode is requested, or already
active (we currently don't differ those two explicitly, but could be
determined from active service count if required), and avoid changing
the mode in the shutdown and restart case. Log that also explicitly
so admins can understand what happened and why.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
11 months agotest behavior of maintenance mode with another shutdown policy
Thomas Lamprecht [Wed, 19 Apr 2023 16:38:21 +0000 (18:38 +0200)]
test behavior of maintenance mode with another shutdown policy

Encode what happens if a node is in maintenance and gets shutdown
with a shutdown policy other than 'migrate' (= maintenance mode)
active.

Currently it's causing disabling the maintenance mode and also might
make a fence even possible (if not all service got moved already).
This will be addressed in the next commit.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
11 months agomanager: ensure node-request state transferred to new active CRM
Thomas Lamprecht [Wed, 19 Apr 2023 12:24:23 +0000 (14:24 +0200)]
manager: ensure node-request state transferred to new active CRM

We do not just take the full CRM status of the old master if a new
one gets active, we only take over the most relevant parts like node
state. But the relative new node_request object entry is also
important, as without that a maintenance state request may get lost
if a new CRM becomes the active master.

Simply copy it over on initial manager construction, if it exists.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
11 months agotest behavior of shutdown with maintenance mode on active master
Thomas Lamprecht [Wed, 19 Apr 2023 12:16:21 +0000 (14:16 +0200)]
test behavior of shutdown with maintenance mode on active master

this encode the current bad behavior of the maintenance mode getting
lost on active CRM switch, due to the request node state not being
transferred. Will be fixed in the next commit.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
11 months agolrm: add maintenance to comment about available modes
Thomas Lamprecht [Thu, 20 Apr 2023 11:14:24 +0000 (13:14 +0200)]
lrm: add maintenance to comment about available modes

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
11 months agoha config: code style/indendation cleanups
Thomas Lamprecht [Thu, 20 Apr 2023 11:13:44 +0000 (13:13 +0200)]
ha config: code style/indendation cleanups

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agocli: assert that node exist when changing CRS request state
Thomas Lamprecht [Thu, 6 Apr 2023 12:09:01 +0000 (14:09 +0200)]
cli: assert that node exist when changing CRS request state

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agobump version to 3.6.0
Thomas Lamprecht [Mon, 20 Mar 2023 12:45:36 +0000 (13:45 +0100)]
bump version to 3.6.0

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agocli: expose new "crm-command node-maintenance enable/disable" commands
Thomas Lamprecht [Mon, 20 Mar 2023 12:29:28 +0000 (13:29 +0100)]
cli: expose new "crm-command node-maintenance enable/disable" commands

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agoadd CRM command to switch an online node manually into maintenance without reboot
Thomas Lamprecht [Mon, 20 Mar 2023 12:18:50 +0000 (13:18 +0100)]
add CRM command to switch an online node manually into maintenance without reboot

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agolrm: always give up lock if node went successfully into maintenance
Thomas Lamprecht [Mon, 20 Mar 2023 12:06:55 +0000 (13:06 +0100)]
lrm: always give up lock if node went successfully into maintenance

the change as of now is a no-op, as we only ever switched to
maintenance mode on shutdown-request, and there we exited immediately
if no active service and worker where around anyway.

So this is mostly preparing for a manual maintenance mode without any
pending shutdown.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agolrm: factor out check fo maintenance-request
Thomas Lamprecht [Mon, 20 Mar 2023 12:04:11 +0000 (13:04 +0100)]
lrm: factor out check fo maintenance-request

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agomanager: some code style cleanups
Thomas Lamprecht [Mon, 20 Mar 2023 08:47:43 +0000 (09:47 +0100)]
manager: some code style cleanups

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agorequest start: allow to auto-rebalance on a new start request
Thomas Lamprecht [Sat, 19 Nov 2022 14:49:50 +0000 (15:49 +0100)]
request start: allow to auto-rebalance on a new start request

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agomanager: select service node: allow to force best-score selection withot try-next
Thomas Lamprecht [Mon, 20 Mar 2023 08:43:06 +0000 (09:43 +0100)]
manager: select service node: allow to force best-score selection withot try-next

useful for re-balanacing on start, where we do not want to exclude
the current node like setting the $try_next param does, but also
don't want to favor it like not setting the $try_next param does.

We might want to transform both, `try_next` and `best_scored` into a
single `mode` parameter to reduce complexity and make it more
explicit what we want here.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agomanager: set new request_start state for services freshly added to HA
Thomas Lamprecht [Sat, 19 Nov 2022 14:49:38 +0000 (15:49 +0100)]
manager: set new request_start state for services freshly added to HA

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agomanager: add new intermediate state for stop->start transitions
Thomas Lamprecht [Sat, 19 Nov 2022 14:27:59 +0000 (15:27 +0100)]
manager: add new intermediate state for stop->start transitions

We always check for re-starting a service if its in the started
state, but for those that go from a (request_)stop to the stopped
state it can be useful to explicitly have a separate transition.

The newly introduced `request_start` state can also be used for CRS
to opt-into starting a service up on a load-wise better suited node
in the future.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agomanager: recompute online usage: iterate over keys sorted
Thomas Lamprecht [Mon, 20 Mar 2023 10:01:50 +0000 (11:01 +0100)]
manager: recompute online usage: iterate over keys sorted

mostly to be safe for reproduce ability with the test system.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agomanager: service start: make EWRONG_NODE a non-fatal error
Thomas Lamprecht [Mon, 20 Mar 2023 08:49:10 +0000 (09:49 +0100)]
manager: service start: make EWRONG_NODE a non-fatal error

traverse the usual error counting mechanisms, as then the
select_service_node helper either picks up the right node and it
starts there or it can trigger fencing of that.

Note, in practice this normally can only happen if the admin
butchered around in the node cluster state, but as we only select the
safe nodes from the configured groups, we should be safe in any case.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agosim hardware: avoid hard error on usage stats parsing
Thomas Lamprecht [Mon, 20 Mar 2023 10:05:09 +0000 (11:05 +0100)]
sim hardware: avoid hard error on usage stats parsing

now that we can automatically derive them from the SID

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
12 months agosim env: derive service usage from ID as fallback
Thomas Lamprecht [Mon, 20 Mar 2023 10:03:05 +0000 (11:03 +0100)]
sim env: derive service usage from ID as fallback

so that we don't need to specify all usage stats explicitly for
bigger tests.

Note, we explicitly use two digits for memory as with just one a lot
of services are exactly the same, which gives us flaky tests due to
rounding, or some flakiness in the rust code - so this is a bit of a
stop gap for that too and should be reduced to a single digit once
we fixed it in the future.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
15 months agoupdate readme to be a bit less confusing/outdated
Thomas Lamprecht [Tue, 3 Jan 2023 12:19:16 +0000 (13:19 +0100)]
update readme to be a bit less confusing/outdated

E.g., pve-ha-manager is our current HA manager, so talking about the
"current HA stack" being EOL without mentioning the actually meant
`rgmanager` one, got taken up the wrong way by some potential users.
Correct that and a few other things, but as there are definitively
stuff still out-of-date, or will be in a few months, mention that
this is an older readme and refer to the HA reference docs at the
top.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agobump version to 3.5.1
Thomas Lamprecht [Sat, 19 Nov 2022 14:51:16 +0000 (15:51 +0100)]
bump version to 3.5.1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agoapi: status: add CRS info to manager if not set to default
Thomas Lamprecht [Sat, 19 Nov 2022 14:27:09 +0000 (15:27 +0100)]
api: status: add CRS info to manager if not set to default

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agomanager: slightly clarify log message for fallback on init-failure
Thomas Lamprecht [Sat, 19 Nov 2022 13:15:36 +0000 (14:15 +0100)]
manager: slightly clarify log message for fallback on init-failure

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agoapi: status: code and indentation cleanup
Thomas Lamprecht [Sat, 19 Nov 2022 13:00:51 +0000 (14:00 +0100)]
api: status: code and indentation cleanup

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agomanager: make crs a full blown hash
Thomas Lamprecht [Sat, 19 Nov 2022 14:38:05 +0000 (15:38 +0100)]
manager: make crs a full blown hash

To support potential more CRS settings more easily.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agomanager: update crs scheduling mode once per round
Thomas Lamprecht [Sat, 19 Nov 2022 12:36:57 +0000 (13:36 +0100)]
manager: update crs scheduling mode once per round

Pretty safe to do as we recompute everything per round anyway (and
much more often on top of that, but that's another topic).

Actually I'd argue that it's safer as this way a user doesn't need to
actively restart the manager, which grinds much more gears and
watchdog changes than checking periodically and updating it
internally. Plus, a lot of admins won't expect that they need to
restart the current active master and thus they'll complain that
their recently made change to the CRS config had no effect/the CRS
doesn't work at all.

We should codify such a change in test for this though.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agomanager: factor out setting crs scheduling mode
Thomas Lamprecht [Sat, 19 Nov 2022 12:36:28 +0000 (13:36 +0100)]
manager: factor out setting crs scheduling mode

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agomanager: various code style cleanups
Thomas Lamprecht [Sat, 19 Nov 2022 12:06:03 +0000 (13:06 +0100)]
manager: various code style cleanups

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agobump version to 3.5.0
Thomas Lamprecht [Fri, 18 Nov 2022 14:03:00 +0000 (15:03 +0100)]
bump version to 3.5.0

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agomanager: better convey that basic is always the fallback
Thomas Lamprecht [Fri, 18 Nov 2022 13:24:25 +0000 (14:24 +0100)]
manager: better convey that basic is always the fallback

to hint to a potential "code optimizer" that it may not be easily
moved above to the scheduling selection

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agod/control: add (build-)dependency for libpve-rs-perl
Thomas Lamprecht [Fri, 18 Nov 2022 12:44:43 +0000 (13:44 +0100)]
d/control: add (build-)dependency for libpve-rs-perl

to ensure we got the perlmod for the basic scheduler available.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
16 months agoresources: add missing PVE::Cluster use statements
Fiona Ebner [Thu, 17 Nov 2022 14:00:16 +0000 (15:00 +0100)]
resources: add missing PVE::Cluster use statements

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agotest: add tests for static resource scheduling
Fiona Ebner [Thu, 17 Nov 2022 14:00:15 +0000 (15:00 +0100)]
test: add tests for static resource scheduling

See the READMEs for more information about the tests.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agousage: static: use service count on nodes as a fallback
Fiona Ebner [Thu, 17 Nov 2022 14:00:14 +0000 (15:00 +0100)]
usage: static: use service count on nodes as a fallback

if something goes wrong with the TOPSIS scoring. Not expected to
happen, but it's rather cheap to be on the safe side.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agomanager: avoid scoring nodes when not trying next and current node is valid
Fiona Ebner [Thu, 17 Nov 2022 14:00:13 +0000 (15:00 +0100)]
manager: avoid scoring nodes when not trying next and current node is valid

With the Usage::Static plugin, scoring is not as cheap anymore and
select_service_node() is called for each running service.

This should cover most calls of select_service_node().

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agomanager: avoid scoring nodes if maintenance fallback node is valid
Fiona Ebner [Thu, 17 Nov 2022 14:00:12 +0000 (15:00 +0100)]
manager: avoid scoring nodes if maintenance fallback node is valid

With the Usage::Static plugin, scoring is not as cheap anymore and
select_service_node() is called for each running service.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agomanager: use static resource scheduler when configured
Fiona Ebner [Thu, 17 Nov 2022 14:00:11 +0000 (15:00 +0100)]
manager: use static resource scheduler when configured

Note that recompute_online_node_usage() becomes much slower when the
'static' resource scheduler mode is used. Tested it with ~300 HA
services (minimal containers) running on my virtual test cluster.

Timings with 'basic' mode were between 0.0004 - 0.001 seconds
Timings with 'static' mode were between 0.007 - 0.012 seconds

Combined with the fact that recompute_online_node_usage() is currently
called very often this can lead to a lot of delay during recovery
situations with hundreds of services and low thousands of services
overall and with genereous estimates even run into the watchdog timer.

Ideas to remedy this is using PVE::Cluster's
get_guest_config_properties() instead of load_config() and/or
optimizing how often recompute_online_node_usage() is called.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agomanager: set resource scheduler mode upon init
Fiona Ebner [Thu, 17 Nov 2022 14:00:10 +0000 (15:00 +0100)]
manager: set resource scheduler mode upon init

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agoenv: datacenter config: include crs (cluster-resource-scheduling) setting
Fiona Ebner [Thu, 17 Nov 2022 14:00:09 +0000 (15:00 +0100)]
env: datacenter config: include crs (cluster-resource-scheduling) setting

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agoenv: rename get_ha_settings to get_datacenter_settings
Fiona Ebner [Thu, 17 Nov 2022 14:00:08 +0000 (15:00 +0100)]
env: rename get_ha_settings to get_datacenter_settings

The method will be extended to include other HA-relevant settings from
datacenter.cfg.

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agousage: add Usage::Static plugin
Fiona Ebner [Thu, 17 Nov 2022 14:00:07 +0000 (15:00 +0100)]
usage: add Usage::Static plugin

for calculating node usage of services based upon static CPU and
memory configuration as well as scoring the nodes with that
information to decide where to start a new or recovered service.

For getting the service stats, it's necessary to also consider the
migration target (if present), becuase the configuration file might
have already moved.

It's necessary to update the cluster filesystem upon stealing the
service to be able to always read the moved config right away when
adding the usage.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agomanager: online node usage: switch to Usage::Basic plugin
Fiona Ebner [Thu, 17 Nov 2022 14:00:06 +0000 (15:00 +0100)]
manager: online node usage: switch to Usage::Basic plugin

no functional change is intended.

One test needs adaptation too, because it created its own version of
$online_node_usage.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agomanager: select service node: add $sid to parameters
Fiona Ebner [Thu, 17 Nov 2022 14:00:05 +0000 (15:00 +0100)]
manager: select service node: add $sid to parameters

In preparation for scheduling based on static information, where the
scoring of nodes depends on information from the service's
VM/CT configuration file (and the $sid is required to query that).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agoadd Usage base plugin and Usage::Basic plugin
Fiona Ebner [Thu, 17 Nov 2022 14:00:04 +0000 (15:00 +0100)]
add Usage base plugin and Usage::Basic plugin

in preparation to also support static resource scheduling via another
such Usage plugin.

The interface is designed in anticipation of the Usage::Static plugin,
the Usage::Basic plugin doesn't require all parameters.

In Usage::Static, the $haenv will necessary for logging and getting
the static node stats. add_service_usage_to_node() and
score_nodes_to_start_service() take the sid, service node and the
former also the optional migration target (during a migration it's not
clear whether the config file has already been moved or not) to be
able to get the static service stats.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agoresources: add get_static_stats() method
Fiona Ebner [Thu, 17 Nov 2022 14:00:03 +0000 (15:00 +0100)]
resources: add get_static_stats() method

to be used for static resource scheduling.

In container's vmstatus(), the 'cores' option takes precedence over
the 'cpulimit' one, but it felt more accurate to prefer 'cpulimit'
here.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
16 months agoenv: add get_static_node_stats() method
Fiona Ebner [Thu, 17 Nov 2022 14:00:02 +0000 (15:00 +0100)]
env: add get_static_node_stats() method

to be used for static resource scheduling. In the simulation
environment, the information can be added in hardware_status.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
20 months agofixup variable name typo
Thomas Lamprecht [Fri, 22 Jul 2022 10:39:27 +0000 (12:39 +0200)]
fixup variable name typo

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
20 months agomanager: add top level comment section to explain common variables
Thomas Lamprecht [Fri, 22 Jul 2022 10:15:55 +0000 (12:15 +0200)]
manager: add top level comment section to explain common variables

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
20 months agod/lintian-overrides: update for newer lintian
Thomas Lamprecht [Fri, 22 Jul 2022 08:06:47 +0000 (10:06 +0200)]
d/lintian-overrides: update for newer lintian

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
20 months agobump version to 3.4.0
Thomas Lamprecht [Fri, 22 Jul 2022 07:22:47 +0000 (09:22 +0200)]
bump version to 3.4.0

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
20 months agomanager: online node usage: factor out possible traget and future proof
Thomas Lamprecht [Fri, 22 Jul 2022 07:12:37 +0000 (09:12 +0200)]
manager: online node usage: factor out possible traget and future proof

only count up target selection if that node is already in the online
node usage list, to avoid that a offline node is considered online if
its a target from any command

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
20 months agotest: update pre-existing policy tests for fixed balancing spread
Thomas Lamprecht [Fri, 22 Jul 2022 06:49:41 +0000 (08:49 +0200)]
test: update pre-existing policy tests for fixed balancing spread

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
20 months agofix variable name typo
Thomas Lamprecht [Fri, 22 Jul 2022 05:25:02 +0000 (07:25 +0200)]
fix variable name typo

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
20 months agofix spreading out services if source node isnt operational but otherwise ok
Thomas Lamprecht [Thu, 21 Jul 2022 16:14:32 +0000 (18:14 +0200)]
fix spreading out services if source node isnt operational but otherwise ok

as its the case for going into maintenance mode

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
20 months agotests: add shutdown policy scenario with multiple guests to spread out
Thomas Lamprecht [Thu, 21 Jul 2022 16:09:38 +0000 (18:09 +0200)]
tests: add shutdown policy scenario with multiple guests to spread out

currently wrong as online_node_usage doesn't considers counting the
target node if the source node isn't considered online (=
operational) anymore

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
23 months agobump version to 3.3-4
Thomas Lamprecht [Wed, 27 Apr 2022 12:02:22 +0000 (14:02 +0200)]
bump version to 3.3-4

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
23 months agolrm: fix getting stuck on restart
Fabian Grünbichler [Wed, 27 Apr 2022 10:19:55 +0000 (12:19 +0200)]
lrm: fix getting stuck on restart

run_workers is responsible for updating the state after workers have
exited. if the current LRM state is 'active', but a shutdown_request was
issued in 'restart' mode (like on package upgrades), this call is the
only one made in the LRM work() loop.

skipping it if there are active services means the following sequence of
events effectively keeps the LRM from restarting or making any progress:

- start HA migration on node A
- reload LRM on node A while migration is still running

even once the migration is finished, the service count is still >= 1
since the LRM never calls run_workers (directly or via
manage_resources), so the service having been migrated is never noticed.

maintenance mode (i.e., rebooting the node with shutdown policy migrate)
does call manage_resources and thus run_workers, and will proceed once
the last worker has exited.

reported by a user:

https://forum.proxmox.com/threads/lrm-hangs-when-updating-while-migration-is-running.108628

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2 years agobuildsys: track and upload debug package
Thomas Lamprecht [Thu, 20 Jan 2022 17:08:27 +0000 (18:08 +0100)]
buildsys: track and upload debug package

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agobump version to 3.3-3
Thomas Lamprecht [Thu, 20 Jan 2022 17:05:37 +0000 (18:05 +0100)]
bump version to 3.3-3

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agolrm: increase run_worker loop-time parition
Thomas Lamprecht [Thu, 20 Jan 2022 15:09:37 +0000 (16:09 +0100)]
lrm: increase run_worker loop-time parition

every LRM round is scheduled to run for 10s but we spend only half
of that to actively trying to run workers (in the max_worker limit).

Raise that to 80% duty cycle.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agolrm: avoid job starvation on huge workloads
Thomas Lamprecht [Thu, 20 Jan 2022 14:35:02 +0000 (15:35 +0100)]
lrm: avoid job starvation on huge workloads

If a setup has a lot VMs we may run into the time limit from the
run_worker loop before processing all workers, which can easily
happen if an admin did not increased their default of max_workers in
the setup, but even with a bigger max_worker setting one can run into
it.

That combined with the fact that we sorted just by the $sid
alpha-numerically means that CTs where preferred over VMs (C comes
before V) and additionally lower VMIDs where preferred too.

That means that a set of SIDs had a lower chance of ever get actually
run, which is naturally not ideal at all.
Improve on that behavior by adding a counter to the queued worker and
preferring those that have a higher one, i.e., spent more time
waiting on getting actively run.

Note, due to the way the stop state is enforced, i.e., always
enqueued as new worker, its start-try counter will be reset every
round and thus have a lower priority compared to other request
states. We probably want to differ between a stop request when the
service is/was in another state just before and the time a stop is
just re-requested even if a service was already stopped for a while.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agolrm: code/style cleanups
Thomas Lamprecht [Thu, 20 Jan 2022 13:40:27 +0000 (14:40 +0100)]
lrm: code/style cleanups

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agolrm: run worker: avoid an indendation level
Thomas Lamprecht [Thu, 20 Jan 2022 12:41:24 +0000 (13:41 +0100)]
lrm: run worker: avoid an indendation level

best viewed with the `-w` flag to ignore whitespace change itself

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agolrm: log actual error if fork fails
Thomas Lamprecht [Thu, 20 Jan 2022 12:39:35 +0000 (13:39 +0100)]
lrm: log actual error if fork fails

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agomanager: refactor fence processing and rework fence-but-no-service log
Thomas Lamprecht [Thu, 20 Jan 2022 12:31:04 +0000 (13:31 +0100)]
manager: refactor fence processing and rework fence-but-no-service log

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agod/changelog: s/nodes/services/
Thomas Lamprecht [Thu, 20 Jan 2022 09:10:27 +0000 (10:10 +0100)]
d/changelog: s/nodes/services/

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agobump version to 3.3-2
Thomas Lamprecht [Wed, 19 Jan 2022 13:30:19 +0000 (14:30 +0100)]
bump version to 3.3-2

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 years agomanage: handle edge case where a node gets stuck in 'fence' state
Fabian Ebner [Fri, 8 Oct 2021 12:52:26 +0000 (14:52 +0200)]
manage: handle edge case where a node gets stuck in 'fence' state

If all services in 'fence' state are gone from a node (e.g. by
removing the services) before fence_node() was successful, a node
would get stuck in the 'fence' state. Avoid this by calling
fence_node() if the node is in 'fence' state, regardless of service
state.

Reported in the community forum:
https://forum.proxmox.com/threads/ha-migration-stuck-is-doing-nothing.94469/

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
[ T: track test change of new test ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>