]>
git.proxmox.com Git - pve-ha-manager.git/log
Dietmar Maurer [Mon, 12 Oct 2015 16:26:52 +0000 (18:26 +0200)]
bump version to 1.0-10
Thomas Lamprecht [Mon, 12 Oct 2015 13:04:42 +0000 (15:04 +0200)]
fix typo in error message s/storage/resource/
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 12 Oct 2015 13:04:41 +0000 (15:04 +0200)]
check resource better on addition and update
Check if the resource exists in the cluster when adding it to the
ha stack.
When trying to update/migrate or delete a resource check if it's
ha managed at all.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 12 Oct 2015 13:04:40 +0000 (15:04 +0200)]
Add resource existence check helper
Add a helper to the resource class which returns service specific if
the resource exists on the cluster, i.e. can be added.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 12 Oct 2015 13:04:39 +0000 (15:04 +0200)]
Document parameters in parent class
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 12 Oct 2015 13:04:38 +0000 (15:04 +0200)]
Add 'service is ha managed' check
add a check for a given $sid if it's managed by the ha stack
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Tue, 29 Sep 2015 05:35:56 +0000 (07:35 +0200)]
bump version to 1.0-9
Thomas Lamprecht [Mon, 28 Sep 2015 09:34:52 +0000 (11:34 +0200)]
delete node from HA stack when deleted from cluster
When a node gets deleted from the cluster with pvecm delnode
we set it's node state in the manager status to 'gone'.
When set to gone the manager waits an hour after the node was last
seen online and only then deletes it from the manager status.
When some HA services were forgotten on the node (shouldn't happen
at all!!) the node will be fenced, the service migrated and then its
state reset to 'gone'. After an hour the node will be deleted,
unless it joined the cluster again in the meantime.
Deleting a node from the HA manager status is by no means a final
act, the ha-manager could live without deleting it, but for the user
it is confusing to see dead nodes in the interface.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Sat, 26 Sep 2015 08:36:31 +0000 (10:36 +0200)]
bump version to 1.0-8
Thomas Lamprecht [Fri, 25 Sep 2015 15:50:06 +0000 (17:50 +0200)]
Use new lock domain sub instead of storage lock
Doesn't changes behaviour at all, but makes code clearer
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Mon, 21 Sep 2015 10:17:49 +0000 (12:17 +0200)]
bump version to 1.0-7
Thomas Lamprecht [Fri, 18 Sep 2015 06:19:37 +0000 (08:19 +0200)]
enhance ha-managers' group commands
add commands for adding, deleting and modifying groups. Also add
better bash completion for these commands.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 18 Sep 2015 09:21:02 +0000 (11:21 +0200)]
vm_is_ha_managed: allow check on service state
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Thu, 17 Sep 2015 09:53:44 +0000 (11:53 +0200)]
Groups: correctly set optional flag in propertyList
Only group and type should be required, all other properties should
be marked optional inside propertyList. We can set correct values
for optional flag inside options().
Dietmar Maurer [Thu, 17 Sep 2015 05:51:43 +0000 (07:51 +0200)]
Makefile: use mv to create files atomically
Thomas Lamprecht [Wed, 16 Sep 2015 13:49:28 +0000 (15:49 +0200)]
Extend ha_managers' man page
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Wed, 16 Sep 2015 10:06:37 +0000 (12:06 +0200)]
bump version to 1.0-6
Thomas Lamprecht [Wed, 16 Sep 2015 09:25:18 +0000 (11:25 +0200)]
fix includes from services
The crm and lrm daemon executables need to include SafeSyslog, as
they use syslog in their signal handler.
Whereas it isn't needed anymore in the Service class of the daemons.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 16 Sep 2015 09:25:17 +0000 (11:25 +0200)]
fixing typos, also whitespace cleanup in PVE2 env class
fix typos through the whole project, used codespell to find most of
them.
Also do a big whitespace cleanup in the PVE2 enviorment class.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 16 Sep 2015 09:25:16 +0000 (11:25 +0200)]
adjust log level on failed start and error to warning
use warning instead of info to represent the significance of the
log message
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 16 Sep 2015 09:25:15 +0000 (11:25 +0200)]
implement recovery policy for services
We implement recovery policies which use settings known from
rgmanager, however the behaviour is not strictly the same,
our approach is more configurable. For example rgmanager cannot
combine its restart and relocate policy.
There are the following policy settings which kick in on an failed
service start:
* max_restart: maxmial number of tries to restart an failed service
on the actual node. The default is 1 restart try.
This policy gets enforced by the LRM.
* max_relocate: maximal number of tries to relocate the service to a
a different node. A relocate only takes place after
the max_restart value is exceeded on the actual node
This policy gets enforced by the CRM.
If a service is still no running after all max tries, it's state
gets set to 'error'. This means that the service needs to be checked
and disabled manually.
*Note* that the relocate state will only reset when the service had
at least one successful start. That means if a service is reenabled
without fixing the error only the restart policy gets repeated.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Wed, 16 Sep 2015 06:32:58 +0000 (08:32 +0200)]
improve sid bash completion
Thomas Lamprecht [Tue, 15 Sep 2015 07:27:37 +0000 (09:27 +0200)]
use helpers to enable advanced auto completion
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Tue, 15 Sep 2015 07:27:36 +0000 (09:27 +0200)]
add auto completion helper for service IDs and HA groups
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 11 Sep 2015 14:57:17 +0000 (16:57 +0200)]
simulator: fix random output of manager status
Tell Data::Dumper to sort the keys before dumping. That fixes
the manager status mess of jumping keys.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Tue, 15 Sep 2015 06:26:59 +0000 (08:26 +0200)]
remove 'exename' from CLIHandler classes (not required)
Dietmar Maurer [Tue, 15 Sep 2015 05:32:22 +0000 (07:32 +0200)]
ha-manager: fix manpage header
Thomas Lamprecht [Mon, 14 Sep 2015 15:21:56 +0000 (17:21 +0200)]
convert pve-ha-crm into a PVE::Service class
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 14 Sep 2015 15:21:55 +0000 (17:21 +0200)]
convert pve-ha-lrm into a PVE::Service class
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Tue, 15 Sep 2015 05:25:10 +0000 (07:25 +0200)]
re-add code silently removed by last commit
Thomas Lamprecht [Mon, 14 Sep 2015 15:21:54 +0000 (17:21 +0200)]
move ha-manager to separate CLIHandler class
Move ha-manager to separate CLIHandler class and add basic auto
completion support.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dietmar Maurer [Tue, 8 Sep 2015 06:46:03 +0000 (08:46 +0200)]
bump version to 1.0-5
Thomas Lamprecht [Wed, 2 Sep 2015 15:52:33 +0000 (17:52 +0200)]
Adding error state behaviour
Previously there was no way out of the error state.
Now a 'safe' state can be reached by disabling the service manually.
Disabling and reactivating should only be done if the error cause
was found and fixed.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 2 Sep 2015 15:52:32 +0000 (17:52 +0200)]
Replacing hardcoded qemu commands with plugin calls
Now a service specific plugin gets loaded and the calls to commands
like 'migrate' or 'stop' will be handled by the plugin.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 2 Sep 2015 15:52:31 +0000 (17:52 +0200)]
Fixed hardcoded type 'vm' in check if vm is ha managed
The new approach checks every registered resource type.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 2 Sep 2015 15:52:30 +0000 (17:52 +0200)]
Adding PVECT resource class so that CT can be HA managed
Extend the PVEVM resource class and add a PVECT resource class so
that service type specific operations (e.g.: start, migrate, ...)
can be handled through an plugin and are independent of the service
type.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Alen Grizonic [Tue, 1 Sep 2015 09:53:59 +0000 (11:53 +0200)]
HA parse_sid changed to accept CT
[PATCH v4] changes:
- fixed VM/CT exist check
- added internal error exception
- fix spelling errors
Wolfgang Link [Mon, 17 Aug 2015 08:52:02 +0000 (10:52 +0200)]
Fix Typo
Dietmar Maurer [Tue, 16 Jun 2015 07:59:25 +0000 (09:59 +0200)]
bump version to 1.0-4
Dietmar Maurer [Tue, 16 Jun 2015 07:57:09 +0000 (09:57 +0200)]
groups: encode nodes as hash (internally)
Dietmar Maurer [Tue, 16 Jun 2015 07:55:48 +0000 (09:55 +0200)]
add trigger for pve-api-updates
Dietmar Maurer [Wed, 10 Jun 2015 05:39:18 +0000 (07:39 +0200)]
crm: simply wait if there is no resource config
Dietmar Maurer [Tue, 9 Jun 2015 12:35:32 +0000 (14:35 +0200)]
bump version to 1.0-3
Dietmar Maurer [Tue, 9 Jun 2015 07:33:42 +0000 (09:33 +0200)]
bump version to 1.0-2
Dietmar Maurer [Tue, 9 Jun 2015 07:32:15 +0000 (09:32 +0200)]
use Wants instead of Requires inside systemd service definitions
To avoid unnecessary restarts of dependent services.
Dietmar Maurer [Fri, 5 Jun 2015 08:04:45 +0000 (10:04 +0200)]
bump version to 1.0-1
Dietmar Maurer [Fri, 5 Jun 2015 08:02:55 +0000 (10:02 +0200)]
delete stale files
Dietmar Maurer [Fri, 5 Jun 2015 08:00:48 +0000 (10:00 +0200)]
always start crm and lrm service
Even if there is no resources.cfg. That makes it easier to
enable HA, because we don't need to start services manually.
Dietmar Maurer [Fri, 10 Apr 2015 04:54:36 +0000 (06:54 +0200)]
bump version to 0.9-3
Dietmar Maurer [Fri, 10 Apr 2015 04:51:51 +0000 (06:51 +0200)]
implement delay command for regression tester
root [Fri, 3 Apr 2015 08:26:29 +0000 (10:26 +0200)]
test of failback
Signed-off-by: Wolfgang Link <w.link@proxmox.com>
Dietmar Maurer [Fri, 10 Apr 2015 04:32:46 +0000 (06:32 +0200)]
correctly pass parameters for change_service_location
Dietmar Maurer [Fri, 10 Apr 2015 04:31:44 +0000 (06:31 +0200)]
sort output so that we can compare logs
Dietmar Maurer [Tue, 7 Apr 2015 07:52:14 +0000 (09:52 +0200)]
bump version to 0.9-2
Dietmar Maurer [Tue, 7 Apr 2015 07:50:56 +0000 (09:50 +0200)]
add warnings if ha group does not exists
Dietmar Maurer [Tue, 7 Apr 2015 04:55:02 +0000 (06:55 +0200)]
use groups parser
Dietmar Maurer [Sun, 5 Apr 2015 15:55:09 +0000 (17:55 +0200)]
avoid perl warning
Dietmar Maurer [Fri, 3 Apr 2015 17:00:22 +0000 (19:00 +0200)]
update README
Dietmar Maurer [Fri, 3 Apr 2015 14:45:49 +0000 (16:45 +0200)]
do not allow deletion of ha group if group is used
Dietmar Maurer [Fri, 3 Apr 2015 09:16:19 +0000 (11:16 +0200)]
use correct class
Dietmar Maurer [Fri, 3 Apr 2015 09:08:23 +0000 (11:08 +0200)]
complete ha group api
Dietmar Maurer [Fri, 3 Apr 2015 06:33:37 +0000 (08:33 +0200)]
api: allow to use simply VMIDs as resource id
Dietmar Maurer [Fri, 3 Apr 2015 04:47:07 +0000 (06:47 +0200)]
improve status API
Dietmar Maurer [Fri, 3 Apr 2015 04:24:47 +0000 (06:24 +0200)]
remove ipaddr resource type
Dietmar Maurer [Fri, 3 Apr 2015 04:18:23 +0000 (06:18 +0200)]
bump version to 0.9-1
Dietmar Maurer [Fri, 3 Apr 2015 04:16:40 +0000 (06:16 +0200)]
rename vm resource prefix: pvevm: => vm:
Dietmar Maurer [Fri, 3 Apr 2015 04:14:04 +0000 (06:14 +0200)]
add API to query ha status
Dietmar Maurer [Thu, 2 Apr 2015 06:48:37 +0000 (08:48 +0200)]
bump version to 0.8-2
Dietmar Maurer [Thu, 2 Apr 2015 06:47:01 +0000 (08:47 +0200)]
lrm: reduce TimeoutStopSec
because systemd waits 2*TimeoutStopSec
Dietmar Maurer [Thu, 2 Apr 2015 06:43:28 +0000 (08:43 +0200)]
lrm: set systemd killmode to 'process'
We do not want to kill running VMs (for example during software update).
Dietmar Maurer [Thu, 2 Apr 2015 06:21:26 +0000 (08:21 +0200)]
bump version to 0.8-1
Dietmar Maurer [Thu, 2 Apr 2015 06:17:15 +0000 (08:17 +0200)]
currecrtly send cfs lock update request
Dietmar Maurer [Wed, 1 Apr 2015 09:05:25 +0000 (11:05 +0200)]
bump version to 0.7-1
Dietmar Maurer [Wed, 1 Apr 2015 07:57:03 +0000 (09:57 +0200)]
create /etc/pve/ha
Dietmar Maurer [Wed, 1 Apr 2015 07:51:48 +0000 (09:51 +0200)]
use correct package for lock_ha_config
Dietmar Maurer [Wed, 1 Apr 2015 06:20:05 +0000 (08:20 +0200)]
fit ha-manager status when ha is unconfigured
Dietmar Maurer [Wed, 1 Apr 2015 06:19:32 +0000 (08:19 +0200)]
do not unlink watchdog socket when started via systemd
Dietmar Maurer [Wed, 1 Apr 2015 06:05:01 +0000 (08:05 +0200)]
depend on systemd (build-depend on dh-systemd)
Dietmar Maurer [Wed, 1 Apr 2015 05:53:08 +0000 (07:53 +0200)]
fix json_reader
Dietmar Maurer [Tue, 31 Mar 2015 11:46:33 +0000 (13:46 +0200)]
fix dependencies
Dietmar Maurer [Fri, 27 Mar 2015 11:42:20 +0000 (12:42 +0100)]
lrm: use correct rpcenv 'ha'
Dietmar Maurer [Fri, 27 Mar 2015 11:29:56 +0000 (12:29 +0100)]
bump version to 0.6-1
Dietmar Maurer [Fri, 27 Mar 2015 11:26:26 +0000 (12:26 +0100)]
move configuration handling into PVE::HA::Config
Dietmar Maurer [Fri, 27 Mar 2015 10:40:21 +0000 (11:40 +0100)]
use cfs_read_file and cfs_write_file
Dietmar Maurer [Fri, 27 Mar 2015 08:17:15 +0000 (09:17 +0100)]
ha-manager status: include service state
Dietmar Maurer [Fri, 27 Mar 2015 08:00:53 +0000 (09:00 +0100)]
ha-manager status: add --verbose flag
Dietmar Maurer [Fri, 27 Mar 2015 07:51:41 +0000 (08:51 +0100)]
restart lrm after upgrade
Dietmar Maurer [Fri, 27 Mar 2015 07:31:41 +0000 (08:31 +0100)]
ha-manager: improve status output
Dietmar Maurer [Fri, 27 Mar 2015 07:31:13 +0000 (08:31 +0100)]
add timestamp to manager status
Dietmar Maurer [Fri, 27 Mar 2015 05:56:51 +0000 (06:56 +0100)]
update lrm status on each iteration
Dietmar Maurer [Fri, 27 Mar 2015 05:50:45 +0000 (06:50 +0100)]
update_lrm_status: add a time stamp
Dietmar Maurer [Fri, 27 Mar 2015 05:49:19 +0000 (06:49 +0100)]
cleanup lrm startup code
Dietmar Maurer [Fri, 27 Mar 2015 05:32:04 +0000 (06:32 +0100)]
depend on qemu-server
Dietmar Maurer [Fri, 27 Mar 2015 05:28:50 +0000 (06:28 +0100)]
improve docu
Dietmar Maurer [Thu, 26 Mar 2015 16:17:49 +0000 (17:17 +0100)]
remove dead code
Dietmar Maurer [Thu, 26 Mar 2015 15:47:18 +0000 (16:47 +0100)]
add another test
Dietmar Maurer [Thu, 26 Mar 2015 15:39:56 +0000 (16:39 +0100)]
add another test case
Dietmar Maurer [Thu, 26 Mar 2015 12:23:20 +0000 (13:23 +0100)]
bump version 0.5-1
Dietmar Maurer [Thu, 26 Mar 2015 12:01:27 +0000 (13:01 +0100)]
implement migrate
Dietmar Maurer [Thu, 26 Mar 2015 11:50:47 +0000 (12:50 +0100)]
implement change_service_location