Thomas Lamprecht [Fri, 30 Oct 2015 09:55:44 +0000 (10:55 +0100)]
HA API: Fix permissions
Integrate permission in the HA API so that not only root may do
changes.
-) create/edit/update actions need the 'Sys.Console' privileges on
the root (/) path
-) read actions need the 'Sys.Audit' privilege on the root (/) path
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 23 Oct 2015 12:04:25 +0000 (14:04 +0200)]
exec_resource_agent: return valid exit code instead of die's
Switch from die's to logging and return the respective exit codes.
This adds the possibility to handle (i.e.: fix) some errors outside
of the forked exec_resource_agent worker.
This does not changes behaviour for now, as the die returned an 255
exit code. We didn't checked on that exit code explicitly and so we
are safe to use the new exit codes, it results in the same behaviour
for the other code (most important the CRM Manager class).
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 12 Oct 2015 13:04:41 +0000 (15:04 +0200)]
check resource better on addition and update
Check if the resource exists in the cluster when adding it to the
ha stack.
When trying to update/migrate or delete a resource check if it's
ha managed at all.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 28 Sep 2015 09:34:52 +0000 (11:34 +0200)]
delete node from HA stack when deleted from cluster
When a node gets deleted from the cluster with pvecm delnode
we set it's node state in the manager status to 'gone'.
When set to gone the manager waits an hour after the node was last
seen online and only then deletes it from the manager status.
When some HA services were forgotten on the node (shouldn't happen
at all!!) the node will be fenced, the service migrated and then its
state reset to 'gone'. After an hour the node will be deleted,
unless it joined the cluster again in the meantime.
Deleting a node from the HA manager status is by no means a final
act, the ha-manager could live without deleting it, but for the user
it is confusing to see dead nodes in the interface.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Groups: correctly set optional flag in propertyList
Only group and type should be required, all other properties should
be marked optional inside propertyList. We can set correct values
for optional flag inside options().
Thomas Lamprecht [Wed, 16 Sep 2015 09:25:18 +0000 (11:25 +0200)]
fix includes from services
The crm and lrm daemon executables need to include SafeSyslog, as
they use syslog in their signal handler.
Whereas it isn't needed anymore in the Service class of the daemons.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 16 Sep 2015 09:25:15 +0000 (11:25 +0200)]
implement recovery policy for services
We implement recovery policies which use settings known from
rgmanager, however the behaviour is not strictly the same,
our approach is more configurable. For example rgmanager cannot
combine its restart and relocate policy.
There are the following policy settings which kick in on an failed
service start:
* max_restart: maxmial number of tries to restart an failed service
on the actual node. The default is 1 restart try.
This policy gets enforced by the LRM.
* max_relocate: maximal number of tries to relocate the service to a
a different node. A relocate only takes place after
the max_restart value is exceeded on the actual node
This policy gets enforced by the CRM.
If a service is still no running after all max tries, it's state
gets set to 'error'. This means that the service needs to be checked
and disabled manually.
*Note* that the relocate state will only reset when the service had
at least one successful start. That means if a service is reenabled
without fixing the error only the restart policy gets repeated.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Adding PVECT resource class so that CT can be HA managed
Extend the PVEVM resource class and add a PVECT resource class so
that service type specific operations (e.g.: start, migrate, ...)
can be handled through an plugin and are independent of the service
type.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>