git.proxmox.com Git - pve-cluster.git/log

pvecm: add: create task log on cluster join

The API join path creates a task log when joining a cluster.
Also create such a log in the CLI code path.

Changes are mostly indentation only.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 9 Jan 2018 14:31:00 +0000 (15:31 +0100)]

lock locally on cluster create and join

If we are not part of a cluster we do not need to worry about other
members messing with the config. But there may be local contenders,
e.g., two automation script instances started in parallel by mistake
or two admin (sessions) which start a create or join clsuter request
at the same time.
Reuse the local flock for this purpose.

lock_file silents an exception, but does not alters it so we die if
$@ is set, to ensure a worker gets to know that something bad
happened.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 18 Dec 2017 14:30:10 +0000 (15:30 +0100)]

use resolved IP address for ring0_addr as default

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 1 Dec 2017 12:57:42 +0000 (13:57 +0100)]

api/cluster: add endpoint to GET cluster join information

Returns all relevant information for joining this cluster over the
current connected node securely over the API, address, fingerprint,
totem config section and (not directly needed but possibly useful)
cluster configuration digest.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 26 Jan 2018 12:07:57 +0000 (13:07 +0100)]

factor out common parameter definitions

Besides the obvious reduction of duplicated code, this also
streamlines the descriptions.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 12 Dec 2017 16:15:56 +0000 (17:15 +0100)]

api/cluster: create cluster in forked worker

Creating a cluster may need a bit longer, we need to gather random
data for the corosync authkey, restart services and such.
As we're now exposed in the API the 30 second response limit from
pveproxy is a big reason to do this. But we also get a nice task log
entry with this, which is nice.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 1 Dec 2017 12:56:57 +0000 (13:56 +0100)]

move cluster create to API

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 1 Dec 2017 12:33:09 +0000 (13:33 +0100)]

cluster create: restart corosync & pmxfs in one go and say so

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 1 Dec 2017 12:31:42 +0000 (13:31 +0100)]

cluster create: factor out initial corosync config assembly

Easier to read and work with in comparison to a heredocs text with
other string variables in there.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 27 Nov 2017 11:53:46 +0000 (12:53 +0100)]

pvecm add: use API by default to join cluster

Default to using the API for a add node procedure.

But, allow the user to manually fall back to the legacy SSH method.
Also fallback if the API detected an not up to date peer, this is
done by checking for the 501 HTTP_NOT_IMPLEMENTED response code.

This could be removed in a later major release, e.g. 6.0.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 27 Nov 2017 09:55:14 +0000 (10:55 +0100)]

api/cluster: add join endpoint

Add an endpoint to the API which allows to join an existing PVE
cluster by only using the API instead of CLI tools (pvecm).

Use a worker as this operation may need longer than 30 seconds.
With the worker we also get a task log entry/window for an UI for
free, allowing to give better feedback.

The join helper will be reused by the CLI handler in a later patch.
It is based on its behaviour, but swapped out the ssh parts with API
calls.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 20 Nov 2017 14:04:56 +0000 (15:04 +0100)]

return cluster config and authkey in addnode API call

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 18 Dec 2017 14:13:28 +0000 (15:13 +0100)]

use run_command instead of system

perls system wants to open /dev/tty which is not available in forked
workers. Use our run_command instead.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 18 Dec 2017 14:20:16 +0000 (15:20 +0100)]

assert_joinable: simplify error and warning handling

remove the if check for force, as we handle this on another level and
want to record errors independent of the value of force.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 26 Jan 2018 10:40:29 +0000 (11:40 +0100)]

node add: factor out local joining steps

Factor out the code which finishes the join to a cluster on the
joinee side, after a cluster member approved the join request and
supplied us with the necessary information.

Will be used by API and the SSH join code paths.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 23 Nov 2017 13:29:37 +0000 (14:29 +0100)]

node add: factor out checks for joining

Factor out the code which checks if the node can join another
cluster. It will be used by the new API endpoint to join a cluster
but stays also in the CLIHandler as we keep the old legacy SSH method
for a bit.

This is not a completely 1:1 move, I changed:
* &$error(...) to $error->(...)
* removing a few empty lines, where code was so spread out that those
lines resulted in the opposite of what they intended, i.e., less
readability

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 12 Dec 2017 15:27:45 +0000 (16:27 +0100)]

tell cluster log when adding/deleting a node

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 20 Nov 2017 12:08:40 +0000 (13:08 +0100)]

move addnode/delnode from CLI to cluster config API

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Tue, 30 Jan 2018 08:58:22 +0000 (09:58 +0100)]

datacenter.cfg: add bwlimit

This will define the global defaults which can be overridden
by per-storage limits.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Mon, 22 Jan 2018 14:27:18 +0000 (15:27 +0100)]

bump version to 5.0-20

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 22 Jan 2018 09:52:12 +0000 (10:52 +0100)]

avoid harmful '<>' pattern, explicitly read from STDIN

Fixes problems in CLIHandler using the code pattern:

while (my $line = <>) {
...
}

For why this causes only _now_ problems lets first look how <>
behaves:

"The null filehandle <> is special: [...] Input from <> comes either
from standard input, or from each file listed on the command line.
Here's how it works: the first time <> is evaluated, the @ARGV array
is checked, and if it is empty, $ARGV[0] is set to "-" , which when
opened gives you standard input. The @ARGV array is then processed
as a list of filenames." - 'perldoc perlop'

Recent changes in the CLIHandler code changed how we modfiied @ARGV
Earlier we assumed that the first argument must be the command and
thus shifted it out of @ARGV, now we can have multiple levels of
(sub)commands. This change also changed how we handle @ARGV, we do
not unshift anything but go through the arguments until we got to
the final command and copy the rest of @ARGV as we know that this
must be the commandos arguments.

For '<>' this means that ARGV was still fully populated and perl
tried to open element as a file, which naturally failed.
Thus the change in pve-common only exposed this 'dangerous' code
pattern.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 22 Dec 2017 13:34:32 +0000 (14:34 +0100)]

add pmxcfs restart detection heuristic for IPCC

Allow clean pmxcfs restarts to be fully transparent for the IPCC
using perl stack aboce.

A restart of pmxcfs invalidates the connection cache, we only set the
cached connection to NULL for this case and the next call to
ipcc_send_rec would connect newly.
Further, such a restart may need quite a bit time (seconds).

Thus write a status file to flag a possible restart when terminating
from pmxcfs. Delete this flag file once we're up and ready again.
Error case handling is described further below.

If a new connections fails and this flag file exists then retry
connecting for a certain period (for now five seconds).

If a cached connection fails always retry once, as every pmxcfs
restart makes the cached connection invalid, even if IPCC would be
fully up and ready again and then also follow the connection polling
heuristic if the restart flag exists, as new connections do.

We use the monotonic clock to avoid problems if the (system) time
changes and to keep things as easy as possible.

We delete the flag file if a IPCC call could not connect in the grace
period, but only if the file is still the same, i.e., no one else has
deleted and recreated it in the meantime (e.g. a second cfs restart).
This guarantees that IPCC calls try this heuristic only for a limited
time (5 seconds until the first failed one) if the cfs does not
starts again.

Further, as the flag resided in /run/... - which is always a tmpfs
(thus in memory and thus cleaned upon reboot) we may not run into
leftover flag files on a node reset, e.g. done by the HA watchdog for
self-fencing.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 22 Dec 2017 13:34:31 +0000 (14:34 +0100)]

pmxcfs: do not wait artificially when stopping

Most of PVE services profit from an up and running pmxcfs, thus
artificially prolonging its graceful termination is contra
productive.

As everything gets de-initialized nicely and gracefully this should
not be problematic, the restart is not the fastest anyhow anyway.

As this specific line has no git tracking information available
(imported from svn) I could not find if and what the reason for this
was.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Dominik Csapak [Thu, 21 Dec 2017 11:52:53 +0000 (12:52 +0100)]

deprecate and map 'applet' console setting in datacenter.cfg

we do not use the applet anymore, and setting it throws an error
in the gui when clicking the console button

map it to 'html5' and mark it deprecated, so that we can remove
that setting in the next major release

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 20 Dec 2017 09:35:21 +0000 (10:35 +0100)]

postinst: stop LRM before CRM in workaround

may help in a single node HA cluster, which while not usable for real
HA can be still found in the wild and make sense as ensure my VM/CT
stays started manager.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 7 Dec 2017 14:20:54 +0000 (15:20 +0100)]

bump version to 5.0-19

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 7 Dec 2017 14:16:28 +0000 (15:16 +0100)]

whitespace fixup

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 7 Dec 2017 14:10:02 +0000 (15:10 +0100)]

ensure problematic ha service is stopped during update

Add a postinst file which stops, if running, the ha service before it
configures pve-cluster and starts them again, if enabled.
Do this only if the version installed before the upgrade is <= 2.0-3

dpkg-query has Version and Config-Version

Version is at this time the new unpacked version already, so we need
to check both to catch all cases.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 7 Dec 2017 07:11:54 +0000 (08:11 +0100)]

bump version to 5.0-18

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 7 Dec 2017 07:09:51 +0000 (08:09 +0100)]

whitespace fixup

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Fabian Grünbichler [Wed, 6 Dec 2017 19:39:49 +0000 (20:39 +0100)]

build: fix sysctl.d install path

and remove the directory before installing the snippet when upgrading
from a broken version (and if the incorrect directory exists).

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 1 Dec 2017 12:25:12 +0000 (13:25 +0100)]

datacenter.cfg write: retransform migration property to string

We use parse_property_string in the parser to make life easier for
code working with the migration format, but we did not retransform
it back when writing datacenter.cfg

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 1 Dec 2017 10:09:01 +0000 (11:09 +0100)]

pvecm create: there is no rrp_mode param anymore

We removed the possibility to pass a rrp_mode param to the create call
in commit 606a890448f0a6219db3d1c19b98960a3dcbcaa8 so this check is
rather useless.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Tue, 28 Nov 2017 15:42:03 +0000 (16:42 +0100)]

bump version to 5.0-17

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Fabian Grünbichler [Mon, 27 Nov 2017 08:48:03 +0000 (09:48 +0100)]

handle Net::SSLeay errors correctly

commit | commitdiff | tree

Thomas Lamprecht [Thu, 23 Nov 2017 08:42:58 +0000 (09:42 +0100)]

factor out reading a nodes ssl cert fingerprint

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 23 Nov 2017 08:57:59 +0000 (09:57 +0100)]

setup_sshd_config: remove useless start_sshd parameter

This controlled if we use reload-or-restart or try-reload-or-restart.
They differ in the following way:
> reload-or-restart - Reload one or more units if possible, otherwise
> start or restart
>
> try-reload-or-restart - Reload one or more units if possible,
> otherwise (re)start if active

Under PVE we normally need a running ssh for a node/cluster to work,
there isn't the case where it should be stopped, especially not for
this method which is normally called when setting up or joining a
cluster.
So always use 'reload-or-restart'.

Semantically reverts: 6c0e95b3

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 23 Nov 2017 11:12:05 +0000 (12:12 +0100)]

pvecm: add/delete: local lock & avoid problems on first node addition

cfs_lock is per node, thus we had a possibility for a node addition
race if the process was started on the same node (e.g. by a
script/ansible/...).

So always request a local lock first, if that is acquired check how
many members currently reside in the cluster and then decide if we
can directly execute the code (single node cluster = no contenders)
or must hold the lock.

One may think that there remains a race when adding a node to single
node cluster, i.e., once the node is added it may itself be a target
for another joining node. But this cannot happen as we only tell the
joining node that it could be added once we already *have* added it
locally.

Besides the defense against a race if two user execute a node
addition to the same node at the same time, this also addresses a
issue where the cluster lock could not be removed after writing the
corosync conf, as pmxcfs and corosync triggered an config reload and
added the new node, which itself did not yet know that it was
accepted in the cluster. Thus, the former single node cluster expects
two nodes but has only one for now, until the other node pulled the
config and authkey and started up its cluster stack.

That resulted in a failing removal of the corosync lock, thus adding
another node did not work until this lock timed out (~2 minutes).

While often node additions are separated by more than 2 minutes time
intervall, deployment helpers (or fast admins, for that matter) may
trigger this easily.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 23 Nov 2017 11:12:04 +0000 (12:12 +0100)]

pvecm: check early if the deletion cannot work

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 22 Nov 2017 07:31:31 +0000 (08:31 +0100)]

bump version to 5.0-16

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 16 Nov 2017 14:27:53 +0000 (15:27 +0100)]

pvecm: module cleanup: remove unused modules

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 16 Nov 2017 14:27:52 +0000 (15:27 +0100)]

pvecm: module cleanup: use our get_host_address_family

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 16 Nov 2017 14:27:51 +0000 (15:27 +0100)]

pvecm: remove unused variable

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 Nov 2017 14:04:21 +0000 (15:04 +0100)]

buildsys: autogen debug package and cleanup unecessary rules

don't do manually what the deb helpers do automatically and better.

Autogenerate the debug package, it includes now only the debugsymbols
without effectively duplicating all executables and libraries.
In the same step add a install file which installs our sysctl
settings, this is done together as it allows to skip some
intermediate steps, also the change is not to big, so it should be
possible to see whats going on.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 Nov 2017 14:04:20 +0000 (15:04 +0100)]

buildsys: remove traces from unused rsyslog config

does not get included in package since a long time, if ever, and has
no purpose currently.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 Nov 2017 14:04:19 +0000 (15:04 +0100)]

buildsys: rules: remove outdated variables

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 Nov 2017 14:02:24 +0000 (15:02 +0100)]

buildsys: use autoreconf

do not manually call autogen/configure just use dh_autoreconf

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 Nov 2017 14:02:23 +0000 (15:02 +0100)]

buildsys: remove outdated postrm

we did nothing special here so just use the debhelper generated one

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 Nov 2017 14:02:22 +0000 (15:02 +0100)]

remove unused sysv init.d service file

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 Nov 2017 14:02:21 +0000 (15:02 +0100)]

pve-cluster service: remove $DAEMON_OPTS and environment file

"We use it to pass $DAEMON_OPTS variable to pmxcfs, which is empty per
default.

pmxcfs accepts following options at the moment:

> Usage:
>   pmxcfs [OPTION...]
>
> Help Options:
>   -h, --help           Show help options
>
> Application Options:
>   -d, --debug          Turn on debug messages
>   -f, --foreground     Do not daemonize server
>   -l, --local          Force local mode (ignore corosync.conf,
>   force quorum)

"help" can be safely ignored, as can "foreground" - it would break
the service as the Type is forking and thus systemd would expect that
the starting process returns rather quickly and kill it after the
timeout thinking the start failed when this is set.

Then there is "debug", while this is quite safe to use I do not
expect to find it in the wild. 1) it *spams* the logs in such a heavy
manner that it's useless for a user 2) if wished it can be
enabled/disable on the fly by writing 1/0 to /etc/pve/.debug (which
is what we tell people in docs, et al.  anyway) So this parameter is
also quite safe to ignore, IMO.

Then there is "local", now this can be enabled but is such evil
and wrong so that anybody setting it *permanently*, deserves to
be saved by not allowing him to do so. If somebody uses this we
should hope that he did out of not knowing better and is actually
thankful when he learns what this meant."
https://pve.proxmox.com/pipermail/pve-devel/2017-November/029527.html

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 20 Nov 2017 07:42:46 +0000 (08:42 +0100)]

fix #1566: do not setup ssh config in updatecerts call

pvecm updatecerts gets called on each pve-cluster.service start,
thus at least on each node boot and on each pve-cluster update.

updatecerts contained a call to setup_sshd_config, which ensured that
the sshd_config parameter 'PermitRootLogin' gets set to yes, with the
intend that this is needed for a working cluster.
But, also the now more common and secure options 'prohibit-password'
and 'without-password' are OK for a cluster to work properly.

This change was added by 6c0e95b3, without clear indication why, our
installer enforces this setting already, as does a cluster create and
a join to a cluster.

To allow an user to use the more secure setting remove the call from
updatecerts again, thus he only needs to changes this after cluster
create/add operations, on one node only.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Fri, 17 Nov 2017 09:07:31 +0000 (10:07 +0100)]

ssh helper: disable escape keys

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 9 Nov 2017 12:24:05 +0000 (13:24 +0100)]

fix #1559: pmxcfs: add missing lock when dumping .rrd

Adding our standard mutex for protecting cfs_status from multiple
conflicting changes solves two things. First, it now protects
cfs_status as it was changed here and secondly, it protects the
global rrd_dump_buf and rrd_dump_last helper variables from
incosistent access and a double free chance.

Fixes: #1559
Reported-by: Tobias Böhm <tb@robhost.de>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 10 Nov 2017 09:24:24 +0000 (10:24 +0100)]

cfs_lock: add missing trailing newline

When we do not instantly get the lock we print a respective message
to stderr. This shows also up in the task logs, and if it's the last
message before a 'Task OK' the UI gets confused an shows the task as
erroneous.

Keep the message as its a good feedback for the user to see why an op
seems to do nothing, so simply add a trailing newline.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 9 Nov 2017 11:12:26 +0000 (12:12 +0100)]

deps: we now break pve-ha-manager < 2.0-4

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 9 Nov 2017 09:36:51 +0000 (10:36 +0100)]

cfs_lock: subtract sleep time from rest timeout

We take the left-over timeout returned from alarm() and then
sleep for a second, so when continuing the alarm timeout we
we need to subtract that second for consistency.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 9 Nov 2017 08:47:27 +0000 (09:47 +0100)]

cfs_lock: save and restore outer alarm

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 9 Nov 2017 08:47:26 +0000 (09:47 +0100)]

cfs_lock: always include lockid in error

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 9 Nov 2017 08:47:25 +0000 (09:47 +0100)]

cfs_lock: swap checks for specific errors with $got_lock

We checked if a specific error was set or, respectively, not set to
know if we got the lock or not.
The check if we may unlock again was negated and thus could lead to
problems, in specific - rather unlikely - cases.

Use the by the previous patch added $got_lock variable, which only
gets set when we really got the lock, instead.

While refactoring for the new variable, set the $noerr parameter of
check_cfs_quorum() as we do not want to die here.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 9 Nov 2017 08:47:24 +0000 (09:47 +0100)]

cfs_lock: address race where alarm triggers with lock accquired

As mkdir can possibly hang forever we need to enforce a timeout on
it. But this was made in such a way so that a small time window
existed where the lock could be acquired successfully but the alarm
triggered still, leaving around an unused lock for 120 seconds.

Wrap only the mkdir call itself in an alarm and save its result
directly in a $got_lock variable, this minimizes the window as far as
possible from the perl side.

This is also easier to track for humans reading the code and should
cope better against code changes, e.g., it does not breaks just if an
error message typo got corrected a few lines above.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fabian Grünbichler [Tue, 17 Oct 2017 13:07:09 +0000 (15:07 +0200)]

bump version to 5.0-15

commit | commitdiff | tree

Wolfgang Bumiller [Wed, 11 Oct 2017 12:24:56 +0000 (14:24 +0200)]

cluster: improve error handling when reading files

When querying file contents via IPC we return undef if the
file does not exist, but also on any other error. This is
potentially problematic as the ipcc_send_rec() xs function
returns undef on actual errors as well, while setting $!
(errno).

It's better to die in cases other than ENOENT. Before this,
pvesr would assume an empty replication config and an empty
vm list if pmxcfs wasn't running, which could then clear out
the entire local replication state file.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Wed, 11 Oct 2017 12:24:57 +0000 (14:24 +0200)]

cluster: cfs_update: option to die rather than warn

It can be useful to know whether we actually have an empty
vm list or whether the last cfs_update call simply failed.
Previously this only warned.

This way we can avoid a nasty type of race condition. For
instance in pvesr where it's possible that the vm list query
fails while everything else worked (eg. if the pmxcfs was
just starting up, or died between the queries), in which
case it would assume there are no guests and the
purge-old-states step would clear out the entire local state
file.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 21 Sep 2017 12:26:31 +0000 (14:26 +0200)]

bump version to 5.0-14

commit | commitdiff | tree

Thomas Lamprecht [Thu, 21 Sep 2017 12:08:00 +0000 (14:08 +0200)]

cfs-func-plug: use RW lock for safe cached data access

fuse may spawn multiple threads if there are concurrent accesses.

Our virtual files, e.g. ".members", ".rrd", are registered over our
"func" cfs plug which is a bit special.

For each unique virtual file there exists a single cfs_plug_func_t
instance, shared between all threads.
As we directly operated unlocked on members of this structure
parallel accesses raced between each other.
This could result in quite visible problems like a crash after a
double free (Bug 1504) or in less noticeable effects where one thread
may read from an inconsistent, or already freed memory region.

Add a Reader/Writer lock to efficiently address this problem.
Other plugs implement more functions and use a mutex to ensure
consistency and thus do not have this problem.

Fixes: #1504
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 21 Sep 2017 07:44:29 +0000 (09:44 +0200)]

bump version to 5.0-13

commit | commitdiff | tree

Thomas Lamprecht [Wed, 20 Sep 2017 13:11:05 +0000 (15:11 +0200)]

test: add test for legacy corosync.conf

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 20 Sep 2017 13:11:04 +0000 (15:11 +0200)]

corosync: add atomic_write_conf and cleanup

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 20 Sep 2017 13:11:03 +0000 (15:11 +0200)]

corosync: transform config to allow easier access

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 20 Sep 2017 13:11:02 +0000 (15:11 +0200)]

corosync config parser: move to hash format

The old parser itself was simple and easy but resulted in quite a bit
of headache when changing corosync config sections, especially if
multiple section levelsshould be touched.

Move to a more practical internal format which represents the
corosync configuration in hash

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Wed, 20 Sep 2017 12:17:59 +0000 (14:17 +0200)]

buildsys: remove autogenerated files

commit | commitdiff | tree

Wolfgang Bumiller [Wed, 20 Sep 2017 12:15:31 +0000 (14:15 +0200)]

pvecm addnode: pass code reference correctly

commit | commitdiff | tree

Thomas Lamprecht [Mon, 18 Sep 2017 08:32:53 +0000 (10:32 +0200)]

pvecm: import often needed run_command

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 18 Sep 2017 08:32:52 +0000 (10:32 +0200)]

pvecm: remove Data::Dumper

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Thu, 14 Sep 2017 07:51:51 +0000 (09:51 +0200)]

fixup: escape @ in double quoted string

commit | commitdiff | tree

Fabian Grünbichler [Wed, 31 May 2017 07:38:00 +0000 (09:38 +0200)]

update SSH Ciphers for Debian Stretch

blowfish, 3des and arcfour are not enabled by default on the
server side anyway.

on most hardware, AES is about 3 times faster than Chacha20
because of hardware accelerated AES, hence the changed order
of preference compared to the default.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

commit | commitdiff | tree

Alwin Antreich [Wed, 23 Aug 2017 08:49:29 +0000 (10:49 +0200)]

fix #1486 pmxcfs spelling mistake

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 3 Aug 2017 15:11:18 +0000 (17:11 +0200)]

pvecm mtunnel: factor out run command

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 18 Aug 2017 09:21:18 +0000 (11:21 +0200)]

limit tasklist to the maximal pmxcfs status entry size

We tried to limit the size of the tasklist by including non-running
task only if we have less than 25 entries. A reason, among others,
was that a single status entry in the cfs_status.kvhash is limited to
32 KiB.

The "max. 25 entry" heuristic assumes that entries are small, which
is also the norm. But on failed tasks, e.g. a Qemu VM with a
problematic command line, is far longer than the usual task entry.

This led to a situation where the last 25 task were bigger than
32KiB, so the ipcc call to the pmxcfs failed with EFBIG.
This aborted then every new task run with fork_worker, and could
render a node partially unusable until "/var/log/pve/tasks/active"
got truncated.

To recreate this issue quite fast do:

# qm create 11109 --args "'$(dd if=/dev/urandom bs=1024 count=1 2>/dev/null | base64 -w 0)'"
# while true; do qm start 11109; done

You should see soon a "ipcc_send_rec failed: File too large"
After this all new task fail, even if they could succeed. pvestatd
also fails to broadcast the tasklist now. To get out of this do:

To address this check the length of the serialized list and remove
elements from its end until we do not exceed the size limit anymore.

Current running tasks and chronological newer ones will get
prioritized.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 23 Jun 2017 08:37:37 +0000 (10:37 +0200)]

cleanup outdated build files/links

All but the ChangeLog file are dead links, the correct and current
ones will get generated by auototools in the build directory, so
remove them here.

This ChangeLog file was unused for quite some years so remove it too.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Dominik Csapak [Mon, 7 Aug 2017 14:04:24 +0000 (16:04 +0200)]

fix #1472: fix rrd file path

upstream rrd-tools changed the syntax for the perl binding,
we now have to supply '-' as the path despite what the documentation
says (it says to supply an empty path, what we did)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 2 Aug 2017 11:27:30 +0000 (13:27 +0200)]

pvecm delnode: pass code reference correctly

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 25 Jul 2017 11:45:07 +0000 (13:45 +0200)]

ipcc_send_rec*: include msgid in error

else we often may have no idea which request failed at all...

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 13 Jul 2017 13:01:02 +0000 (15:01 +0200)]

pvecm: lock corosync config on addition and deletion

This avoids potentiall races which would lead to an inconsistent
corosync config.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Dietmar Maurer [Wed, 12 Jul 2017 10:00:54 +0000 (12:00 +0200)]

bump version to 5.0-12

commit | commitdiff | tree

Thomas Lamprecht [Wed, 12 Jul 2017 09:53:16 +0000 (11:53 +0200)]

ssh_merge_known_hosts: also add entry if current sshkey does not match

this ensures that our current valid SSH keys gets added even if
another key on the same hostname exists already for some reasons.
The code path which handles hashed host names has this behavior
already since the beginning, so let the new non-hashed code act the
same way.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Dietmar Maurer [Mon, 10 Jul 2017 06:54:38 +0000 (08:54 +0200)]

bump version to 5.0-11

commit | commitdiff | tree

Thomas Lamprecht [Thu, 6 Jul 2017 11:19:38 +0000 (13:19 +0200)]

ssh_merge_known_hosts: refactor and simplify

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 6 Jul 2017 11:19:37 +0000 (13:19 +0200)]

ssh_merge_known_hosts: address auth failure problem

On node addition we create two entries in the cluster-wide known_host
file with our public host key, one with the pve-localhost bound IP
address and one with our nodename.

SSH always lower cases hostnames or their aliases before comparing
them to the known host entry. This is allowed as per RFC 1035,
Section "2.3.3 Character Case" [1].
No problems are caused by this, if known_host entries are not hashed,
as both, the original value and the now specified value can be
compared canonically in an case insensitive matter.

But, if a known_host entry is hashed we have no access to its
original plain text value – and we cannot do a case insensitive
comparison anymore. SSH thus expects that the original value was
transformed to lowercase before hashing. We did not follow this
convention when we added node keys to the clusters known_host file as
we kept the case. This resulted in problems when a user set up nodes
with names containing uppercase letters.[2]

Instead of transforming everything to lowercase on hashing lets omit
hashing known_host entries completely.
To explain why this can be done safely – without security
implications - we need to state the reason why hashing those entries
would gain some security in the first place. It wants to prevent
information leakage for the case an local account gets taken over by
an attacker. If not hashed, the attacker could use the known_host
file to see which other host the user connected to.
This could "just" leak information on what a user does but could also
make it easier to attacked the listed hosts too - e.g. if the user
had an unprotected SSH key which the hosts trust. As there are other
ways to get a list of hosts where an user connected too
(.bash_history, monitoring outgoing traffic, ...) hashing known_host
entries itself provides just a small hurdle of obfuscation in the
case an account got already taken over. And this is the case for an
normal, unprivileged user account.
In the case of PVE hashing the used known_host file brings absolutely
*no* advantage. First, the affected known_host file is located under
/etc/pve/priv where only root has read access. Thus, an attacker
would need to take over root to get the known_hosts in the first
place. If he did take over root all hope is lost one way or another.
Even if known_host was world readable, hashing would not do much.
As and attacker would know that the nodes IPs are entries he could
use /etc/network/interfaces to get the subnet of interest and just
bruteforce all entries until we got all node IPs - he normally would
only need to iterate through 8-16 bit in an IPv4 network.
Even this could be simplified by just port scanning the range for an
open port 8006, to get all PVE nodes in a subnet.
Further /etc/hosts (world readable) often provides the information
which hashing known_hosts tries to hide, as does /etc/pve/.members
(readable by: www-data,root)

So, to summarize, while for an unprivileged user it may add a slight
defense against a information leak it really doesn't for a PVE
systems root/cluster members - all information which it tries to hide
is accessible in various other ways.

Add new entries in plain text, add checks if entries are already
there for the plain text case too. Further use lowercase comparison
as openssh does.
If hashed entries are already there allow them still, but ensure that
a lowercase'd version is saved to avoid authentication failed
problems.

[1]: https://tools.ietf.org/html/rfc1035#section-2.3.3
[2]: https://forum.proxmox.com/threads/35473

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 26 Jun 2017 12:10:57 +0000 (14:10 +0200)]

add simple corosync config parser self check

Each test reads and parses a config "writes" it again and then
re-parses it.
Then both the parsed hash structures and the raw config get compared
This is cheap and should catch simple regressions in either the
parser or writer, as currently we have no safety net that
modifications on either one didn't cause regressions.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Dietmar Maurer [Thu, 22 Jun 2017 06:29:49 +0000 (08:29 +0200)]

bump version to 5.0-10

commit | commitdiff | tree

Thomas Lamprecht [Tue, 13 Jun 2017 07:25:34 +0000 (09:25 +0200)]

pvecm delnode: prevent deleting current node

Else corosync really delete himself from the cluster which pmxcfs
cannot really handle and this is a bad idea in general.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 13 Jun 2017 07:25:33 +0000 (09:25 +0200)]

factor out corosync methods to own module

PVE::Cluster is already quite big, the corosync part is ~250 lines
long of 1900 total. Further the corosync part is only needed in a few
specialised places (API2/ClusterConfig and CLI/pvecm).
This speaks for factoring out this part in a separate perl module as
most modules which use Cluster load the corosync parts for no reason.
Further, cluster handling through API may even add more corosync
related methods.

Create a new Corosync perl module and move all relevant methods over.
Method names lost the 'corosync_' prefix, not really needed anymore
as they already lives in the 'Corosync' namespace now.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fabian Grünbichler [Tue, 13 Jun 2017 13:22:05 +0000 (15:22 +0200)]

pmxcfs: fix segfault in cfs_create_status_msg

it's possible to request a status message for a no longer
existing nodename in a standalone setting (e.g., node was
renamed after pmxcfs was started).

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

commit | commitdiff | tree

Wolfgang Bumiller [Tue, 6 Jun 2017 08:03:57 +0000 (10:03 +0200)]

add sshinfo_to_command_base

required for rsync's --rsh

Cluster FS and Tools

RSS Atom