Cluster.pm: add get_ssh_info and ssh_info_to_command
To get a node's address info optionally inside a specified
network (eg. migration_network), and a standardized way to
create an SSH command from this info.
pvecm add: fix #1369 - re-allow using hostnames for ringX_addr
If an user passed a hostname as ring0_addr or ring1_addr the check_ip
checked failed as it implicitly assumed IPs even if we allowed a
general address (i.e. IP or hostname) as a format for those
properties.
remote_node_ip: use same return signature for both branches
We have two return statements in the remote_node_ip submethod, one
checked if we are in list context and adapt the returning values
accordingly and one just returned a list, independent of the
context.
Adapt the second one and check the context there.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 22 Feb 2017 15:59:11 +0000 (16:59 +0100)]
pvecm add: assert that ringX IPs are available on node add
If 'ringX_addr' parameters are used on adding a node to a cluster
check if those addresses are actually configured on the to-be-added
node. It makes no sense that the address is not or multiple times
configured.
This prevents a node in limbo, waiting for quorum (if it was the
second node in a cluster, even two node would be in the no-quorum
limbo) where manual pmxcfs kills, local starts and manual
configuration edits which may need to get manually synced to other
cluster members are needed.
The check does not cost much and gets only made on node additions, so
assert with our get_local_ip_from_cidr method that the IP is
configured on any interface.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 22 Feb 2017 15:59:08 +0000 (16:59 +0100)]
pvecm addnode: ensure ring address isn't already used by cluster
If someone enters the wrong address by accident when adding a node it
may cause havoc in the cluster (meaning a reset of the whole cluster
when HA is used, may even happen more often during the recovery
tries. Also a whole lot of problems get triggered in gneral, even
witouth HA).
Further, user get into a hard to repair situation where a layman may
not be able to fix it by hand even when given directions by an
experienced user.
This is a really bad outcome for such a small and easy to make
mistake, so just make a small check and assert that the requested IPs
are not used by any node on any ring in the cluster configuration.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 22 Feb 2017 15:59:07 +0000 (16:59 +0100)]
pvecm addnode: error out on interactive call
addnode is thought to be used by the `add` command only.
So check if STDIN or STOUT are connected to a tty and exit with an
error message if this is the case.
The force flag allows overwriting this check.
Fixes bug #294
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 22 Feb 2017 15:59:06 +0000 (16:59 +0100)]
pvecm create: remove rrp_mode parameter
I detected a bug where we overwrote the whole $interfaces variable
(and so all interfaces from the corosync config) if the 'rrp_mode'
param was set.
While this would be easy to with by changing the line to
$interfaces .= ...
I removed the whole rrp_mode parameter instead.
As:
a) I've seen no one running into this bug, so this parameter was not
really used either way.
b) only the 'passive' is supported and works, 'active' has a whole
lot of problems. If someone really wants it he should edit the
corosync config file to achieve this
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 22 Feb 2017 15:59:04 +0000 (16:59 +0100)]
pvecm add: fix check if corosync alread runs
`corosync-quorumtool` exit with 1 (CS_OK) if corosync runs and is
quorate.
Use `corosync-quorumtool -l` (list nodes) instead, this returns
1 if corosync does not run
0 if corosync runs, independent if a cluster is quorate or not.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Else the first make {deb,dinstall} from a clean repo fails as we
generated the debian control file to the source debian/ folder.
Just write it directly to the build/debian directory, so we do not
clutter the source directory and build always with the up to date
control file.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fix #1199: pmxcfs: vmlist cache update condition in rename
rename() wrongly used the vmid filled in by
path_contain_vm_config() as a condition for whether to
update the vmlist cache rather than the returned nodename.
This caused a rename in any folder of a file whose name
was a number followed by '.conf' to remove the corresponding
vmid from the vmlist cache.
Thomas Lamprecht [Fri, 11 Nov 2016 08:03:13 +0000 (09:03 +0100)]
error out when getting remote ip address fails
remove the noerr flag so that we error out when we get multiple IPs
from a CIDR or none at all, the user has to guarantee that his CIDR
matches just one IP on each local host.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Mon, 31 Oct 2016 08:42:30 +0000 (09:42 +0100)]
add migration format to datacenter config
This adds a new format for configuring cluster wide migration
settings.
Those settings include the migration transfer method, secure
(currently ssh) or insecure (tcp), this deprecates the
migration_unsecure parameter which we only keep for backward
compatibility and map it to the new property.
The mapping of the setting should be unproblematic for the user as
exactly the same semantics happen.
Only the case where both, new and old are set at the same time is
problematic, here warn the user and ignore the old setting.
Further the migration network can be set, this denotes the network
used for sending the migration traffic.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Wolfgang Link [Tue, 6 Sep 2016 09:43:57 +0000 (11:43 +0200)]
Fix #1093: allow also delete node by IP
If there is a second ring or the ring0_addr is set on a IP.
pvecm nodes shows the ip instead the name.
So there is a reason to delete a node also by IP.
Thomas Lamprecht [Mon, 12 Sep 2016 15:50:54 +0000 (17:50 +0200)]
pmxcfs: increase max filesize from 128k to 512k
This fixes bug 1014 and also fixes a few other problems where user
ran into the file size limitation, I did not found the bug entries
for them, but they covered:
1) there was a maximum of about <1500 services which could be
managed by our HA manager, as after that the manager_status file
got to big
2) firewall rules may also reach this limit on a bigger setup
I tested this with concurrent started read/writes of random data
files from and into RAM (tmpfs mounts), as long as we do not flush
often and read everything at once (i.e. write/read with a big block
size) the performance stays good.
The limiting factor in speed is not corosyncs CPG but sqlite, that
can be seen when comparing worst case scenarios between local pmxcfs
and clustered pmxcfs instances and simple debug logging.
We optimize our sqlite usage quite heavy, relevant additional speed
gains cannot be made without loosing reliability, as far as I've
seen.
So I only got into problems if I read/wrote small blocks
with a few hundred big writes started at once, e.g.
for i in {1..100}
do
dd if=/tmp/random512k.data of="/etc/pve/data$i" bs=1k &
done
As with the above worst case each block gets written as a single
transaction to the database, where each transaction has to be locked
and synced to disk for reliability.
So packing all changes (i.e. the whole file) into one DB transaction
does not produces much overhead of 512k files compared to 128k files
As data written through the PVE framework is written and read in
such a way we can increase this without seeing much of a
performance impact.
It should be also noted that just because files can now get bigger
not a lot will get that. Rather there may be just one to three files
bigger than 128k on some setups.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
use g_return_val_fail as cfs_loop_stop_worker returns void
do not use g_return_val_if_fail because the cfs_loop_stop_worker
function does not return anything and newer versions of GCC complain
about that (I used gcc version 5.4.0 20160609 (Debian 5.4.0-6 from
stretch)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Thu, 30 Jun 2016 14:35:36 +0000 (16:35 +0200)]
ensure quorum is set false when corosync fails
If corosync directly fails (i.e. `killall corosync`) the local node
acted like it had still quorum, which is not ideal.
Ensure that we set quorate to false before we finalize the quorum.
Do this in:
* service_quorum_dispatch, if it fails it is important that we set
it to false, as there is a good possibility that the
quorum_notification_fn won't get called anymore, reproducible with
$ killall corosync && sleep 0.1 && ls -l /etc/pve/ \
&& systemctl start corosync
Expected behavior: corosync is dead, the ls should show that
everything in /etc/pve is read only
Shown: behavior: /etc/pve still has read/write access and
PVE::Cluster::check_cfs_quorum() still returns true
* service_quorum_initialize: just to be sure as we successfully
registered the quorum notification function already
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Tue, 24 May 2016 13:55:53 +0000 (15:55 +0200)]
cleanup format strings for cfs_* messages
This does not change semantics on our current target platform
(x86_64) but is needed for porting it to other platforms.
The GCC on ARM, for example, complains about them.
For all:
* size_t use "%z*"
* off_t use "%j*"
* uint64_t use "PRI*64"
where * may be one of (X,d,u).
Also cast guint64 to uint64_t to allow use of a general, portable
format which also supports hex output as the GUINT64_FORMAT allows
decimal output only.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Cc: mir@datanom.net