]> git.proxmox.com Git - mirror_corosync.git/log
mirror_corosync.git
6 years agocorosync-qdevice: send startup notification to systemd
Ferenc Wágner [Tue, 8 Nov 2016 21:36:53 +0000 (22:36 +0100)]
corosync-qdevice: send startup notification to systemd

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agocorosync-qnetd: send startup notification to systemd
Ferenc Wágner [Mon, 19 Dec 2016 13:27:08 +0000 (14:27 +0100)]
corosync-qnetd: send startup notification to systemd

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoSend corosync-notifyd startup notification to systemd
Ferenc Wágner [Mon, 30 Oct 2017 21:12:14 +0000 (22:12 +0100)]
Send corosync-notifyd startup notification to systemd

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoMake systemd stop corosync-notifyd if corosync is stopped
Ferenc Wágner [Mon, 30 Oct 2017 21:12:09 +0000 (22:12 +0100)]
Make systemd stop corosync-notifyd if corosync is stopped

Otherwise is just exits successfully (which should probably be fixed
eventually), leading to confusion.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agocorosync.spec: Add system-devel build requirement
Jan Friesse [Wed, 8 Nov 2017 14:39:34 +0000 (15:39 +0100)]
corosync.spec: Add system-devel build requirement

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Ferenc Wágner <wferi@debian.org>
6 years agoSend corosync startup notification to systemd
Ferenc Wágner [Mon, 30 Oct 2017 21:11:56 +0000 (22:11 +0100)]
Send corosync startup notification to systemd

This enables starting the daemon directly in the service file, because
dependent units won't be started until initialization is complete.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoquorumtool: Use full buffer size in snprintf
Jan Friesse [Tue, 7 Nov 2017 14:55:30 +0000 (15:55 +0100)]
quorumtool: Use full buffer size in snprintf

Thanks Bin Liu <bliu@suse.com> for this patch.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
6 years agocpghum: Mark print/log functions with printf attr
Jan Friesse [Tue, 7 Nov 2017 14:54:36 +0000 (15:54 +0100)]
cpghum: Mark print/log functions with printf attr

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
6 years agocpg_test_agent: Fix snprintf compiler warnings
Jan Friesse [Tue, 7 Nov 2017 14:53:45 +0000 (15:53 +0100)]
cpg_test_agent: Fix snprintf compiler warnings

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
6 years agosam: Fix snprintf compiler warnings
Jan Friesse [Tue, 7 Nov 2017 14:53:21 +0000 (15:53 +0100)]
sam: Fix snprintf compiler warnings

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
6 years agocoroparse: Do not convert empty uid, gid to 0
Jan Friesse [Mon, 6 Nov 2017 08:22:41 +0000 (09:22 +0100)]
coroparse: Do not convert empty uid, gid to 0

When uid (or gid) value was empty string it was incorrectly converted to
0. Solution is to check input string emptines.

Thanks Bin Liu <bliu@suse.com> for reporting the bug.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Bin Liu <bliu@suse.com>
6 years agocmapctl: Add option to clear the stats
Christine Caulfield [Thu, 2 Nov 2017 16:01:36 +0000 (16:01 +0000)]
cmapctl: Add option to clear the stats

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agostats: Don't display errors when reading knet stat
Christine Caulfield [Thu, 2 Nov 2017 13:16:00 +0000 (13:16 +0000)]
stats: Don't display errors when reading knet stat

Only add the knet handle stat keys if we are actually running knet. This
prevents errors occurring when iterating through all of the stats keys

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agomake the output of "corosync-cfgtool -s" more readable (#269)
Bin Liu [Fri, 3 Nov 2017 09:50:29 +0000 (17:50 +0800)]
make the output of "corosync-cfgtool -s" more readable (#269)

6 years agocfg: nodeid should be unsigned int
Bin Liu [Wed, 1 Nov 2017 08:23:41 +0000 (16:23 +0800)]
cfg: nodeid should be unsigned int

nodeid in struct req_lib_cfg_get_node_addrs is "unsigned int",
so the function corosync_cfg_get_node_addrs should have its param
"nodeid" to be unsigned int.

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoquorumtool: remove duplicated help message
Bin Liu [Wed, 1 Nov 2017 03:30:54 +0000 (11:30 +0800)]
quorumtool: remove duplicated help message

Option "-p" was included twice, so remove one of them.

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoman: fix cpg_mcast_joined.3.in
Jonathan Davies [Wed, 1 Nov 2017 14:36:40 +0000 (14:36 +0000)]
man: fix cpg_mcast_joined.3.in

Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoman: Add stats.clear keys to the cmap_keys man pg
Christine Caulfield [Tue, 31 Oct 2017 11:47:41 +0000 (11:47 +0000)]
man: Add stats.clear keys to the cmap_keys man pg

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agostats: Add cmap key to clear the various stats.
Christine Caulfield [Tue, 31 Oct 2017 10:54:43 +0000 (10:54 +0000)]
stats: Add cmap key to clear the various stats.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoUse RuntimeDirectory instead of tmpfiles.d
Ferenc Wágner [Mon, 28 Nov 2016 13:47:05 +0000 (14:47 +0100)]
Use RuntimeDirectory instead of tmpfiles.d

This reverts part of commit 32123f6bb2ebc4f9ac7865945cc85a9c9b903dc6.

A simple directive is a much lighter solution to the same problem, and
automatically follows the specified User.  I copied the 0770 modes from
the corresponding init scripts; they could use a little documentation.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agototemconfig: generate mcast icmap items for UDP
Bin Liu [Thu, 26 Oct 2017 09:29:43 +0000 (17:29 +0800)]
totemconfig: generate mcast icmap items for UDP

Generating mcastaddr and mcastport in icmap make
sense only for UDP transport.

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoUse static case blocks to determine distro flavor
Ferenc Wágner [Tue, 24 Oct 2017 12:05:03 +0000 (14:05 +0200)]
Use static case blocks to determine distro flavor

This is a configure-time decision, avoid live filesystem checks.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoconfigure: add --with-initconfigdir option
Ferenc Wágner [Thu, 24 Nov 2016 11:06:37 +0000 (12:06 +0100)]
configure: add --with-initconfigdir option

Default value is /etc/sysconfig and resulting
INITCONFIGDIR is used to reduce duplication in init system
integration code.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agototemconfig: add nodeid check for knet
Bin Liu [Thu, 26 Oct 2017 07:57:12 +0000 (15:57 +0800)]
totemconfig: add nodeid check for knet

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoman: support SOURCE_DATE_EPOCH
Ferenc Wágner [Tue, 24 Oct 2017 14:06:37 +0000 (16:06 +0200)]
man: support SOURCE_DATE_EPOCH

Make reproducible builds possible by supporting
https://reproducible-builds.org/specs/source-date-epoch/

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoman:fix in corosync-qdevice.8
Bin Liu [Tue, 24 Oct 2017 09:29:42 +0000 (17:29 +0800)]
man:fix in corosync-qdevice.8

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoman: must set nodeid for knet in nodelist
Bin Liu [Tue, 24 Oct 2017 06:12:08 +0000 (14:12 +0800)]
man: must set nodeid for knet in nodelist

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agologsys: Avoid redundant callsite section checking
Jan Pokorný [Wed, 18 Oct 2017 19:59:22 +0000 (21:59 +0200)]
logsys: Avoid redundant callsite section checking

Previously, corosync executable was repeatedly (proportionally to the
count of LOGSYS_DECLARE_SUBSYS macro applications involved in the
constituent source files) checking the same for no gain in the pre-main
startup. This is not needed since nothing changes with static data
shared withing the same program space (it may have been a different
story once upon a time if loadable modules were in use), so make that
happen in (one-off per executable) LOGSYS_DECLARE_SYSTEM instead.

Libqb offers it's own ready-made macro to that
effect, simply to isolate the inner percularities from the library user
(that should not be required to understand anything about the orphan
sections and respective autocreated symbols to denote their boundaries).
As it is compile-time conditionalized in the same way, just use it
directly instead. As a value added, corosync will be kept up to date
about the possibly growing set of the logging-sanity checks as it gets
compiled with newer and newer libqb versions (their header files, for
that matter).

Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoconfig: Fix memory leak
Christine Caulfield [Fri, 20 Oct 2017 08:16:45 +0000 (09:16 +0100)]
config: Fix memory leak

totem_volatile_config_set_string_value was not properly freeing memory.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoknet: Add support for knet compression
Christine Caulfield [Fri, 13 Oct 2017 13:53:21 +0000 (14:53 +0100)]
knet: Add support for knet compression

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agoqdevice: Add support for heuristics
Jan Friesse [Thu, 10 Nov 2016 17:49:09 +0000 (18:49 +0100)]
qdevice: Add support for heuristics

Heuristics are set of commands executed locally on startup, cluster
membership change, successful connect to corosync-qnetd and optionally
also at regular times. When all commands finish successfully
(their return error code is zero) on time, heuristics have passed,
otherwise they have failed. The heuristics result is sent to
corosync-qnetd and there it's used in calculations to determine which
partition should be quorate.

Right know, there are some problems (bugs):
- Regular heuristics is supported only by ffsplit. This is not a
problem for clusters with power fencing, but deployments where
non-quorate partition continues to operate may see this as a problem.
- Qdevice-tool status doesn't contain detailed information about
heuristics.
- Qdevice-tool doesn't have a possibility to trigger heuristics
re-execute.

Thanks Chrissie Caulfield for Englishify the man pages.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
6 years agoSpec: fix arch-qualified dependencies
Keisuke MORI [Wed, 18 Oct 2017 08:22:52 +0000 (08:22 +0000)]
Spec: fix arch-qualified dependencies

needed along with commit 30af25294e019678c4f31e3368b19266f69b8254

Signed-off-by: Keisuke MORI <kskmori@intellilink.co.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agocmap: Remove noop highest config version check
Jan Friesse [Wed, 11 Oct 2017 15:11:33 +0000 (17:11 +0200)]
cmap: Remove noop highest config version check

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
6 years agocmap: don't shutdown highest config_version node
Jonathan Davies [Tue, 10 Oct 2017 14:53:41 +0000 (15:53 +0100)]
cmap: don't shutdown highest config_version node

Scenario:
 1. node A starts corosync with config_version = 2, nodelist = {A, B}
 2. node B starts corosync with config_version = 1, nodelist = {A, B}

corosync.conf(5) says the config_version option is "used to prevent
joining old nodes with not up-to-date configuration."

So expected outcome is:
 * corosync on node A remains alive
 * corosync on node B exits

Actual outcome is:
 * corosync on node A exits
 * corosync on node B exits

Explanation of actual behaviour:
 * Host A will have cmap_my_config_version = 2 but
   cmap_highest_config_version_received = 1, so will shutdown in
   cmap_sync_activate because these are not equal.
 * Host B will have cmap_my_config_version = 1 but
   cmap_highest_config_version_received = 2, so will shutdown in
   cmap_sync_activate because these are not equal.

Instead, node A should consider its own config_version in the
calculation of the highest config_version, i.e.
cmap_highest_config_version_received = 2, and so not shutdown
in cmap_sync_activate.

Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
6 years agototemudp: Remove memb_join discarding
Kazunori INOUE [Tue, 26 Sep 2017 08:36:09 +0000 (17:36 +0900)]
totemudp: Remove memb_join discarding

This is already implemented in totemsrp in much cleaner way (added
by commit ab8942f6260fde93824ed2a18e09e572b59ceb25).

Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agovotequorum: make atb consistent on nodelist reload
Edwin Torok [Fri, 22 Sep 2017 16:28:54 +0000 (17:28 +0100)]
votequorum: make atb consistent on nodelist reload

When the cluster changes from even sized to odd sized corosync
disables auto-tie-breaker if wait_for_all is not enabled.
However when changing from odd sized to even sized it doesn't reenable
it, causing auto_tie_breaker to be inconsistent across the cluster:
the newly added node and any nodes that restart corosync
will have it, but all the previously running nodes won't.

Signed-off-by: Edwin Torok <edvin.torok@citrix.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agototem: Remove unnecessary NSS headers
Fabio M. Di Nitto [Fri, 22 Sep 2017 08:25:40 +0000 (10:25 +0200)]
totem: Remove unnecessary NSS headers

Also fix corosync.spec.in to depend on libknet.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoconfig: Allow dynamic link configuration
Christine Caulfield [Fri, 15 Sep 2017 12:12:01 +0000 (13:12 +0100)]
config: Allow dynamic link configuration

Now we are using knet, it's possible to dynamically add, remove and
reconfigure links on the fly.

Also print 'n' for non-existant knet links. This will show up
only on loopback links >0. But it looks better than 'status ='

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agototemudp: Retry if bind fails
Masse Nicolas [Fri, 15 Sep 2017 14:52:15 +0000 (16:52 +0200)]
totemudp: Retry if bind fails

If bind call fails it's retried for BIND_MAX_RETRIES.
If it's still unsuccessful, corosync exists instead
of working incorrectly.

Slightly modified by reviewer.

Signed-off-by: Masse Nicolas <nicolas.masse@stormshield.eu>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocorosync.conf.5: watchdog support is conditional
Ferenc Wágner [Wed, 13 Sep 2017 13:29:32 +0000 (15:29 +0200)]
corosync.conf.5: watchdog support is conditional

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agowd: default to not using a watchdog
Ferenc Wágner [Mon, 11 Sep 2017 16:44:56 +0000 (18:44 +0200)]
wd: default to not using a watchdog

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agowd: remove extra capitalization typo
Ferenc Wágner [Mon, 11 Sep 2017 16:26:33 +0000 (18:26 +0200)]
wd: remove extra capitalization typo

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocorosync.conf.5: add warning about slow watchdogs
Ferenc Wágner [Mon, 11 Sep 2017 11:40:05 +0000 (13:40 +0200)]
corosync.conf.5: add warning about slow watchdogs

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agototemknet: fix debug message typo
Jonathan Davies [Thu, 7 Sep 2017 08:21:08 +0000 (10:21 +0200)]
totemknet: fix debug message typo

Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocorosync.conf.5: Fix watchdog documentation
Ferenc Wágner [Wed, 6 Sep 2017 12:43:00 +0000 (14:43 +0200)]
corosync.conf.5: Fix watchdog documentation

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agowd: fix typo
Ferenc Wágner [Thu, 8 Jun 2017 09:17:37 +0000 (11:17 +0200)]
wd: fix typo

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoInclude fcntl.h for F_* and O_* defines
Khem Raj [Thu, 31 Aug 2017 00:28:55 +0000 (17:28 -0700)]
Include fcntl.h for F_* and O_* defines

Fixes errors like
utils.c:95:22: error: use of undeclared identifier 'O_WRONLY'

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agostats: add knet 'handle' stats
Christine Caulfield [Tue, 22 Aug 2017 08:22:07 +0000 (09:22 +0100)]
stats: add knet 'handle' stats

knet handle stats show compression and crypto statistics. With these
you can see the effectiveness of compression and the overheads of both
crypto and compression.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agomain: use syslog & printf directly for early log messages
Christine Caulfield [Tue, 22 Aug 2017 08:51:09 +0000 (09:51 +0100)]
main: use syslog & printf directly for early log messages

libqb seems funny about logging things before its fully configured.
This corosync commit didn't help either:
8b6bd86a55b8bda9f3a8ff67bdff908263976fa3

So to make sure that messages about the config file not being opened
get delivered to the user/syslog we send them directly.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agototempg: Allow space for incoming overflow
Christine Caulfield [Mon, 14 Aug 2017 13:04:31 +0000 (14:04 +0100)]
totempg: Allow space for incoming overflow

totempg needs to store the current message + any
overflow for the next message which can be up to (nearly) the MTU size.
in knet that's large, but for UDP it's just 1500.

The reason we've never seen it before is because the actual max message
size is 1024 less than 1MB and after all the headers are stripped out the overflow is
usually 1024 bytes or less.
The 1024*1024 size of the assembly buffer is large enough to hold a max message (1047552) +
1024 bytes of a new UDP message. So we never saw any problems.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocpghum: Add options to change flood start/mult/end sizes (#237)
Chrissie Caulfield [Fri, 11 Aug 2017 14:28:02 +0000 (15:28 +0100)]
cpghum: Add options to change flood start/mult/end sizes (#237)

I ran out of sensible short options for cpghum so added some long
ones to cope with them.

Also added is the ability to specify most size values in a sensible format
eg 64M for 64 Megabytes or 48K for 48 Kilobytes.

Strictly those are MiB and KiB of course, but I'm old-fashioned.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agototemknet: Use knet's LOOPBACK transport (#236)
Chrissie Caulfield [Fri, 4 Aug 2017 11:59:16 +0000 (12:59 +0100)]
totemknet: Use knet's LOOPBACK transport (#236)

knet now has a built-in LOOPBACK transport so use that
rather than special-casing it for ourself.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agoCFG: Remove ring-reenable code
Christine Caulfield [Thu, 3 Aug 2017 08:58:27 +0000 (09:58 +0100)]
CFG: Remove ring-reenable code

RRP doesn't exist any more so all the ring re-enable code is redundant.

I've removed it from the library and all the code that does anything,
but I've left the hole in the IPC just in case old libraries are
hanging around.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agomain: Add support for libcgroup
Jan Friesse [Fri, 28 Jul 2017 14:32:58 +0000 (16:32 +0200)]
main: Add support for libcgroup

When corosync is started in environment where it ends in cgroup without
properly set rt_runtime_us it's impossible to get RT priority.

Already implemented workaround is to use higher non-RT priority.

This patch implements another solution. It moves corosync into root cpu
cgroup. Root cpu cgroup hopefully has enough RT budget.

Another solution was mentioned on ML
https://lists.freedesktop.org/archives/systemd-devel/2017-July/039353.html
but this means to generate some "random" values.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
(cherry picked from commit c56086c701d08fc17cf6d8ef603caf505a4021b7)

7 years agostats: Add map with on-demand statistics
Christine Caulfield [Mon, 3 Jul 2017 13:54:33 +0000 (14:54 +0100)]
stats: Add map with on-demand statistics

Icmap is factored out so it's possible to add other
maps for cmap. API call to switch maps from application
end is added.

Corosync-cmapctl is enhanced with -m option.

Stats contains all statistics previously found in runtime.connections,
runtime.services and runtime.totem prefixes together with new knet
related. All stats are read only.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoipc: Check for the libraries sending invalid message IDs
Christine Caulfield [Fri, 14 Jul 2017 13:06:49 +0000 (14:06 +0100)]
ipc: Check for the libraries sending invalid message IDs

If the library sent an invalid (ie too high) message ID to
corosync, then it could cause the daemon to crash.

Now we check the message ID before indexing the function array

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agomain: Add option to set priority
Jan Friesse [Fri, 7 Jul 2017 15:49:46 +0000 (17:49 +0200)]
main: Add option to set priority

Option -P takes numeric value with same meaning
as nice or values min / max, meaning maximal / minimal priority (so
minimal / maximal nice value).

Scheduler / priority setting is moved in code so it is now executed
after logsys is configured so errors are logged.

Setting maximal priority is also used as fallback when realtime
scheduling is requested and sched_setscheduler fails.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
(cherry picked from commit a008448efb2b1d45c432867caf08f0bcf2b4b9b0)

7 years agototemknet: Prevent dead-loop in log_flush_messages
Jan Friesse [Mon, 3 Jul 2017 13:40:29 +0000 (15:40 +0200)]
totemknet: Prevent dead-loop in log_flush_messages

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
7 years agocorosync-keygen: Display number of needed bits
Jan Friesse [Mon, 3 Jul 2017 11:16:33 +0000 (13:16 +0200)]
corosync-keygen: Display number of needed bits

Instead of currently read bits, number of already read bits is
displayed to let the user know how long it's needed to "press keys"

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agototemknet: Flush knet log messages
Jan Friesse [Fri, 30 Jun 2017 08:35:57 +0000 (10:35 +0200)]
totemknet: Flush knet log messages

When initialization fails knet logs messages into pipe. Previously they
were never processed. Solution is to add log_flush_messages which takes
care to call log_deliver_fn.

Call of log_flush_messages is also added to totemknet_finalize because
this removes log pipe fd from qb_loop so similar problem can happen.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agocorosync-keygen: Make less-secure default
Jan Friesse [Fri, 23 Jun 2017 12:31:53 +0000 (14:31 +0200)]
corosync-keygen: Make less-secure default

/dev/urandom is good enough for crypto keys and it's not blocking. If
superb randomness is really needed, it's possible to use newly added
option -r.

Also manpage is reworked a bit to use .nf instead of many .br.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agocorosync-keygen: Adapt to knet key sizes
Jan Friesse [Fri, 23 Jun 2017 12:18:08 +0000 (14:18 +0200)]
corosync-keygen: Adapt to knet key sizes

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agototemconfig: Make crypto work again
Jan Friesse [Fri, 23 Jun 2017 09:22:09 +0000 (11:22 +0200)]
totemconfig: Make crypto work again

Knet needs longer key and supports various key lengths. Split
TOTEM_PRIVATE_KEY_LEN into TOTEM_PRIVATE_KEY_LEN_MIN and
TOTEM_PRIVATE_KEY_LEN_MAX (both using KNET_*_KEY_LEN).

Fix incorrect "Could only read..." message.

Make sure key is properly initialized/zeroed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agoknet: Compile with latest knet API
Christine Caulfield [Thu, 29 Jun 2017 09:02:21 +0000 (10:02 +0100)]
knet: Compile with latest knet API

extra parameter added to knet_link_get_status()

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agototem: Propagate totem initialization failure
Jan Friesse [Thu, 15 Jun 2017 09:07:19 +0000 (11:07 +0200)]
totem: Propagate totem initialization failure

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agototemknet: Use new knet_link_set_config() API
Christine Caulfield [Fri, 9 Jun 2017 12:28:46 +0000 (13:28 +0100)]
totemknet: Use new knet_link_set_config() API

TC_PRIO_INTERACTIVE is now a link option in knet, so we have
to provide it at link config time.

This needs the latest knet git to compile as this is an updated API.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agocoroapi: Use size_t for private_data_size
Michael Jones [Thu, 25 May 2017 18:29:19 +0000 (13:29 -0500)]
coroapi: Use size_t for private_data_size

Unsigned int and size_t represent two different concepts.

Same problem was present in ipc_glue.

Signed-off-by: Michael Jones <jonesmz@jonesmz.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agovotequorum: Report errors from votequorum_exec_send_reconfigure
Christine Caulfield [Fri, 26 May 2017 12:55:24 +0000 (13:55 +0100)]
votequorum: Report errors from votequorum_exec_send_reconfigure

If votequorum_exec_send_reconfigure() returns an error (ie the
packet could not be sent) then we should either return it to the
sender (for a library call) or, for an internal call, log it.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocpghum: remove space after delimiter
Christine Caulfield [Thu, 25 May 2017 09:35:45 +0000 (10:35 +0100)]
cpghum: remove space after delimiter

machine-readable stats do not need extra spaces!

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocpghum: Add interim RTT to cpghum
Christine Caulfield [Thu, 25 May 2017 09:18:32 +0000 (10:18 +0100)]
cpghum: Add interim RTT to cpghum

when -f is selected the interim stats show the RTTs for that
size of packet.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoconfigure: Enable C99 language standard
Michael Jones [Mon, 10 Oct 2016 01:51:46 +0000 (20:51 -0500)]
configure: Enable C99 language standard

Also disable some obsolete warnings.

Signed-off-by: Michael Jones <jonesmz@jonesmz.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agomain: Display reason why cluster cannot be formed
Jan Friesse [Thu, 18 May 2017 15:15:14 +0000 (17:15 +0200)]
main: Display reason why cluster cannot be formed

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agonotifyd: Add the community name to an SNMP trap
Hideo Yamauchi [Thu, 18 May 2017 14:55:46 +0000 (23:55 +0900)]
notifyd: Add the community name to an SNMP trap

Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocpghum: Add machine-readable output
Christine Caulfield [Mon, 15 May 2017 13:19:38 +0000 (14:19 +0100)]
cpghum: Add machine-readable output

and fix a few small counter bugs.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agotest: Fold cpgbench into cpghum (#205)
Chrissie Caulfield [Thu, 11 May 2017 07:51:34 +0000 (08:51 +0100)]
test: Fold cpgbench into cpghum (#205)

* test: Fold cpgbench into cpghum

cpgbench and cpghum share a lot of code & concepts so it makes
sense to merge them into a single test program that can both
benchmark and sanity check CPG.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoknet: Allow space for encapsulated messages
Christine Caulfield [Tue, 9 May 2017 08:05:12 +0000 (09:05 +0100)]
knet: Allow space for encapsulated messages

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agoMain: Call mlockall after fork
Andrew Price [Tue, 25 Apr 2017 12:44:33 +0000 (14:44 +0200)]
Main: Call mlockall after fork

Man page of mlockall is clear:
Memory locks are not inherited by a child created via fork(2) and are
automatically removed (unlocked) during an execve(2) or when the
process terminates.

So calling mlockall before corosync_tty_detach is noop when corosync is
executed as a daemon (corosync -f was not affected).

This regression is caused by ed7d054e552b4cb2a0cb502b65f84310ce6da844
(setprio for logsys/qblog was correct, mlockall was not).

Solution is to move corosync_mlockall call on correct place.

Signed-off-by: Andrew Price <anprice@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoFix typos in README.recovery
Michael Schwarz [Fri, 21 Apr 2017 09:26:10 +0000 (11:26 +0200)]
Fix typos in README.recovery

Signed-off-by: Michael Schwarz <michi.schwarz@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocoroparse: Use readdir instead of readdir_r
Bin Liu [Thu, 20 Apr 2017 06:49:03 +0000 (08:49 +0200)]
coroparse: Use readdir instead of readdir_r

readdir_r is deprecated in glibc 2.24 in favor of readdir (which became
thread safe). Also because corosync never calls read_uidgid_files_into_icmap
in muliple threads, no problem should appears even with libc where
readdir is thread-safe.

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agototemknet: Handle logpipe creation failure
Bin Liu [Thu, 20 Apr 2017 06:46:00 +0000 (08:46 +0200)]
totemknet: Handle logpipe creation failure

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agowd: Report error when close of wd fails
Bin Liu [Thu, 20 Apr 2017 06:43:45 +0000 (08:43 +0200)]
wd: Report error when close of wd fails

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoQnetd lms: Use UTILS_PRI_RING_ID printf format str
Bin Liu [Thu, 20 Apr 2017 06:42:11 +0000 (08:42 +0200)]
Qnetd lms: Use UTILS_PRI_RING_ID printf format str

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocpghum: Fix printf of size_t variable
Bin Liu [Thu, 20 Apr 2017 06:41:21 +0000 (08:41 +0200)]
cpghum: Fix printf of size_t variable

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agototemknet: Got back to recvmsg() from recvmmsg()
Christine Caulfield [Tue, 11 Apr 2017 12:44:08 +0000 (13:44 +0100)]
totemknet: Got back to recvmsg() from recvmmsg()

The kernel team have recommended us not to use recvmmsg and as it
confers no particular speed advantage (especially given the extra
memory consumption) I'm going back to single message recvmsg() again.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agototemconfig: Prefer nodelist over bindnetaddr
Bin Liu [Mon, 10 Apr 2017 02:45:10 +0000 (10:45 +0800)]
totemconfig: Prefer nodelist over bindnetaddr

In a two-node cluster, I 've one node configured with open-vswtich:
5: br-fixed: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UNKNOWN group default
inet 192.168.124.88/24 scope global br-fixed
inet 192.168.124.87/24 scope global secondary br-fixed
inet 192.168.124.83/24 brd 192.168.124.255 scope global secondary
tentative br-fixed
inet 192.168.124.89/24 scope global secondary br-fixed

while I use 192.168.124.83 in node list of corosync.conf with udpu, and
the bind_addr is 192.168.124.0. After upgrading corosync on this node,
the it uses 192.168.124.88 instead of 192.168.124.83. As we can see:

corosync-cfgtool -s
Printing ring status.
Local node ID 1084783704

corosync-quorumtool -s
Membership information:
Nodeid Votes Name
1084783697 1 d52-54-77-77-01-02
1084783699 1 d52-54-77-77-01-01 (local)

while the other node can only see itself:
corosync-cfgtool -s
Printing ring status.
Local node ID 1084783697
RING ID 0
id = 192.168.124.81
status = ring 0 active with no faults

corosync-quorumtool -s
Membership information:
Nodeid Votes Name
1084783697 1 d52-54-77-77-01-02.virtual.cloud.suse.de (local)

this patch will check if there are both nodelist and bindnetaddr and if
so, display warning and use nodelist information.

Signed-off-by: Bin Liu <bliu@suse.com>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoknet: Close libknet down cleanly at shutdown
Christine Caulfield [Tue, 11 Apr 2017 08:03:26 +0000 (09:03 +0100)]
knet: Close libknet down cleanly at shutdown

By tidily shutting down knet in totekmknet_finalize we
make sure all the links are cleanly taken down and,
more importantly for us, the corosync LEAVE message gets
sent so we don't get fenced on a clean exit.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agoman: Document -a option to corosync-quorumtool
Christine Caulfield [Fri, 7 Apr 2017 15:10:17 +0000 (17:10 +0200)]
man: Document -a option to corosync-quorumtool

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agocpghum test: Improve error codes
Jan Friesse [Fri, 7 Apr 2017 07:32:07 +0000 (09:32 +0200)]
cpghum test: Improve error codes

Return error when unknown option is found. Also return error code 2 if
one of send/crc/length/sequence error happened. Finally make sure abort
returns same error code and not 999 (what is nonsense code anyway).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
7 years agoquorumtool: Add option to show all node addresses
Christine Caulfield [Tue, 4 Apr 2017 09:08:33 +0000 (10:08 +0100)]
quorumtool: Add option to show all node addresses

New -a option shows all of the names/ip address of nodes
in a multi-homed environment.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agocpghum: Stop cpghum from reporting fake CRC errors
Christine Caulfield [Tue, 14 Mar 2017 16:38:19 +0000 (16:38 +0000)]
cpghum: Stop cpghum from reporting fake CRC errors

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agologconfig: Do not overwrite logger_subsys priority
Bin Liu [Fri, 10 Mar 2017 07:22:13 +0000 (15:22 +0800)]
logconfig: Do not overwrite logger_subsys priority

logfile_priority and syslog_priority could be modified by
logging.logger_subsys.{logfile_priority|syslog_priority}. which could
lead to the following output(which are at notice level):

corosync[21419]:   [QUORUM] Using quorum provider corosync_votequorum
corosync[21419]:   [QUORUM] Members[1]: 1084777643
corosync[21419]:   [QUORUM] This node is within the primary component
                   and will provide service.
corosync[21419]:   [QUORUM] Members[3]: 1084777563 1084777584 1084777643

even the syslog_priority is warning. This patch could avoid the
overwrite.

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agototem: Fix buffer sizes
Christine Caulfield [Thu, 2 Mar 2017 14:57:39 +0000 (14:57 +0000)]
totem: Fix buffer sizes

knet needs buffers to be KNET_MAX_PACKET_SIZE or messages will
get lost or corrupted.

UDPU packets shouldn't be that big so I introduced UDP_FRAME_SIZE_MAX
for that transport.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agomain: Don't ask libqb to handle segv, it doesn't work
Christine Caulfield [Mon, 27 Feb 2017 15:14:41 +0000 (15:14 +0000)]
main: Don't ask libqb to handle segv, it doesn't work

segv should be handled by corosync, libqb is not the
place to be handling emergency signals.

This currently requires the head of libqb git tree to
generate a blackbox & coredump in the event of a segfault,
but it's better than the write() spin that currently happens.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
7 years agoLogsys: Change logsys syslog_priority priority
Jan Friesse [Fri, 24 Feb 2017 15:23:50 +0000 (16:23 +0100)]
Logsys: Change logsys syslog_priority priority

LibQB adds default "*" syslog filter so we have to set syslog_priority
as low as possible so filters applied later in
_logsys_config_apply_per_file takes effect.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
7 years agoknet: improve logging messages by adding knet subsystem
Fabio M. Di Nitto [Fri, 24 Feb 2017 08:41:35 +0000 (09:41 +0100)]
knet: improve logging messages by adding knet subsystem

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
7 years agocpghum: Add abort_on_error option
Christine Caulfield [Fri, 17 Feb 2017 14:50:27 +0000 (14:50 +0000)]
cpghum: Add abort_on_error option

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agocpghum: Add min rtt and print stats every alarm
Christine Caulfield [Thu, 16 Feb 2017 15:59:52 +0000 (15:59 +0000)]
cpghum: Add min rtt and print stats every alarm

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agocpghum: Add Round Trip Time (RTT) reporting
Christine Caulfield [Wed, 15 Feb 2017 14:09:09 +0000 (14:09 +0000)]
cpghum: Add Round Trip Time (RTT) reporting

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
7 years agoknet: Change nodeids to knet_node_id_t for new knet compatibility
Fabio M. Di Nitto [Tue, 14 Feb 2017 05:08:45 +0000 (06:08 +0100)]
knet: Change nodeids to knet_node_id_t for new knet compatibility

after some feedback on github, people prefers to have the option
to support up to 64K node_id's.

libknet added knet_node_id_t to mask the size and type, currently
set to uint16_t.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
7 years agoknet: Fix MTU sizes & allow transport config in corosync.conf
Christine Caulfield [Mon, 13 Feb 2017 16:54:30 +0000 (16:54 +0000)]
knet: Fix MTU sizes & allow transport config in corosync.conf

Corosync layers don't need to know the knet MTU size - this way
corosync fragments buffers only when they get larger than the
KNET buffer size (64K) and knet fragments below that based on
the actual MTU and transport considerations.

It is also now possible to configure knet to use UDP or SCTP
transports in corosync.conf. This is currently done per-link
so if you have more than 1 link you need several interface{}
stanzas inside totem{} to make it use other than the default
of UDP. if it's useful I might add the option of a global
default.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>