]> git.proxmox.com Git - mirror_corosync.git/log
mirror_corosync.git
5 years agobuild: add option for enabling sanitizer builds
Fabio M. Di Nitto [Wed, 9 Oct 2019 08:46:19 +0000 (10:46 +0200)]
build: add option for enabling sanitizer builds

--with-sanitizers= option is stricly meant for runtime debugging
purposes. Do NOT use in production.

Please check gcc/clang man pages on how to use ASAN/UBSAN/TSAN.

Also allow users to specificy SANITIZERS_CFLAGS and SANITIZERS_LDFLAGS
for advanced use.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemknet: Add locking for log call
Jan Friesse [Mon, 9 Sep 2019 15:47:24 +0000 (17:47 +0200)]
totemknet: Add locking for log call

Knet callbacks may be called from different thread than main thread. If
this happens, log messages may be lost. Most prominent example is when
link goes up (logged by main thread) and host_change_callback_fn is
called.

Implemented solution is adding mutex for every log call in totemknet.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
5 years agoman: Fix link_mode priority description
Jan Friesse [Mon, 26 Aug 2019 13:44:18 +0000 (15:44 +0200)]
man: Fix link_mode priority description

... to match knet source code.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
5 years agonotifyd: Don't dereference NULL key_name
Jan Friesse [Tue, 30 Jul 2019 12:24:32 +0000 (14:24 +0200)]
notifyd: Don't dereference NULL key_name

This problem shouldn't really happen, but better safe than sorry.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
5 years agototem: Increase ring_id seq after load
Jan Friesse [Mon, 15 Jul 2019 12:08:39 +0000 (14:08 +0200)]
totem: Increase ring_id seq after load

This patch handles the situation where the leader
node (the node with lowest node_id) crashes and is started again
before token timeout of the rest of the cluster.
The newly restarted node restores the ringid of the old ring from
stable storage, so it has the same ringid as rest of the nodes,
but ARU is zero. If the node is able to create a singleton membership
before receiving the joinlist from rest of the cluster,
everything works as expected, because the ring id gets increased
correctly.

But if the node receives a joinlist from another cluster node before
its own joinlist, then it continues as it would had it never left
the cluster. This is not correct, because the new node should always
create a singleton configuration first.

During the recovery phase, ARUs are compared and because they differ
(the ARU of the old leader node is 0), the other nodes
try to sent all of their previous messages. This is impossible
(even if it was correct), because other nodes have already freed most
of those messages. The implementation uses an assert to limit maximum
number of messages sent during recovery (we could fix this,
but it's not really the point).

The solution here is to increase the ring_id sequence number by 1 after
loading it from storage. During creation of the commit token it is
always increased by 4, so it will not collide with an existing
sequence.

Thanks Christine Caulfield <ccaulfie@redhat.com> for clarify commit
message.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoinit: Use cpgtool instead of cfgtool
Jan Friesse [Thu, 4 Jul 2019 13:07:44 +0000 (15:07 +0200)]
init: Use cpgtool instead of cfgtool

Init script used to use corosync-cfgtool -s to wait till
corosync accepts ipc connection. Problem with this approach
is that error code is returned not only if ipc cannot be initialized,
but also when one of the ring is marked as failed, making corosync
service not to start. Corosync with one failed ring can work just
fine and there is no need to fail startup.

Patch is changing call of corosync-cfgtool to corosync-cpgtool. Also to
make spotting of broken ring easier, corosync-cfgtool -s is called after
successful return of the cpgtool, and warning is issued if cfgtool
fails.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agonotifyd: Fix warning produced by 32-bit compiler
Jan Friesse [Thu, 4 Jul 2019 12:36:54 +0000 (14:36 +0200)]
notifyd: Fix warning produced by 32-bit compiler

time_t is platform dependent real type which is usually long int on
64-bit platform, but only int on 32-bit platform and printing it with
%ld generated warning.

Solution seems to be ether retype time_t to long int or use functions
which works with time_t. Later option is used in this patch, which uses
localtime and strftime to print time_t value.

Also code is refactored to remove duplicate calls and add _cs_snmp
prefix to prevent snmp_ prefix collision.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agocfgtool: Remove unused code
Jan Friesse [Thu, 4 Jul 2019 13:38:18 +0000 (15:38 +0200)]
cfgtool: Remove unused code

corosync_cfg_ring_status_get returns string status, which is always OK
for UDP(U) and detailed status for Knet transport. Previously also
FAULTY status was returned for UDP(U) and cfgtool used to return error
code back to shell when one of the interfaces was faulty.

Because FAULTY is now not returned, it's not needed to have code for
handling it.

Also man page was misleading, so it is fixed too.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agologging: Add CS_PRI_NODE_ID and CS_PRI_RING_ID
Jan Friesse [Tue, 2 Jul 2019 08:36:30 +0000 (10:36 +0200)]
logging: Add CS_PRI_NODE_ID and CS_PRI_RING_ID

Previously node id was logged ether as a %d (most often), %u, %x or
PRI.32 and ring id ether as %lld, %llx with various separators (., :, /)
between rep nodeid and seq. This seems to cause confusion.

This patch adds macros CS_PRI_NODE_ID, CS_PRI_RING_ID and
CS_PRI_RING_ID_SEQ (CS prefix = corosync, PRI modeled in spirit of
inttypes.h PRIx32) and makes code use them.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agovqsim: Fix gitignore
Jan Friesse [Tue, 2 Jul 2019 08:34:08 +0000 (10:34 +0200)]
vqsim: Fix gitignore

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemknet: Disable forwarding on shutdown
Jan Friesse [Thu, 27 Jun 2019 06:33:27 +0000 (08:33 +0200)]
totemknet: Disable forwarding on shutdown

Disabling forwarding will make knet flush the messages (especially
LEAVE one).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemconfig: Fix compiler warning
Jan Friesse [Mon, 17 Jun 2019 13:40:13 +0000 (15:40 +0200)]
totemconfig: Fix compiler warning

Compiler is unable to understand relation between members and
num_configured and warns about uninitialized members. Instead of
initializing members to 0 and (potentially after some code
refactor) let code fall to display error message, more explicit method
of assert is used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototem: fix check if all nodes have same number of links
Thomas Lamprecht [Fri, 14 Jun 2019 16:56:57 +0000 (18:56 +0200)]
totem: fix check if all nodes have same number of links

configured links may not come in order in the interfaces array, which
holds an entry for _all_ possible links, not just configured ones.

So iterate through all interfaces, but skip those which are not
configured. This allows to start corosync with a configuration where
link 0 is currently not mentioned, as else it was checked but had
member_count = 0 from it's default initialization, which then made
this code report a false positive for the "Not all nodes have the
same number of links" check even on a correct config.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototem: fix check if all nodes have name attrs in multi-link setups
Thomas Lamprecht [Fri, 14 Jun 2019 16:31:16 +0000 (18:31 +0200)]
totem: fix check if all nodes have name attrs in multi-link setups

As totem_config->interfaces entries are _all_ possible links and not
only the configured ones we cannot trust that interface[0] is
configured at the time of checking, and thus has a valid
member_count. So set the members variable to the member_count entry
from an actually configured interface and loop over that one.

This fixes a case where the check for the name property on all nodes
for multi links was skipped if link 0 was not configured, as then its
member_count was 0.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agocorosync-notifyd: Add option to disable DNS lookup
dkutergin [Tue, 11 Jun 2019 20:58:13 +0000 (13:58 -0700)]
corosync-notifyd: Add option to disable DNS lookup

New configuration option -n is added.

Signed-off-by: dkutergin <dmytro.kutergin@harmonicinc.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemsrp: Fix warnings produced by gcc 9.1
Jan Friesse [Tue, 4 Jun 2019 13:24:58 +0000 (15:24 +0200)]
totemsrp: Fix warnings produced by gcc 9.1

New gcc warn about passing posibly unaligned pointer from packed
structure. This shouldn't be problem for x86.

Implemented solution is to let compiler do its job (compiler knows if
pointer is aligned so accessing structure field is safe) and
use it together with support for asigning and returning of structure
(not a pointer to the structure).

- srp_addr_copy is removed and replaced by simple assignment
- srp_addr_copy_endian_convert is removed and replaced by
  srp_addr_endian_convert function which takes srp_addr structure and
  returns endian converted srp_addr structure
- functions which accepts srp_addr array are not changed because
  (luckily) non-aligned pointer is always just one item array and
  such item is always used as a source pointer so it's possible to use
  temporary variable

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agocpg: Move filling of member_list to subfunction
Jan Friesse [Thu, 16 May 2019 12:08:25 +0000 (14:08 +0200)]
cpg: Move filling of member_list to subfunction

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
5 years agocpg: Add more comments to notify_lib_joinlist
Jan Friesse [Wed, 15 May 2019 15:39:13 +0000 (17:39 +0200)]
cpg: Add more comments to notify_lib_joinlist

And make handling of left_list more generic. Also free skiplist
allocated by joinlist_inform_clients function. Last (but not least)
remove czechlish founded (should have been pp of "find").

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
5 years agocpg: send single confchg event per group on joinlist
Fabian Grünbichler [Wed, 8 May 2019 14:31:15 +0000 (16:31 +0200)]
cpg: send single confchg event per group on joinlist

using a similar approach to

43bead364514e8ae2ba00bcf07c460e31d0b1765
"Send one confchg event per CPG group to CPG client"

which did the same for leave events on a network partition.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agocpg: notify_lib_joinlist: drop conn parameter
Fabian Grünbichler [Wed, 15 May 2019 11:45:13 +0000 (13:45 +0200)]
cpg: notify_lib_joinlist: drop conn parameter

since it is always set to NULL.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agovqsim: Check length of copied optarg v3.0.2
Jan Friesse [Tue, 11 Jun 2019 13:30:00 +0000 (15:30 +0200)]
vqsim: Check length of copied optarg

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agovqsim: Check result of icmap_set_uint32
Jan Friesse [Tue, 11 Jun 2019 13:26:29 +0000 (15:26 +0200)]
vqsim: Check result of icmap_set_uint32

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agovqsim: Remove unused total_nodes
Jan Friesse [Tue, 11 Jun 2019 13:11:13 +0000 (15:11 +0200)]
vqsim: Remove unused total_nodes

... and remove unused nodes_in_partition function.

Also replace TAILQ_FOREACH with goto to while cycle.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agovqsim: Free allocated newvq on error
Jan Friesse [Tue, 11 Jun 2019 12:50:03 +0000 (14:50 +0200)]
vqsim: Free allocated newvq on error

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agovqsim: Check length of received message
Jan Friesse [Tue, 11 Jun 2019 12:48:41 +0000 (14:48 +0200)]
vqsim: Check length of received message

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agovqsim: Check write result
Jan Friesse [Tue, 11 Jun 2019 12:46:34 +0000 (14:46 +0200)]
vqsim: Check write result

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agovqsim: Do not access unitialized argv[0]
Jan Friesse [Tue, 11 Jun 2019 09:04:48 +0000 (11:04 +0200)]
vqsim: Do not access unitialized argv[0]

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemknet: Initialize return value in setup_nozzle
Jan Friesse [Tue, 11 Jun 2019 09:00:03 +0000 (11:00 +0200)]
totemknet: Initialize return value in setup_nozzle

Also add comment why return value is currently not used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemknet: macaddr_str is always set
Jan Friesse [Tue, 11 Jun 2019 08:44:17 +0000 (10:44 +0200)]
totemknet: macaddr_str is always set

Check for NULL was invalid, because macaddr_str is ether defined in cmap
or set to "54:54:01:00:00:00".

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemknet: Ignore icmap_get_string result
Jan Friesse [Tue, 11 Jun 2019 08:40:07 +0000 (10:40 +0200)]
totemknet: Ignore icmap_get_string result

... and add comment why it is not a bug.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemknet: create_nozzle_device simplify check
Jan Friesse [Tue, 11 Jun 2019 08:32:40 +0000 (10:32 +0200)]
totemknet: create_nozzle_device simplify check

ipaddr existence is checked for being not NULL by caller setup_nozzle.
Also ipaddr was passed to reparse_nozzle_ip_address function unchecked
so code would crash before reaching the actual check.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemip: Use res in totemip_sa_equal
Jan Friesse [Tue, 11 Jun 2019 08:28:41 +0000 (10:28 +0200)]
totemip: Use res in totemip_sa_equal

Setting res to -1 was not entirely following semantics of "equal"
operation. Set it to 0 and return it when families differs makes
compiler happy.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemconfig: ipaddr_equal use switch
Jan Friesse [Tue, 11 Jun 2019 08:24:05 +0000 (10:24 +0200)]
totemconfig: ipaddr_equal use switch

Compiler may have problem understanding relation between addr1p and
addrlen. Small change makes code a little more readable and compiler
happy.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoconfigure: Fix GDB_CFLAGS typo
Jan Friesse [Mon, 10 Jun 2019 08:56:10 +0000 (10:56 +0200)]
configure: Fix GDB_CFLAGS typo

GDB_FLAGS (without C) is the correct name of variable
to print in the summary.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoman: Add vqsim man page into distributed tarball
Jan Friesse [Mon, 10 Jun 2019 06:04:50 +0000 (08:04 +0200)]
man: Add vqsim man page into distributed tarball

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
5 years agospec: Add support for user-flags configure option
Jan Friesse [Fri, 7 Jun 2019 08:20:04 +0000 (10:20 +0200)]
spec: Add support for user-flags configure option

Passing -ggdb3 (or -g3) during compiler may result in corrupted
debuginfo files (bug in debugedit - for Fedora filed as a
https://bugzilla.redhat.com/show_bug.cgi?id=1708786). Until the bug is
fixed it's possible to ether change configure to add -ggdb2/-g2 or use
already existing --enable-user-flags option and rely on environment set
by rpmbuild.

Patch implements second option so RPM distros without broken debugedit
are not affected.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoman: Enahnce block_unlisted_ips description
Jan Friesse [Fri, 31 May 2019 07:34:33 +0000 (09:34 +0200)]
man: Enahnce block_unlisted_ips description

Thanks Christine Caulfield <ccaulfie@redhat.com> for
Englishify and refining the description.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoman: Enhance corosync.conf mp a bit
Jan Friesse [Tue, 28 May 2019 08:08:37 +0000 (10:08 +0200)]
man: Enhance corosync.conf mp a bit

Fix issues found by Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agocfgtool: Fix link status display
Fabian Grünbichler [Wed, 29 May 2019 12:57:09 +0000 (14:57 +0200)]
cfgtool: Fix link status display

instead of the nodeid, this displayed arbitrary values (usually '1')
from other cmap keys under nodelist.node.XX.

sscanf returns the number of conversions even on mismatch, e.g. it also
returns 1 for

nodelist.node.2.quorum_votes
nodelist.node.2.ring0_addr
nodelist.node.2.name
...

instead of just

nodelist.node.2.nodeid

which leads to the value of (at least) quorum_votes being stored in
nodeid_list in addition to the actual nodeid.

storing the returned int in a cs_error_t enum also potentially masks
errors, so just compare the result with the expectation directly.

Fixes: c0d14485c3ebdeb2332f7c48acd155163e5b7fc1
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoknet: Use block_unlisted_ips
Jan Friesse [Fri, 24 May 2019 07:33:13 +0000 (09:33 +0200)]
knet: Use block_unlisted_ips

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoudpu: Drop packets from unlisted IPs
Jan Friesse [Fri, 24 May 2019 06:48:01 +0000 (08:48 +0200)]
udpu: Drop packets from unlisted IPs

This feature allows corosync to block packets received from unknown
nodes (nodes with IP address which is not in the nodelist). This is
mainly for situations when "forgotten" node is booted and tries to join
cluster which already removed such node from configuration. Another use
case is to allow atomic reconfiguration and rejoin of two separate
clusters.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoknet: Fix initialising of knet access lists.
Christine Caulfield [Tue, 19 Mar 2019 10:47:58 +0000 (10:47 +0000)]
knet: Fix initialising of knet access lists.

It needs to be done at both reload and initialize time.
Also disable access lists if the config key is removed.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoknet: allow corosync to use knet access lists
Fabio M. Di Nitto [Sun, 10 Mar 2019 07:23:38 +0000 (08:23 +0100)]
knet: allow corosync to use knet access lists

currently knet acl are only available on master
but they might be backported
to stable1 as they don´t break onwire protocol.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoman: Enhance token_retransmit description
yuan ren [Thu, 16 May 2019 10:31:44 +0000 (18:31 +0800)]
man: Enhance token_retransmit description

Signed-off-by: yuan ren <yren@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemconfig: Fix minimum limit for hold timeout
yuan ren [Tue, 14 May 2019 11:33:12 +0000 (19:33 +0800)]
totemconfig: Fix minimum limit for hold timeout

Make sure the retransmit timeout have the lowest limit
`MINIMUM_TIMEOUT`. So, the lowest limit of hold should be
recalculated.

Also token timeout and retransmits count should
keep a relational expression.

Signed-off-by: yuan ren <yren@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agovqsim: Enhance vqsim
Christine Caulfield [Fri, 26 Apr 2019 09:54:32 +0000 (10:54 +0100)]
vqsim: Enhance vqsim

1. Enable scripting of vqsim and add man page

I've added a 'sleep' command to help with scripting as well as
documentation on how to do it.

2. Make 'sync' operation much more robust and useful

Refactored a lot of code to make sure that in sync mode the
prompt appears at the 'right' time. What we do is wait for all
of the nodes in all partitions to have the same ring_id. If this
doesn't happen then the timeout will fire as before.

3. Rename binary to corosync-vqsim and add a sub-package for it

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoknet: Fix a couple of errors when adding a new link
Christine Caulfield [Thu, 2 May 2019 13:22:47 +0000 (14:22 +0100)]
knet: Fix a couple of errors when adding a new link

When adding a new link for the first time you will often see:
1) knet_link_set_ping_timers for nodeid 1, link 1 failed: Invalid
argument (22)
2) New config has different knet transport for link 1. Internal value
was NOT changed. To reconfigure an interface it must be deleted and
recreated. A working interface needs to be available to corosync at all
times

1) is caused by setting the ping timers twice, once in
totemknet_member_add() and once in totemknet_refresh_config().
The first time we don't know the value
so it's zero and thus display an error. For this we simply check
for the zero and skip the knet API call. It's not ideal, but
totemconfig needs a lot of reconfiguring itself before we can
make this more sane.

2) was caused by simply comparing an unconfigured link with
a configured one, so OF COURSE, they are going to be different!

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemconfig: fix autogen mcastaddr for ipv6-4
yuan ren [Sun, 28 Apr 2019 10:29:37 +0000 (18:29 +0800)]
totemconfig: fix autogen mcastaddr for ipv6-4

When UDP is used as a transport, the error would occur
"Multicast address family does not match bind address family"
because there is no ipv6 in /etc/hosts specified but using the
totem.ip_version: ipv6-4. because
the mcastaddr generated (if not specified) only according to
the totem.ip_version.

Solution is to use bindnetaddr (configured or generated from
nodelist) addr family.

Signed-off-by: yuan ren <yren@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemconfig: Ensure nodeid is specified for IPv6
Jan Friesse [Wed, 24 Apr 2019 12:47:47 +0000 (14:47 +0200)]
totemconfig: Ensure nodeid is specified for IPv6

Thanks Yuan Ren <yren@suse.com> for finding this problem.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agovqsim: Fix vqsim for corosync 3.0
Christine Caulfield [Thu, 25 Apr 2019 13:38:52 +0000 (14:38 +0100)]
vqsim: Fix vqsim for corosync 3.0

A couple of small internal changes in corosync 3.0 broke vqsim.
1) The way the custom config file is specified (no long an env variable)
2) votequorum now needs to know ouZ_node_pos

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agovqsim: Make vqsim compile
Jan Friesse [Tue, 23 Apr 2019 14:48:30 +0000 (16:48 +0200)]
vqsim: Make vqsim compile

Also add vqsim binary to .gitignore.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemconfig: ipaddr_equal check just addr part
Jan Friesse [Tue, 23 Apr 2019 10:38:04 +0000 (12:38 +0200)]
totemconfig: ipaddr_equal check just addr part

Checking whole structure is fine for IPv4, but IPv6 contains also scope
id, what may be problem for local address. It's possible to use a zone
index, but because it's not required when host name is used, it
shouldn't be needed when IPv6 address is used.

Example configuration snip which fails without patch:

...
nodelist {
  node {
    nodeid: 1
      ring0_addr: fe80::1234:5678:9abc:def1
    }
}
...

(example succeed when %eth0 is used).

With patch, zone index is not needed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agocpg: Add CPG_REASON_UNDEFINED
Jan Friesse [Tue, 16 Apr 2019 10:52:31 +0000 (12:52 +0200)]
cpg: Add CPG_REASON_UNDEFINED

Previously the reason field for the member_list items
in cpg_totem_confchg_fn was unset what may be little confusing.

Solution is to add a special value CPG_REASON_UNDEFINED and use it for
the member_list items.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agocrypto: re-introduce secauth parameter
Fabian Grünbichler [Wed, 10 Apr 2019 07:43:33 +0000 (09:43 +0200)]
crypto: re-introduce secauth parameter

with the following semantics:
- default off
- implies crypto_hash SHA256 and crypto_cipher AES256
- crypto_* have higher precedence
- only applicable for knet, like crypto_*

this should make upgrading from Corosync 2.x less painful for users that
have an explicit secauth=on in their configuration.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemconfig: Remove support for 3des
Jan Friesse [Thu, 11 Apr 2019 06:23:29 +0000 (08:23 +0200)]
totemconfig: Remove support for 3des

Triple DES is considered as a "weak cipher" since 2016 so there is
really no need to support it in the corosync. Thanks to bug in
Corosync/Knet/NSS which caused 3des to not work at all,
no matter what library was used, we can just remove support for 3des
without braking the compatibility.

Also fix coroparse so:
- totem.crypto_type is removed (this is 1.x construct which was not used
even in 2.x)
- Add checking of totem.crypto_model.
- Enumarate possible values for crypto_model, crypto_cipher and
crypto_hash error messages

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agokeygen: Reflect change in knet
Jan Friesse [Tue, 9 Apr 2019 15:09:34 +0000 (17:09 +0200)]
keygen: Reflect change in knet

Knet commit 1cb36f0cffd4559971826ca4774a88c5b05882fb reduced minimal
key length to 1024-bit. Keygen should keep compatibility with already
released 3.0.[0-1] so default key length should be 2048 bits. It's
possible to use -s argument to generate shorter key - keygen respects
minimum/maximum as defined by knet.

Also fix man page to reflect this change.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoset totem.keyfile and totem.key to RO
Fabian Grünbichler [Wed, 3 Apr 2019 19:57:30 +0000 (21:57 +0200)]
set totem.keyfile and totem.key to RO

so that we get the nice log message when attempting to modify them at
runtime, just like for totem.crypto_* and co.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoRevert "init: Enable StopWhenUnneeded"
Jan Friesse [Thu, 4 Apr 2019 09:40:19 +0000 (11:40 +0200)]
Revert "init: Enable StopWhenUnneeded"

This reverts commit 03d9321bc80887d4578744c26c05d61e2d9d4278.

Reverted because when corosync service is not enabled and corosync
is executed by "systemctl start corosync" it is then immediately
shutdown because of "Unit not needed anymore. Stopping.".

This is really not expected behavior.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemsrp: Word spelling mistake
yuan ren [Fri, 29 Mar 2019 07:43:10 +0000 (15:43 +0800)]
totemsrp: Word spelling mistake

Signed-off-by: yuan ren <reyren179@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agocoroparse: Fix compiler warning
Jan Friesse [Tue, 26 Feb 2019 12:28:08 +0000 (13:28 +0100)]
coroparse: Fix compiler warning

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoconfigure: Do not autodetect nozzle
Jan Friesse [Tue, 26 Feb 2019 10:04:16 +0000 (11:04 +0100)]
configure: Do not autodetect nozzle

Nozzle is part of kronosnet but it is independent library. Enabling it
when detected without ability to turn it off is not in line with
other libraries.

Solution is to use same method as for other libraries - add
--enable-nozzle to configure script and add support for this option into
spec file.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agonozzle: Add support for libnozzle devices
Christine Caulfield [Tue, 15 Jan 2019 15:18:18 +0000 (15:18 +0000)]
nozzle: Add support for libnozzle devices

A nozzle device is a pseudo ethernet device that routes network
traffic through a channel on the corosync knet network (NOT cpg or any
corosync internal service) to other nodes in the cluster. It allows
applications to take advantage of knet features such as multipathing,
automatic failover, link switching etc.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoquorumtool: Fix exit status codes
Jan Friesse [Thu, 14 Feb 2019 15:05:59 +0000 (16:05 +0100)]
quorumtool: Fix exit status codes

1. Use EXIT_SUCCESS and EXIT_FAILURE when possible
2. For -s option return EXIT_SUCCESS when no problem appeared and node
   is quorate, EXIT_FAILURE if problem appeared and exit code 2
   (EXIT_NOT_QUORATE) when no problem appeared but node is not quorate.
3. Document exit codes in the man page

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agocorosync-cfgtool: Fix -i matching
Jan Friesse [Wed, 13 Feb 2019 11:54:55 +0000 (12:54 +0100)]
corosync-cfgtool: Fix -i matching

Previously it was required to use link id together with IP address (ex.
"0 127.0.0.1") as a -i parameter.

This was reported as not very user friendly. Solution is to split
returned interface name and try match link id and ip address
separately.

Also fix typo in description of parameter -s.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agobuild: Use the AWK variable provided by configure
Ferenc Wágner [Tue, 29 Jan 2019 14:25:18 +0000 (15:25 +0100)]
build: Use the AWK variable provided by configure

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agobuild: Use the SED variable provided by configure
Ferenc Wágner [Tue, 29 Jan 2019 14:24:19 +0000 (15:24 +0100)]
build: Use the SED variable provided by configure

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoconfigure.ac: AC_PROG_SED is already present
Ferenc Wágner [Tue, 29 Jan 2019 14:15:27 +0000 (15:15 +0100)]
configure.ac: AC_PROG_SED is already present

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agocorosync.conf.5: typography fixes
Ferenc Wágner [Sat, 22 Dec 2018 17:58:27 +0000 (18:58 +0100)]
corosync.conf.5: typography fixes

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agocorosync.conf.5: fix grammar
Ferenc Wágner [Sat, 22 Dec 2018 17:56:01 +0000 (18:56 +0100)]
corosync.conf.5: fix grammar

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agocfgtool: Improve link status display
Christine Caulfield [Tue, 22 Jan 2019 10:06:29 +0000 (10:06 +0000)]
cfgtool: Improve link status display

Now show the nodeids properly, rather than node indexes which were
annoying and unhelpful.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agodoc: Update INSTALL file
Jan Friesse [Wed, 16 Jan 2019 13:39:42 +0000 (14:39 +0100)]
doc: Update INSTALL file

- Add LibQB and Knet links
- Remove old (pre udpu) config file example
- Change corosync.conf man page to contain useful information about
token timeout

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoinit: Enable StopWhenUnneeded v3.0.1
Jan Pokorný [Fri, 14 Dec 2018 20:07:37 +0000 (21:07 +0100)]
init: Enable StopWhenUnneeded

It shall be a rule of thumb not to combine "application stack"
components run under particular init/supervision mechanism and
run by whatever other means (without transitive relationships
like when corosync's client runs from other pacemaker that is
itself started through systemd) when there's a directed graph
of reliance between them (sans constrained corner cases like
when of such components is a kernel module).

And corosync on its own is just a service provider that only
appears useful when utilized as a basic building block for
application specific distributed environments.

Therefore, we may assume whenever corosync gets started by the
means of systemd, it's because of a mechanized attempt to satisfy
declared dependency of some such corosync's client that is about
to be started under the service manager realms (directly or, by
induction, through the same triggering mechanism indirectly).
Hence, when there's no such client around anymore (unless
this dependant is being restarted at the moment, see below)
corosync shall rather shutdown as well.

In the past, there was an issue with systemd regarding said
inflicted restart of the dependant/client, but that's resolved
as of v236:
https://github.com/systemd/systemd/commit/
deb4e7080db9dcd2a1d51ccf7c357f88ea863e54

Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemip: Use AF_UNSPEC for ipv4-6 and ipv6-4
Jan Friesse [Thu, 10 Jan 2019 14:06:20 +0000 (15:06 +0100)]
totemip: Use AF_UNSPEC for ipv4-6 and ipv6-4

AF_UNSPEC returns different results than AF_INET/AF_INET6, because of
nsswitch.conf search is in order and it stops asking other
modules once current module success.

Example of difference between previous and new code when ipv6-4 is used:
- /etc/hosts contains test_name with an ipv4
- previous code called AF_INET6 where /etc/hosts failed so other methods
were used which may return IPv6 addr -> result was ether fail or IPv6
address.
- new code calls AF_UNSPEC returning IPv4 defined in /etc/hosts ->
result is IPv4 address

New code behavior should solve problems caused by nss-myhostname.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
5 years ago[totemknet] update for libknet.so.2.0.0 init API
Fabio M. Di Nitto [Thu, 3 Jan 2019 08:57:49 +0000 (09:57 +0100)]
[totemknet] update for libknet.so.2.0.0 init API

more changes are to be expected on this front as the API evolves in
knet master.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoConfig version must be specified
Ferenc Wágner [Sun, 16 Dec 2018 16:52:08 +0000 (17:52 +0100)]
Config version must be specified

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoDon't declare success early
Ferenc Wágner [Sun, 16 Dec 2018 14:30:27 +0000 (15:30 +0100)]
Don't declare success early

Here we're very far from entering the main loop, even farther from
sending the READY notification to systemd.  This sounded awkward:

systemd[1]: Starting Corosync Cluster Engine...
corosync[827]:   [MAIN  ] Corosync Cluster Engine ('2.99.5'):
  started and ready to provide service.
corosync[827]:   [MAIN  ] Corosync built-in features: dbus monitoring
  watchdog augeas systemd xmlconf snmp pie relro bindnow
corosync[827]:   [MAIN  ] parse error in config: No interfaces defined
corosync[827]:   [MAIN  ] Corosync Cluster Engine exiting with status 8
  at main.c:1378.
systemd[1]: corosync.service: Main process exited, code=exited,
  status=8/n/a
systemd[1]: corosync.service: Failed with result 'exit-code'.
systemd[1]: Failed to start Corosync Cluster Engine.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoMore natural error messages
Ferenc Wágner [Sun, 16 Dec 2018 14:18:58 +0000 (15:18 +0100)]
More natural error messages

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agomain: Rename run_dir to state_dir v3.0.0
Jan Friesse [Fri, 14 Dec 2018 12:29:53 +0000 (13:29 +0100)]
main: Rename run_dir to state_dir

system.run_dir was a little bit unfortunate and confusing name. Rename
to state_dir makes more evident what is content of this directory. To
keep setting consistent with code, get_run_dir is changed to
get_state_dir.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemconfig: Enhance totem.ip_version
Jan Friesse [Thu, 13 Dec 2018 15:50:49 +0000 (16:50 +0100)]
totemconfig: Enhance totem.ip_version

Originally totem.ip_version was used to force ip version used by totem.
With Knet this variable didn't make too much sense so it was not used.

Sadly rely only on DNS resolver order doesn't always work (RFC is quite
complicated, but if IPv6 is not configured then IPv4 is preferred), what
we tried to solve by forcing IPv6 and only if that fails, use IPv4.

Sadly this collides with nss_myhostname which is able to return every
local address and today system usually have at least one autogenerated
link-local IPv6 address so it is able to "overwrite" /etc/hosts.

Solution is to enhance totem.ip_version and use it also for Knet.
totem.ip_version is now just a flag for resolver and can have four
states: ipv4 (only IPv4 is used), ipv6 (only IPv6 is used), ipv4-6 (ask
IPv4 first and if it fails ask for IPv6) and ipv6-4 (ask IPv6 first and
if it fails ask for IPv4). Default for Knet and UDPU transports is
ipv6-4, for UDP it's ipv4, because autogenerated mcast addr doesn't play
too well with ipv6-4.

So everywhere where nss_myhostname becomes problem, it's just possible
to set totem.ip_version to ipv4-6.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemip: Add debug information to totemip_parse
Jan Friesse [Wed, 12 Dec 2018 16:26:39 +0000 (17:26 +0100)]
totemip: Add debug information to totemip_parse

It's required to create TOTEM logsys subsys before totemip_parse is used
(so before totem_config_read). Logsys is not yet fully initialized, but
it's good enough.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agototemconfig: Add IPs to family mismatch error
Jan Friesse [Wed, 12 Dec 2018 16:17:22 +0000 (17:17 +0100)]
totemconfig: Add IPs to family mismatch error

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoconfig: Look up hostnames in a defined order
Christine Caulfield [Fri, 7 Dec 2018 13:03:20 +0000 (13:03 +0000)]
config: Look up hostnames in a defined order

Current practice is to let getaddrinfo() decide which address we get
but this is not necessarily deterministic as DNS servers won't
always return addresses in the same order if a node has
several. While this doesn't deal with node names that have
multiple IP addresses of the same family (that's an installation issue
IMHO) we can, at least, force a definite order for IPv6/IPv4 name
resolution.

I've chosen IPv6 then IPv4 as that's what happens on my test system (
using /etc/hosts) and it also seems more 'future proof'.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoFix corosync.conf.5 manpage typos
Ferenc Wágner [Mon, 10 Dec 2018 13:56:51 +0000 (14:56 +0100)]
Fix corosync.conf.5 manpage typos

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoman: Add some information about address resolution
Christine Caulfield [Mon, 10 Dec 2018 10:03:44 +0000 (10:03 +0000)]
man: Add some information about address resolution

to corosync.conf(5)

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agototemconfig: Really use totemip_parse results
Jan Friesse [Mon, 10 Dec 2018 08:07:55 +0000 (09:07 +0100)]
totemconfig: Really use totemip_parse results

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoman: Add instructions for adding/removing nodes v2.99.5
Christine Caulfield [Thu, 6 Dec 2018 09:47:04 +0000 (09:47 +0000)]
man: Add instructions for adding/removing nodes

This replaces the 'cmaptool' method previously documented
in cmap_keys.8

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoconfig: Disallow corosync-cmapctl updates of nodelist
Christine Caulfield [Tue, 4 Dec 2018 15:31:24 +0000 (15:31 +0000)]
config: Disallow corosync-cmapctl updates of nodelist

It didn't work anyway (the config system requires whole links
to be configured at once) and caused crashes.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agoconfig: Report IP addr/nodename parse errors back
Christine Caulfield [Mon, 3 Dec 2018 15:25:05 +0000 (15:25 +0000)]
config: Report IP addr/nodename parse errors back

Corosync used to just ignore parse errors so that un-resolved names
could cause silent failures. We now always check the result from
totemip_parse() and at least print something in syslog.

There's also a little get-out here that allows you to correct
a bad node address without having to destroy and recreate the
whole link. I'm being nice to you.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
5 years agocoroparse: Remove unused cs_err initialization
Jan Friesse [Fri, 23 Nov 2018 15:00:00 +0000 (16:00 +0100)]
coroparse: Remove unused cs_err initialization

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agocpghum: Check cpg_local_get return code
Jan Friesse [Fri, 23 Nov 2018 14:47:31 +0000 (15:47 +0100)]
cpghum: Check cpg_local_get return code

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agotestcpg2: Check cpg_dispatch return code
Jan Friesse [Fri, 23 Nov 2018 14:44:15 +0000 (15:44 +0100)]
testcpg2: Check cpg_dispatch return code

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agonotifyd: Delete registered tracking keys v2.99.4
Jan Friesse [Thu, 15 Nov 2018 16:02:22 +0000 (17:02 +0100)]
notifyd: Delete registered tracking keys

Forward port of needle 70fd66767494872b93018949d685f19482cd5bec by Hideo
Yamauchi <renayama19661014@ybb.ne.jp>.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agostats: Fix delete of track
Jan Friesse [Thu, 15 Nov 2018 15:54:47 +0000 (16:54 +0100)]
stats: Fix delete of track

When cmap_track_delete was called to stats map (cmap created with
CMAP_MAP_STATS parameter) result was always ERR_BAD_HANDLE.

It turned out that corosync part of cmap is always calling icmap
function to get user data (where required hdb handle is stored)
instead of generalized map_fns.

After fixing this issue, valgrind showed error about jump depending on
unitialized data in stats_map_track_delete. Solution seems to be always
initialize tracker->events (so not only when track_type is add or
delete).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoinit: Fix init script to work with containers
Jan Friesse [Mon, 3 Sep 2018 15:04:23 +0000 (17:04 +0200)]
init: Fix init script to work with containers

Previously init scripts were not using pid file so pidof was used. This
is usually not a problem, but when containers are used it may result to
killing improper instance when issued on host.

Solution is to always use pidfile.

Also try to use LSB complaint status codes.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agomain: Remove COROSYNC_RUN_DIR
Jan Friesse [Tue, 13 Nov 2018 16:32:43 +0000 (17:32 +0100)]
main: Remove COROSYNC_RUN_DIR

Remove last used environment variable (reasons similar to removal of
COROSYNC_MAIN_CONFIG_FILE).

This environment variable was never documented, so document it properly.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoman: Describe nodelist.node.name properly
Jan Friesse [Tue, 13 Nov 2018 16:14:17 +0000 (17:14 +0100)]
man: Describe nodelist.node.name properly

Old description is no longer true, because with knet transport name got
new and very important role.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agomain: Remove COROSYNC_TOTEM_AUTHKEY_FILE
Jan Friesse [Tue, 13 Nov 2018 15:53:43 +0000 (16:53 +0100)]
main: Remove COROSYNC_TOTEM_AUTHKEY_FILE

Remove another environment variable (reasons similar to removal of
COROSYNC_MAIN_CONFIG_FILE).

Also properly document both totem.keyfile and totem.key.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agomain: Replace COROSYNC_MAIN_CONFIG_FILE
Jan Friesse [Mon, 12 Nov 2018 17:35:45 +0000 (18:35 +0100)]
main: Replace COROSYNC_MAIN_CONFIG_FILE

COROSYNC_MAIN_CONFIG_FILE environment variable was quite well hidden
and it was never used by init script. It also makes quite hard to debug
possible problems.

Replace it by -c option.

Also patch makes use of configuration file path as a base for uidgid.d
directory, so it's no longer needed to keep uidgid.d in sysconfdir.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agomain: Move sched paramaters to config file
Jan Friesse [Mon, 12 Nov 2018 14:46:14 +0000 (15:46 +0100)]
main: Move sched paramaters to config file

The reason for this change is, that number of corosync CLI options
kind of exploded and scheduler based one are really beter to be kept in
config file.

Nice side-effect of this move is better "integration" with systemd,
because currently used EnvironmentFile should be really used for
environment and not that much for passing extra options to CLI.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
5 years agoconfigure: move to AC_COMPILE_IFELSE
Jan Friesse [Wed, 7 Nov 2018 14:12:10 +0000 (09:12 -0500)]
configure: move to AC_COMPILE_IFELSE

from AC_PREPROC_IFELSE which is strongly discouraged.

Our detection system was very weak and recent versions of clang did
show that PREPROC_IFELFE (cpp) would enable warning options that
the compiler does not support (clang).

Use a full compilation test to detect what works and what doesn't.

Based on knet patch 88491f27375a9e8aceb946853a1abf4d23ebb8f3.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>