A simple directive is a much lighter solution to the same problem, and
automatically follows the specified User. I copied the 0770 modes from
the corresponding init scripts; they could use a little documentation.
Signed-off-by: Ferenc Wágner <wferi@debian.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Jan Pokorný [Wed, 18 Oct 2017 19:59:22 +0000 (21:59 +0200)]
logsys: Avoid redundant callsite section checking
Previously, corosync executable was repeatedly (proportionally to the
count of LOGSYS_DECLARE_SUBSYS macro applications involved in the
constituent source files) checking the same for no gain in the pre-main
startup. This is not needed since nothing changes with static data
shared withing the same program space (it may have been a different
story once upon a time if loadable modules were in use), so make that
happen in (one-off per executable) LOGSYS_DECLARE_SYSTEM instead.
Libqb offers it's own ready-made macro to that
effect, simply to isolate the inner percularities from the library user
(that should not be required to understand anything about the orphan
sections and respective autocreated symbols to denote their boundaries).
As it is compile-time conditionalized in the same way, just use it
directly instead. As a value added, corosync will be kept up to date
about the possibly growing set of the logging-sanity checks as it gets
compiled with newer and newer libqb versions (their header files, for
that matter).
Signed-off-by: Jan Pokorný <jpokorny@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Jan Friesse [Thu, 10 Nov 2016 17:49:09 +0000 (18:49 +0100)]
qdevice: Add support for heuristics
Heuristics are set of commands executed locally on startup, cluster
membership change, successful connect to corosync-qnetd and optionally
also at regular times. When all commands finish successfully
(their return error code is zero) on time, heuristics have passed,
otherwise they have failed. The heuristics result is sent to
corosync-qnetd and there it's used in calculations to determine which
partition should be quorate.
Right know, there are some problems (bugs):
- Regular heuristics is supported only by ffsplit. This is not a
problem for clusters with power fencing, but deployments where
non-quorate partition continues to operate may see this as a problem.
- Qdevice-tool status doesn't contain detailed information about
heuristics.
- Qdevice-tool doesn't have a possibility to trigger heuristics
re-execute.
Thanks Chrissie Caulfield for Englishify the man pages.
Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Jonathan Davies [Tue, 10 Oct 2017 14:53:41 +0000 (15:53 +0100)]
cmap: don't shutdown highest config_version node
Scenario:
1. node A starts corosync with config_version = 2, nodelist = {A, B}
2. node B starts corosync with config_version = 1, nodelist = {A, B}
corosync.conf(5) says the config_version option is "used to prevent
joining old nodes with not up-to-date configuration."
So expected outcome is:
* corosync on node A remains alive
* corosync on node B exits
Actual outcome is:
* corosync on node A exits
* corosync on node B exits
Explanation of actual behaviour:
* Host A will have cmap_my_config_version = 2 but
cmap_highest_config_version_received = 1, so will shutdown in
cmap_sync_activate because these are not equal.
* Host B will have cmap_my_config_version = 1 but
cmap_highest_config_version_received = 2, so will shutdown in
cmap_sync_activate because these are not equal.
Instead, node A should consider its own config_version in the
calculation of the highest config_version, i.e.
cmap_highest_config_version_received = 2, and so not shutdown
in cmap_sync_activate.
Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
votequorum: make atb consistent on nodelist reload
When the cluster changes from even sized to odd sized corosync
disables auto-tie-breaker if wait_for_all is not enabled.
However when changing from odd sized to even sized it doesn't reenable
it, causing auto_tie_breaker to be inconsistent across the cluster:
the newly added node and any nodes that restart corosync
will have it, but all the previously running nodes won't.
Signed-off-by: Edwin Torok <edvin.torok@citrix.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
knet handle stats show compression and crypto statistics. With these
you can see the effectiveness of compression and the overheads of both
crypto and compression.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
totempg needs to store the current message + any
overflow for the next message which can be up to (nearly) the MTU size.
in knet that's large, but for UDP it's just 1500.
The reason we've never seen it before is because the actual max message
size is 1024 less than 1MB and after all the headers are stripped out the overflow is
usually 1024 bytes or less.
The 1024*1024 size of the assembly buffer is large enough to hold a max message (1047552) +
1024 bytes of a new UDP message. So we never saw any problems.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Jan Friesse [Fri, 28 Jul 2017 14:32:58 +0000 (16:32 +0200)]
main: Add support for libcgroup
When corosync is started in environment where it ends in cgroup without
properly set rt_runtime_us it's impossible to get RT priority.
Already implemented workaround is to use higher non-RT priority.
This patch implements another solution. It moves corosync into root cpu
cgroup. Root cpu cgroup hopefully has enough RT budget.
Another solution was mentioned on ML
https://lists.freedesktop.org/archives/systemd-devel/2017-July/039353.html
but this means to generate some "random" values.
Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
(cherry picked from commit c56086c701d08fc17cf6d8ef603caf505a4021b7)
Icmap is factored out so it's possible to add other
maps for cmap. API call to switch maps from application
end is added.
Corosync-cmapctl is enhanced with -m option.
Stats contains all statistics previously found in runtime.connections,
runtime.services and runtime.totem prefixes together with new knet
related. All stats are read only.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Jan Friesse [Fri, 30 Jun 2017 08:35:57 +0000 (10:35 +0200)]
totemknet: Flush knet log messages
When initialization fails knet logs messages into pipe. Previously they
were never processed. Solution is to add log_flush_messages which takes
care to call log_deliver_fn.
Call of log_flush_messages is also added to totemknet_finalize because
this removes log pipe fd from qb_loop so similar problem can happen.
Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Jan Friesse [Fri, 23 Jun 2017 09:22:09 +0000 (11:22 +0200)]
totemconfig: Make crypto work again
Knet needs longer key and supports various key lengths. Split
TOTEM_PRIVATE_KEY_LEN into TOTEM_PRIVATE_KEY_LEN_MIN and
TOTEM_PRIVATE_KEY_LEN_MAX (both using KNET_*_KEY_LEN).
Fix incorrect "Could only read..." message.
Make sure key is properly initialized/zeroed.
Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
votequorum: Report errors from votequorum_exec_send_reconfigure
If votequorum_exec_send_reconfigure() returns an error (ie the
packet could not be sent) then we should either return it to the
sender (for a library call) or, for an internal call, log it.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
cpgbench and cpghum share a lot of code & concepts so it makes
sense to merge them into a single test program that can both
benchmark and sanity check CPG.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Andrew Price [Tue, 25 Apr 2017 12:44:33 +0000 (14:44 +0200)]
Main: Call mlockall after fork
Man page of mlockall is clear:
Memory locks are not inherited by a child created via fork(2) and are
automatically removed (unlocked) during an execve(2) or when the
process terminates.
So calling mlockall before corosync_tty_detach is noop when corosync is
executed as a daemon (corosync -f was not affected).
Bin Liu [Thu, 20 Apr 2017 06:49:03 +0000 (08:49 +0200)]
coroparse: Use readdir instead of readdir_r
readdir_r is deprecated in glibc 2.24 in favor of readdir (which became
thread safe). Also because corosync never calls read_uidgid_files_into_icmap
in muliple threads, no problem should appears even with libc where
readdir is thread-safe.
Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
The kernel team have recommended us not to use recvmmsg and as it
confers no particular speed advantage (especially given the extra
memory consumption) I'm going back to single message recvmsg() again.
Bin Liu [Mon, 10 Apr 2017 02:45:10 +0000 (10:45 +0800)]
totemconfig: Prefer nodelist over bindnetaddr
In a two-node cluster, I 've one node configured with open-vswtich:
5: br-fixed: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UNKNOWN group default
inet 192.168.124.88/24 scope global br-fixed
inet 192.168.124.87/24 scope global secondary br-fixed
inet 192.168.124.83/24 brd 192.168.124.255 scope global secondary
tentative br-fixed
inet 192.168.124.89/24 scope global secondary br-fixed
while I use 192.168.124.83 in node list of corosync.conf with udpu, and
the bind_addr is 192.168.124.0. After upgrading corosync on this node,
the it uses 192.168.124.88 instead of 192.168.124.83. As we can see:
corosync-cfgtool -s
Printing ring status.
Local node ID 1084783704
while the other node can only see itself:
corosync-cfgtool -s
Printing ring status.
Local node ID 1084783697
RING ID 0
id = 192.168.124.81
status = ring 0 active with no faults
By tidily shutting down knet in totekmknet_finalize we
make sure all the links are cleanly taken down and,
more importantly for us, the corosync LEAVE message gets
sent so we don't get fenced on a clean exit.
Jan Friesse [Fri, 7 Apr 2017 07:32:07 +0000 (09:32 +0200)]
cpghum test: Improve error codes
Return error when unknown option is found. Also return error code 2 if
one of send/crc/length/sequence error happened. Finally make sure abort
returns same error code and not 999 (what is nonsense code anyway).
Bin Liu [Fri, 10 Mar 2017 07:22:13 +0000 (15:22 +0800)]
logconfig: Do not overwrite logger_subsys priority
logfile_priority and syslog_priority could be modified by
logging.logger_subsys.{logfile_priority|syslog_priority}. which could
lead to the following output(which are at notice level):
corosync[21419]: [QUORUM] Using quorum provider corosync_votequorum
corosync[21419]: [QUORUM] Members[1]: 1084777643
corosync[21419]: [QUORUM] This node is within the primary component
and will provide service.
corosync[21419]: [QUORUM] Members[3]: 108477756310847775841084777643
even the syslog_priority is warning. This patch could avoid the
overwrite.
Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
main: Don't ask libqb to handle segv, it doesn't work
segv should be handled by corosync, libqb is not the
place to be handling emergency signals.
This currently requires the head of libqb git tree to
generate a blackbox & coredump in the event of a segfault,
but it's better than the write() spin that currently happens.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Jan Friesse [Fri, 24 Feb 2017 15:23:50 +0000 (16:23 +0100)]
Logsys: Change logsys syslog_priority priority
LibQB adds default "*" syslog filter so we have to set syslog_priority
as low as possible so filters applied later in
_logsys_config_apply_per_file takes effect.
knet: Fix MTU sizes & allow transport config in corosync.conf
Corosync layers don't need to know the knet MTU size - this way
corosync fragments buffers only when they get larger than the
KNET buffer size (64K) and knet fragments below that based on
the actual MTU and transport considerations.
It is also now possible to configure knet to use UDP or SCTP
transports in corosync.conf. This is currently done per-link
so if you have more than 1 link you need several interface{}
stanzas inside totem{} to make it use other than the default
of UDP. if it's useful I might add the option of a global
default.
knet: Change nodeids to 8 bit for new knet compatibility
I've also put an assert in totemknet_member_add() to check
for invalid nodeids. Later on we need to fix the rest of the
corosync code to only use 8bit nodeids (or force people to use
UDPU if they want large nodeids).
Commit 8d8d4a936ab73d8449a3574f969b17a90ef9428e introduced the
configuration parameter resources.watchdog_device. This commit
introduces the resources section and watchdog_device parameter in
corosync.conf.5.
Signed-off-by: Adrian Vondendriesch <adrian.vondendriesch@credativ.de> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Jan Pokorný [Mon, 9 Jan 2017 18:42:57 +0000 (19:42 +0100)]
Spec: drop unneeded dependency
corosynclib-devel doesn't need to have a dependency on corosync package.
It's expected that libraries are still working properly (e.g. indicating
errors to their users) when there's no corosync process around in that
moment, and from this perspective it doesn't matter whether it is
installed at all for some purposes, especially having linkage with them
in mind.
Note that the inverse dependency, main corosync package on corosynclib,
is already there (not strictly needed, likely just to enforce package
version match -- otherwise RPM's dependency generator will handle this
on its own using SONAMEs -- hence the comments to that effect are also
added), so breaking this symmetry:
- is supposed to be harmless modulo cases that should be fixed to
express explicit dependency on corosync's runtime anyway
(but only for runtime, i.e., Requires as opposed to BuildRequires)
- will effectively enable more lightweight get-build-deps-and-build
process for programs linking with corosynclibs (e.g. pacemaker),
as corosync package won't need to be installed needlessly
Signed-off-by: Jan Pokorný <jpokorny@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Running 'doxygen -u Doxyfile.in' in the source root produces the
following results:
- SYMBOL_CACHE_SIZE at line 301 has become obsolete. This tag
has been removed.
- SHOW_DIRECTORIES at line 507 has become obsolete. This tag
has been removed.
- HTML_ALIGN_MEMBERS at line 881 has become obsolete. This tag
has been removed.
- USE_INLINE_TREES at line 1067 has become obsolete. This tag
has been removed.
- XML_SCHEMA at line 1311 has become obsolete. This tag has been
removed.
- XML_DTD at line 1317 has become obsolete. This tag has been
removed.
Signed-off-by: Richard B Winters <rik@mmogp.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Bin Liu [Fri, 2 Dec 2016 02:37:27 +0000 (10:37 +0800)]
Totempg: remove duplicate memcpy in mcast_msg func
In function mcast_msg of totempg.c, line 923, there is a memcpy call in
"else" branch, and also another memcpy out of the "else" branch, while
the two calls have the same parameters. It is possibleto remove the memcpy
in "else" branch.
Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>