git.proxmox.com Git - mirror

configure: Modernize configure.ac a bit

... to make 2.71 happy. Also increase minimum version to 2.69 (10 years
old version so should be compatible enough).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>

log: Configure knet logging to the same as corosync

Before this, all knet messages, including debug, were sent
over the pipe from knet to corosync and filtered in corosync.
This was obviously a waste, so now we tell knet the logging
level we need from it and so only get the messages that the
user has requested.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

logrotate: Use copytruncate method by default

The reopen lograte method has two main problems:
1. It does fail when corosync is not running (solvable by
   adding "|| true")
2. If (for some reason, like SELinux) cfgtool -L fails, logrotate
   fails and corosync keeps logging into old file. Added "|| true"
   makes situation even worse because logrotate removes file but
   corosync keeps logging into it.

Solution is to install copytruncate logrotate snip by default (and
keep reopen config file only for reference).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemconfig: Check uname return value correctly

uname in Solaris/Illumos returns non-negative value when succesful.

Signed-off-by: Andreas Grueninger <andreas.grueninger@noemail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totempg: Fix alignment handling

Some platforms requires aligned memory access. For such platforms,
special code was added using address modulo 4 to check if aligning is
needed or not. This may be problem for 64 bits platforms. Also check in
app_deliver_fn was incorrect and always true.

Solution is to use modulo sizeof pointer and add parentheses to fix the
check in app_deliver_fn function.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

pkgconfig: Export corosysconfdir

Useful for external code to easily tell where corosync.conf
is (in case someone configured it for /usr/local/etc, ...)

E.g. pacemaker's crm_report collects corosync.conf, and some
of its testing tools generate a corosync.conf for a test cluster.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>

Remove bashism from configure script

This was the real problem behind 384d168: Debian experimental now
sports a dash with LINENO support, so configure does not fall back to
using bash instead, choking on such bash-only constructs. Unfortunately
this didn't bail out cleanly, just unexpectedly set link_all_deplibs to
no, and the error message

./configure: 13158: test: yes: unexpected operator

stayed unnoticed in the logs. Actually, link_all_deplibs=no is the
default in Debian, reducing overlinking and causing confusion overall,
see https://debbugs.gnu.org/db/13/13920.html for example.

I think being explicit about used interfaces has its merit, so now that
Corosync has it, it might be advantageous to disable link_all_deplibs
by default across the board (after this patch re-enables it as a side
effect).

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totemudpu: Don't block local socketpair

Commit to drop packets from unlisted IPs made ifdown case not working
because msg_name is unset for socketpair.

solution is to drop packets from unlisted IPs only when bind state is
BIND_STATE_REGULAR.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

build: Add explicit dependency for used libraries

Don't rely on implicit symbol finding (cs_strerror being most prominent
example) but rather use explicit one.

This makes current debian experimental happy (compile source)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>

totemsrp: Switch totempg buffers at the right time

Commit 92e0f9c7bb9b4b6a0da8d64bdf3b2e47ae55b1cc added switching of
totempg buffers in sync phase. But because buffers got switch too early
there was a problem when delivering recovered messages (messages got
corrupted and/or lost). Solution is to switch buffers after recovered
messages got delivered.

I think it is worth to describe complete history with reproducers so it
doesn't get lost.

It all started with 402638929e5045ef520a7339696c687fbed0b31b (more info
about original problem is described in
https://bugzilla.redhat.com/show_bug.cgi?id=820821). This patch
solves problem which is way to be reproduced with following reproducer:
- 2 nodes
- Both nodes running corosync and testcpg
- Pause node 1 (SIGSTOP of corosync)
- On node 1, send some messages by testcpg
  (it's not answering but this doesn't matter). Simply hit ENTER key
  few times is enough)
- Wait till node 2 detects that node 1 left
- Unpause node 1 (SIGCONT of corosync)

and on node 1 newly mcasted cpg messages got sent before sync barrier,
so node 2 logs "Unknown node -> we will not deliver message".

Solution was to add switch of totemsrp new messages buffer.

This patch was not enough so new one
(92e0f9c7bb9b4b6a0da8d64bdf3b2e47ae55b1cc) was created. Reproducer of
problem was similar, just cpgverify was used instead of testcpg.
Occasionally when node 1 was unpaused it hang in sync phase because
there was a partial message in totempg buffers. New sync message had
different frag cont so it was thrown away and never delivered.

After many years problem was found which is solved by this patch
(original issue describe in
https://github.com/corosync/corosync/issues/660).
Reproducer is more complex:
- 2 nodes
- Node 1 is rate-limited (used script on the hypervisor side):
  ```
  iface=tapXXXX
  # ~0.1MB/s in bit/s
  rate=838856
  # 1mb/s
  burst=1048576
  tc qdisc add dev $iface root handle 1: htb default 1
  tc class add dev $iface parent 1: classid 1:1 htb rate ${rate}bps \
    burst ${burst}b
  tc qdisc add dev $iface handle ffff: ingress
  tc filter add dev $iface parent ffff: prio 50 basic police rate \
    ${rate}bps burst ${burst}b mtu 64kb "drop"
  ```
- Node 2 is running corosync and cpgverify
- Node 1 keeps restarting of corosync and running cpgverify in cycle
  - Console 1: while true; do corosync; sleep 20; \
      kill $(pidof corosync); sleep 20; done
  - Console 2: while true; do ./cpgverify;done

And from time to time (reproduced usually in less than 5 minutes)
cpgverify reports corrupted message.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>

cpghum: Allow to continue if corosync is restarted

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

man: Fix consensus timeout

The consensus timeout is 1.2 * token_timeout,
which has been changeg from 1000 to 3000, so change also consensus
timeout.

Signed-off-by: miharahiro <hmihara@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

logsys: Unlock config mutex on error

Thanks Ryan Cai <ycaibb@gmail.com> for reporting the problem.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totem: Add cancel_hold_on_retransmit config option

Previously, existence of retransmit messages canceled holding
of token (and never allowed representative to enter token hold
state).

This makes token rotating maximum speed and keeps processor
resending messages over and over again - overloading network
and reducing chance to successfully deliver the messages.

Also there were reports of various Antivirus / IPS / IDS which slows
down delivery of packets with certain sizes (packets bigger than token)
what make Corosync retransmit messages over and over again.

Proposed solution is to allow representative to enter token hold
state when there are only retransmit messages. This allows network to
handle overload and/or gives Antivirus/IPS/IDS enough time scan and
deliver packets without corosync entering "FAILED TO RECEIVE" state and
adding more load to network.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemconfig: Knet nodeid must be < 65536

Knet limits maximum node id to 16-bit type. This was not ensured in
corosync and it was possible to set nodeid to value >= 65536 and
(surprisingly) most of the things were working quite well because of
overflow. corosync-cmapctl -m stats contained knet nodeid in
stats.knet. subtree, so for nodeid 65536 result was:

Can't get value of stats.knet.node0.link0.connected. Error
CS_ERR_NOT_EXIST

Commit implements checking of nodeid and limits it to KNET_MAX_HOST
value when knet is used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemconfig: Ensure all knet hosts has a nodeid

Nodeid is required for knet for every node. Right now, existence of
nodeid is checked only for local for local node, so broaden the test.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

cfgtool: Use CS_PRI_NODE_ID for formatting nodeid

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

cfgtool: Fix brief mode display of localhost

Show 'n' also for first localhost link, so all localhost links
are marked consistently with non-brief display.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

cfgtool: Set nodeid indexes after sort

Needed for having correct index of localhost

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemconfig: Put autogenerated nodeid back to cmap

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

cfgtool: Check existence of at least one of nodeid

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemconfig: Do not process totem.nodeid

totem.nodeid is relict from times when nodelist was not required and
totemsrp was sending whole membership with ip addresses.

With Corosync 3 ip addresses are no longer sent so
it is not possible to find "next" node ip address where to send token
(because only nodeid is sent) without having information about all of
the nodes stored locally.

When totem.nodeid was configured it was partly used and other parts
(most notably totemudpu_token_target_set) were using autogenerated
nodeid. Together it was not possible to create even single node
membership.

Solution is to ignore totem.nodeid completely (and display warning when
it is set).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

knet: Fix node status display

Currently if there is a gap in the links (eg link0 is missing)
corosync-cfgtool -s will still display the links as 0,1,2,3...
even if they are 1,2,5,6...

Also display the KNET transport type with the link in
corosync-cfgtool -s & -n

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

main: Add support for cgroup v2 and auto mode

Support for cgroup v2 is very similar to cgroup v1 just checking (and
writing) different file.

Because of all the problems described later with cgroup v2 new "auto"
mode (new default) is added. This mode first tries to set rr scheduling
and moves Corosync to root cgroup only if it fails.

Testing this feature is a bit harder than with cgroup v1 so it's
probably worh noting in this commit message.

1. Copy some service file (I've used httpd service) and set
   CPUQuota=30% in the [service] section.
2. Check /sys/fs/cgroup/cgroup.subtree_control - there should be no
   "cpu"
3. Start modified service
4. Check /sys/fs/cgroup/cgroup.subtree_control - there should be "cpu"
5. Start corosync - It should be able to get rt priority

When move_to_root_cgroup is disabled (applies only for kernels
with CONFIG_RT_GROUP_SCHED enabled), behavior differs:
- If corosync is started before modified service, so
  there is no "cpu" in /sys/fs/cgroup/cgroup.subtree_control
  corosync starts without problem and gets rt priority.
  Starting modified service later will never add "cpu" into
  /sys/fs/cgroup/cgroup.subtree_control (because corosync is holding
  rt priority and it is placed in the non-root cgroup by systemd).

- When corosync is started after modified service, so "cpu"
  is in /sys/fs/cgroup/cgroup.subtree_control, corosync is not
  able to get RT priority.

It's worth noting problems when cgroup v2 is used together with systemd
logging described in corosync.conf(5) man page.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

stats: fix crash when iterating over deleted keys

The libqb map API leaves 'ownership' of the data with the caller
but does its own lifetime management, so it can easily happen that
map_rm() is called and the data deleted by the caller.
But if an iterator is running over that item then the map entry
will not get removed (leaving dangling pointers) until later.

libqb has a hack-y callback that tells the owner when it is safe to
delete the allocated memory, so we hook into that. icmap is already
using this.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

man: Add note about single node configuration

Internally knet is using just one link for localhost so for single node
configuration knet_link_get_link_list returns only one entry. This is
propagated to `corosync-cfgtool -s`.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

Revert "main: Add support for cgroup v2"

This reverts commit 57e6b86b53010dd2612b0a6a4e04917673062ecf.

We are in process of finding better solution so reverting for now.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>

Revert "man: Add info about cgroup v2 behavior"

This reverts commit 9d3df5696ed6b04b379a2fe643eec1fcd5a4b10d.

We are in process of finding better solution so reverting for now.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>

man: Add info about cgroup v2 behavior

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

cfg: corosync_cfg_trackstop blocks forever

corosync_cfg_trackstop expects reply but that was never sent. Make sure
to send reply so corosync_cfg_trackstop works.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

main: Add support for cgroup v2

Support for cgroup v2 is very similar to cgroup v1 just checking (and
writing) different file.

Testing this feature is a bit harder than with cgroup v1 so it's
probably worh noting in this commit message.

1. Copy some service file (I've used httpd service) and set
   CPUQuota=30% in the [service] section.
2. Check /sys/fs/cgroup/cgroup.subtree_control - there should be no
   "cpu"
3. Start modified service
4. Check /sys/fs/cgroup/cgroup.subtree_control - there should be "cpu"
5. Start corosync - It should be able to get rt priority

When move_to_root_cgroup is disabled, behavior differs:
- If corosync is started before modified service, so
  there is no "cpu" in /sys/fs/cgroup/cgroup.subtree_control
  corosync starts without problem and gets rt priority.
  Starting modified service later will never add "cpu" into
  /sys/fs/cgroup/cgroup.subtree_control (because corosync is holding
  rt priority and it is placed in the non-root cgroup by systemd).

- When corosync is started after modified service, so "cpu"
  is in /sys/fs/cgroup/cgroup.subtree_control, corosync is not
  able to get RT priority.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

main: Mark crypto_model key read only

... to be in align with crypto_cypher and crypto_hash.

Reload (corosync-cfgtool -R) works without any problem and changing of
key is not supported anyway,

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemconfig: Ensure strncpy is always terminated

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

config: Properly check crypto and compress models

Use knet_get_crypto_list to find knet supported crypto models and use
them instead of hardcoded list.

Also fix compression handling. Previously knet_compression_model
value was not checked at all and was directly passed to knet.

Use knet_get_compress_list to find knet supported compress models and
use them to check validity of config file and for more informative
error message.

Lastly enhance corosync version display with information
about available crypto/compression models.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

man: corosync-cfgtool.8: use proper single quotes

Apostrophe as the first character of the input line indicates a
request, so groff complained: macro 'onwire'' not defined.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

knet: pass correct handle to knet_handle_compress

totemknet_configure_compression was using knet_context
just to gather the knet handle / instance.

On first time config knet_contex is not initialized till
much later in the code, passing some random garbage pointers
to knet_handle_compress, that would crash later trying
to acquire a mutex lock.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totemconfig: fix integer underflow and logic bug

Fix integer underflow when computing `namelen` in `nodelist_byname`,
always use computed `namelen`.
Fixes #626.

Signed-off-by: Johannes Krupp <johannes.krupp@cispa.saarland>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totemconfig: change udp netmtu value as a constant

Insted of using "magic number" use UDP_NETMTU constant.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totemknet: retry knet_handle_new if it fails

Retry knet_handle_new without privileged operations if it fails

knet_handle_new can fail with ENAMETOOLONG if its privileged operations
fail, which can happen if we're running as a user process or in an
unprivileged container.

This adds a cmap key 'allow_knet_handle_fallback' that defaults to no,
which is the current behavior of exiting with error if the knet_handle
can't be created with privileged operations. If the new cmap key is set
to 'yes' and the knet_handle creation fails, fallback to creating the
handle using unprivileged operations is tried.

Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

main: Check memlock rlimit

Don't lock all current and future memory if can't
increase memlock rlimit.

If we fail to increase our RLIMIT_MEMLOCK, then locking all our current
and future memory is extremely dangerous; once our memory use reaches
our RLIMIT_MEMLOCK, memory allocations will start failing, very likely
leading to our entire process crashing.

This can happen if we aren't a privileged process, for example if
running as non-root user, or inside an unprivileged container.

Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

configure: drop unnecessary check and define

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

configure: move exec_prefix sanitize

Move exec_prefix sanitize closer to prefix. This is not
functional change, just group functional tests together.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

configure: drop dead code

prefix is sanitized already at the top of configure.ac to /usr,
hence the second instance can never hit.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

configure: detect and init pkg-config with macro

this also allows to use PKG_CONFIG_* macros immediately
in conditional calls

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

main: Close race condition when moving to statedir

Found by covscan which also didn't like us 'leaking' the
fd to the lockfile. So close that too.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

init: Use corosync-cfgtool for shutdown

... to trigger cfg shutdown callbacks.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

test: Add testcfg to exercise some cfg functions

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cfg: Reinstate cfg tracking

CFG tracking was removed in 815375411e80131f31b172d7c43625769ee8b53d,
probably as a mistake, as part of the tidy up of cfg and the removal of
dynamic loading. This means that shutdown tracking (using
cfg_try_shutdown()) stopped working.

This patch restores the trackstart & trackstop API calls (renamed to be
more consistent with the exiting libraries) so that shutdown tracking
can be used again.

Change cfg.shutdown_timeout to be in milliseconds rather than seconds
nd use libqb macros for conversion.

Add --force option to corosync-cfgtool -H

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cfg: Improve nodestatusget versioning

Patch tries to make nodestatusget really extendable. Following changes
are implemented:
- corosync_cfg_node_status_version_t is added with (for now) single
  value CFG_NODE_STATUS_V1
- corosync_knet_node_status renamed to corosync_cfg_node_status_v1 (it
  isn't really knet because it works as well for udp(u()
- struct res_lib_cfg_nodestatusget_version is added which holds only ipc
  result header and version on same position as for
  corosync_cfg_node_status_v1
- corosync_cfg_node_status_get requires version and pointer to one of
  corosync_cfg_node_status_v structures
- request is handled in case switches to make adding new version easier

Also fix following bugs:
- totempg_nodestatus_get error was retyped to cs_error_t without any
  meaning.
- header.error was not checked at all in the library

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

cfg: New API to get extended node/link infomation

Current we horribly over-use totempg_ifaces_get() to
retrieve information about knet interfaces. This is an attempt to
improve on that.

All transports are supported (so not only Knet but also UDP(U)).

This patch builds best against the "onwire-upgrade" branch of knet
as that's what sparked my interest in getting more information out.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totemknet: Check both cipher and hash for crypto

Previously only crypto cipher was used as a way to find out if crypto is
enabled or disabled.

This usually works ok until cipher is set to none and hash to some other
value (like sha1). Such config is perfectly valid and it was not
supported correctly.

As a solution, check both cipher and hash.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

The ring id file needn't be executable

At the same time simplify the overwrite logic and stop clearing the
umask (which is unexpected and quite pointless here, as applications
can't really protect the users from their own pathological settings).

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

pkgconfig: export LOGDIR in corosync.pc

logdir is configurable at build time and can change
from distro to distro. Export the path for pcs to use.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

spec: Add isa version of corosync-devel provides

Also add release to version to match autogenerated corosynclib-devel
provides.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemconfig: remove redundant nodeid error log

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemsrp: More informative messages

... when token and consensus timeouts pop.

Signed-off-by: Aleksei Burlakov <aburlakov@suse.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

config: Increase default token timeout to 3000 ms

Default token timeout of 1000 ms was often changed by users because of
other workloads on machine which may make corosync responding a bit
later than needed and resulting in token loss.

3000 ms was chosen as a compromise between token timeout increase
and allow live cluster upgrade (other nodes should receive token
by node with new default on time).

It doesn't affect token token_coefficient so final token timeout still
depends on number of configured nodes (just base is higher).

This change slows down failover a bit so for clusters where failover
times are important, please change the token timeout in configuration
file corosync.conf as a:

totem {
version: 2
token: 1000
...

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

man: votequorum.5: use proper single quotes

Backtick and apostrophe are formatted as directional quotes by plain
groff, but they behave literally in the body of a man page.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

man: fix typo: avaialable

By slightly rewording the documentation of knet_compression_model.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

tests: Use CS_DISPATCH_BLOCKING instead of cycle

Some tests were using dispatch function in CS_DISPATCH_ALL mode
without poll/select on fd. This leads to busywait cycle, because
CS_DISPATCH_ALL masks CS_ERR_TRY_AGAIN error.

Simpliest solution is to use CS_DISPATCH_BLOCKING instead and remove
while cycle, because CS_DISPATCH_BLOCKING handles CS_ERR_TRY_AGAIN
correctly.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

quorum: Add support for nodelist callback

Current quorum callback contains only actual view list and there is no
way how to find out joined/left nodes. This cannot be emulated by user
app, because when corosync restarts before other nodes notices then view
list is unchanged (ring id is changed tho).

Solution is to implement similar callback as for cpg which contains ring
id, member list, joined list and left list.

To implement such callback and keep backwards compatibility,
quorum_model_initialize is introduced. Its behavior is similar to
cpg_model_initialize. This allows passing model v1, which contains
enhanced quorum (full ring id is passed instead of just seq number)
and nodelist callbacks.

To find out which events should be sent by corosync daemon, new message
MESSAGE_REQ_QUORUM_MODEL_GETTYPE is used. Quorum library on init was
sending MESSAGE_REQ_QUORUM_GETTYPE. Whem model v1 is requested the
MESSAGE_REQ_QUORUM_MODEL_GETTYPE is used, which contains model number
so corosync knows that client is using model v1 and can send enhanced
quorum and nodelist events.

Nodelist event is (for now) send both in case of change of membership
and also when requested, also when CS_TRACK_CURRENT is requested, but
then left_list and joined_list is left empty, because they don't make
too much sense there.

New test application testquorummodel is added as an example of new API
usage.

Also during patch developement, I found few bugs here and there, which
are also fixed:
- quorum_initialize was never returning error code returned by
  MESSAGE_REQ_QUORUM_GETTYPE call (always returned CS_OK)
- Allocated memory in send_library_notification was based
  on sizeof(unsigned int) instead of mar_uint32_t. That's not wrong,
  but   it make more sense to use sizeof(mar_uint32_t) instead

(big thanks to Chrissie for englishify the man pages)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

man: reload during rolling upgrade

Make it clear that reloads during a rolling upgrade are not
supported.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totemsrp: Move token received callback

Trigger token received callback only for valid token.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

common_lib: Remove trailing spaces in cs_strerror

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemconfig: improve linknumber checking

Check whether linknumber larger than INTERFACE_MAX and display error if
so.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totemconfig: add interface number to the error str

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cfg: enhance message_handler_req_lib_cfg_killnode

While execute corosync-cfgtool -k <nodeid> to kill node:
* Check whether nodeid exists
* Check whether the node was joined

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

totemconfig: validate totem.transport value

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cmapctl: return error on no result of print prefix

return EXIT_FAILURE if no result print for ACTION_PRINT_PREFIX.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cmapctl: check NULL for key type and value for -p

To avoid segmentation fault.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

quorumtool: strict check for -o option

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

quorumtool: Help shouldn't require running service

Do not require corosync running when usage is requested.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cfgtool: Return error when -i doesn't match

Give error message and EXIT_FAILURE return code when -i
option doesn't match.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

man: update output of -s and -b for cfgtool

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cmapctl: return EXIT_FAILURE on failure

For -g and -d option return EXIT_FAILURE when error occurs (most often
because key does not exist).

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

tools: use util_strtonum for options checking

Function atoi is not safe since miss validation;
Function strtol is better but need to consider empty string and overflows
Function util_strtonum is a safer wrapper of strtoll

Use util_strtonum to check nodeid option and strict checking condition.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cfgtool: enhancement -a option

* Add return code
* Give error message when nodeid not exist

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cfgtool: output error messages to stderr

... and standardize the return code

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

configure: Use default systemd path with prefix

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>

build: Use git-version-gen during specfile build

Instead of copying parts of git-version-gen for spec target use
git-version-gen directly and parse final version into components
(rpmver, alphatag, numcomm) and use them.

Main reason is to simplify code a bit (sed scripts are a bit repetitive
tho), reuse the code and also allow building of RPM from dist tarball
generated from non-tagged commit or dirty git (not very useful).

The code relies on fact, that hyphen is never used in tagged release
name.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>

build: Update git-version-gen

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>

spec: Require at least knet 1.18 for crypto reload

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>

config: Allow reconfiguration of crypto options

Needs new knet crypto API.

If it's not available, then fall back to the old
API and forbid changing crypto while running.

To avoid us being dependant on the leader node, each
node sends its own crypto_reconfig_phase messages so
we can guarantee that the reconfiguration always completes
on each node.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

test: Fix cpgtest

... to cope with the max number of group members.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

config: Fix crash when a reload fails twice

Have string values stored in char arrays in totem_config
so we don't get into a mess with the pointers.

Also remove vsftype (which hasn't been used since corosync 1)

Use strncpy even though we know the string is fine. Keep covscan happy

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

config: Don't free pointers used by transports

reload failed for UDP[U] because they had saved pointers
to the interfaces[] array. so memcpy into that rather then
re-allocate it.

Also, move the check for different IP address families so
it also gets run at reload time.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

config: don't reload vquorum if reload fails

Fix an 'error: success' stype message by propogating error_string
back down the stack.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

cfg: Improve error return to cfgtool -R

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

config: Reorganise the config system

To be more reliable & maintainable

The basic plan here is to fix reloads to be more stable
using read/parse/verify/build/commit stages, so that any errors
will not leave corosync in an unstable state. This should
also make the code more maintainable as currently the verify/commit
stages are horribly intertwined.

Also:
- Fix local_node_pos not being updated in the new map during validation
(broke adding and removing new nodes in the middle of the list).
- Fix reconfiguration so that nodes are indexed by nodeid and not their
position in the list. This is an old bug that's just been carried
over

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

Revert "totemip: compare sin6_scope_id and interface_num"

This reverts commit efd34df531d1b23d6458dca863a7517b7ac0099d to make
master compile after revert of 934c47ed4384daf2819c26306bebba3225807499.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>

Revert "totemip: Add support for sin6_scope_id"

This reverts commit 934c47ed4384daf2819c26306bebba3225807499 which is
causing protocol incompatibility in needle. Master seems to be not
affected, but it needs more checking.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>

cfgtool: Fix error code as described in MP

If all links are connected 0 is returned to the shell, otherwise it's
error code 1.

Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

icmap: icmap_init_r() leaks if trie_create() fails

Thanks to Coverity for finding this

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>

votequorum: set wfa status only on startup

Previously reload of configuration with enabled wait_for_all result in
set of wait_for_all_status which set cluster_is_quorate to 0 but didn't
inform the quorum service so votequorum and quorum information may get
out of sync.

Example is 1 node cluster, which is extended to 3 nodes. Quorum service
reports cluster as a quorate (incorrect) and votequorum as not-quorate
(correct). Similar behavior happens when extending cluster in general,
but some configurations are less incorrect (3->4).

Discussed solution was to inform quorum service but that would mean
every reload would cause loss of quorum until all nodes would be seen
again.

Such behaviour is consistent but seems to be a bit too strict.

Proposed solution sets wait_for_all_status only on startup and
doesn't touch it during reload.

This solution fulfills requirement of "cluster will be quorate for
the first time only after all nodes have been visible at least
once at the same time." because node clears wait_for_all_status only
after it sees all other nodes or joins cluster which is quorate. It also
solves problem with extending cluster, because when cluster becomes
unquorate (1->3) wait_for_all_status is set.

Added assert is only for ensure that I haven't missed any case when
quorate cluster may become unquorate.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

quorumtool: exit on invalid expected votes

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

votequorum: Change check of expected_votes

Previously value of new expected_votes was checked so newly computed
quorum value was in the interval <total_votes / 2, total_votes>. The
upper range prevented the cluster to become unquorate, but bottom check
was almost useless because it allowed to change expected_votes so it is
smaller than total_votes.

Solution is to check if expected_votes is bigger or equal to total_votes
and for quorate cluster only check if cluster doesn't become unquorate
(for unquorate cluster one can set upper range freely - as it is
perfectly possible when using config file)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

cfgtool: Simplify output a bit for link status

Display words connected/disconnected instead of 1/0 and show enabled
status only when link is not enabled (shouldn't happen).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

man: Enhance link_mode priority description

Some users found description of priority for passive link_mode
confusing (probably because "priority" word is too
overloaded) so add some redundancy to make description
unambiguous.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

main: Add schedmiss timestamp into message

This is useful for matching schedmiss event in stats map with logged
event.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>

totemip: compare sin6_scope_id and interface_num

When user configure a specific interface like vlan
with the same IPv6 link-local address, Corosync should
compare sin6_scope_id with interface_num, to make sure got
the right interface to bind

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>