]> git.proxmox.com Git - mirror_corosync.git/log
mirror_corosync.git
12 years agoonecrypt: move encryption code to crypto.c
Jan Friesse [Tue, 13 Mar 2012 10:34:33 +0000 (11:34 +0100)]
onecrypt: move encryption code to crypto.c

This will remove duplicity of code.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agocfg: remove crypto_set
Jan Friesse [Tue, 13 Mar 2012 08:36:55 +0000 (09:36 +0100)]
cfg: remove crypto_set

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agocorosync-cfgtool: Remove set of cryptography
Jan Friesse [Tue, 13 Mar 2012 08:30:51 +0000 (09:30 +0100)]
corosync-cfgtool: Remove set of cryptography

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agoRemove libtomcrypt
Jan Friesse [Mon, 12 Mar 2012 15:46:51 +0000 (16:46 +0100)]
Remove libtomcrypt

Tomcrypt in corosync is for long time not updated. Because we have
support for libnss, libtomcrypt can be removed.

Also few leftovers (AES is 256 bits, not 128, ...) are removed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agodrop evs service
Fabio M. Di Nitto [Sun, 11 Mar 2012 06:50:58 +0000 (07:50 +0100)]
drop evs service

there are several reasons for this:

1) evs is only partially implemented with no plans to complete it

typedef enum {
       EVS_TYPE_UNORDERED, /* not implemented */
       EVS_TYPE_FIFO,          /* same as agreed */
       EVS_TYPE_AGREED,
       EVS_TYPE_SAFE           /* not implemented */
} evs_guarantee_t;

2) evs has no users in any upstream distribution and no search
   engine can find any other upstream using it.

3) the only reason (I was told) to carry around evs was that evs
   receives the full ring_id struct from totem. This is only
   partially correct because while the structures are prepared
   to carry around those data, they are never transmitted from
   corosync engine down the IPC line to the user.
   CPG ring_id contains the exact same information and it's
   actually less buggy (due to prototying of the info).

worst case scenario where a user really absolutely need libevs,
it can be easily reimplemented as libcpg wrapper and avoid
lots of code duplication.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agobuild: drop another leftover from the past
Fabio M. Di Nitto [Sun, 11 Mar 2012 08:21:18 +0000 (09:21 +0100)]
build: drop another leftover from the past

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agobuild: drop obsoleted SOCKETDIR option
Fabio M. Di Nitto [Sun, 11 Mar 2012 08:02:44 +0000 (09:02 +0100)]
build: drop obsoleted SOCKETDIR option

yet another leftover from the past that can go away

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agobuild: drop last LCRSO references
Fabio M. Di Nitto [Sun, 11 Mar 2012 07:55:15 +0000 (08:55 +0100)]
build: drop last LCRSO references

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agopload: make it a test service and not a public one
Fabio M. Di Nitto [Sat, 10 Mar 2012 15:59:14 +0000 (16:59 +0100)]
pload: make it a test service and not a public one

pload is a performance benchmark that measures the onwire
speed of corosync.

problem is that once pload has been executed, the cluster
is basically dead.

turn pload into a test tool, by removing corosync-pload tool
and user library.

cleanup pload code to make it more readable and drop lots
of unnecessary stuff.

add test/ploadstart tool that can configure and start pload
via cmap calls.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agototem: drop crypt_accept: concept/option
Fabio M. Di Nitto [Fri, 9 Mar 2012 12:54:12 +0000 (13:54 +0100)]
totem: drop crypt_accept: concept/option

this was another old onwire compat mode that is not useful anylonger.

we can safely move the new model by default.

According to Honza (real hardware 1 node testing) there are no
performance impact.

My tests (8 nodes VM cluster), there is up to 10/12% performance
improvements up to 1M packet size where old and new models are equal.

As a side note, nss still shows to be a performance loss on both
real and virtual hw (without any kind of nss hw acceleration).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoFix typo in stats key name.
Angus Salkeld [Fri, 9 Mar 2012 02:59:35 +0000 (13:59 +1100)]
Fix typo in stats key name.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agoRemove unused function logsys_priority_name_get()
Angus Salkeld [Wed, 7 Mar 2012 23:40:28 +0000 (10:40 +1100)]
Remove unused function logsys_priority_name_get()

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agoAdd pid, hostname and process name to the logfile
Angus Salkeld [Wed, 7 Mar 2012 23:37:05 +0000 (10:37 +1100)]
Add pid, hostname and process name to the logfile

Note this is only for file targets not stderr or syslog.

https://bugzilla.redhat.com/show_bug.cgi?id=789925

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agodrop last references to compatibility: whitetank
Fabio M. Di Nitto [Fri, 9 Mar 2012 10:38:54 +0000 (11:38 +0100)]
drop last references to compatibility: whitetank

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
12 years agoutils: cleanup main daemon exit codes
Fabio M. Di Nitto [Fri, 9 Mar 2012 09:46:50 +0000 (10:46 +0100)]
utils: cleanup main daemon exit codes

some of them are not in use anymore and can be dropped.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agosync: kill evil and syncv1 in one shot
Fabio M. Di Nitto [Fri, 9 Mar 2012 09:36:27 +0000 (10:36 +0100)]
sync: kill evil and syncv1 in one shot

this change breaks onwire compatibility.

cpg is the only user of sync_* interface and it's the only
service that will require extra testing.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agoman: Add cmap pages to index.html v1.99.7
Jan Friesse [Mon, 5 Mar 2012 15:50:59 +0000 (16:50 +0100)]
man: Add cmap pages to index.html

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoman: Add description of cpg_iteration_* functions
Jan Friesse [Mon, 5 Mar 2012 15:42:05 +0000 (16:42 +0100)]
man: Add description of cpg_iteration_* functions

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoman: Fix cmap_iter_finalize typo
Jan Friesse [Mon, 5 Mar 2012 14:06:14 +0000 (15:06 +0100)]
man: Fix cmap_iter_finalize typo

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoTreat ENOMSG as TRY_AGAIN.
Angus Salkeld [Mon, 5 Mar 2012 11:10:02 +0000 (22:10 +1100)]
Treat ENOMSG as TRY_AGAIN.

ENOMSG is returned by the ringbuffer when you attempt to read
a message and there is nothing there to read.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoAdd common IPC errors.
Angus Salkeld [Mon, 5 Mar 2012 11:05:08 +0000 (22:05 +1100)]
Add common IPC errors.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoquorumtool: improve display of status data
Fabio M. Di Nitto [Mon, 5 Mar 2012 12:12:55 +0000 (13:12 +0100)]
quorumtool: improve display of status data

always display membership data from the local node

display when a node is unknown to the local node instead of an error
from IPC.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: move last malloc/alloca buf to static
Fabio M. Di Nitto [Mon, 5 Mar 2012 11:47:02 +0000 (12:47 +0100)]
votequorum: move last malloc/alloca buf to static

this should guarantee that votequorum won't fail under high memory
pressure. Price is 3500 bytes extra preallocated at startup.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: fix node allocation memory leak
Fabio M. Di Nitto [Mon, 5 Mar 2012 11:39:54 +0000 (12:39 +0100)]
votequorum: fix node allocation memory leak

stop using malloc for each new node, because we cannot free the memory
easily. Move to a static allocated buffer that can contain
PROCESSOR_MAX + qdevice cluster_node instead.

We can never have more than PROCESSOR_MAX nodes anyway and the memory
footprint is small enough compared to memory leaks (those can
effectively happen only in very dynamic clusters with tons of different
nodes joining/leaveing with different nodeids).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: rename leave_remove to allow_downscale
Fabio M. Di Nitto [Fri, 2 Mar 2012 09:56:07 +0000 (10:56 +0100)]
votequorum: rename leave_remove to allow_downscale

pointed out that leave_remove can be easily confused with the old
cman leave_remove behavior. The two are substantially different
and we need to avoid confusion both for users and our support team.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: fix handling of config updates
Fabio M. Di Nitto [Fri, 2 Mar 2012 10:10:22 +0000 (11:10 +0100)]
votequorum: fix handling of config updates

cmap changes are local to the node only and should not be broadcasted
as configuration changes.

if any change has happened to us, we will inform other nodes via
send_nodeinfo.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: free our data and lists on exit
Fabio M. Di Nitto [Fri, 2 Mar 2012 09:07:10 +0000 (10:07 +0100)]
votequorum: free our data and lists on exit

this is mostly to avoid valgrind errors on exit and make the output
more readable.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: disallow special features vs qdevice
Fabio M. Di Nitto [Thu, 1 Mar 2012 13:42:01 +0000 (14:42 +0100)]
votequorum: disallow special features vs qdevice

simply taking the safest path here since integration of qdevice is not
fully complete

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: fix node check based on reconfig parameter
Fabio M. Di Nitto [Thu, 1 Mar 2012 12:46:49 +0000 (13:46 +0100)]
votequorum: fix node check based on reconfig parameter

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: make a common function to calculate votes and cluster members
Fabio M. Di Nitto [Thu, 1 Mar 2012 11:01:56 +0000 (12:01 +0100)]
votequorum: make a common function to calculate votes and cluster members

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: incorporate static config into dynamic
Fabio M. Di Nitto [Thu, 1 Mar 2012 10:36:42 +0000 (11:36 +0100)]
votequorum: incorporate static config into dynamic

no functional changes or extra features yet

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: move all configuration in votequorum_readconfig
Fabio M. Di Nitto [Thu, 1 Mar 2012 10:14:23 +0000 (11:14 +0100)]
votequorum: move all configuration in votequorum_readconfig

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: start moving from static to fully dynamic config
Fabio M. Di Nitto [Thu, 1 Mar 2012 09:59:41 +0000 (10:59 +0100)]
votequorum: start moving from static to fully dynamic config

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: disallow wait_for_all and qdevice operations
Fabio M. Di Nitto [Thu, 1 Mar 2012 09:37:27 +0000 (10:37 +0100)]
votequorum: disallow wait_for_all and qdevice operations

The problem here is that user expectations, when using both modes
at the same time, have not been set yet. There are 2/3 options
that need investigation.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agovotequorum: improve debugging output
Fabio M. Di Nitto [Wed, 29 Feb 2012 09:40:23 +0000 (10:40 +0100)]
votequorum: improve debugging output

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agoAlways set interface_up in totemip_iface_check
Jan Friesse [Thu, 1 Mar 2012 15:39:06 +0000 (16:39 +0100)]
Always set interface_up in totemip_iface_check

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agovotequorum: fix node->flags type when receiving nodeinfo messages
Fabio M. Di Nitto [Wed, 29 Feb 2012 08:37:35 +0000 (09:37 +0100)]
votequorum: fix node->flags type when receiving nodeinfo messages

old_flags was set to uint16_t but it needs to be uint32_t.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agovotequorum: fix segfault in wfa status update
Fabio M. Di Nitto [Wed, 29 Feb 2012 07:53:28 +0000 (08:53 +0100)]
votequorum: fix segfault in wfa status update

this is a regression introduced by cb5fd775

when reading static config us->flags does not exists yet and therefor
setting it will cause a segfault.

Move the settings after cluster_node *us is created, with the long
term plan to simply kill the whole _static readconfig bits
in favour of dynamic (runtime changeable) bits.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agoquorumtool: improve Membership information output v1.99.6
Fabio M. Di Nitto [Tue, 28 Feb 2012 10:26:54 +0000 (11:26 +0100)]
quorumtool: improve Membership information output

align nodeid, votes and name to make it all more readable

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agoquorumtool: make output more human friendly and retain machine parsable bits
Fabio M. Di Nitto [Tue, 28 Feb 2012 09:42:48 +0000 (10:42 +0100)]
quorumtool: make output more human friendly and retain machine parsable bits

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agoquorumtools: fix typo in man page
Fabio M. Di Nitto [Tue, 28 Feb 2012 09:03:18 +0000 (10:03 +0100)]
quorumtools: fix typo in man page

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agoquorumtools: drop unused option parsing
Fabio M. Di Nitto [Tue, 28 Feb 2012 09:03:01 +0000 (10:03 +0100)]
quorumtools: drop unused option parsing

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agoquorumtool: fix version display info
Fabio M. Di Nitto [Tue, 28 Feb 2012 09:01:13 +0000 (10:01 +0100)]
quorumtool: fix version display info

we don't need that on every run

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agoquorumtool: swap node state and node votes output
Fabio M. Di Nitto [Mon, 27 Feb 2012 09:41:41 +0000 (10:41 +0100)]
quorumtool: swap node state and node votes output

there is no point to show the votes if the node is dead

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agovotequorum: fix votequorum_getinfo man page and align struct name
Fabio M. Di Nitto [Mon, 27 Feb 2012 09:40:41 +0000 (10:40 +0100)]
votequorum: fix votequorum_getinfo man page and align struct name

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agoquorumtool: update man page and help text
Fabio M. Di Nitto [Mon, 27 Feb 2012 09:10:15 +0000 (10:10 +0100)]
quorumtool: update man page and help text

improve error output since this is more than a debugging tool now

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agovotequorum: major rework to fix qdevice API and integration with core
Fabio M. Di Nitto [Thu, 23 Feb 2012 10:01:53 +0000 (11:01 +0100)]
votequorum: major rework to fix qdevice API and integration with core

qdevice is a very special node in the cluster and it adds a certain
amount of complexity and special cases across the code.

most of the qdevice data are shared across the cluster (name/votes)
but effectively each node has a different view of the qdevice
(registered/unregistered/voting/etc.)

with this change, we align the qdevice view across the node,
exchanging more data between nodes and we fix how qdevice behaves
and it is configured.

The only side effect is that the amount of data transmitted on wire
is slightly higher.

The qdevice API is still disabled by default. This means that
the amount of real changes in current code are a lot smaller
than it appears by this patch.

TODO: documentation/man pages needs to be updated once
      this change is in (and behavior finalized).

User visible changes:

- configuration (coroparse, exec/votequorum):
  the quorum device section is now standalone within the quorum.

  quorum {
    provider: corosync_votequorum
    device {
      model: (name)
      timeout: (millisec)
      votes:
    }
  }

  the keyword "model:" is mandatory to enable qdevice in configuration
  and should express the name of the script/daemon that will provide
  the qdevice. Looking into the future, an init script or systemd
  service will look for that name in /path/to/be/decided/name
  and start/stop qdevice.

  timeout: defines the maximum interval the qdevice implementation
  has available between poll (see votequorum_qdevice_poll.3) before
  the device is considered dead and votes discarded

  votes: is now a configuration parameter and not an API call.
  quorum devices don't care what they need to vote.
  votes is autocalculated when a nodelist is available and all
  nodes in the list vote 1. Otherwise this parameter is mandatory.

- configuration (exec/votequorum):
  startup and runtime configuration changes have been improved.
  errors at startup are considered fatal. errors at runtime
  have different exit paths.

  startup:

  * quorum.two_node and qdevice are incompatible.
  * quorum.expected_votes requires quorum.device.votes.
  * quorum.expected_votes - quorum.device.votes cannot be lower
    than 2.
  * qdevice and last_man_standing are mutually exclusive.
  * qdevice and auto_tie_breaker are mutually exclusive.

  runtime config changes:

  * quorum.two_node and qdevice are incompatible:
    if quorum device is alive, two_node is disabled.
    if quorum device is not alive and node count is 2, two_node is
       enabled, and quorum device cannot be registered

  * if either last_man_standing or auto_tie_breaker were enabled
    at startup, and at runtime quorum device is configured,
    quorum device registration will be blocked.

  * if quorum.expected_votes is configured but not quorum.device.votes,
    quorum device registration will be blocked.

  * if quorum.device.votes is not configured and we cannot
    automatically calculate it, quorum device registration will be blocked.

  * An error in configuring quorum.expected_votes and quorum.device.votes
    will block quorum device registration.

blocking quorum device registation, also means dropping the votes.

quorum.device.votes (either set or automatically calculated) is now
used to determine current expected_votes in the cluster.

- logging (exec/votequorum):

  all errors from configuration are treated as WARNING/CRITICAL.

  lots of extra DEBUG output is added (see internal changes too).

- corosync-quorumtool (tools/corosync-quorumtool):

  * added option to forcefully kick out a quorum device from the local
    node. This is for emergency recovery only and it is only
    available when qdevice API is built-in.

  * Improved status output, specifically add node state and qdevice
    information

[root@fedora-master-node2 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          132
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
   0     1  QDEVICE (Voting)

  * allow to print status for any node in the cluster known to
    local node.

[root@fedora-master-node1 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net

[root@fedora-master-node1 coro]# corosync-quorumtool -s -n 2
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
         0     1  QDEVICE (Voting)

Internal changes:

- change qdevice timer to not run all time, but only when necessary.
- change votequorum_nodeinfo on wire data to use flags instead of uint8_t
  and add QDEVICE status.
- allocate nodeid 0 to qdevice since it's the only real
  nodeid that be reserved.
- change send_nodeinfo to allow to send nodeinfo for any node
  so that we can share qdevice info across the cluster
  (and this might be useful in future if we need to sync
   internal cluster view).
- add votequorum api call to update qdevice name
- add runtime data if quorum device has been forcefully disabled
  by config error
- add qdevice votes to expected_votes calculation (this
  is probably the biggest difference vs cman)
- change votequorum_read_nodelist_configuration so that
  we can autocalculate votes for qdevice (we need the nodecount
  vs votes).
- add all checks for startup/runtime config (see above).
- do not make qdevice part of the membership_list received from
  totem. None of our users care about it and it is not a real node.
- change onwire message handlers to deal with "data for this node from any node"
  case and undersand nodeid 0 for qdevice info
- always allocate qdevice at startup. this simplifies code a lot.
- dispatch qdevice nodeinfo on membership changes.
- inform libvotequorum users when a qdevice is registered
- improve substantially qdevice api and add a simple
  barrier based on qdevice name.
- add qdevice API barrier at cluster level. This feature allow
  only one qdevice name to be active in the cluster at any time.
- qdevice getinfo can now report status for qdevice on any node.
- change slightly the way the qdevice API is built-in/out:
  only the libvotequorum calls are #ifdef'out now. Doing so in
  the core is too complex and would make the code unreadable
  with the risk of missing a bit or two effectively introducing
  an on-wire incompatibility if we will ever turn the API on.
- probably added some bugs on the way...

TODO: update qdevice_* API once the above is settled and test
      qdevice integration with other features.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com> (only second part)
12 years agobuild: fix fallout from swithing to common shared lib v1.99.5
Fabio M. Di Nitto [Wed, 22 Feb 2012 08:05:14 +0000 (09:05 +0100)]
build: fix fallout from swithing to common shared lib

when building corosync on a clean system or for the very first
time, corosync_common needs to be visible both via -L for link
and for the LD_PATH, otherwise the linker cannot resolve
normal library dependencies.

This issue does NOT affect corosync users, but it's confined
to internal corosync only.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agoDocument SAM_RECOVERY_POLICY_CMAP
Jan Friesse [Fri, 17 Feb 2012 11:05:49 +0000 (12:05 +0100)]
Document SAM_RECOVERY_POLICY_CMAP

Also all irelevant references for SAM_RECOVERY_POLICY_CONFDB are
corrected.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoTweak nodeid warning
Jan Friesse [Fri, 17 Feb 2012 10:59:59 +0000 (11:59 +0100)]
Tweak nodeid warning

Nodeid warning now appears only when both totem.nodeid and nodelist
nodeid exists. When nodelist nodeid is not defined, totem.nodeid is
used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agospec: Add optional xmlconf
Jan Friesse [Tue, 21 Feb 2012 11:40:36 +0000 (12:40 +0100)]
spec: Add optional xmlconf

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agoiba: Use configured node id
Jan Friesse [Tue, 21 Feb 2012 13:30:35 +0000 (14:30 +0100)]
iba: Use configured node id

Corosync was ignoring nodeid for iba transport and always used
autogenerated one.

Original patch by: Jason Dillaman <jdillama@redhat.com>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agoConvert the common lib into a shared lib.
Angus Salkeld [Tue, 21 Feb 2012 09:26:08 +0000 (20:26 +1100)]
Convert the common lib into a shared lib.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
12 years agoAllow autoconfiguration of interface section
Jan Friesse [Wed, 15 Feb 2012 16:04:25 +0000 (17:04 +0100)]
Allow autoconfiguration of interface section

Thanks to totemip_getifaddrs infrastructure it's now possible to use
nodelist informations to autoconfigure interface bindnetaddr. Together
with cluster_name, interface section can be completely omitted.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agototemconfig: ensure suffix for ringX_addr
Jan Friesse [Wed, 15 Feb 2012 16:00:25 +0000 (17:00 +0100)]
totemconfig: ensure suffix for ringX_addr

Patch makes sure, that ringX_addr key has really _addr suffix.
Previously, it was possible to enter ringXanything and it was
interpreted as ringX_addr.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agocmap: Handle NULL in [i]cmap_set_string value
Jan Friesse [Wed, 15 Feb 2012 15:59:19 +0000 (16:59 +0100)]
cmap: Handle NULL in [i]cmap_set_string value

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoCreate solaris specific getifaddrs
Jan Friesse [Wed, 15 Feb 2012 12:45:53 +0000 (13:45 +0100)]
Create solaris specific getifaddrs

This not only makes possible to use generic totemip_iface_check, but
also fixes some problems with previous implementation (fixed mask, not
very well supported ipv6, ...)

Tested on OpenIndiana 151a

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoAdd totemip_iface_check based on totemip_getifaddrs
Jan Friesse [Wed, 15 Feb 2012 10:48:23 +0000 (11:48 +0100)]
Add totemip_iface_check based on totemip_getifaddrs

Also Linux and BSD/Darwin specific bits are no longer needed, so they
are gone.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoAdd generic implementation of getifaddrs
Jan Friesse [Wed, 15 Feb 2012 09:47:22 +0000 (10:47 +0100)]
Add generic implementation of getifaddrs

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agobuild: fix make dist to include xml man pages v1.99.4
Fabio M. Di Nitto [Tue, 14 Feb 2012 12:45:18 +0000 (13:45 +0100)]
build: fix make dist to include xml man pages

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agoChange the IPC TIMEOUT to block. v1.99.3
Angus Salkeld [Tue, 14 Feb 2012 10:27:02 +0000 (21:27 +1100)]
Change the IPC TIMEOUT to block.

This is to make sure that we properly wait for responses
from corosync. I have made a fix to libqb to properly
handle the case when corosync exits/crashes between
a send and receive.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
12 years agoCPG: fix membership_get()
Angus Salkeld [Mon, 13 Feb 2012 11:21:47 +0000 (22:21 +1100)]
CPG: fix membership_get()

1) remove BUSY loop from membership get
   Note only cpg_join and cpg_leave ever set the
   BUSY error code.
2) set the size correctly
3) copy the name in correctly

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoTEST: remove unused code.
Angus Salkeld [Mon, 13 Feb 2012 10:45:29 +0000 (21:45 +1100)]
TEST: remove unused code.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoMove hdb_error_to_cs to corotypes.h
Angus Salkeld [Mon, 13 Feb 2012 10:43:07 +0000 (21:43 +1100)]
Move hdb_error_to_cs to corotypes.h

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoTEST: add logging to testcpg and testevs
Angus Salkeld [Thu, 9 Feb 2012 02:15:07 +0000 (13:15 +1100)]
TEST: add logging to testcpg and testevs

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoTEST: Use pacemaker repeat macro
Angus Salkeld [Thu, 9 Feb 2012 00:11:47 +0000 (11:11 +1100)]
TEST: Use pacemaker repeat macro

This is to simulate the way pacemaker uses the cpg api.
With this you can run testcpg directly after corosync
starts and it should initialise ok.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoRemove a reference to openais that is present in corosync.conf.5
Steven Dake [Mon, 13 Feb 2012 21:07:46 +0000 (14:07 -0700)]
Remove a reference to openais that is present in corosync.conf.5

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
12 years agoUpdate corosync_overview.8 man page
Steven Dake [Mon, 13 Feb 2012 21:06:26 +0000 (14:06 -0700)]
Update corosync_overview.8 man page

Move forward 5 years on our main man page ;)

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
12 years agoRemove empty testquorum.c file
Steven Dake [Mon, 13 Feb 2012 20:37:03 +0000 (13:37 -0700)]
Remove empty testquorum.c file

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
12 years agoUpdate copyright header dates in exec directory
Steven Dake [Mon, 13 Feb 2012 20:35:54 +0000 (13:35 -0700)]
Update copyright header dates in exec directory

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
12 years agoUpdate copyright dates on include/totem files
Steven Dake [Mon, 13 Feb 2012 18:19:29 +0000 (11:19 -0700)]
Update copyright dates on include/totem files

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
12 years agoRemove jhash.h since it is not used
Steven Dake [Mon, 13 Feb 2012 18:17:54 +0000 (11:17 -0700)]
Remove jhash.h since it is not used

We would use libqb for hashing now if we needed hashing.
cpg no longer uses jhash.h.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
12 years agoUpdated copyright dates in include directory
Steven Dake [Mon, 13 Feb 2012 18:14:33 +0000 (11:14 -0700)]
Updated copyright dates in include directory

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
12 years agoUpdate copyright dates in tools directory
Steven Dake [Mon, 13 Feb 2012 18:04:26 +0000 (11:04 -0700)]
Update copyright dates in tools directory

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
12 years agoUpdate copyright dates in util directory
Steven Dake [Mon, 13 Feb 2012 18:00:51 +0000 (11:00 -0700)]
Update copyright dates in util directory

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
12 years agoRemove unused or unimplemented CFG apis
Steven Dake [Mon, 13 Feb 2012 02:44:11 +0000 (19:44 -0700)]
Remove unused or unimplemented CFG apis

Remove:
cfg_statetrack
cfg_statetrackstop
cfg_administrativestateste
cfg_administrativestateget
cfg_serviceload
cfg_serviceunload

Rev SO to 5.0.0

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
12 years agovotequorum: cleanup all man pages
Fabio M. Di Nitto [Mon, 13 Feb 2012 09:56:08 +0000 (10:56 +0100)]
votequorum: cleanup all man pages

sort and reference man pages in typical usage order

update some structures/defines

clean formatting to be consistent

don't ship qdevice API man pages for now

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
12 years agoquorum: cleanup all man pages
Fabio M. Di Nitto [Fri, 10 Feb 2012 14:35:41 +0000 (15:35 +0100)]
quorum: cleanup all man pages

sort and reference man pages in typical usage order

update some structures/defines

clean formatting to be consistent

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
12 years agocpg: drop dead code
Fabio M. Di Nitto [Thu, 9 Feb 2012 15:52:21 +0000 (16:52 +0100)]
cpg: drop dead code

not used/referenced anywhere

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agocoverity: increase aggressiveness of the test and fix build
Fabio M. Di Nitto [Tue, 7 Feb 2012 10:17:32 +0000 (11:17 +0100)]
coverity: increase aggressiveness of the test and fix build

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agovotequorum: fix variable init
Fabio M. Di Nitto [Tue, 7 Feb 2012 10:11:27 +0000 (11:11 +0100)]
votequorum: fix variable init

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoquorumtool: fix some var init and checks
Fabio M. Di Nitto [Tue, 7 Feb 2012 10:02:51 +0000 (11:02 +0100)]
quorumtool: fix some var init and checks

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agovotequorum: fix possible memory corruption
Fabio M. Di Nitto [Tue, 7 Feb 2012 09:25:26 +0000 (10:25 +0100)]
votequorum: fix possible memory corruption

nodeid = 0 is a valide nodeid and node associated with it should
not be freed

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoquorum: don't leak memory on error
Fabio M. Di Nitto [Tue, 7 Feb 2012 09:20:27 +0000 (10:20 +0100)]
quorum: don't leak memory on error

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agotestvotequorum: fix test loop to break if votequorum goes away
Fabio M. Di Nitto [Tue, 7 Feb 2012 09:11:38 +0000 (10:11 +0100)]
testvotequorum: fix test loop to break if votequorum goes away

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoquorumtool: fix return code
Fabio M. Di Nitto [Tue, 7 Feb 2012 09:08:47 +0000 (10:08 +0100)]
quorumtool: fix return code

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agotestquorum: check for quorum_dispatch return code
Fabio M. Di Nitto [Tue, 7 Feb 2012 09:05:07 +0000 (10:05 +0100)]
testquorum: check for quorum_dispatch return code

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoquorumtools: check for quorum_dispatch return code
Fabio M. Di Nitto [Tue, 7 Feb 2012 08:53:30 +0000 (09:53 +0100)]
quorumtools: check for quorum_dispatch return code

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agospecfile: ship new man pages
Fabio M. Di Nitto [Thu, 9 Feb 2012 12:24:34 +0000 (13:24 +0100)]
specfile: ship new man pages

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoman: add *quorum_track* devel man pages
Fabio M. Di Nitto [Thu, 9 Feb 2012 12:21:13 +0000 (13:21 +0100)]
man: add *quorum_track* devel man pages

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoquorum: drop dead code
Fabio M. Di Nitto [Thu, 9 Feb 2012 12:02:55 +0000 (13:02 +0100)]
quorum: drop dead code

spotted while writing man pages. There are no users for this struct

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoman: add quorum_overview.8 man page
Fabio M. Di Nitto [Thu, 9 Feb 2012 12:01:32 +0000 (13:01 +0100)]
man: add quorum_overview.8 man page

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoman: hook quorum and votequorum devel man pages with genman script
Fabio M. Di Nitto [Thu, 9 Feb 2012 10:01:28 +0000 (11:01 +0100)]
man: hook quorum and votequorum devel man pages with genman script

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoman: rename all devel man pages to .3.in
Fabio M. Di Nitto [Thu, 9 Feb 2012 09:35:38 +0000 (10:35 +0100)]
man: rename all devel man pages to .3.in

tidy up man/Makefile.am a bit in the process

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoman: add build infrastructure to generate devel man pages
Fabio M. Di Nitto [Thu, 9 Feb 2012 09:14:59 +0000 (10:14 +0100)]
man: add build infrastructure to generate devel man pages

this is useful to include ipc_common errors into all man pages

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agomove cs_strerror() to common_lib
Angus Salkeld [Tue, 7 Feb 2012 23:36:11 +0000 (09:36 +1000)]
move cs_strerror() to common_lib

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoTreat ENOBUFS as TRY_AGAIN
Angus Salkeld [Tue, 7 Feb 2012 23:31:22 +0000 (10:31 +1100)]
Treat ENOBUFS as TRY_AGAIN

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agomove hdb_error_to_cs to common_lib
Angus Salkeld [Tue, 7 Feb 2012 23:00:47 +0000 (10:00 +1100)]
move hdb_error_to_cs to common_lib

Note the previous inconsistent implementation.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoAdd a common library that can be shared between libs and corosync
Angus Salkeld [Wed, 8 Feb 2012 23:43:49 +0000 (10:43 +1100)]
Add a common library that can be shared between libs and corosync

We have always had this problem and worked around it by coping code
or using inline functions. Both not good IMO.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
12 years agoRemove cs_config.h from global header install
Steven Dake [Wed, 8 Feb 2012 14:41:02 +0000 (07:41 -0700)]
Remove cs_config.h from global header install

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>