]> git.proxmox.com Git - mirror_corosync.git/log
mirror_corosync.git
13 years agoCTS: get GenStopAllBeekhof working a bit better
Angus Salkeld [Thu, 9 Dec 2010 06:03:13 +0000 (17:03 +1100)]
CTS: get GenStopAllBeekhof working a bit better

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: log bind() errors better
Angus Salkeld [Thu, 9 Dec 2010 06:14:07 +0000 (17:14 +1100)]
CTS: log bind() errors better

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: log cfg results
Angus Salkeld [Wed, 8 Dec 2010 02:33:35 +0000 (13:33 +1100)]
CTS: log cfg results

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: rename flatiron to needle
Angus Salkeld [Tue, 23 Nov 2010 00:46:25 +0000 (11:46 +1100)]
CTS: rename flatiron to needle

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: add exit handler to test_agents
Angus Salkeld [Wed, 8 Dec 2010 01:18:02 +0000 (12:18 +1100)]
CTS: add exit handler to test_agents

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: add "Too many open files" to the BadNews pattern
Angus Salkeld [Thu, 9 Dec 2010 06:12:07 +0000 (17:12 +1100)]
CTS: add "Too many open files" to the BadNews pattern

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: impove debug during msgSend test
Angus Salkeld [Tue, 23 Nov 2010 00:47:33 +0000 (11:47 +1100)]
CTS: impove debug during msgSend test

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: add logging to test agent
Angus Salkeld [Wed, 8 Dec 2010 00:28:06 +0000 (11:28 +1100)]
CTS: add logging to test agent

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: increse wait for node to reboot
Angus Salkeld [Thu, 11 Nov 2010 04:45:40 +0000 (15:45 +1100)]
CTS: increse wait for node to reboot

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoCTS: support new pacemaker-cts
Angus Salkeld [Thu, 11 Nov 2010 04:47:43 +0000 (15:47 +1100)]
CTS: support new pacemaker-cts

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoAUGEAS: fix "tags" log field
Angus Salkeld [Fri, 3 Dec 2010 03:29:12 +0000 (14:29 +1100)]
AUGEAS: fix "tags" log field

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoTEST: fix the print out when cpg_finalize() fails
Angus Salkeld [Tue, 14 Dec 2010 06:15:08 +0000 (17:15 +1100)]
TEST: fix the print out when cpg_finalize() fails

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: use the new cs_strerror() to print out the error message.
Angus Salkeld [Tue, 14 Dec 2010 02:05:49 +0000 (13:05 +1100)]
libqb: use the new cs_strerror() to print out the error message.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: fix iov_len in pcmk_test
Angus Salkeld [Mon, 15 Nov 2010 02:39:04 +0000 (13:39 +1100)]
libqb: fix iov_len in pcmk_test

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: fix valgring warnings in mon/wd
Angus Salkeld [Sun, 14 Nov 2010 12:54:27 +0000 (23:54 +1100)]
libqb: fix valgring warnings in mon/wd

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: change pause_timestamp to uint64_t
Angus Salkeld [Tue, 16 Nov 2010 22:16:34 +0000 (09:16 +1100)]
libqb: change pause_timestamp to uint64_t

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: rip out objdb & serialize locks
Angus Salkeld [Thu, 11 Nov 2010 21:32:37 +0000 (08:32 +1100)]
libqb: rip out objdb & serialize locks

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: only init IPC on service engines that need it.
Angus Salkeld [Mon, 15 Nov 2010 10:20:23 +0000 (21:20 +1100)]
libqb: only init IPC on service engines that need it.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: remove the lib init/exit from the test service agent
Angus Salkeld [Thu, 11 Nov 2010 03:38:38 +0000 (14:38 +1100)]
libqb: remove the lib init/exit from the test service agent

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: use the main loop to shutdown
Angus Salkeld [Mon, 15 Nov 2010 10:19:18 +0000 (21:19 +1100)]
libqb: use the main loop to shutdown

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: remove tsafe.c
Angus Salkeld [Fri, 5 Aug 2011 02:30:14 +0000 (12:30 +1000)]
libqb: remove tsafe.c

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: remove worker thread - keep to one thread.
Angus Salkeld [Wed, 10 Nov 2010 08:38:34 +0000 (18:38 +1000)]
libqb: remove worker thread - keep to one thread.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: make timer api a wrapper around qb_loop timers.
Angus Salkeld [Wed, 10 Nov 2010 08:38:33 +0000 (19:38 +1100)]
libqb: make timer api a wrapper around qb_loop timers.

- change timeout value to nano seconds
- fix timer handles (don't alloc on stack)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: change ipc -> qb_ipc
Angus Salkeld [Fri, 5 Aug 2011 02:18:43 +0000 (12:18 +1000)]
libqb: change ipc -> qb_ipc

IPC: return 0/-ENOBUFS from message handler
IPC: use the new rate_limit API to improve perf.
CPG: add send_async API & hook up flow control
IPC: Fix flow control getting stuck.
IPC: Port the remaining libs to use libqb IPC
IPC: remove libqb flowcontrol API
TEST: put cpg_dispatch() in it's own thread
IPC: cleanup ipc_glue.c name everything cs_ipcs_*()
IPC: add back statistics
IPC: remove coroipcc_ symbols from lib*.versions
IPC: init each se's IPC as it is loaded.
IPC: use the new connection_closed() event to free the context.
IPC: re-add zero copy functionality back
IPC: remove cpg_mcast_joined_async() and make it the default
 -> now cpg_mcast_joined() == cpg_mcast_joined_async()
libqb: expose a libqb error converter
libqb: add missing error conversions
libqb: remove repeat try loop in lib/cpg.c
CPG: fix zero copy mcast
CPG: use newer return codes
Add ENOTCONN to qb_to_cs_error()
libqb: fix error conversion from errno to cs_error_t in confdb
libqb: change errno_to_cs to qb_to_cs_error
libqb: add a cs_strerror() to get a more meaningful message
libqb: fix some confusing error conversions.
libqb: set the timeout on recv's to -1 (wait forever)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibqb: convert coropoll calls to qb_loop calls.
Angus Salkeld [Fri, 5 Aug 2011 01:52:28 +0000 (11:52 +1000)]
libqb: convert coropoll calls to qb_loop calls.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoAdd systemd unit files for corosync and corosync-notifyd
Angus Salkeld [Mon, 8 Aug 2011 11:01:52 +0000 (21:01 +1000)]
Add systemd unit files for corosync and corosync-notifyd

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocorosync.conf.example: add note about host addresses in bindnetaddr
Florian Haas [Mon, 1 Aug 2011 06:47:58 +0000 (08:47 +0200)]
corosync.conf.example: add note about host addresses in bindnetaddr

https://lists.linux-foundation.org/pipermail/openais/2011-July/016563.html

Jan Friesse pointed out that bindnetaddr should be set to a host
address (as opposed to a network address) on hosts where multiple
NICs live on the same subnet. Add a comment to that effect to
the example configuration file.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocorosync.conf.example: include comments
Florian Haas [Tue, 26 Jul 2011 16:54:10 +0000 (18:54 +0200)]
corosync.conf.example: include comments

It's nice to say people should read the man page. It's also naive to
assume that they always do. Include comments in the example config
file itself.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocorosync.conf.example: change mcastaddr
Florian Haas [Tue, 26 Jul 2011 16:16:31 +0000 (18:16 +0200)]
corosync.conf.example: change mcastaddr

Change suggested mcastaddr to one in the 239.255.0.0/16
pseudo-subnet. Multicast addresses outside 239.x.x.x may be IANA
registered and can clash with other services present on the
network. Suggest an address defined as part of the multicast IPv4
Local Scope in RFC 2365.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocorosync.conf.example: change bindnetaddr
Florian Haas [Tue, 26 Jul 2011 16:14:53 +0000 (18:14 +0200)]
corosync.conf.example: change bindnetaddr

Change the example configuration file so "bindnetaddr" has a value
that more obviously looks like a network address. So as not to have
people think they need to set an existing IP address here (and hence,
have non-identical corosync.conf files between nodes).

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agomain: let poll really stop before totempg_finalize
Jan Friesse [Mon, 25 Jul 2011 13:18:10 +0000 (15:18 +0200)]
main: let poll really stop before totempg_finalize

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoRevert "totemsrp: Remove recv_flush code"
Jan Friesse [Tue, 26 Jul 2011 08:05:34 +0000 (10:05 +0200)]
Revert "totemsrp: Remove recv_flush code"

This reverts commit 1a7b7a39f445be63c697170c1680eeca9834de39.

Reversion is needed to remove overflow of receive buffers and dropping
messages.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
13 years agototemsrp: fix buffer overflows for large clusters (> 100 nodes)
MORITA Kazutaka [Sun, 24 Jul 2011 09:58:40 +0000 (18:58 +0900)]
totemsrp: fix buffer overflows for large clusters (> 100 nodes)

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agospecfile: Install corosync-signals.conf for dbus
Jan Friesse [Tue, 19 Jul 2011 14:41:44 +0000 (16:41 +0200)]
specfile: Install corosync-signals.conf for dbus

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agospecfile: use _datadir as var expansion not exec
Jan Friesse [Tue, 19 Jul 2011 14:35:28 +0000 (16:35 +0200)]
specfile: use _datadir as var expansion not exec

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agospecfile: Correct URL and source0
Jan Friesse [Tue, 19 Jul 2011 13:21:45 +0000 (15:21 +0200)]
specfile: Correct URL and source0

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoAdd some more stats for debugging
Tim Beale [Tue, 19 Jul 2011 15:58:21 +0000 (08:58 -0700)]
Add some more stats for debugging

+ overload - number of times client is told to try again
+ invalid_request - message contained invalid paramter, e.g. invalid size
+ msg_queue_avail - messages currently available at the Totem layer
+ msg-queue_reserved - messages currently reserved at the Totem layer

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agorrp: Handle rollower in passive rrp properly
Jan Friesse [Fri, 15 Jul 2011 12:29:06 +0000 (08:29 -0400)]
rrp: Handle rollower in passive rrp properly

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agorrp: handle rollover in active rrp properly
Jan Friesse [Tue, 12 Jul 2011 10:55:16 +0000 (06:55 -0400)]
rrp: handle rollover in active rrp properly

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agototemconfig: Change default FAIL_TO_RECV_CONST
Jan Friesse [Fri, 15 Jul 2011 15:10:41 +0000 (17:10 +0200)]
totemconfig: Change default FAIL_TO_RECV_CONST

Previous default (50) was too low for most modern switch hardware. This
may trigger abort because the aru doesn't increase for 50 token
rotations combined with a defect in how failed to recv conditions are
handled.  By increasing this tunable, the condition should no longer
trigger the errant code.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoCorrect missing poll funtions from service handler struct needed for confdb APIs
Steven Dake [Mon, 4 Jul 2011 15:17:53 +0000 (08:17 -0700)]
Correct missing poll funtions from service handler struct needed for confdb APIs

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
13 years agoFix problem where corosync will segfault if there are gaps in recovery queue
Steven Dake [Thu, 7 Jul 2011 22:29:10 +0000 (15:29 -0700)]
Fix problem where corosync will segfault if there are gaps in recovery queue

Fixes a problem where there are gaps in the recovery queue.  Example my_aru = 5,
but there are messages at 7,8.  8 = my_high_seq_received which results
in data slots taken up in new message queue.  What should really happen
is these last messages should be delivered after a transitional
configuration to maintain SAFE agreement.  We don't have support for
SAFE atm, so it is probably safe just to throw these messages away.  Without
this change, the new message queue on a new configuraton change is out of sync.

Signed-off-by: Steven Dake <sdake@redhat.com>
Tested-by: Tim Beale <tlbeale@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
13 years agototemiba: free send_buf on ibv_reg_mr failure
Jan Friesse [Thu, 7 Jul 2011 08:58:06 +0000 (10:58 +0200)]
totemiba: free send_buf on ibv_reg_mr failure

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agobuild: disable RDMA support in RPMs by default
Florian Haas [Tue, 5 Jul 2011 11:44:57 +0000 (13:44 +0200)]
build: disable RDMA support in RPMs by default

Rather than curiously disable RDMA support by default in configure and
enable it by default in RPM builds, streamline the default
configuration to always turn RDMA support off. It can be enabled in
RPM builds with "--with rdma".

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agobuild: set RDMA related _LIBS and _CFLAGS only if building with RDMA support
Florian Haas [Tue, 5 Jul 2011 11:22:50 +0000 (13:22 +0200)]
build: set RDMA related _LIBS and _CFLAGS only if building with RDMA support

Having to force {ibverbs,rdmacm}_{LIBS,CFLAGS} looks positively odd;
so this may warrant further review. However, they are definitely not
needed if building without RDMA support.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agobuild: make RDMA support an RPM build conditional
Florian Haas [Tue, 5 Jul 2011 09:54:52 +0000 (11:54 +0200)]
build: make RDMA support an RPM build conditional

Enable RDMA in RPM builds by default to maintain the previous behavior
(which always included --enable-rdma in the %configure invocation).

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agobuild: force LC_ALL=C correctly for dates
Florian Haas [Tue, 5 Jul 2011 11:10:05 +0000 (13:10 +0200)]
build: force LC_ALL=C correctly for dates

Failure to force "C" dates will have RPM et al. complain about invalid
dates and timestamps.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoFix compile/runtime issues for _POSIX_THREAD_PROCESS_SHARED < 1
Tim Beale [Wed, 6 Jul 2011 13:38:17 +0000 (06:38 -0700)]
Fix compile/runtime issues for _POSIX_THREAD_PROCESS_SHARED < 1

For the case where _POSIX_THREAD_PROCESS_SHARED < 1, the code doesn't compile
for corosync v1.3.1. And when it does compile, it crashes on our system - our
version of uClibc seems to always expect a 4th arg. The man pages suggests
the 4th arg is optional, but does say: 'For greater portability it is best to
always call semctl() with four arguments', which is what this patch does.
Also removed semop as it's an unused variable.

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agogetpwnam_r()/getgrnam_r() returns ERANGE for some systems
Tim Beale [Wed, 6 Jul 2011 13:31:45 +0000 (06:31 -0700)]
getpwnam_r()/getgrnam_r() returns ERANGE for some systems

On our system the expected buffer length is 256. This means calls to
getpwnam_r()/getgrnam_r() return ERANGE error and corosync fails to startup.
These 2 functions return ERANGE when insufficient buffer space is supplied.
Judging by the man page for getpwnam_r, the correct way to determine the
buffersize on any given system is to use sysconf().

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoRRP: redundant ring automatic recovery
Jiaju Zhang [Tue, 5 Jul 2011 15:54:38 +0000 (23:54 +0800)]
RRP: redundant ring automatic recovery

This patch automatically recovers redundant ring failures.

Please note that this patch introduced rrp_autorecovery_check_timeout
in totem config hence breaks internal ABI. The internal ABI users
of totem.h need to rebuild their binaries.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Signed-off-by: Steven Dake <sdake@redhat.com>
Tested-by: Jan Friesse <jfriesse@redhat.com>
Tested-by: Florian Haas <florian.haas@linbit.com>
Tested-by: Jiaju Zhang <jjzhang@suse.de>
13 years agoCorrect mailing list address in corosync_overview manpage
Tim Serong [Mon, 23 May 2011 04:19:23 +0000 (14:19 +1000)]
Correct mailing list address in corosync_overview manpage

Signed-off-by: Tim Serong <tserong@novell.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agofix typos in cpg_mcast_joined.3 and cpg_zcb_mcast_joined.3
Masatake YAMATO [Tue, 28 Jun 2011 09:06:23 +0000 (18:06 +0900)]
fix typos in cpg_mcast_joined.3 and cpg_zcb_mcast_joined.3

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
13 years agoAdd coverity target to corosync makefile.am
Steven Dake [Fri, 20 May 2011 02:53:00 +0000 (19:53 -0700)]
Add coverity target to corosync makefile.am

Allow a make coverity target for those developers with coverity tools
available to them.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
13 years agocoroipcc: Test _SC_PAGESIZE result
Jan Friesse [Mon, 30 May 2011 11:15:02 +0000 (13:15 +0200)]
coroipcc: Test _SC_PAGESIZE result

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoRemove spinlocks
Jan Friesse [Tue, 21 Jun 2011 09:57:08 +0000 (11:57 +0200)]
Remove spinlocks

Spinlocks are now removed, because even spinlock can improve
speed is some special cases, in most cases it makes corosync CPU usage
much more intensive and less responsive then if only mutexes are used.

What we were doing is:
pthread_mutex_lock
pthread_spin_lock
pthread_spin_unlock
pthread_mutex_unlock

what is not safe.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agovotequorum: free newly allocated node if nodeid==0
Jan Friesse [Mon, 30 May 2011 14:00:45 +0000 (16:00 +0200)]
votequorum: free newly allocated node if nodeid==0

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoFix usage of strerror_r()/perror()
Jerome Flesch [Tue, 28 Jun 2011 07:56:58 +0000 (09:56 +0200)]
Fix usage of strerror_r()/perror()

Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
13 years agosched_params log message incorrect
Steven Dake [Thu, 23 Jun 2011 05:46:56 +0000 (22:46 -0700)]
sched_params log message incorrect

The sched_params parameter was set before being printed.

Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Reviewed-by: <sdake@redhat.com>
13 years agoconfigure.ac: Align --enable-* options description
Jan Friesse [Tue, 21 Jun 2011 10:02:56 +0000 (12:02 +0200)]
configure.ac: Align --enable-* options description

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoconfigure.ac: change edefault to default
Jan Friesse [Tue, 21 Jun 2011 10:51:55 +0000 (12:51 +0200)]
configure.ac: change edefault to default

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoCTS: Test for confdb dispatch deadlock
Jan Friesse [Wed, 15 Jun 2011 14:49:53 +0000 (16:49 +0200)]
CTS: Test for confdb dispatch deadlock

Test is disabled by default because it depends on SMP and about 2GB RAM.
It's also testing race, so test is unreliable.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoconfdb: Resolve dispatch deadlock
Jan Friesse [Wed, 15 Jun 2011 13:54:23 +0000 (15:54 +0200)]
confdb: Resolve dispatch deadlock

Following situation could happen:
- one thread is waiting for finish write operation (line 853), objdb is
  locked
- flush (done in objdb_notify_dispatch) is called in main thread, but
  this call will never appear because main thread is waiting for objdb
  lock.

In this situation deadlock appears.

Commit solves this by:
- setting pipe to non-blocking mode
- pipe is used only as trigger for coropoll
- dispatch messages are stored in list
- main thread is processing messages from list

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoobjdb: save copy of handles in object_find_create
Jan Friesse [Thu, 9 Jun 2011 13:46:31 +0000 (15:46 +0200)]
objdb: save copy of handles in object_find_create

Following situation could happen:
- process 1 thru confdb creates find handle
- calls find iteration once
- different process 2 deletes object pointed by process 1 iterator
- process 1 calls iteration again ->
  object_find_instance->find_child_list is invalid pointer

-> segfault

Now object_find_create creates array of matching object handlers and
object_find_next uses that array together with check for name. This
prevents situation where between steps 2 and 3 new object is created
with different name but sadly with same handle.

Also good to note that this patch is more or less quick hack rather
then proper solution. Real proper solution is to not use pointers
and rather use handles everywhere. This is big TODO.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoRRP: Fix ring initialization issue for UDPU mode
Jiaju Zhang [Wed, 8 Jun 2011 23:59:26 +0000 (07:59 +0800)]
RRP: Fix ring initialization issue for UDPU mode

Redundant ring has some problem in the UDP unicast mode. The problem
is the second ring has not been successfully initialized, that is, the
second time iface_changes happens, the member list for that interface
has not been added, which results in that ring cannot transmit normal
message. So the second ring cannot take over the work if the first
ring is down. This patch fixes this issue.

comments from review:
More work is needed probably in totemnet where totemnet maintains the
the of node list and an iterator for them, and totemudpu_member_add adds
state information to a context for the iteration.

In any regard, that is somewhat difficult to test, so I'll merge this
patch for now - keep in mind interface changes on the bindnetaddr will
cause problems with udpu after this patch has been commmitted.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocoroipcc: check recvmsg result in socket_recv
Jan Friesse [Thu, 9 Jun 2011 13:42:54 +0000 (15:42 +0200)]
coroipcc: check recvmsg result in socket_recv

According specification recvmsg can return 0, which means that
connection is closed. We had this check, but limited only for systems
other then Linux. recvmsg can return 0 even on Linux, so check is now
applied on all systems.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoconfdb: Properly check result of object_find_create
Jan Friesse [Thu, 9 Jun 2011 13:42:33 +0000 (15:42 +0200)]
confdb: Properly check result of object_find_create

in confdb_object_iter result of object_find_create is now properly
checked. object_find_create can return -1 if object doesn't exists.
Without this check, incorrect handle (memory garbage) was directly
passed to object_find_next.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
13 years agocrypto: rng_make_prng prevent buf overflow
Jan Friesse [Mon, 30 May 2011 14:55:45 +0000 (16:55 +0200)]
crypto: rng_make_prng prevent buf overflow

with bits set to 1023, buf of 256 bytes was filled by rng_get_bytes
up to 257 bytes. Buf is now 258 bytes so it's no longer problem.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agomainconfig: Check retval of logsys_format_set
Jan Friesse [Mon, 30 May 2011 11:02:36 +0000 (13:02 +0200)]
mainconfig: Check retval of logsys_format_set

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agotestcpgzc: fgets buffer to really allocated size
Jan Friesse [Mon, 30 May 2011 13:51:45 +0000 (15:51 +0200)]
testcpgzc: fgets buffer to really allocated size

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocpg: do_proc_join change list_slice to list_add
Jan Friesse [Mon, 30 May 2011 14:41:37 +0000 (16:41 +0200)]
cpg: do_proc_join change list_slice to list_add

In this concrete case result is equivalent but makes coverity happy.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agototemudp: memset of proper size
Jan Friesse [Mon, 30 May 2011 13:53:39 +0000 (15:53 +0200)]
totemudp: memset of proper size

In totemudp_mcast_thread_state_constructor memset to
sizeof(struct totemudp_mcast_thread_state) instead of size of
pointer.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocoroipcs: init buf in coroipcs_handler_dispatch
Jan Friesse [Mon, 30 May 2011 13:50:04 +0000 (15:50 +0200)]
coroipcs: init buf in coroipcs_handler_dispatch

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocoroparse: don't leak dirent
Jan Friesse [Mon, 30 May 2011 13:43:14 +0000 (15:43 +0200)]
coroparse: don't leak dirent

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agologsys: _logsys_wthread_create never returns != 0
Jan Friesse [Mon, 30 May 2011 11:08:23 +0000 (13:08 +0200)]
logsys: _logsys_wthread_create never returns != 0

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agonotifyd: Check retval of corosync_cfg_initialize
Jan Friesse [Mon, 30 May 2011 11:06:03 +0000 (13:06 +0200)]
notifyd: Check retval of corosync_cfg_initialize

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agototemconfig: discard check of objdb_get_string ret
Jan Friesse [Mon, 30 May 2011 10:37:20 +0000 (12:37 +0200)]
totemconfig: discard check of objdb_get_string ret

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocoroipcc: proper path size in coroipcc_zcb_alloc
Jan Friesse [Mon, 30 May 2011 09:59:27 +0000 (11:59 +0200)]
coroipcc: proper path size in coroipcc_zcb_alloc

memory_map function internally limits maximum path size to
PATH_MAX but coroipcc_zcb_alloc passed smaller buffer.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agolibquorum: memset/memcpy proper size of callbacks
Jan Friesse [Mon, 30 May 2011 09:54:42 +0000 (11:54 +0200)]
libquorum: memset/memcpy proper size of callbacks

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoiazc: Reduce number of mem alloc and memcpy
Jan Friesse [Tue, 17 May 2011 09:20:37 +0000 (11:20 +0200)]
iazc: Reduce number of mem alloc and memcpy

X86 processors are able to handle unaligned memory access. Improve
performance by using that feature on i386 and x86_64 compatible
processors, and use old aligning code on different processors.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agologsys: When corosync is compiled with --enable-small-memory-footprint, also reduce...
Jerome Flesch [Fri, 27 May 2011 11:45:27 +0000 (13:45 +0200)]
logsys: When corosync is compiled with --enable-small-memory-footprint, also reduce the size of the logsys SHM

Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocoroipcc_dispatch_get(): Fix --enable-small-memory-footprint support
Jerome Flesch [Fri, 27 May 2011 11:42:42 +0000 (13:42 +0200)]
coroipcc_dispatch_get(): Fix --enable-small-memory-footprint support

Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
13 years agocoroipcs_handler_dispatch(): Fix conn_info->service security value: -1 is not a good...
Jerome Flesch [Fri, 27 May 2011 11:40:36 +0000 (13:40 +0200)]
coroipcs_handler_dispatch(): Fix conn_info->service security value: -1 is not a good security value since it's equal to SOCKET_SERVICE_INIT

Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
13 years agocoroipcc: Fix unhandled BSD EOF in coroipcc_dispatch_get()
Jerome Flesch [Fri, 27 May 2011 11:35:02 +0000 (13:35 +0200)]
coroipcc: Fix unhandled BSD EOF in coroipcc_dispatch_get()

Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoCorosync: Fix build when done with --enable-fatal-warnings
Jerome Flesch [Fri, 27 May 2011 11:29:12 +0000 (13:29 +0200)]
Corosync: Fix build when done with --enable-fatal-warnings

Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Jan Friesse<jfriesse@redhat.com>
13 years agologsys.c: Use snprintf() instead of sprintf().
Russell Bryant [Sun, 8 May 2011 07:40:34 +0000 (02:40 -0500)]
logsys.c: Use snprintf() instead of sprintf().

Change a couple of string functions to use the the output length
limiting counterpart.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
13 years agocorosync-objctl: Option to display binary data
Jan Friesse [Wed, 11 May 2011 14:58:23 +0000 (16:58 +0200)]
corosync-objctl: Option to display binary data

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agocpg: fix sync master selection when one node paused.
Angus Salkeld [Wed, 4 May 2011 23:29:37 +0000 (09:29 +1000)]
cpg: fix sync master selection when one node paused.

If one node is paused it can miss a config change and
thus report a larger old_members than expected.

The solution is to use the left_nodes field.

Master selection used to be "choose node with":
1) largest previous membership
2) (then as a tie-breaker) node with smallest nodeid

New selection:
1) largest (previous #nodes - #nodes know to have left)
2) (then as a tie-breaker) node with smallest nodeid

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: fix some tests that didn't handle been called more than one
Angus Salkeld [Wed, 4 May 2011 23:11:18 +0000 (09:11 +1000)]
CTS: fix some tests that didn't handle been called more than one

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: sort the configuration - prevent duplicates in the config file
Angus Salkeld [Wed, 4 May 2011 23:09:38 +0000 (09:09 +1000)]
CTS: sort the configuration - prevent duplicates in the config file

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: fix syntax error in log message
Angus Salkeld [Wed, 4 May 2011 23:10:20 +0000 (09:10 +1000)]
CTS: fix syntax error in log message

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: bump up log messages of failed RPC
Angus Salkeld [Wed, 4 May 2011 23:08:11 +0000 (09:08 +1000)]
CTS: bump up log messages of failed RPC

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: don't force all-once (breaks random tests)
Angus Salkeld [Wed, 4 May 2011 23:07:04 +0000 (09:07 +1000)]
CTS: don't force all-once (breaks random tests)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoautobuild: improve messages
Angus Salkeld [Wed, 4 May 2011 23:06:28 +0000 (09:06 +1000)]
autobuild: improve messages

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: add -l to keygen (normal keygen struggles to run on VMs)
Angus Salkeld [Wed, 4 May 2011 04:41:18 +0000 (14:41 +1000)]
CTS: add -l to keygen (normal keygen struggles to run on VMs)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: send with correct number of iovecs
Angus Salkeld [Mon, 18 Apr 2011 02:46:53 +0000 (12:46 +1000)]
CTS: send with correct number of iovecs

Else payload won't be sent

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agoCTS: timer should not be on the stack
Angus Salkeld [Mon, 18 Apr 2011 02:45:50 +0000 (12:45 +1000)]
CTS: timer should not be on the stack

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
13 years agototemsrp: Enhance mcast failure detection
Jan Friesse [Wed, 4 May 2011 13:00:31 +0000 (15:00 +0200)]
totemsrp: Enhance mcast failure detection

memb_state_gather_enter increase stats.continuous_gather only if
previous state was gather also. This should happen only if multicast is
not working properly (local firewall in most cases) and not if many
nodes joins at one time.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
13 years agocoroipcs: Deny connect to service without initfn
Jan Friesse [Tue, 29 Mar 2011 13:51:42 +0000 (15:51 +0200)]
coroipcs: Deny connect to service without initfn

If library connect to service with no init function, coroipcs will try
to dereference NULL pointer. Now we correctly return error code
CS_ERR_NOT_EXIST.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoAdd ipc_refcnt to message_handler_req_{exec, lib}_cfg_ringreenable()
Tim Serong [Fri, 15 Apr 2011 00:40:11 +0000 (10:40 +1000)]
Add ipc_refcnt to message_handler_req_{exec, lib}_cfg_ringreenable()

Without refcounting the conn pointer here, corosync will segfault
if one kills a running instance of "corosync-cfgtool -r" (rhbz#695191)

Signed-off-by: Tim Serong <tserong@novell.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
13 years agoAlign ipc on 8 byte boundaries
Steven Dake [Mon, 3 Jan 2011 23:40:55 +0000 (16:40 -0700)]
Align ipc on 8 byte boundaries

Align all ipc messages on 8 byte boundaries.  This alignment will remove bus
errors on systems that can't access non-byte aligned data and should improve
performance.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>