Most 32-bit architectures cannot do atomic loads and stores of data
wider than their pointer size, i.e. 32 bit. Funnily enough they
generally *can* do a CAS2, i.e., 64-bit compare-and-swap, but while a
CAS can emulate atomic add/bitops, loads and stores aren't available.
Replace with a mutex; since this is 99% used from the zserv thread, the
mutex should take the local-to-thread fast path anyway. And while one
atomic might be faster than a mutex lock/unlock, we're doing several
here, and at some point a mutex wins on speed anyway.
This fixes build on armel, mipsel, m68k, powerpc, and sh4.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>