Alex Wang [Sat, 15 Mar 2014 01:30:39 +0000 (18:30 -0700)]
cfm: Notify connectivity_seq on remote maintenance points change.
Commit f23d157c ("ofproto-dpif: Don't poll ports when nothing changes")
did not ensure the update of the row of remote maintenance points in ovsdb
when it changes. This commit makes the update happen by notifying the
global connectivity_seq.
Alex Wang [Wed, 19 Mar 2014 17:42:08 +0000 (10:42 -0700)]
ovs-rcu: Call ovsrcu_init() in ovsrcu_quiesce().
This commit fixes a bug introduced by 0f2ea848(ovs-rcu: New library.).
It is possible that ovsrcu_quiesce() is called before ovsrcu_init().
So, it is necessary to call ovsrcu_init() in ovsrcu_quiesce().
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Wed, 19 Mar 2014 15:51:52 +0000 (08:51 -0700)]
lib/hmap: Remove the memory fence from hmap_is_empty().
The fence made classifier_lookup() slower. Access to a size_t 'n' is
safe without synchonizing, and if racing with writers matters,
additional syncronization primitives are used anyway.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Fri, 14 Mar 2014 18:47:30 +0000 (11:47 -0700)]
datapath: compat: Downstream the reciprocal_div.{c,h}.
The reciprocal division code used in datapath is flawed. The bug
has been fixed in the linux kernel repo in commit 809fa972fd(
reciprocal_divide: update/correction of the algorithm).
This commit downstreams the reciprocal_div.{c,h} from the linux
kernel repo.
Signed-off-by: Alex Wang <alexw@nicira.com> Reviewed-by: Thomas Graf <tgraf@redhat.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Andy Zhou [Thu, 13 Mar 2014 22:28:54 +0000 (15:28 -0700)]
backtrace: Add log_backtrace()
log_backtrace() and log_backtrace_msg() logs the back trace into
the log file. It may be most useful when debugging unit tests.
"backtrace.h" documents the usage. It is not being called directly
in the code, but rather as a handy tool available when needed.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Fri, 14 Mar 2014 04:48:55 +0000 (21:48 -0700)]
udpif: Bug fix updif_flush
Before this commit, all datapath flows are cleared with dpif_flush(),
but the revalidator thread still holds ukeys, which are caches of the
datapath flows in the revalidaor. Flushing ukeys causes flow_del
messages to be sent to the datapath again on flows that have been
deleted by the dpif_flush() already.
Double deletion by itself is not problem, per se, may an efficiency
issue. However, for ever flow_del message sent to the datapath, a log
message, at the warning level, will be generated in case datapath
failed to execute the command. In addition to cause spurious log
messages, Double deletion causes unit tests to report erroneous
failures as all warning messages are considered test failures.
The fix is to simply shut down the revalidator threads to flush all
ukeys, then flush the datapth before restarting the revalidator threads.
dpif_flush() was implemented as flush flows of all datapaths while
most of its invocation should only flush its local datapath.
Only megaflow on/off commands should flush all dapapaths. This bug is
also fixed.
Found during development.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
kmindg [Sun, 9 Mar 2014 09:48:52 +0000 (17:48 +0800)]
stp: Fix bpdu tx problem in listening state
The restriction only allows to send bpdu in forwarding state in
compose_output_action__. But a port could send bpdu in listening
and learning state according to comments in lib/stp.h(State of
an STP port).
Until this commit, OVS did not send out BPDUs in listening and learning
states. But those two states are temporary, the stp port will be in
forwarding state and send out BPDUs eventually (In the default
configuration listening and learning states last 15+15 second). Therefore,
this bug increased convergence time but did not entirely break STP.
Signed-off-by: kmindg <kmindg@gmail.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
windows/netinet: Copy ip6.h and icmp6.h from netbsd.
There are a few structure definitions that is used from
these headers. So copy them from the netbsd repo.
The following changes have been made on top of it:
* The keyword "__packed" has been removed
from the headers as the corresponding Linux headers don't
do packing.
* #if BYTE_ORDER == 'X' macros have been replaced by CONSTANT_HTONx().
* code inside #ifdef _KERNEL has been deleted.
* code inside #ifdef ICMP6_STRINGS has been deleted.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 11 Mar 2014 19:46:29 +0000 (12:46 -0700)]
ovs-atomic: Use raw types, not structs, when locks are required.
Until now, the GCC 4+ and pthreads implementations of atomics have used
struct wrappers for their atomic types. This had the advantage of allowing
a mutex to be wrapped in, in some cases, and of better type-checking by
preventing stray uses of atomic variables other than through one of the
atomic_*() functions or macros. However, the mutex meant that an
atomic_destroy() function-like macro needed to be used. The struct wrapper
also made it impossible to define new atomic types that were compatible
with each other without using a typedef. For example, one could not simply
define a macro like
#define ATOMIC(TYPE) struct { TYPE value; }
and then have two declarations like:
ATOMIC(void *) x;
ATOMIC(void *) y;
and do anything with these objects that require type-compatibility, even
"&x == &y", because the two structs are not compatible. One can do it
through a typedef:
typedef ATOMIC(void *) atomic_voidp;
atomic_voidp x, y;
but that is inconvenient, especially because of the need to invent a name
for the type.
This commit aims to ease the problem by getting rid of the wrapper structs
in the cases where the atomic library used them. It gets rid of the
mutexes, in the cases where they are still needed, by using a global
array of mutexes instead.
This commit also defines the ATOMIC macro described above and documents
its use in ovs-atomic.h.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
idl headers won't be built, if we build individual executables
e..g., "make ovsbd/ovsdb-server.exe". According to
http://www.gnu.org/software/automake/manual/html_node/Built-Sources-Example.html
we may have to add the headers as dependecies for every executables.
Currently the lack of a ovs-appctl port to Windows prevents us from
running just a "make". We plan to get ovs-appctl port done soon. Till
then, call out that the idl headers need to be built separately.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
ofp-actions: Relax build assertion condition for ofpact_nest struct.
struct ofpact has enums that are packed in case of __GNUC__.
This packing does not occur for visual studio. For 'struct ofpact_nest',
we are currently expecting that "struct ofpact actions[]" has an offset of
8 bytes. This condition won't be true in compilers where enums are
not packed.
It is good enough if struct ofpact actions[] starts at an offset which is
a multiple of OFPACT_ALIGNTO.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Windows does not have the getppid(), getuid(), getgid() functions.
We do get a random seed from CryptGenRandom(). That seed along with
process id and current time hopefully is good enough.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
kmindg [Sun, 9 Mar 2014 09:48:04 +0000 (17:48 +0800)]
ofproto: Update rule's priority in eviction group.
We do call heap_rebuild in ofproto_run, but we do not update rule's
priority with latest hard_timeout and idle_timeout before heap_rebuild.
This patch ensures that rule's priority has been updated before
heap_rebuild, and adds two test cases to check eviction with modified
hard_timeout and idle_timwout.
Signed-off-by: kmindg <kmindg@gmail.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
We use the number of cpu cores to determine the number
of threads that we spawn. We are not yet sure what is
the ideal number of OVS userspace threads that can run
on Hyper-V. Till we figure that out, use the same logic
of counting CPU cores in Windows too.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Fri, 7 Mar 2014 01:20:27 +0000 (17:20 -0800)]
tests: Test learned flow idle timeouts.
The previous tests would check that a single learned flow had its stats
correctly attributed to the right interfaces and flows. These new tests
take it a step further by causing two different learned flows to be
created, and checking the stats are correct. This is done for rules that
are learned with idle and hard timeouts.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Lorand Jakab [Tue, 11 Mar 2014 17:02:28 +0000 (19:02 +0200)]
README-lisp: improve LISP documentation
People familiar with LISP are used to the concept of a mapping cache in
a LISP Tunnel Router. Explain how that concept maps to OVS flow rules.
Additionally, mention that eth0 need not be added in all example
scenarios.
Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Thu, 6 Mar 2014 00:56:05 +0000 (16:56 -0800)]
upcall: Configure datapath max-idle through ovs-vsctl.
This patch adds a new configuration option, "max-idle" to the
Open_vSwitch "other-config" column. This sets how long datapath flows
are cached in the datapath before revalidators expire them.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Justin Pettit <jpettit@nicira.com>
We use getrusage mainly to get user CPU time and system CPU time.
Windows has a GetProcessTimes and GetThreadTimes that does the
same job. So use them.
We also use getrusage to get page faults. Use GetProcessMemoryInfo()
for that.
We also get number of context switches, block i/o times and use it for
debug information when we wake up from poll_block late. I haven't found
functions for that in Windows. We only use it for debug information, so
it should be okay not implementing it.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Co-authored-by: Linda Sun <lsun@vmware.com> Signed-off-by: Linda Sun <lsun@vmware.com> Acked-by: Ben Pfaff <blp@nicira.com>
Use GetSystemTimePreciseAsFileTime() for gettimeofday().
GetSystemTimePreciseAsFileTime() provides the result that is more
high resolution than just the microsecond that gittimeofday() in
Linux provides. So we need to remove some additional precision.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
QueryPerformanceCounter() retrieves the current value of the performance
counter, which is a high resolution (<1us) time stamp that can be used for
time-interval measurements. So, use it for MONOTONIC clock.
The GetSystemTimePreciseAsFileTime() function retrieves the current system date
and time with the highest possible level of precision (<1us). Use it for
real time clock. This function returns a counter representing the number of
100-nanosecond intervals since January 1, 1601. To make it compatible with
Linux CLOCK_REALTIME, we need to calculate the 100-nanoseconds counter value
till 01/01/1970.
An upcoming commit implements gettimeofday() using the same clock, so,
carve out a function.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 23 Jan 2014 23:35:22 +0000 (15:35 -0800)]
Makefile: Compile Linux-specific files based on __linux__ macro.
We want to conditionally compile several files based on whether we're
building for a Linux host, so we need some Automake conditional for that.
Previously this was based on whether Netlink is available and we're not
on ESX (since ESX has Netlink but isn't Linux), but it's more
straightforward to just test for Linux directly.
CC: Luigi Rizzo <rizzo@iet.unipi.it> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 23 Jan 2014 23:33:25 +0000 (15:33 -0800)]
Use __linux__ instead of LINUX_DATAPATH in C code.
The LINUX_DATAPATH C preprocessor symbol was originally meant to be used as
a signal for whether the Linux datapath module could be used, but it was
used as a proxy for a lot of other stuff that is really just Linux
specific. This commit switches all of these users to just test for
__linux__, which is more straightforward and should have the same result.
CC: Luigi Rizzo <rizzo@iet.unipi.it> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Sun, 2 Mar 2014 01:15:00 +0000 (17:15 -0800)]
tunnel: Do not set padding bits in tunnel mask.
On most architectures other than 32-bit x86, struct flow_tnl ends with 4
padding bytes. Until now, tnl_xlate_init() set those bytes to nonzero
values in the wildcard mask. When the wildcard mask passed through Netlink
attributes and back to userspace, the padding bytes of course became zero
again, which caused a wildcard mask mismatch and premature deletion of the
flow in revalidation. This commit fixes the problem.
Bug #1192516. Reported-by: Krishna Miriyala <miriyalak@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Sun, 2 Mar 2014 01:11:02 +0000 (17:11 -0800)]
odp-util: Include tun_id when nonzero even if "key" flag not set.
When a flow_tnl is being translated to Netlink attributes, the tun_id field
was included only if the FLOW_TNL_F_KEY flag was set. This meant that for
a mask, where one would not necessarily expect that flag to be set even if
there were a key, the tun_id could be omitted even if it were nonzero.
This led to kernel flows that did not match on a field that was required
to be matched (possibly causing incorrect treatment of packets) and
premature deletion of kernel flows due to mask mismatch. This commit
fixes the problem.
Bug #1192516. Reported-by: Krishna Miriyala <miriyalak@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Sat, 1 Mar 2014 00:20:17 +0000 (16:20 -0800)]
vconn: Fix vconn_get_status() return value when connection in progress.
When a connection takes a few rounds of the state machine to complete,
'error' gets filled with EAGAIN until that completes. This didn't match
the vconn_get_status() documentation, which says that it only returns a
positive errno value if there was an error. One could fix the problem
by updating the documentation (and the callers) or by updating the
implementation. I decided that the latter was the way to go because
the distinction between the TCP connection being in progress or complete
isn't visible to the client; what is visible to the client is the OpenFlow
negotiation being complete.
This problem is difficult to find in the unit tests because TCP connections
to localhost complete immediately.
Bug introduced by commit accaecc419cc57d (rconn: Discover errors in
rconn_run() even if rconn_recv() is never called.)
Reported-by: Anuprem Chalvadi <achalvadi@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Thu, 27 Feb 2014 02:08:04 +0000 (18:08 -0800)]
lib: simplify flow_extract() API
Change the flow_extract() API to accept struct pkt_metadata,
instead of individual metadata fields. It will make the API more
logical and easier to maintain when we need to expand metadata
down the road.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>¬
Ben Pfaff [Fri, 28 Feb 2014 21:12:04 +0000 (13:12 -0800)]
datapath: Correctly report flow used times for first 5 minutes after boot.
The kernel starts out its "jiffies" timer as 5 minutes below zero, as
shown in include/linux/jiffies.h:
/*
* Have the 32 bit jiffies value wrap 5 minutes after boot
* so jiffies wrap bugs show up earlier.
*/
#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
The loop in ovs_flow_stats_get() starts out with 'used' set to 0, then
takes any "later" time. This means that for the first five minutes after
boot, flows will always be reported as never used, since 0 is greater than
any time already seen.
Bug #1192516. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>