Jarno Rajahalme [Wed, 27 Nov 2013 20:58:46 +0000 (12:58 -0800)]
util: Better count_1bits().
Inline, use another well-known algorithm for 64-bit builds, and use
builtins when they are known to be fast at compile time. A 32-bit
version of the alternate algorithm is slower than the existing
implementation, so the old one is used for 32-bit builds. Inline
assembler would be a bit faster on 32-bit i7 build, but we use the GCC
builtin for portability.
It should be stressed builds for specific CPUs do not work on others
CPUs, and that OVS build system or runtime does not currently support
CPU detection.
Pravin B Shelar [Fri, 6 Dec 2013 18:43:12 +0000 (10:43 -0800)]
datapath: compat: Fix Compiler error for kernel 3.3 to 3.8
Kernel 3.3 to 3.8 has defined `struct flow_keys` but does not
contains flow_keys.thoff field. Therefore we need to use
compat definition for flow_keys struct.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Pravin B Shelar [Thu, 5 Dec 2013 23:50:27 +0000 (15:50 -0800)]
datapath: Use percpu allocator for flow-stats.
Use percpu allocator for stats due to objection to stats array.
But percpu allocator is not designed for high churn allocation/
deallcation. so we need to avoid allocating percpu flow for
short lived flows. One cheaper way to detect flow is by checking
if 5-tuple used in RSS are masked or not. if any one of them is
masked, flow is likely shared across CPU where percpu stat
should be more scalable. And that flow should be relatively
long lived flow.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Pravin B Shelar [Tue, 3 Dec 2013 16:33:36 +0000 (08:33 -0800)]
datapath: Improve compat rxhash functionality.
Following patch improves rxhash calculation, It is taken from
upstream Linux kernel code.
From kernel 3.8, skb_get_rxhash() can handle hardware generated
l4-rxhash. Therefore compat skb_get_rxhash() is not used on
kernel 3.8 or new.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Thomas Graf <tgraf@redhat.com> Acked-by: Jesse Gross <jesse@nicira.com>
James Page [Thu, 5 Dec 2013 17:29:05 +0000 (17:29 +0000)]
Add check for -latomic
Later versions of gcc on some architectures push atomic functions
out into a separate atomic library; add a check to see when this
is required and add it to LIBS if need be.
Specifically the problem was observed on GCC 4.8.2 on powerpc
architecture for Ubuntu 14.04:
Flavio Leitner [Tue, 3 Dec 2013 01:13:16 +0000 (23:13 -0200)]
fedora package: include python byte compiled files
Include byte compiled files to speed up the execution,
to avoid spurious SELinux AVC denials and also to make
rpm happy when checking for unpackaged files:
Jesse Gross [Tue, 3 Dec 2013 02:56:32 +0000 (18:56 -0800)]
datapath: Silence RCU lockdep checks from flow lookup.
Flow lookup can happen either in packet processing context or userspace
context but it was annotated as requiring RCU read lock to be held. This
also allows OVS mutex to be held without causing warnings.
Jarno Rajahalme [Mon, 2 Dec 2013 23:14:09 +0000 (15:14 -0800)]
lib: More intuitive syntax for TCP flags matching.
Allow TCP flags match specification with symbolic flag names. TCP
flags are optionally specified as a string of flag names, each
preceded by '+' when the flag must be one, or '-' when the flag must
be zero. Any flags not explicitly included are wildcarded. The
existing hex syntax is still allowed, and is used in flow dumps when
all the flags are matched.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 22 Nov 2013 23:57:23 +0000 (15:57 -0800)]
ofproto-dpif: keep slow path flow time stamp up-to-date
Noting updating slow path subfacet's time stamp can cause their datapath
flows deleted periodically. For example, CFM datapath flow have usespace
actions that are handled in dpif slow path. They are deleted and
recreated periodically without the fix.
This bug are not obvious during normal operation. Deleted CFM flow
would cause CFM packets to be handled by flow miss handler which will
reinstall the flow in the datapath. The only potentially observable
behavior is that when the user space is overwhelmed with flow miss packets,
the periodic CFM miss packets may get stuck behind other miss packets,
cause tunnel flapping.
Ben refactored the patch to its current form.
Reported-by: Guolin Yang <gyang@nicira.com> Co-authored-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
Alin Serdean [Tue, 26 Nov 2013 07:38:48 +0000 (23:38 -0800)]
Avoid printf type modifiers not supported by MSVC C runtime library.
The MSVC C library printf() implementation does not support the 'z', 't',
'j', or 'hh' format specifiers. This commit changes the Open vSwitch code
to avoid those format specifiers, switching to standard macros from
<inttypes.h> where available and inventing new macros resembling them
where necessary. It also updates CodingStyle to specify the macros' use
and adds a Makefile rule to report violations.
Signed-off-by: Alin Serdean <aserdean@cloudbasesolutions.com> Co-authored-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
debian: Don't recreate bridges during manual restart.
Open vSwitch bridges and ports can be configured through
the /etc/network/interfaces script. During system startup,
Open vSwitch startup script reads the interfaces file
and creates the bridges and ports. During system shutdown,
the bridges and ports are removed.
The same behavior also can occur with a manual 'restart' of
Open vswitch (ex: service openvswitch-switch restart).
This behavior has come across as undesirable in some cases.
ex: When some one manually creates interfaces through ovs-vsctl
and then restarts Open vSwitch, that interface is lost.
This commit changes the startup script such that, interfaces
are created and deleted through the startup script only when
RUNLEVEL environment variable is set. This behavior will be
consistent with the OVS RHEL ifcfg-* scripts too.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Romain Lenglet [Wed, 20 Nov 2013 18:57:53 +0000 (10:57 -0800)]
ipfix: allow empty targets column in table IPFIX
The "targets" column in IPFIX had a min=1 constraints, so OVSDB
implicitly adds an empty string "" into that column if no value is
given. No connection can be opened to a target with address "", so
the whole IPFIX exporter for that row was disabled until that ""
target was removed by users. That behavior is correct but proved to
be unintuitive to users.
This patch removes the min=1 constraint, to avoid the trouble for
users who insert IPFIX rows with no targets: it eliminates the log
messages due to failed connections to target "", and eliminates the
need to manually remove the "" target after row insertion.
This doesn't impact the behavior for any existing row, whether it has
a "" target or not.
Signed-off-by: Romain Lenglet <rlenglet@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Wed, 20 Nov 2013 19:38:29 +0000 (11:38 -0800)]
bfd: Add new key "flap_count" to "bfd_status".
This commit adds a new key "flap_count" to "bfd_status" to count
the number of bfd "forwarding" flag flaps. A flap is considered
as a change of the "forwarding" flag value.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Wed, 20 Nov 2013 19:15:54 +0000 (11:15 -0800)]
bfd: Add forwarding flag to struct bfd.
This commit adds a forwarding flag to "struct bfd". This flag
is for indicating the interface's capability of packet I/O.
Also, this flag makes it possible to count the number of interface
state flapping. bfd_forwarding__() will update this flag at
each invocation.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Fri, 15 Nov 2013 23:25:00 +0000 (15:25 -0800)]
coverage: Synchronize per-thread counters less aggressively
When profiling CPU usage in situations involving high numbers of ports,
coverage_clear() was highlighted as a commonly called function. It
appears that it can be quite expensive to access all of the per-thread
coverage counters when threads are constantly waking up.
This patch makes each thread only do coverage_clear() logic roughly once
per second by introducing per-thread timers. Upcall handler counters may
become less accurate, as these threads may sleep without synchronising
and not wake up for some time. When the main thread is under load at
~90% CPU, this drops to ~85%. Upcall handler threads sitting at ~2.5%
drop to ~1.5%.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 15 Nov 2013 22:19:57 +0000 (14:19 -0800)]
ofp-actions: Make ofpacts_check() report consistency for all protocols.
Until now ofpacts_check() has been told either to enforce consistency or
not, but that means that the caller has to know exactly what protocol is
going to be in use (because some protocols require consistency to be
enforced and others don't). This commit changes ofpacts_check() to just
rule out protocols that require enforcement when it detects
inconsistencies.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
cccl: Provide '-' before the options instead of double slash.
While compiling Open vSwitch with visual c++ in
a mingw environment, I have observed that "//" before
options does not work for all the compiler options of
MSVC. Using "-" on the other hand seems to work.
Also, echo the command line options passed to
the MSVC compiler.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
One option to compile Open vSwitch code in windows
is to use Visual c++ compiler.
From http://cccl.sourceforge.net/ :
"cccl is a wrapper around Microsoft Visual C++'s cl.exe
and link.exe. It converts Unix compiler parameters
into parameters understood by cl and link. cccl's main
use is for using Unix build processes with Microsoft
compilers. Using cccl in conjunction with ports of Unix
utilities, it is possible to build many Unix packages
using MSVC, without modifying the build process."
There are couple of forks of the project in the internet.
This particular piece is copied from:
https://gitorious.org/swift/swift/source/\ cf9b391b40a9c59a620c8093d438370381949c60:autoconf/cccl
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Wed, 20 Nov 2013 01:31:29 +0000 (17:31 -0800)]
Classifier: Staged subtable matching.
Subtable lookup is performed in ranges defined for struct flow,
starting from metadata (registers, in_port, etc.), then L2 header, L3,
and finally L4 ports. Whenever it is found that there are no matches
in the current subtable, the rest of the subtable can be skipped. The
rationale of this logic is that as many fields as possible can remain
wildcarded.
Ben Pfaff [Tue, 19 Nov 2013 19:02:08 +0000 (11:02 -0800)]
utilities: Regenerate ovs-lib if configuration changes.
Otherwise running "configure" twice with different --prefix (etc.) will
fail to update ovs-lib, so that "make install" installs an ovs-lib with
the wrong paths.
Reported-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Wed, 13 Nov 2013 17:50:54 +0000 (09:50 -0800)]
datapath: Shrink sw_flow_mask by 8 bytes (64-bit) or 4 bytes (32-bit).
We won't normally have a ton of flow masks but using a size_t to store
values no bigger than sizeof(struct sw_flow_key) seems excessive.
This reduces sw_flow_key_range and sw_flow_mask by 4 bytes on 32-bit
systems. On 64-bit systems it shrinks sw_flow_key_range by 12 bytes but
sw_flow_mask only by 8 bytes due to padding.
Compile tested only.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Mon, 18 Nov 2013 21:18:41 +0000 (13:18 -0800)]
util: Make popcount() handle 64-bit integers, not separate popcount64().
Having a single function that can do popcount() on any integer type is
easier for callers to get right. The implementation is probably slower
if the caller actually provides a 32-bit (or shorter) integer, but the
only existing callers always provide a full 64-bit integer so this seems
unimportant for now.
This also restores use, in practice, of the optimized implementation of
population count. (As the comment on popcount32() says, this version is
2x faster than __builtin_popcount().)
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Ben Pfaff [Mon, 18 Nov 2013 19:30:38 +0000 (11:30 -0800)]
util: Make raw_ctz() accept 64-bit integers.
Having a single function that can do raw_ctz() on any integer type is
easier for callers to get right, and there is no real downside in the
implementation.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Simon Horman [Fri, 15 Nov 2013 09:10:18 +0000 (18:10 +0900)]
ofp-actions: Consider L4 actions after mpls_push as inconsistent
After an mpls_push action the resulting packet is MPLS and
the MPLS payload is opaque. Thus nothing can be assumed
about the packets network protocol and it is inconsistent
to apply L4 actions.
With regards to actions that affect the packet at other layers
of the protocol stack:
* L3: The consistency of L3 actions should already be handled correctly
by virtue of the dl_type of the flow being temporarily altered
during consistency checking by both push_mpls and pop_mpls actions.
* MPLS: The consistency checking of MPLS actions appear to already be
handled correctly.
* VLAN: At this time Open vSwitch on mpls_push an MPLS LSE is always
added after any VLAN tags that follow the ethernet header.
That is the tag ordering defined prior to OpenFlow1.3. As such
VLAN actions should sill be equally valid before and after mpls_push
and mpls_pop actions.
* L2 actions are equally valid before and after mpls_push and mpls_pop actions.
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 15 Nov 2013 16:54:56 +0000 (08:54 -0800)]
util: New function ovs_scan().
This new function is essentially an implementation of sscanf() with
slightly different behavior (see the comment) that is more convenient for
Open vSwitch internal use. Also, this implementation ought to work out of
the box on Windows, which has a defective sscanf() that lacks the 'hh'
modifier required to scan into a char variable.
Ben Pfaff [Sat, 9 Nov 2013 23:21:12 +0000 (15:21 -0800)]
bitmap: New macro BITMAP_N_LONGS for use in constant expressions.
An upcoming commit will declare a bitmap on the stack, rather than heap
allocating it, which means that it is not possible to use a function call
in the declaration.
Simon Horman [Thu, 14 Nov 2013 02:19:07 +0000 (11:19 +0900)]
OPENFLOW-1.1+: Update MPLS items
* MPLS BoS match is supported as much as other MPLS matches.
That is, all that is missing is the kernel datapath side
which is part my pending patchset.
* The rework of tag order does require work but it is
covered by my pending patchset.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Wed, 13 Nov 2013 19:19:56 +0000 (11:19 -0800)]
FAQ, NEWS: Notes on TCP flags matching.
Add a FAQ categorry "Performance Problems". So far the only entry
addresses the issue with using a new kernel module with an older
(pre-megaflows) userspace.
Simon Horman [Wed, 13 Nov 2013 06:13:42 +0000 (15:13 +0900)]
ovs-ofctl: Document masked versions of arp_sha and arp_tha matches
Document masked versions of arp_sha and arp_tha matches.
Also update documentation of unmasked versions of these
matches to include an example address as is the case with
the documentation of dl_src and dl_dst.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ravi Kondamuru [Mon, 4 Nov 2013 23:07:10 +0000 (23:07 +0000)]
bond: Use active-backup mode on LACP failure.
Commit bdebeece5 (lacp: Require successful LACP negotiations when
configured.) makes successful LACP negotiation mandatory for the
bond to come UP. This patch provides a configuration option to
bring up the bond by falling back to active-backup mode on LACP
negotiation failure.
Several of the physical switches that support LACP block all traffic
for ports that are configured to use LACP, until LACP is negotiated
with the host. When configuring a LACP bond on a OVS host
(eg: XenServer), this means that there will be an interruption of the
network connectivity between the time the ports on the physical
switch and the bond on the OVS host are configured. The interruption
may be relatively long, if different people are responsible for
managing the switches and the OVS host.
Such network connectivity failure can be avoided if LACP can be
configured on the OVS host before configuring the physical switch,
and having the OVS host fall back to a bond mode (active-backup) till
the physical switch LACP configuration is complete. An option
"lacp-fallback-ab" is introduced with this patch to provide such
behavior on openvswitch.
Signed-off-by: Ravi Kondamuru <Ravi.Kondamuru@citrix.com> Signed-off-by: Dominic Curran <Dominic.Curran@citrix.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Mon, 7 Oct 2013 21:26:28 +0000 (14:26 -0700)]
odp-util: Fix IPFIX breakage with old kernel modules.
Before commit e995e3df57ea (Allow OVS_USERSPACE_ATTR_USERDATA to be
variable length.) userdata attributes in userspace actions were expected
to be exactly 64 bits long. The kernel only actually enforced that they
were at least 64 bits long (the previously referenced commit's log message
contains misinformation on this account).
Initially this was no problem, because all of the userdata that userspace
actually used was exactly 8 bytes long. Commit 29089a540c (Implement IPFIX
export), however, exposed a problem by reducing the length of userdata for
IPFIX support to just 4 bytes. This meant that IPFIX no longer worked on
older datapaths, because the userdata was no longer at least 8 bytes long.
This commit fixes the problem by padding out userdata attributes less than
8 bytes long to 8 bytes.
CC: Romain Lenglet <rlenglet@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Romain Lenglet <rlenglet at vmware.com>
Ben Pfaff [Mon, 6 May 2013 17:55:06 +0000 (10:55 -0700)]
ofp-actions: Switch away from odd use of "q" in "enqueue" action format.
The formatting of the "enqueue" action uses a "q" to separate the port
number from the queue number, as in "enqueue:123q456". This is different
from every other action. This commit improves the situation by:
* Switching the formatting to use a colon (e.g. "enqueue:123:456"),
which is a little less odd-looking but still accepted by older
versions of Open vSwitch.
* Improving the parser to accept "enqueue(123,456)" also.