debian: Don't recreate bridges during manual restart.
Open vSwitch bridges and ports can be configured through
the /etc/network/interfaces script. During system startup,
Open vSwitch startup script reads the interfaces file
and creates the bridges and ports. During system shutdown,
the bridges and ports are removed.
The same behavior also can occur with a manual 'restart' of
Open vswitch (ex: service openvswitch-switch restart).
This behavior has come across as undesirable in some cases.
ex: When some one manually creates interfaces through ovs-vsctl
and then restarts Open vSwitch, that interface is lost.
This commit changes the startup script such that, interfaces
are created and deleted through the startup script only when
RUNLEVEL environment variable is set. This behavior will be
consistent with the OVS RHEL ifcfg-* scripts too.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Romain Lenglet [Wed, 20 Nov 2013 18:57:53 +0000 (10:57 -0800)]
ipfix: allow empty targets column in table IPFIX
The "targets" column in IPFIX had a min=1 constraints, so OVSDB
implicitly adds an empty string "" into that column if no value is
given. No connection can be opened to a target with address "", so
the whole IPFIX exporter for that row was disabled until that ""
target was removed by users. That behavior is correct but proved to
be unintuitive to users.
This patch removes the min=1 constraint, to avoid the trouble for
users who insert IPFIX rows with no targets: it eliminates the log
messages due to failed connections to target "", and eliminates the
need to manually remove the "" target after row insertion.
This doesn't impact the behavior for any existing row, whether it has
a "" target or not.
Signed-off-by: Romain Lenglet <rlenglet@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Wed, 20 Nov 2013 19:38:29 +0000 (11:38 -0800)]
bfd: Add new key "flap_count" to "bfd_status".
This commit adds a new key "flap_count" to "bfd_status" to count
the number of bfd "forwarding" flag flaps. A flap is considered
as a change of the "forwarding" flag value.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Wed, 20 Nov 2013 19:15:54 +0000 (11:15 -0800)]
bfd: Add forwarding flag to struct bfd.
This commit adds a forwarding flag to "struct bfd". This flag
is for indicating the interface's capability of packet I/O.
Also, this flag makes it possible to count the number of interface
state flapping. bfd_forwarding__() will update this flag at
each invocation.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Fri, 15 Nov 2013 23:25:00 +0000 (15:25 -0800)]
coverage: Synchronize per-thread counters less aggressively
When profiling CPU usage in situations involving high numbers of ports,
coverage_clear() was highlighted as a commonly called function. It
appears that it can be quite expensive to access all of the per-thread
coverage counters when threads are constantly waking up.
This patch makes each thread only do coverage_clear() logic roughly once
per second by introducing per-thread timers. Upcall handler counters may
become less accurate, as these threads may sleep without synchronising
and not wake up for some time. When the main thread is under load at
~90% CPU, this drops to ~85%. Upcall handler threads sitting at ~2.5%
drop to ~1.5%.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 15 Nov 2013 22:19:57 +0000 (14:19 -0800)]
ofp-actions: Make ofpacts_check() report consistency for all protocols.
Until now ofpacts_check() has been told either to enforce consistency or
not, but that means that the caller has to know exactly what protocol is
going to be in use (because some protocols require consistency to be
enforced and others don't). This commit changes ofpacts_check() to just
rule out protocols that require enforcement when it detects
inconsistencies.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
cccl: Provide '-' before the options instead of double slash.
While compiling Open vSwitch with visual c++ in
a mingw environment, I have observed that "//" before
options does not work for all the compiler options of
MSVC. Using "-" on the other hand seems to work.
Also, echo the command line options passed to
the MSVC compiler.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
One option to compile Open vSwitch code in windows
is to use Visual c++ compiler.
From http://cccl.sourceforge.net/ :
"cccl is a wrapper around Microsoft Visual C++'s cl.exe
and link.exe. It converts Unix compiler parameters
into parameters understood by cl and link. cccl's main
use is for using Unix build processes with Microsoft
compilers. Using cccl in conjunction with ports of Unix
utilities, it is possible to build many Unix packages
using MSVC, without modifying the build process."
There are couple of forks of the project in the internet.
This particular piece is copied from:
https://gitorious.org/swift/swift/source/\ cf9b391b40a9c59a620c8093d438370381949c60:autoconf/cccl
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Wed, 20 Nov 2013 01:31:29 +0000 (17:31 -0800)]
Classifier: Staged subtable matching.
Subtable lookup is performed in ranges defined for struct flow,
starting from metadata (registers, in_port, etc.), then L2 header, L3,
and finally L4 ports. Whenever it is found that there are no matches
in the current subtable, the rest of the subtable can be skipped. The
rationale of this logic is that as many fields as possible can remain
wildcarded.
Ben Pfaff [Tue, 19 Nov 2013 19:02:08 +0000 (11:02 -0800)]
utilities: Regenerate ovs-lib if configuration changes.
Otherwise running "configure" twice with different --prefix (etc.) will
fail to update ovs-lib, so that "make install" installs an ovs-lib with
the wrong paths.
Reported-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Wed, 13 Nov 2013 17:50:54 +0000 (09:50 -0800)]
datapath: Shrink sw_flow_mask by 8 bytes (64-bit) or 4 bytes (32-bit).
We won't normally have a ton of flow masks but using a size_t to store
values no bigger than sizeof(struct sw_flow_key) seems excessive.
This reduces sw_flow_key_range and sw_flow_mask by 4 bytes on 32-bit
systems. On 64-bit systems it shrinks sw_flow_key_range by 12 bytes but
sw_flow_mask only by 8 bytes due to padding.
Compile tested only.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Mon, 18 Nov 2013 21:18:41 +0000 (13:18 -0800)]
util: Make popcount() handle 64-bit integers, not separate popcount64().
Having a single function that can do popcount() on any integer type is
easier for callers to get right. The implementation is probably slower
if the caller actually provides a 32-bit (or shorter) integer, but the
only existing callers always provide a full 64-bit integer so this seems
unimportant for now.
This also restores use, in practice, of the optimized implementation of
population count. (As the comment on popcount32() says, this version is
2x faster than __builtin_popcount().)
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Ben Pfaff [Mon, 18 Nov 2013 19:30:38 +0000 (11:30 -0800)]
util: Make raw_ctz() accept 64-bit integers.
Having a single function that can do raw_ctz() on any integer type is
easier for callers to get right, and there is no real downside in the
implementation.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Simon Horman [Fri, 15 Nov 2013 09:10:18 +0000 (18:10 +0900)]
ofp-actions: Consider L4 actions after mpls_push as inconsistent
After an mpls_push action the resulting packet is MPLS and
the MPLS payload is opaque. Thus nothing can be assumed
about the packets network protocol and it is inconsistent
to apply L4 actions.
With regards to actions that affect the packet at other layers
of the protocol stack:
* L3: The consistency of L3 actions should already be handled correctly
by virtue of the dl_type of the flow being temporarily altered
during consistency checking by both push_mpls and pop_mpls actions.
* MPLS: The consistency checking of MPLS actions appear to already be
handled correctly.
* VLAN: At this time Open vSwitch on mpls_push an MPLS LSE is always
added after any VLAN tags that follow the ethernet header.
That is the tag ordering defined prior to OpenFlow1.3. As such
VLAN actions should sill be equally valid before and after mpls_push
and mpls_pop actions.
* L2 actions are equally valid before and after mpls_push and mpls_pop actions.
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 15 Nov 2013 16:54:56 +0000 (08:54 -0800)]
util: New function ovs_scan().
This new function is essentially an implementation of sscanf() with
slightly different behavior (see the comment) that is more convenient for
Open vSwitch internal use. Also, this implementation ought to work out of
the box on Windows, which has a defective sscanf() that lacks the 'hh'
modifier required to scan into a char variable.
Ben Pfaff [Sat, 9 Nov 2013 23:21:12 +0000 (15:21 -0800)]
bitmap: New macro BITMAP_N_LONGS for use in constant expressions.
An upcoming commit will declare a bitmap on the stack, rather than heap
allocating it, which means that it is not possible to use a function call
in the declaration.
Simon Horman [Thu, 14 Nov 2013 02:19:07 +0000 (11:19 +0900)]
OPENFLOW-1.1+: Update MPLS items
* MPLS BoS match is supported as much as other MPLS matches.
That is, all that is missing is the kernel datapath side
which is part my pending patchset.
* The rework of tag order does require work but it is
covered by my pending patchset.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Wed, 13 Nov 2013 19:19:56 +0000 (11:19 -0800)]
FAQ, NEWS: Notes on TCP flags matching.
Add a FAQ categorry "Performance Problems". So far the only entry
addresses the issue with using a new kernel module with an older
(pre-megaflows) userspace.
Simon Horman [Wed, 13 Nov 2013 06:13:42 +0000 (15:13 +0900)]
ovs-ofctl: Document masked versions of arp_sha and arp_tha matches
Document masked versions of arp_sha and arp_tha matches.
Also update documentation of unmasked versions of these
matches to include an example address as is the case with
the documentation of dl_src and dl_dst.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ravi Kondamuru [Mon, 4 Nov 2013 23:07:10 +0000 (23:07 +0000)]
bond: Use active-backup mode on LACP failure.
Commit bdebeece5 (lacp: Require successful LACP negotiations when
configured.) makes successful LACP negotiation mandatory for the
bond to come UP. This patch provides a configuration option to
bring up the bond by falling back to active-backup mode on LACP
negotiation failure.
Several of the physical switches that support LACP block all traffic
for ports that are configured to use LACP, until LACP is negotiated
with the host. When configuring a LACP bond on a OVS host
(eg: XenServer), this means that there will be an interruption of the
network connectivity between the time the ports on the physical
switch and the bond on the OVS host are configured. The interruption
may be relatively long, if different people are responsible for
managing the switches and the OVS host.
Such network connectivity failure can be avoided if LACP can be
configured on the OVS host before configuring the physical switch,
and having the OVS host fall back to a bond mode (active-backup) till
the physical switch LACP configuration is complete. An option
"lacp-fallback-ab" is introduced with this patch to provide such
behavior on openvswitch.
Signed-off-by: Ravi Kondamuru <Ravi.Kondamuru@citrix.com> Signed-off-by: Dominic Curran <Dominic.Curran@citrix.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Mon, 7 Oct 2013 21:26:28 +0000 (14:26 -0700)]
odp-util: Fix IPFIX breakage with old kernel modules.
Before commit e995e3df57ea (Allow OVS_USERSPACE_ATTR_USERDATA to be
variable length.) userdata attributes in userspace actions were expected
to be exactly 64 bits long. The kernel only actually enforced that they
were at least 64 bits long (the previously referenced commit's log message
contains misinformation on this account).
Initially this was no problem, because all of the userdata that userspace
actually used was exactly 8 bytes long. Commit 29089a540c (Implement IPFIX
export), however, exposed a problem by reducing the length of userdata for
IPFIX support to just 4 bytes. This meant that IPFIX no longer worked on
older datapaths, because the userdata was no longer at least 8 bytes long.
This commit fixes the problem by padding out userdata attributes less than
8 bytes long to 8 bytes.
CC: Romain Lenglet <rlenglet@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Romain Lenglet <rlenglet at vmware.com>
Ben Pfaff [Mon, 6 May 2013 17:55:06 +0000 (10:55 -0700)]
ofp-actions: Switch away from odd use of "q" in "enqueue" action format.
The formatting of the "enqueue" action uses a "q" to separate the port
number from the queue number, as in "enqueue:123q456". This is different
from every other action. This commit improves the situation by:
* Switching the formatting to use a colon (e.g. "enqueue:123:456"),
which is a little less odd-looking but still accepted by older
versions of Open vSwitch.
* Improving the parser to accept "enqueue(123,456)" also.
Ben Pfaff [Tue, 19 Mar 2013 20:30:33 +0000 (13:30 -0700)]
ofproto: Check ofproto_port_query_by_name() return value when adding port.
Otherwise, if the port add succeeds but the query that looks up the port
number fails, then ofproto_port_add() would return zero as the OpenFlow
port number and ignore the error.
Reported-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Wed, 6 Nov 2013 18:36:17 +0000 (10:36 -0800)]
cfm: Count flaps when logging is disabled
Previously, the flap count logic for CFM was dependent on the state of
the logging facility. This patch ensures flaps are always counted when
there is a transition between non-fault and fault (and vice versa).
Found by inspection.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Fri, 8 Nov 2013 17:41:19 +0000 (09:41 -0800)]
netdev: Make naming more consistent
netdev-dummy and netdev-vport use the function name netdev_poll_notify()
for the same purpose as netdev-linux/bsd's netdev_*_changed(). This patch
changes the former two to be more consistent with the linux/bsd naming
scheme.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 7 Nov 2013 00:12:34 +0000 (16:12 -0800)]
ofproto: Limit OVS-assigned port numbers to 32767 and below.
A couple of controller vendors have mentioned to me that they would like to
have some part of the OpenFlow port number space reserved for the
controller to use. This commit reserves 32768 and up (roughly the upper
half of the OF1.0 port range) to the controller.
Bug #18753. Signed-off-by: Ben Pfaff <blp@nicira.com>
ofproto-dpif: Disassociate datapath max_ports with openflow port numbers.
With single datapath, multiple userspace bridges share the same datapath.
As such it does not look beneficial that we decide a valid open flow port
number based on the number of ports in the datapath specially now that
we have the ofport_request column in OVSDB.
This commit does not remove ofproto_init_max_ports() interface as defined
in ofproto-provider.h as there may be other implementations that still use it.
But ofproto-dpif should not need it.
Bug #20163. Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
'struct dp_netdev_flow' is currently being instantiated as 'flow'.
An upcoming commit introduces a classifier to dpif-netdev
which uses 'struct flow' at a few places and that can cause
confusion while reading code.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 11 Oct 2013 23:24:41 +0000 (16:24 -0700)]
ovs-controller: Rename test-controller and do not install or package.
Too many users have incorrectly assumed that ovs-controller is a necessary
or desirable part of an Open vSwitch deployment. This commit should fix
the problem by renaming it test-controller and removing it from the
default install and from packaging.
Joe Stringer [Fri, 1 Nov 2013 23:34:29 +0000 (16:34 -0700)]
netdev-linux: Skip miimon execution when disabled
When dealing with a large number of ports, one of the performance
bottlenecks is that we loop through all netdevs in the main loop. Miimon
is a contributor to this, checking all devices even if it has never been
enabled.
This patch introduces a counter for the number of netdevs with miimon
configured. If this is 0, then we skip miimon_run() and miimon_wait().
In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces CPU usage from about 50% to about 45%.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 30 Oct 2013 09:17:20 +0000 (18:17 +0900)]
ofproto-dpif: Support weight for select groups
Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com adjusted this for highest random weight scoring and
updated the test] Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 30 Oct 2013 09:17:19 +0000 (18:17 +0900)]
ofproto-dpif: Implement translation of select groups.
Select bucket from those that are alive based on a hash of the destination
ethernet address of the packet.
Support for weights is proposed by a subsequent patch.
The selection is based on a hash of the destination ethernet
address of the flow. It should be possible to extend
this to cover a hash of user-specified elements of the flow.
Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com replaced bucket selection by "highest random weight"
method] Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 30 Oct 2013 09:17:15 +0000 (18:17 +0900)]
ofproto: Add group desc and stats tests
Lightly exercise group desc and stats
Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com found that a new test segfaulted and folded in fixes] Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 30 Oct 2013 09:17:14 +0000 (18:17 +0900)]
ofproto-dpif: Translation of indirect and all groups
Allow translation of indirect and all groups. Also allow insertion of
indirect and all groups by changing the maximum permitted number in the
groups table from 0 to OFPG_MAX.
Implementation note:
After translating the actions for each bucket ctx->flow is reset to its
state prior to translation of the buckets actions. This is equivalent to
cloning the bucket before applying actions. This is my interpretation of the
OpenFlow 1.3.2 specification section 5.6.1 Group Types, which includes the
following text. I believe there is room for other interpretations.
* On all groups: "The packet is effectively cloned for each bucket; one
packet is processed for each bucket of the group."
* On indirect groups: "This group type is effectively identical to an
all group with one bucket."
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 30 Oct 2013 09:17:13 +0000 (18:17 +0900)]
ofproto: Break out resubmit resource checking
Break out resubmit resource checking into a helper function
xlate_resubmit_resource_check() and use this new function.
This is to allow the check to be re-used by a subsequent patch.
Suggested-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Sat, 2 Nov 2013 18:36:50 +0000 (11:36 -0700)]
nicira-ext: Update comment.
The ARP headers have been acceptable as NXAST_REG_MOVE destinations since
commit f6c8a6b163 (Add software switch support for modifying ARP headers
in OpenFlow.)
Reported-by: Anupam Chanda <achanda@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
Ben Pfaff [Sat, 2 Nov 2013 15:43:14 +0000 (08:43 -0700)]
ofproto-dpif-xlate: Handle oversized actions more gracefully.
If the datapath actions exceed the maximum size of a Netlink attribute
(about 64 kB), then previously we would assert-fail (before commit 542024c4c3d36 "ofproto-dpif-xlate: Suppress oversize datapath actions.")
or just drop all of them (after that commit). This commit makes OVS cope
by slow-pathing the flow and executing all of its actions in userspace.
Ben Pfaff [Mon, 30 Sep 2013 21:46:50 +0000 (14:46 -0700)]
flow: Fill in ->l7 in flow_compose().
flow_extract() fills in ->l7 but flow_compose() wasn't doing it, which
confused bfd_process_packet() when invoked via the ofproto/trace appctl
command.