Jesse Gross [Mon, 24 Jun 2013 22:02:18 +0000 (15:02 -0700)]
datapath: Make GRE support conditional on CONFIG_NET_IPGRE_DEMUX.
Now that GRE support has been upstreamed into Linux, OVS is
using the components in the native kernel when available. However,
this means that it is now dependent on the appropriate kernel
config, which is CONFIG_NET_IPGRE_DEMUX on 2.6.37 and later.
Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Justin Pettit [Thu, 27 Jun 2013 20:42:14 +0000 (13:42 -0700)]
datapath: Convert IPv6 TCP and UDP port netlink attributes properly.
The code that converts netlink attributes to a flow match always
stored TCP and UDP ports in the IPv4 structure. This commit
properly puts TCP and UDP traffic into appropriate IPv4 and IPv6
structures.
Swap places of OFPRR_METER_DELETE and OFPRR_EVICTION in enumeration to be
compatible with OpenFlow 1.4.
Prior to OpenFlow 1.4 OFPRR_EVICTION was a Nicira specific flow removal reason
code. OpenFlow 1.3 added support for meters, which require dependent flow
removal when meters are deleted. The reason code for this is also added in
OpenFlow 1.4, but OFPRR_METER_DELETE now has the value OVS previously had for
OFPRR_EVICTION.
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
ovsdb-server: Make database name mandatory when specifying db paths.
Currently, if we have just one database, we can optionally skip the
database name when providing the DB path for certain options (ex:
--remote=db:[db,]table,column). But in case we have multiple databases,
it is mandatory.
With this commit, we make the database name mandatory. This provides
increased flexibility for an upcoming commit that provides the ability
to add and remove databases during run time.
Feature #14595. Acked-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
ovsdb-server: Store databases in shash instead of array.
An upcoming commit provides the ability to add and remove databases.
Having the databases in a shash instead of an array makes it easier
to add and remove databases.
Feature #14595. Acked-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
This adds support for specifying flow miss handling behaviour at
runtime, through a new "other-config" option in the Open_vSwitch table.
This takes precedence over flow-eviction-threshold.
By default, the behaviour is the same as before. If force-miss-model is
set to 'with-facets', then flow miss handling will always result in the
creation of new facets and flow-eviction-threshold will be ignored. If
force-miss-model is set to 'without-facets', then flow miss handling will never
result in the creation of new facets (effectively the same as setting the
flow-eviction-threshold to 0, which is not currently configurable).
We intend to use this configuration option in the testsuite to force
particular code paths to be used, allowing us to improve test coverage.
Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Ben Pfaff <blp@nicira.com>
Justin Pettit [Tue, 25 Jun 2013 23:40:50 +0000 (16:40 -0700)]
tunnel: Only un-wildcard the ECN bits for IP traffic.
With tunnels carrying IP packets, ECN bits are always inherited by
the encapsulating tunnel. However, it doesn't make sense to
unwildcard the inner packet's TOS fields if the packet is not IP.
Ben Pfaff [Tue, 25 Jun 2013 20:50:26 +0000 (13:50 -0700)]
ovs-thread: Add per-thread data support.
POSIX defines a portable pthread_key_t API for per-thread data. GCC and
C11 have two different forms of per-thread data that are generally faster
than the POSIX API, where they are available. This commit adds a
macro-based wrapper, DEFINE_PER_THREAD_DATA, that takes advantage of these
features where they are available and falls back to the POSIX API
otherwise.
The Clang compiler implements C11 thread_local in its <threads.h>.
This commit also adds a convenience wrapper for the POSIX API, via the
DEFINE_PER_THREAD_MALLOCED_DATA macro.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Andy Zhou [Tue, 25 Jun 2013 16:21:16 +0000 (09:21 -0700)]
datapath: Make OVS_ACTION_ATTR_USERSPACE action to send packet key
OVS_ACTION_ATTR_USERSPACE action was sending the key from the matching
flow. This works for exact match flows because flow keys are the
same as packet keys. However, it does not work with wildcarded flows as
the packet keys may be different than the flow keys. This patch uses
the packet keys carried in OVS_CB(skb) when calling output_userspace().
Bug #18163
Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Tue, 21 May 2013 00:14:27 +0000 (17:14 -0700)]
ofproto-dpif: Make "fdb/show" report OpenFlow port numbers.
Users are more likely to be able to reasonably interpret OpenFlow port
numbers than datapath port numbers.
This issue has existed since at least 2011 but only recently has it been
possible for OpenFlow and datapath port numbers to differ (except for the
"local" port).
Reported-by: Christopher Paggen <cpaggen@cisco.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Mon, 24 Jun 2013 20:18:46 +0000 (13:18 -0700)]
ofp-util: New function ofputil_port_to_string().
This function is more convenient than ofputil_format_port() when a "struct
ds" is not already in use. This commit converts one caller for which this
was already true, and the following commit will add another.
Jesse Gross [Mon, 24 Jun 2013 19:21:29 +0000 (12:21 -0700)]
datapath: Do not clear key in ovs_match_init()
When executing packets sent from userspace, the majority of the
flow information is extracted from the packet itself and a small
amount of metadata supplied by userspace is added. However, when
adding this metadata, the extracted flow information is currently
being cleared.
This manifests in a problem when executing actions as elements of key are
used when verifying some actions. For example a dec_ttl action verifies the
proto of the flow. An example of a flow that fails as a result of this
problem is:
Ben Pfaff [Mon, 24 Jun 2013 19:25:48 +0000 (12:25 -0700)]
acinclude: Improve detection of not-understood compiler options with clang.
By default, clang warns about but does not fail on unknown -W options.
This made configure add the option to WARNING_FLAGS, which caused the
warning about not-understood warnings to be emitted for every file
compiled.
In combination with -Werror, clang does fail on unknown -W options. This
commit adds -Werror during configure's warning tests, which should cause
the not-understood warnings to be detected that way.
Reported-by: Ed Maste <emaste@freebsd.org> Tested-by: Ed Maste <emaste@freebsd.org> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Fri, 21 Jun 2013 23:07:08 +0000 (16:07 -0700)]
datapath: Fix a kernel crash caused by corrupted mask list.
When flow table is copied, the mask list from the old table
is not properly copied into the new table. The corrupted mask
list in the new table will lead to kernel crash. This patch
fixes this bug.
Pravin B Shelar [Fri, 21 Jun 2013 00:11:43 +0000 (17:11 -0700)]
gre: Restructure tunneling.
Following patch restructures ovs tunneling and gre vport
implementation to make ovs tunneling more in sync with
upstream kernel tunneling. Doing this tunneling code is
simplified as most of protocol processing on send and
recv is pushed to kernel tunneling. For external ovs
module the code is moved to kernel compatibility code.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Jesse Gross [Fri, 21 Jun 2013 00:08:09 +0000 (17:08 -0700)]
datapath: Use a single attribute array for parsing values and masks.
When parsing flow Netlink messages we currently have arrays to hold the
attribute pointers for both values and masks. This results in a large
stack, which some compilers warn about. It's not actually necessary
to have both arrays at the same time, so we can collapse this to a
single array.
Reported-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Justin Pettit [Tue, 11 Jun 2013 01:09:53 +0000 (18:09 -0700)]
ofproto-dpif: Handle failed flow 'put's.
If a flow cannot be installed in the datapath, we should notice
this and not treat it as installed. This becomes an issue with
megaflows, since a batch of unique flows may come in that generate
a single new datapath megaflow that covers them. Since userspace
doesn't know whether the datapath supports megaflows, each unique
flow will get a separate flow entry (which overlap when masks are
applied) and all except the first will get rejected by a megaflow-
supporting datapath as duplicates.
Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
James Page [Thu, 20 Jun 2013 21:31:52 +0000 (22:31 +0100)]
tests: Tolerate init process pid != 1.
On Ubuntu Saucy based desktops, upstart runs with user sessions
enabled which means that the init process under which a daemon
might run is not always pid = 1.
Instead of checking for pid = 1, check to ensure that the parent
pid of the monitor is not the pid of the shell that started it.
Signed-off-by: James Page <james.page@ubuntu.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Wed, 19 Jun 2013 23:58:44 +0000 (16:58 -0700)]
Create specific types for ofp and odp port
Until now, datapath ports and openflow ports were both represented by
unsigned integers of various sizes. With implicit conversions, etc., it is
easy to mix them up and use one where the other is expected. This commit
creates two typedefs, ofp_port_t and odp_port_t. Both of these two types
are marked by "__attribute__((bitwise))" so that sparse can be used to
detect any misuse.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Wed, 19 Jun 2013 07:15:10 +0000 (07:15 +0000)]
ovs-dpctl: Add mega flow support
Added support to allow mega flow specified and displayed. ovs-dpctl tool
is mainly used as debugging tool.
This patch also implements the low level user space routines to send
and receive mega flow netlink messages. Those netlink suppor
routines are required for forthcoming user space mega flow patches.
Added a unit test to test parsing and display of mega flows.
Ethan contributed the ovs-dpctl mega flow output function.
Co-authored-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Mon, 17 Jun 2013 14:51:00 +0000 (07:51 -0700)]
datapath: Mega flow implementation
Add wildcarded flow support in kernel datapath.
Wildcarded flow can improve OVS flow set up performance by avoid sending
matching new flows to the user space program. The exact performance boost
will largely dependent on wildcarded flow hit rate.
In case all new flows hits wildcard flows, the flow set up rate is
within 5% of that of linux bridge module.
Pravin has made significant contributions to this patch. Including API
clean ups and bug fixes.
Co-authored-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
[jesse: Additional documentation, fix memory leak, and improve validation.] Signed-off-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Wed, 19 Jun 2013 04:01:33 +0000 (21:01 -0700)]
ovs-vsctl: Improve error message for "ovs-vsctl del-port <bridge>".
Previously, commands like this:
ovs-vsctl add-br br0
ovs-vsctl del-port br0
yielded an error message like:
no port named br0
which is confusing. This commit improves the error message to:
cannot delete port br0 because it is the local port for bridge br0
(deleting this port requires deleting the entire bridge)
Bug #17994. Reported-by: Reid Price <reid@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Justin Pettit [Wed, 19 Jun 2013 06:55:47 +0000 (23:55 -0700)]
ofproto-dpif: Tighten up megaflow wildcard handling.
A number of use-cases weren't handled properly when determining what can
be wildcarded for megaflows. This commit both catches additional fields
that cannot be wildcarded and loosens a few other cases.
Bug #17979
Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
ovs-xapi-sync: Cache the bridge-id value for non nicira-bridge-id too.
Currently we connect to xapi in case there are multiple
external_ids:xs-network-uuids to get the single bridge id everytime
we have a change in the database for all the interested columns in
ovs-xapi-sync. The xs-network-uuids value can also change whenever
new VLANs are added or deleted, which is a common use case. The
disadvantage with this approach is that we query XAPI more often
and set the bridge-id as "" if we don't get a valid response for
our query. This can take down the logical connectivity for all the
VMs on that xenserver.
Instead of looking at the PIF records for all the xs-network-uuids,
we can instead just look at the xapi record which has the same bridge
name as the OVS bridge name and then cache its uuid. This value will
hold true till the OVS bridge is recreated in which case we will re-read
the value.
Ethan Jackson [Mon, 17 Jun 2013 21:04:36 +0000 (14:04 -0700)]
ofproto-dpif: Store patch port peer in struct ofport_dpif.
This removes ofproto-dpif-xlate's dependency on ofport_get_peer()
which, while cleaner in-and-of itself, will become more important
as ofproto-dpif_xlate modularizes.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Tue, 18 Jun 2013 16:48:14 +0000 (19:48 +0300)]
ofproto: Index flows by cookie.
The simplest way for an OpenFlow controller to refer to a (set of) flows
is by a controller-issued flow cookie. Make this fast by inserting flows
to a hash index, and use that when flows are queried, deleted, or modified
with a full cookie mask.
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Justin Pettit [Tue, 18 Jun 2013 01:07:33 +0000 (18:07 -0700)]
ofproto-dpif: Don't wildcard fields used in special processing.
A number of fields are looked at when determining whether special
processing (slow-path) is needed. This commit removes wildcarding when
they were consulted.
Reported-by: Ethan Jackson <ethan@nicira.com> Reported-by: Paul Ingram <paul@nicira.com> Signed-off-by: Justin Pettit <jpettit@nicira.com>
Justin Pettit [Tue, 18 Jun 2013 00:56:54 +0000 (17:56 -0700)]
ofproto-dpif: Move process_special() to ofproto-dpif-xlate.c.
The action translation functions are the only ones that need
process_special(). Move that function closer to the callers, since a
future commit will use more xlate-related knowledge in process_special.
Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Tue, 23 Apr 2013 21:52:41 +0000 (14:52 -0700)]
leak-checker: Remove because it cannot be made thread-safe.
The underlying glibc interface is deprecated because the interface itself
is not thread-safe. That means that there's no way for a layer on top of
it to be thread-safe.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Tue, 23 Apr 2013 20:41:32 +0000 (13:41 -0700)]
backtrace: Remove variant that does not support threads.
This variant was Linux-specific, GCC-specific, only worked on
architectures with frame pointers (possibly only on i386?), and isn't used
with glibc anyway. Remove it.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
ovs-xapi-sync: Retry getting bridge-ids in case xapi is not ready.
When there are multiple xs-network-uuids set for a bridge,
we query xapi to get the record that does not have a VLAN
associated with it. For cases when xapi does not respond,
retry again after a second.
During the times when xapi does not respond, set the value
as external_ids:bridge_id "".
Justin Pettit [Thu, 13 Jun 2013 23:46:33 +0000 (16:46 -0700)]
tunnel: Don't wildcard TTL and TOS in some circumstances.
For tunnels, we need to handle the facet's wildcards specially in a
couple of cases:
- Don't wildcard TTL for facets if "ttl" option is "inherit".
- Never wildcard the ECN bits, since they are always inherited.
- Wildcard the rest of the TOS field if the "tos" option is "inherit".
Murphy McCauley [Thu, 13 Jun 2013 21:41:21 +0000 (14:41 -0700)]
lib/netdev-linux.c: Prevent receiving of sent packets
Commit 796223f5 (netdev: Add new "struct netdev_rx" for capturing packets
from a netdev) refactored send and receive into separate netdevs. As a
result, send and receive now use different socket descriptors (except for tap
interfaces which are treated specially). An unintended side effect was that
all sent packets are looped back and received, which had previously been
avoided as the kernel specifically prevents this from happening on a single
socket descriptor.
To resolve the situation, a socket filter is added to the receive socket
so that it only accepts inbound packets.
Simon Horman co-discovered and initially reported this issue.
Signed-off-by: Murphy McCauley <murphy.mccauley@gmail.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Tested-by: Simon Horman <horms@verge.net.au> Reviewed-by: Simon Horman <horms@verge.net.au>
Ben Pfaff [Thu, 13 Jun 2013 19:25:39 +0000 (12:25 -0700)]
ovsdb-server: Preserve remotes across crash and restart.
Commit b421d2af0ab (ovsdb-server: Add commands for adding and removing
remotes) made it possible to make ovsdb-server connect to OVS managers only
after ovs-vswitchd has completed its initial configuration. But this
results in an undesirable effect: whenever ovsdb-server crashes, the
monitor restarts its, but ovsdb-server can no longer connect to the manager
because the remotes were added during runtime and that information is lost
during the crash.
Ethan Jackson [Wed, 12 Jun 2013 20:58:16 +0000 (13:58 -0700)]
mac-learning: Simplify mac_learning_changed().
With this patch, the mac_learning module takes responsibility for
remembering tags which need revalidation after a
mac_learning_changed() call. This removes one of
ofproto-dpif-xlate's dpif_backer uses.
Ben Pfaff [Wed, 12 Jun 2013 21:49:19 +0000 (14:49 -0700)]
ofp-print: Avoid returning static data.
Returning a static data buffer makes code more brittle and definitely
not thread-safe, so this commit switches to using a caller-provided
buffer instead.
Ben Pfaff [Wed, 12 Jun 2013 21:37:18 +0000 (14:37 -0700)]
ofproto-dpif-xlate: Make code more readable via 'flow' and 'wc' locals.
'ctx->xin->flow' and 'ctx->xout->wc' are both pretty long. Where it gets
in the way of code readability, this patch replaces them by 'xin' and
'xout' using local variables.
Also, replace an explicit comparison against IP and IPv6 Ethertypes by
a call to is_ip_any().
Co-authored-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Jarno Rajahalme [Wed, 12 Jun 2013 21:33:17 +0000 (14:33 -0700)]
ofproto-dpif-xlate: Harmonize naming of internal functions.
It would be good to be able to harminize the use of "xlate", "execute",
"compose", etc. "xlate" clearly relates to the use of the various
translation context structures, but the distinction between "execute" and
"compose" is not that clear, so these names could be going either way.
Choose to go with "compose", keeping with the older tradition.
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 31 May 2013 11:35:11 +0000 (14:35 +0300)]
ofproto-dpif: Check for MPLS depth at the flow.
The earlier check on base_flow.mpls_depth seemed wrong, as multiple
MPLS push actions would have resulted in the flow.mpls_depth being
set to 1 each time.
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Tue, 11 Jun 2013 20:32:30 +0000 (13:32 -0700)]
ofproto-dpif: Pull xlate_actions() into its own file.
Ideally, this patch would move xlate_actions() into it's own module
with a clearly defined regular interface which is minimally
dependent on ofproto-dpif. While, I've done this in a prototype,
moving large amounts of code into a new file while simultaneously
changing the logic and keeping up with changes to master has proved
nearly impossible.
This patch takes a different approach. It simply copies the logic
directly from ofproto-dpif with no changes. Once this is in,
future patches can begin breaking the ties between
ofproto-dpif-xlate and ofproto-dpif proper.
Justin Pettit [Wed, 12 Jun 2013 00:15:31 +0000 (17:15 -0700)]
ofproto-dpif: Never wildcard dl_type for "normal" action.
The is_gratuitous_arp() function is occasionally called when
processing the "normal" action. The previous code only disabled
wildcarding the dl_type field when the function was called, but
since it runs occasionally, it could lead to inconsistencies in the
facet table. This commit causes the dl_type to never be wildcarded
when the "normal" action is used.
Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ethan Jackson [Thu, 6 Jun 2013 00:15:35 +0000 (17:15 -0700)]
ofproto-dpif: Retire 'struct initial_vals'.
By detecting that a port is a vlan splinter realdev, we can force
xlate_actions() to emit the appropriate vlan push action. This
allows as to ditch struct initial_vals. It will not be missed.