Jesse Gross [Sun, 8 Feb 2015 06:40:52 +0000 (22:40 -0800)]
datapath: Account for "vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions."
Upstream commit:
vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions.
Simplify vxlan implementation using common UDP tunnel APIs.
Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Using the upstream functions where available also has the side effect
of ensuring that we can use hardware offloads. The GBP changes forced
the use of the OVS emulated GSO path on kernels that lack GBP. This
resulted in the loss of VXLAN offload on earlier kernels. This restores
the offload support (for both GBP and non-GBP VXLAN).
Upstream: acbf74a7 ("vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions.") Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Jesse Gross [Wed, 18 Feb 2015 00:27:04 +0000 (16:27 -0800)]
datapath: Consistently set skb->inner_protocol for tunnels.
skb->inner_protocol is used by GSO and TSO for tunnels on new
kernels. Since we are setting up packets to be handled by the
kernel's GSO and not just our own, we need to initialize this
field properly.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Jesse Gross [Sat, 7 Feb 2015 03:25:09 +0000 (19:25 -0800)]
datapath: Account for "udp-tunnel: Add a few more UDP tunnel APIs"
Upstream commit:
udp-tunnel: Add a few more UDP tunnel APIs
Added a few more UDP tunnel APIs that can be shared by UDP based
tunnel protocol implementation. The main ones are highlighted below.
setup_udp_tunnel_sock() configures UDP listener socket for
receiving UDP encapsulated packets.
udp_tunnel_xmit_skb() and upd_tunnel6_xmit_skb() transmit skb
using UDP encapsulation.
udp_tunnel_sock_release() closes the UDP tunnel listener socket.
Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 6a93cc90 ("udp-tunnel: Add a few more UDP tunnel APIs") Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Jesse Gross [Wed, 18 Feb 2015 22:27:17 +0000 (14:27 -0800)]
datapath: Enable OVS GSO to be used up to 3.18 if necessary.
There are two important GSO tunnel features that were introduced
after the 3.12 cutoff for our current out of tree GSO implementation:
* 3.16 introduced support for outer UDP checksums.
* 3.18 introduced support for verifying hardware support for protocols
other than VXLAN.
In cases where these features are used, we should use OVS GSO to
ensure correct behavior. However, we also want to continue to use
kernel GSO or hardware TSO in existing situations. Therefore, this
extends the range of kernels where OVS GSO is available to 3.18 and
makes it easier to select which one to use.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Pravin B Shelar [Fri, 20 Feb 2015 22:59:47 +0000 (14:59 -0800)]
datapath: Fix net exit.
Open vSwitch allows moving internal vport to different namespace
while still connected to the bridge. But when namespace deleted
OVS does not detach these vports, that results in dangling
pointer to netdevice which causes kernel panic as follows.
This issue is fixed by detaching all ovs ports from the deleted
namespace at net-exit.
Reported-by: Assaf Muller <amuller@redhat.com> Fixes: 46df7b81454("openvswitch: Add support for network namespaces.") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 7b4577a9da ("openvswitch: Fix net exit"). Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Fri, 20 Feb 2015 20:32:08 +0000 (12:32 -0800)]
userspace: Replace all uses of strncpy() by ovs_strlcpy().
strncpy() has a lot of pitfalls. A while back we replaced all its uses by
calls to ovs_strlcpy() or ovs_strzcpy(), but some more have crept in. This
commit fixes them.
Reported-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
Ben Pfaff [Fri, 20 Feb 2015 19:30:50 +0000 (11:30 -0800)]
socket-util: Use correct address family in set_dscp(), instead of guessing.
The set_dscp() function, until now, tried to set the DSCP as IPv4 and as
IPv6. This worked OK on Linux, where an ENOPROTOOPT error made it really
clear which one was wrong, but FreeBSD uses EINVAL instead, which has
multiple meanings and which it therefore seems somewhat risky to ignore.
Instead, this commit just tries to set the correct address family's DSCP
option.
Tested by Alex Wang on FreeBSD 9.3.
Reported-by: Atanu Ghosh <atanu@acm.org> Signed-off-by: Ben Pfaff <blp@nicira.com> Co-authored-by: Alex Wang <alexw@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Tested-by: Alex Wang <alexw@nicira.com>
Ben Pfaff [Fri, 20 Feb 2015 16:44:48 +0000 (08:44 -0800)]
stream: Eliminate pstream_set_dscp().
This function is really of marginal utility. This commit drops it and
makes the existing callers instead open a new pstream with the desired
dscp.
The ulterior motive here is that the set_dscp() function that actually sets
the DSCP on a socket really wants to know the address family (AF_INET vs.
AF_INET6). We could plumb that down through the stream code, and that's
one reasonable option, but I thought that simply eliminating some calls
to set_dscp() where we don't already have the address family handy was
another reasonable way to go.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
Russell Bryant [Fri, 20 Feb 2015 18:22:14 +0000 (13:22 -0500)]
timeval: Remove duplicate memset().
init_clock begins with a memset of 0 of the full clock struct. This
memset at the end of a single struct member just makes extra sure that
it's set to 0, which is unnecessary.
Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Sat, 7 Feb 2015 03:24:32 +0000 (19:24 -0800)]
datapath: Account for "udp: Generic functions to set checksum"
Upstream commit:
udp: Generic functions to set checksum
Added udp_set_csum and udp6_set_csum functions to set UDP checksums
in packets. These are for simple UDP packets such as those that might
be created in UDP tunnels.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: af5fcba7 ("udp: Generic functions to set checksum") Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
Jesse Gross [Fri, 6 Feb 2015 23:42:04 +0000 (15:42 -0800)]
datapath: Account for "vxlan: Call udp_sock_create"
Upstream commit:
vxlan: Call udp_sock_create
In vxlan driver call common function udp_sock_create to create the
listener UDP port.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 3ee64f39 ("vxlan: Call udp_sock_create") Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
Jesse Gross [Fri, 6 Feb 2015 23:16:20 +0000 (15:16 -0800)]
datapath: Account for "udp: Add udp_sock_create for UDP tunnels to open listener socket"
Upstream commit:
udp: Add udp_sock_create for UDP tunnels to open listener socket
Added udp_tunnel.c which can contain some common functions for UDP
tunnels. The first function in this is udp_sock_create which is used
to open the listener port for a UDP tunnel.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 8024e028 ("udp: Add udp_sock_create for UDP tunnels to open listener socket") Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
Jesse Gross [Fri, 6 Feb 2015 23:19:38 +0000 (15:19 -0800)]
datapath: Remove compat vxlan_src_port().
vxlan_src_port() has been replaced with the more generic
udp_flow_src_port() upstream. We already have a backport for this and
it is used everywhere where this is needed, so we can remove the
dead vxlan_src_port() function.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
Andy Zhou [Thu, 19 Feb 2015 01:17:33 +0000 (17:17 -0800)]
ovs-sandbox: Add an option to allow running ovs-vswitchd under gdb
It is some times useful to leverage the sandbox facility to experiment
and explore the internals of ovs-vswitchd. Since GDB requires console
access for user inputs, this patch launch an xterm for GDB, The main
terminal continue to run the sub-shell as before. Exiting the sub-shell
will also kill the ovs-vswitchd under GDB (but not GDB itself currently)
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 19 Feb 2015 19:09:22 +0000 (11:09 -0800)]
ovsdb-doc: Add license and copyright notice.
The copyright dates are taken from "git log --follow ovsdb/ovsdb-doc",
considering only Nicira authors' changes. (Only one change was from
a non-Nicira author anyhow.)
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
Ben Pfaff [Thu, 19 Feb 2015 00:09:21 +0000 (16:09 -0800)]
vswitch: Document columns that had been previously overlooked.
A fair number of columns had been overlooked. This documents them.
The patch is smaller than it appears because this rearranges the STP and
RSTP documentation to group configuration, status, and statistics together
in the documentation for clarity.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Gurucharan Shetty <gshetty@nicira.com>
Russell Bryant [Thu, 19 Feb 2015 18:10:51 +0000 (13:10 -0500)]
ovsdb-idlc: Constify 'char **'.
Update the logic used in constify() to add const to a 'char **' while
still excluding all other cases of more than one level of indirection.
This results in adding const to a parameter of a generated setter
function where we're generally passing in array of constant strings.
As a result, this patch includes the other necessary fixes to the code
base to reflect the const addition.
Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Russell Bryant [Thu, 19 Feb 2015 18:03:53 +0000 (13:03 -0500)]
CodingStyle: recommend PEP 8 for Python code
Add a new section about Python code to the coding style document.
Suggest that all new Python code should adhere to the PEP 8 standard.
Also include a reference to tools that can quickly check code for
style issues.
Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 19 Feb 2015 07:50:18 +0000 (23:50 -0800)]
AUTHORS: Add Madhu Venugopal.
Madhu reported a bug last year, mentioned in commit 639b6d9c9093
(ovsdb-server: Document RFC 7047 extensions to ovsdb <error>s.) but I
forgot to credit him in AUTHORS at the time.
Andy Zhou [Thu, 12 Feb 2015 23:10:28 +0000 (15:10 -0800)]
ofproto/bond: Fix a race condition in updating post recirculation rules
When updating post recirc rules, rule management requires calls to
hmap APIs, which requires proper locking to ensure mutual exclsion in
accessing the hmap internal data structure. The locking currently is
missing from the output_normal() xlate path, thus causing
a race condition.
The race condition leads to segfault crash of ovs-vswitchd, with the
following stack trace:
The crash was found by adding and deleting bond interfaces repeatedly
with on-going traffic hitting the bond interfaces. The same test was
ran over multiple days with this patch to ensure the same crash was
not seen.
The patch added the necessary lock annotation that would have caught
the bug.
Tested-by: Salvatore Cambria <salvatore.cambria@citrix.com> Reported-by: Salvatore Cambria <salvatore.cambria@citrix.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Patches that modify existing code can break expected behaviour.
Flag this by testing the patch with 'make check' prior to submission.
Furthermore, it is not sufficient to only test patches that add files
using 'make distcheck'; the compile flags for this target could change
the definition of some functions (ovs_assert, for example), altering
the outcome of some unit tests. Rather, it is preferable to use a
combination of 'make distcheck' with 'make check' to cover all bases.
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 13 Feb 2015 22:31:21 +0000 (14:31 -0800)]
ofp-parse: Correctly update bucket lists if they are empty.
Previously, list_moved() only worked with non-empty lists, but this was a
caveat that was really easy to miss. parse_ofp_group_mod_file() had a bug
because it didn't honor that restriction. This commit fixes the problem,
by modifying the list_moved() interface to be harder to use incorrectly
and then updating the callers.
Reported-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
tests: Enable running parallel unit tests for Windows.
testsuite uses mkfifo in its job dispatcher that manages
parallel unit tests. MinGW does not have a mkfifo. This
results in unit tests running serially on Windows. Right
now it takes up to approximately 40 minutes to run all the
unit tests on Windows.
This commit provides a job dispatcher for MinGW that uses
temporary files instead of mkfifo to manage parallel jobs.
With this commit, on a Windows machine with 4 cores and with
8 parallel unit test sessions, it takes approximately 8
minutes to finish a unit test run.
Shu Shen [Wed, 11 Feb 2015 05:20:12 +0000 (21:20 -0800)]
docs: Fix overlapping 'weak' edges in ovs-vswitchd.conf.db.5.
Multiple weak edges between nodes at the same rank overlaps with each other in
a dot/graphviz diagram. The vswitchd.pic used in ovs-vswitchd.conf.db.5 suffers
this problem.
Removing "constraint=false" allows graphviz to rank the nodes using the weak
edages as well so that the nodes at the ends of a weak edge won't be at the
same rank and allows mutlple 'weak' edges to be visible.
Signed-off-by: Shu Shen <shu.shen@radisys.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 12 Feb 2015 07:34:50 +0000 (23:34 -0800)]
mac-learning: Implement per-port MAC learning fairness.
In "MAC flooding", an attacker transmits an overwhelming number of frames
with unique Ethernet source address on a switch port. The goal is to
force the switch to evict all useful MAC learning table entries, so that
its behavior degenerates to that of a hub, flooding all traffic. In turn,
that allows an attacker to eavesdrop on the traffic of other hosts attached
to the switch, with all the risks that that entails.
Before this commit, the Open vSwitch "normal" action that implements its
standalone switch behavior (and that can be used by OpenFlow controllers
as well) was vulnerable to MAC flooding attacks. This commit fixes the
problem by implementing per-port fairness for MAC table entries: when
the MAC table is at its maximum size, MAC table eviction always deletes an
entry from the port with the most entries. Thus, MAC entries will never
be evicted from ports with only a few entries if a port with a huge number
of entries exists.
Controllers could introduce their own MAC flooding vulnerabilities into
OVS. For a controller that adds destination MAC based flows to an OpenFlow
flow table as a reaction to "packet-in" events, such a bug, if it exists,
would be in the controller code itself and would need to be fixed in the
controller. For a controller that relies on the Open vSwitch "learn"
action to add destination MAC based flows, Open vSwitch has existing
support for eviction policy similar to that implemented in this commit
through the "groups" column in the Flow_Table table documented in
ovs-vswitchd.conf.db(5); we recommend that users of "learn" not already
familiar with eviction groups to read that documentation.
In addition to implementation of per-port MAC learning fairness,
this commit includes some closely related changes:
- Access to client-provided "port" data in struct mac_entry
is now abstracted through helper functions, which makes it
easier to ensure that the per-port data structures are maintained
consistently.
- The mac_learning_changed() function, which had become trivial,
vestigial, and confusing, was removed. Its functionality was folded
into the new function mac_entry_set_port().
- Many comments were added and improved; there had been a lot of
comment rot in previous versions.
CERT: VU#784996 Reported-by: "Ronny L. Bull - bullrl" <bullrl@clarkson.edu>
Reported-at: http://www.irongeek.com/i.php?page=videos/derbycon4/t314-exploring-layer-2-network-security-in-virtualized-environments-ronny-l-bull-dr-jeanna-n-matthews Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Thomas Graf [Thu, 12 Feb 2015 20:23:08 +0000 (21:23 +0100)]
datapath: vxlan: Only set has-GBP bit in header if any other bits would be set
vxlan: Only set has-GBP bit in header if any other bits would be set
This allows for a VXLAN-GBP socket to talk to a Linux VXLAN socket by
not setting any of the bits.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: db79a621835e ("vxlan: Only set has-GBP bit in header if any other bits would be set") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Nithin Raju [Thu, 12 Feb 2015 18:53:10 +0000 (10:53 -0800)]
lib/util.h: use types compatible with DWORD
_BitScanForward() and friends are part of the Windows API and
take DWORD as parameter type. DWORD is defined to be 'unsigned long'
in Windows' header files.
We call into these functions from within lib/util.h. Currently, we
pass arguments of type uint32_t which is type defined to
'unsigned int'. This incompatiblity causes failures when we compile
the code as C++ code or with warnings enabled, when compiled as C
code.
The fix is to use 'unsigned long' rather than fixed size type.
Co-Authored-by: Linda Sun <lsun@vmware.com> Signed-off-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Linda Sun <lsun@vmware.com> Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
daemon.at: Fix a race condition with windows service test.
OVS daemon service for Windows creates the pidfile and then
registers with the Windows services manager that the service
is running. There is a small time gap between the two steps.
So retry a few times in the test.
Ben Pfaff [Wed, 11 Feb 2015 02:00:22 +0000 (18:00 -0800)]
timeval: Correctly report usage statistics in log_poll_interval().
Most of the information that timeval was reporting for long poll intervals
was comparing per-thread with per-process statistics, which yielded
nonsense a lot of the time.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
Alex Wang [Wed, 4 Feb 2015 01:08:13 +0000 (17:08 -0800)]
netdev-dpdk: Allow changing NON_PMD_CORE_ID for testing purpose.
For testing purpose, developers may want to change the NON_PMD_CORE_ID
and use a different core for non-pmd threads. Since the netdev-dpdk
module is hard-coded to assert the non-pmd threads using core 0, such
change will cause abortion of OVS.
This commit fixes the assertion and allows changing NON_PMD_CORE_ID.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Andy Zhou [Sat, 7 Feb 2015 00:14:10 +0000 (16:14 -0800)]
test: Reverse the order of commands added by ON_EXIT macro
Executing clean-up commands in the reverse order of their addition
seems to be better for most of the cleanup situations. For example,
in kmod tests, we should remove name spaces before remove kernel
modules.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.comugi
Shu Shen [Wed, 4 Feb 2015 06:24:18 +0000 (22:24 -0800)]
ofproto: add support of OFPR_ACTION_SET as packet-in reason for OF1.4+
This patch adds support for OFPR_ACTION_SET as the packet-in reason when
a Packet-In message is triggered by an output action within an
action-set. By default reason code OFPR_ACTION_SET is enabled for async
messages when Openflow 1.4+ is used. A test case is included.
Signed-off-by: Shu Shen <shu.shen@radisys.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jason Kölker [Sat, 7 Feb 2015 06:47:51 +0000 (06:47 +0000)]
XenServer: Don't reset on xe-toolstack-restart
With XenServer only 1 manager is configured in the pool, which may not
be the first manager returned from `get-manager` as it returns in
lexicographical order.
Signed-off-by: Jason Kölker <jason@koelker.net> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Thu, 5 Feb 2015 23:29:38 +0000 (15:29 -0800)]
test: remove unnecessary leading blanks
This is mostly a style fix. The macro is used in the next patch to
add commands to the 'cleanup" file. This fix makes the 'cleanup' file
easier to read.
Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Joe Stringer <joestringer@nicira.com>
Andy Zhou [Thu, 5 Feb 2015 23:19:41 +0000 (15:19 -0800)]
test: remove namespace after ovs-vswitchd is stopped
Remove namespaces also removes the ports in them, thus may
causing vswitch to generate warning log messages about not being
able find the port before it exits.
Remove namespaces after ovs-vswitchd exits improves test reliability.
Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Joe Stringer <joestringer@nicira.com>
Thomas Graf [Fri, 6 Feb 2015 20:10:44 +0000 (21:10 +0100)]
datapath: Account for "vxlan: Eliminate dependency on UDP socket in transmit path"
Excludes VXLAN_F_REMCSUM_TX bits as OVS currently doesn't support it.
Upstream commit:
vxlan: Eliminate dependency on UDP socket in transmit path
In the vxlan transmit path there is no need to reference the socket
for a tunnel which is needed for the receive side. We do, however,
need the vxlan_dev flags. This patch eliminate references
to the socket in the transmit path, and changes VXLAN_F_UNSHAREABLE
to be VXLAN_F_RCV_FLAGS. This mask is used to store the flags
applicable to receive (GBP, CSUM6_RX, and REMCSUM_RX) in the
vxlan_sock flags.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: af33c1adae1e ("vxlan: Eliminate dependency on UDP socket in transmit path") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Thomas Graf [Fri, 6 Feb 2015 20:10:44 +0000 (21:10 +0100)]
datapath: Support VXLAN Group Policy extension
Upstream commit:
openvswitch: Support VXLAN Group Policy extension
Introduces support for the group policy extension to the VXLAN virtual
port. The extension is disabled by default and only enabled if the user
has provided the respective configuration.
The configuration interface to enable the extension is based on a new
attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION
which can carry additional extensions as needed in the future.
The group policy metadata is stored as binary blob (struct ovs_vxlan_opts)
internally just like Geneve options but transported as nested Netlink
attributes to user space.
Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the
binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced.
The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing
OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 1dd144 ("openvswitch: Support VXLAN Group Policy extension") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
INSTALL.Windows: Don't specify the openssl version.
Windows openssl libraries are distributed through slproweb.
The maintainer there is very strict about retiring unsafe
versions of openssl and only lets the latest versions of
openssl to be downloaded. Instead of updating INSTALL.Windows
everytime there is a change in latest version, leave it to
user's discretion to download the latest version.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Thomas Graf [Wed, 4 Feb 2015 15:45:06 +0000 (07:45 -0800)]
datapath: Fix missing symbols when required to use own VXLAN stack
Fixes an insufficient ifdef in compat/vxlan.c which caused required
symbols not to be included in the build. The declarations were properly
enabled so the build would succeed but the module would spit missing
symbols when being inserted.
The fix uses a new define USE_UPSTREAM_VXLAN which is set in the compat
header <net/vxlan.h> as required. This centralizes the decision when to
include VXLAN compat code to a single place which eases further changes.
Reported-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Jarno Rajahalme [Wed, 4 Feb 2015 20:59:04 +0000 (12:59 -0800)]
ofproto: Initialize tunnel module earlier.
The tunnel module can get called by the handler threads right after
they are started, so we need to call ofproto_tunnel_init() before
opening the backer.
Late initialization caused a spurious ovs-vswitchd handler thread
crash on start-up due to the tunnel module fat_rwlock not being
initialized yet.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Flavio Leitner [Thu, 11 Dec 2014 11:38:19 +0000 (09:38 -0200)]
mcast-snoop: Add support to control Reports forwarding
The RFC4541 section 2.1.1 item 1 allows the snooping switch
to provide an administrative control to allow Report messages
to be flooded to ports not connected to multicast routers.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Mon, 2 Feb 2015 22:50:47 +0000 (14:50 -0800)]
ovs-dpctl: Do not report pmd info for 'dpif-netlink' datapath.
In 'ovs-dpctl dump-flows' output, we should only report the pmd
related info for 'dpif-netdev' datapath. However, current
implementation also reports uninitialized pmd info for
'dpif-netlink' datapath, which is very confusing to users.
This commit fixes it.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Mon, 2 Feb 2015 20:57:16 +0000 (12:57 -0800)]
ovs-command-completion: Complete on file path by default.
This commit makes the bash completion script complete on file
path when there is no completion available. The unit tests
are also adjusted accordingly.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 30 Jan 2015 22:20:52 +0000 (14:20 -0800)]
ofp-util: Issue error when OFPGC_DELETE command includes buckets.
An OFPGC_DELETE command deletes a whole group, including all of its
buckets, and so it doesn't make sense for the command itself to include any
specification of buckets.
ONF-JIRA: EXT-510 Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Tue, 3 Feb 2015 21:57:55 +0000 (13:57 -0800)]
datapath: update exact match lookup hash value to avoid hash collision
Currently, the exact match cache lookup uses 'skb->hash' as an index.
In most cases, this value will be the same for pre and post
recirculation lookup, threshing the exact match cache. This patch
avoid this hash collision by using the rehashed value, by mixing in
in the 'recirc_id', as the lookup index.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Thomas Graf [Tue, 3 Feb 2015 20:53:36 +0000 (21:53 +0100)]
datapath: Account for "netlink: make nlmsg_end() and genlmsg_end() void"
genlmsg_end() no longer returns an error value. Not a problem as it
never returned an error code anyway.
Upstream: 053c09 ("netlink: make nlmsg_end() and genlmsg_end() void") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Thomas Graf [Tue, 3 Feb 2015 20:53:36 +0000 (21:53 +0100)]
datapath: Account for "genetlink: pass only network namespace to genl_has_listeners()"
Upstream commit:
genetlink: pass only network namespace to genl_has_listeners()
There's no point to force the caller to know about the internal
genl_sock to use inside struct net, just have them pass the network
namespace. This doesn't really change code generation since it's
an inline, but makes the caller less magic - there's never any
reason to pass another socket.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: f8403a2 ("genetlink: pass only network namespace to genl_has_listeners()") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Thomas Graf [Tue, 3 Feb 2015 20:53:36 +0000 (21:53 +0100)]
datapath: Allow for any level of nesting in flow attributes
Upstream commit:
openvswitch: Allow for any level of nesting in flow attributes
nlattr_set() is currently hardcoded to two levels of nesting. This change
introduces struct ovs_len_tbl to define minimal length requirements plus
next level nesting tables to traverse the key attributes to arbitrary depth.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 81bfe3 ("openvswitch: Allow for any level of nesting in flow attributes") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Thomas Graf [Tue, 3 Feb 2015 20:53:36 +0000 (21:53 +0100)]
datapath: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS()
Backport of upstream commit:
openvswitch: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS()
Also factors out Geneve validation code into a new separate function
validate_and_copy_geneve_opts().
A subsequent patch will introduce VXLAN options. Rename the existing
GENEVE_TUN_OPTS() to reflect its extended purpose of carrying generic
tunnel metadata options.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: d91641d ("openvswitch: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS()") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Thomas Graf [Tue, 3 Feb 2015 20:53:35 +0000 (21:53 +0100)]
datapath: Account for "vxlan: add x-netns support"
Upstream commit:
vxlan: add x-netns support
This patch allows to switch the netns when packet is encapsulated or
decapsulated.
The vxlan socket is openned into the i/o netns, ie into the netns where
encapsulated packets are received. The socket lookup is done into this netns to
find the corresponding vxlan tunnel. After decapsulation, the packet is
injecting into the corresponding interface which may stand to another netns.
When one of the two netns is removed, the tunnel is destroyed.
Configuration example:
ip netns add netns1
ip netns exec netns1 ip link set lo up
ip link add vxlan10 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0
ip link set vxlan10 netns netns1
ip netns exec netns1 ip addr add 192.168.0.249/24 broadcast 192.168.0.255 dev vxlan10
ip netns exec netns1 ip link set vxlan10 up
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: f01ec1c017de ("vxlan: add x-netns support") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Thomas Graf [Tue, 3 Feb 2015 20:53:35 +0000 (21:53 +0100)]
datapath: Account for "vxlan: Group Policy extension"
Upstream commit:
vxlan: Group Policy extension
Implements supports for the Group Policy VXLAN extension [0] to provide
a lightweight and simple security label mechanism across network peers
based on VXLAN. The security context and associated metadata is mapped
to/from skb->mark. This allows further mapping to a SELinux context
using SECMARK, to implement ACLs directly with nftables, iptables, OVS,
tc, etc.
The group membership is defined by the lower 16 bits of skb->mark, the
upper 16 bits are used for flags.
SELinux allows to manage label to secure local resources. However,
distributed applications require ACLs to implemented across hosts. This
is typically achieved by matching on L2-L4 fields to identify the
original sending host and process on the receiver. On top of that,
netlabel and specifically CIPSO [1] allow to map security contexts to
universal labels. However, netlabel and CIPSO are relatively complex.
This patch provides a lightweight alternative for overlay network
environments with a trusted underlay. No additional control protocol
is required.
Host 1: Host 2:
Group A Group B Group B Group A
+-----+ +-------------+ +-------+ +-----+
| lxc | | SELinux CTX | | httpd | | VM |
+--+--+ +--+----------+ +---+---+ +--+--+
\---+---/ \----+---/
| |
+---+---+ +---+---+
| vxlan | | vxlan |
+---+---+ +---+---+
+------------------------------+
Backwards compatibility:
A VXLAN-GBP socket can receive standard VXLAN frames and will assign
the default group 0x0000 to such frames. A Linux VXLAN socket will
drop VXLAN-GBP frames. The extension is therefore disabled by default
and needs to be specifically enabled:
ip link add [...] type vxlan [...] gbp
In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket
must run on a separate port number.
Examples:
iptables:
host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200
host2# iptables -I INPUT -m mark --mark 0x200 -j DROP
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 351149 ("vxlan: Group Policy extension") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
This patch cleans up the header flags of VXLAN in anticipation of
defining some new ones:
- Move header related definitions from vxlan.c to vxlan.h
- Change VXLAN_FLAGS to be VXLAN_HF_VNI (only currently defined flag)
- Move check for unknown flags to after we find vxlan_sock, this
assumes that some flags may be processed based on tunnel
configuration
- Add a comment about why the stack treating unknown set flags as an
error instead of ignoring them
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 3bf394 ("vxlan: Improve support for header flags") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Thomas Graf [Tue, 3 Feb 2015 20:53:35 +0000 (21:53 +0100)]
datapath: Account for "rename vlan_tx_* helpers since "tx" is misleading there"
Upstream commit:
net: rename vlan_tx_* helpers since "tx" is misleading there
The same macros are used for rx as well. So rename it.
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: df8a39d ("net: rename vlan_tx_* helpers since "tx" is misleading there") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Ben Pfaff [Fri, 30 Jan 2015 21:36:34 +0000 (13:36 -0800)]
ofproto-dpif: Revalidate when sFlow probability changes.
Until now, when the sFlow selection probability changed, OVS failed to
immediately revalidate the flow table, delaying the new probability taking
effect. This commit fixes the problem.
Reported-by: K 華 <k940545@hotmail.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
commit 7905aae3fc1633c2c44c8fdb9e9d3a3d6e63439b
("vlog: Don't fail syslog initialization in chroot.")
uses os.path.isfile("/dev/log"), which tests if the given path
is a regular file, to see if syslog can be usable.
However, /dev/log is not a regular file for platforms I looked at.
* On Ubuntu 14.04 and CentOS 6.5, /dev/log is a socket
* On NetBSD-6, /dev/log is a symlink to a socket
Replace the test with os.path.exists() so that it can work
as intended for these platforms.
Jarno Rajahalme [Tue, 3 Feb 2015 02:06:50 +0000 (18:06 -0800)]
miniflow: Fix miniflow push of L4 port numbers.
Replace a 64 bit copy of L4 src/dst ports that was also
including additional packet params (e.g. TCP Seq Num). This
was later resulting in all packets from the flow missing in
the EMC.
Alex Wang [Mon, 24 Nov 2014 19:15:45 +0000 (11:15 -0800)]
ovs-command-completion: Autotest integration.
This commit integrates the unit tests defined in
utilities/ovs-command-compgen-test.bash into 'make check'.
The tests will be skipped if the current shell is not bash.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Sorin Vinturis [Mon, 26 Jan 2015 19:00:40 +0000 (19:00 +0000)]
datapath-windows: Solved BSOD when loading an activated extension
If the OVS extension was previously enabled and the driver unloaded,
when the driver is loaded again a BSOD is triggered.
This happens because the OVS extension registers its FilterXxx routines
to NDIS, by calling NdisFRegisterFilterDriver, before performing all
the necessary initialization. Because drivers that call
NdisFRegisterFilterDriver must be prepared for an immediate call to any
of their FilterXxx functions.
The BSOD is triggered because the FilterAttach routine, OvsExtAttach,
tries to acquire the control lock, when the lock is not yet initialized.
This happens because the FilterAttach is called before the driver
finishes initialization, in OvsInit().
The solution is to perform all necessary initialization before
registering OVS FilterXxx routines.
If device object creation fails, all allocated resources during init
phase are released by calling OvsCleanup and NdisFDeregisterFilterDriver
functions.
Christoph Jaeger [Mon, 26 Jan 2015 16:26:12 +0000 (11:26 -0500)]
autotest: Use modprobe for kernel module unloading
rmmod fails if the module is not loaded; thus, Vagrant aborts provisioning
when started from a clean slate. Use modprobe, which does not fail, instead.
Unloading unused modules the to-be-unloaded module depends on may also be
desirable.
Signed-off-by: Christoph Jaeger <cj@linux.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
When OVS unit tests are run inside chroot environment,
there is no syslog infrastructure available. In a
situation like that, don't fail or log additional messages
to syslog by increasing the severity level of syslog very high
(log messages would continue to be logged to console and file).
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
Mark D. Gray [Thu, 29 Jan 2015 15:22:53 +0000 (15:22 +0000)]
INSTALL.DPDK: Update documentation to indicate VFIO support
Since DPDK 1.7, VFIO is supported in place of UIO. This allows
a user to avoid having to insert a non-standard kernel module.
This patch updates the documentation with instructions for
setting up OVS with VFIO. As part of this work, VFIO was also
successfully tested with OVS and the DPDK netdev.
[tgraf: Added some more markdown formatting]
Signed-off-by: Mark D. Gray <mark.d.gray@intel.com> Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
in OVS_KMOD_VSWITCHD_STOP make up a syntactically correct function definition
(OVS_SWITCHD_STOP does not exist, and therefore the call does not expand):
Consequently, neither of the calls has the intended effect, i.e., stopping
ovs-vswitchd and ovsdb-server, checking their log files, and unloading the
datapath kernel module. Fix the misnaming, so all calls expand properly.
Fixes: 69c2bdfef9 ("autotest: add autotest framework for adding kernel module unit tests") Signed-off-by: Christoph Jaeger <cj@linux.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
Author: add Christoph Jaeger
vlog: Ability to override the default log facility.
When Open vSwitch is run in hundreds of hypervisors, it is
useful to collect log messages through log collectors. To
collect log messages like this, it is useful to log them
in a particular RFC5424 facility in the local system. The
log collectors can then be used to collect logs anytime
desired.
This commit provides a sysadmin the ability to specify the
facility through which the log messages are logged.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
INSTALL.DPDK.md: Provide a little a more consistency to documentation.
A few users (based on the reports in discuss@openvswitch.org) have been
literally following the instructions in INSTALL.DPDK.md and mixing up
pre-installed utilities and daemons with freshly compiled utilities
because the current documentation does not consistently call out
using utilities from the compiled sources.
This commit updates DPDK documentation to ask users to do a 'make install'
and then use the utilities and daemons directly from Linux PATH.
It also adds github markup where applicable.