]> git.proxmox.com Git - mirror_ovs.git/log
mirror_ovs.git
6 years agoAUTHROS: Add Ivan Dyukov.
Ben Pfaff [Thu, 14 Jun 2018 23:57:23 +0000 (16:57 -0700)]
AUTHROS: Add Ivan Dyukov.

Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agotests/stp: Make validation of flows before changing of topology.
Ivan Dyukov [Tue, 5 Jun 2018 14:37:42 +0000 (17:37 +0300)]
tests/stp: Make validation of flows before changing of topology.

The change fixes random stp test failure. Accuracy is about 20%.
Failed test is following:
2337: STP - flush the fdb and mdb when topology changed

In some cases, a validation is executed after topology change and
it increase time of stp stabilization. To prevent this, delay
which wait validation is added before deleting a port.

CC: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Fixes: 427e9751f300 ("tests: Add and improve stp tests.")
Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agotap: flag as present after opening it.
Flavio Leitner [Thu, 7 Jun 2018 14:10:17 +0000 (11:10 -0300)]
tap: flag as present after opening it.

Assume the device is present if it can be opened.

Reported-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Eelco Chaudron <echaudro@redhat.com>
6 years agolinux: Assume it is local if no API is available.
Flavio Leitner [Thu, 7 Jun 2018 14:10:52 +0000 (11:10 -0300)]
linux: Assume it is local if no API is available.

If the 'openvswitch' kernel module is not loaded, the API is not
available and the userspace will keep retrying. This approach is
not ideal for the netdev datapath type.

This patch disables network netns support if the error code returned
indicates that the API is not available.

Reported-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Eelco Chaudron <echaudro@redhat.com>
6 years agolinux: disable netns support for tap.
Flavio Leitner [Thu, 7 Jun 2018 14:11:19 +0000 (11:11 -0300)]
linux: disable netns support for tap.

Tap device is not added to the kernel datapath, so there is
no way to get netns information.

Reported-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Eelco Chaudron <echaudro@redhat.com>
6 years agorhel: Add python-netifaces as a dependency for openvswitch-test
Timothy Redaelli [Tue, 12 Jun 2018 09:27:40 +0000 (11:27 +0200)]
rhel: Add python-netifaces as a dependency for openvswitch-test

Currently python-netifaces is needed for ovs-tcpdump that is installed
by openvswitch-test package.

This commit adds {python,python2}-netifaces as a dependency for the
openvswitch-test package.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
6 years agotests/sendpkt.py: Fix to work with Python3
Timothy Redaelli [Thu, 31 May 2018 14:52:40 +0000 (16:52 +0200)]
tests/sendpkt.py: Fix to work with Python3

CC: Ashish Varma <ashishvarma.ovs@gmail.com>
Fixes: 296251ca0c82 ("tests: Added NSH related unit test cases for datapath")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agotests: Fix test that tests if the system doesn't support IPv6
Timothy Redaelli [Mon, 11 Jun 2018 11:15:35 +0000 (13:15 +0200)]
tests: Fix test that tests if the system doesn't support IPv6

Currently if IPv6 is globally disabled (net.ipv6.conf.all.disable_ipv6=1) or
if IPv6 is disabled on loopback interface (net.ipv6.conf.lo.disable_ipv6=1)
the check doesn't work since no interface have ::1 and EADDRNOTAVAIL is
returned.

This causes a Python exception to be printed, like this:

Traceback (most recent call last):
  File "<string>", line 6, in <module>
  File "/usr/lib64/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 99] Cannot assign requested address

In this case HAVE_IPV6 is not set and all IPv6 tests fails.
This commit fixes the problem by check also for EADDRNOTAVAIL.

CC: Ben Pfaff <blp@ovn.org>
Fixes: 5c1d812d7fb3 ("tests: Avoid printing Python exception for hosts without IPv6 support.")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agotests: Enable Valgrind for userspace system tests.
Darrell Ball [Tue, 12 Jun 2018 00:51:42 +0000 (17:51 -0700)]
tests: Enable Valgrind for userspace system tests.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agolldp: fix string warnings
Aaron Conole [Wed, 13 Jun 2018 19:43:03 +0000 (15:43 -0400)]
lldp: fix string warnings

lib/lldp/lldpd.c: In function :
lib/lldp/lldpd.c:520:17: warning:  output truncated before terminating nul copying as many bytes from a string as its length [-Wstringop-truncation]
                strncat(buffer, cfg->g_protocols[i].name,
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                strlen(cfg->g_protocols[i].name));
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

lib/lldp/lldpd.c: In function :
lib/lldp/lldpd.c:519:17: warning:  specified bound 2 equals source length [-Wstringop-overflow=]
                strncat(buffer, ", ", 2);
                ^~~~~~~~~~~~~~~~~~~~~~~~

Closer inspection shows that buffer is only used to output protocol names
when debug logging is enabled, so restructure the code a bit as well.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoOVN: add ICMP time exceeded support to OVN logical router
Lorenzo Bianconi [Thu, 14 Jun 2018 15:27:18 +0000 (17:27 +0200)]
OVN: add ICMP time exceeded support to OVN logical router

Using icmp4 action, send an ICMP time exceeded frame whenever
an OVN logical router receives an IPv4 packets whose TTL has
expired (ip.ttl == {0, 1})

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodatapath: compat: Fix RHEL 7.5 build warning from ip_tunnel_get_stats64()
Yi-Hung Wei [Tue, 12 Jun 2018 00:50:23 +0000 (17:50 -0700)]
datapath: compat: Fix RHEL 7.5 build warning from ip_tunnel_get_stats64()

This patch fixes warning as the following in RHEL 7.5 kernel.

  CC [M]  /root/git/ovs/datapath/linux/geneve.o
/root/git/ovs/datapath/linux/geneve.c:1273:2: warning: initialization
from incompatible pointer type [enabled by default]
  .ndo_get_stats64 = ip_tunnel_get_stats64,
  ^
/root/git/ovs/datapath/linux/geneve.c:1273:2: warning: (near
initialization for ‘geneve_netdev_ops.<anonymous>.ndo_get_stats64’)
[enabled by default]
/root/git/ovs/datapath/linux/ip_gre.c:1162:2: warning: initialization
from incompatible pointer type [enabled by default]
  .ndo_get_stats64 = ip_tunnel_get_stats64,
  ^
/root/git/ovs/datapath/linux/ip_gre.c:1162:2: warning: (near
initialization for ‘ipgre_netdev_ops.<anonymous>.ndo_get_stats64’)
[enabled by default]
/root/git/ovs/datapath/linux/ip_gre.c:1180:2: warning: initialization
from incompatible pointer type [enabled by default]
  .ndo_get_stats64 = ip_tunnel_get_stats64,
  ^

Fixes: 436d36db ("compat: Fixups for newer kernels")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: Fix ip6_gre, ip6_tunnel, and ip_gre backport
Yi-Hung Wei [Tue, 12 Jun 2018 00:50:22 +0000 (17:50 -0700)]
datapath: Fix ip6_gre, ip6_tunnel, and ip_gre backport

Recently added ERSAPN feature introduced changes in ip6_gre, ip6_tunnel,
and ip_gre which will break build on RHEL 7.5 kernel because of
ndo_change_mtu().  This patch fixes the issue in RHEL 7.5 kernel.

Fixes: 8e53509c ("gre: introduce native tunnel support for ERSPAN")
Fixes: c387d817 ("compat: Add ipv6 GRE and IPV6 Tunneling")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: Fix max MTU size on RHEL 7.5 kernel
Yi-Hung Wei [Tue, 12 Jun 2018 00:50:21 +0000 (17:50 -0700)]
datapath: Fix max MTU size on RHEL 7.5 kernel

Without the patch, in RHEL 7.5, the maximum configurable MTU of vport
internal device is 1500, which shall be 65535.  This patch fixes this
issue.

Fixes: 39ca338374ab ("datapath: compat: Fix build on RHEL 7.5")
Reported-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agoMerge branch 'dpdk_merge' of https://github.com/istokes/ovs into HEAD
Ben Pfaff [Tue, 12 Jun 2018 19:39:12 +0000 (12:39 -0700)]
Merge branch 'dpdk_merge' of https://github.com/istokes/ovs into HEAD

6 years agoovn-controller: Drop controller_ctx structure entirely.
Ben Pfaff [Mon, 11 Jun 2018 21:44:11 +0000 (14:44 -0700)]
ovn-controller: Drop controller_ctx structure entirely.

The remaining controller_ctx members were ovsdb_idl_txn pointers that could
be passed to functions directly, so this commit makes that change and
removes the structure.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agoovn-controller: Drop ovs_idl, ovnsb_idl from controller_ctx.
Ben Pfaff [Mon, 11 Jun 2018 21:13:37 +0000 (14:13 -0700)]
ovn-controller: Drop ovs_idl, ovnsb_idl from controller_ctx.

These were essentially unused except within ovn-controller.c itself.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agoovn-controller: Use chassis_lookup_by_name() instead of get_chassis().
Ben Pfaff [Fri, 8 Jun 2018 21:51:12 +0000 (14:51 -0700)]
ovn-controller: Use chassis_lookup_by_name() instead of get_chassis().

This was duplicate functionality.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agochassis-index: Use OVSDB index mechanism.
Ben Pfaff [Fri, 8 Jun 2018 21:47:32 +0000 (14:47 -0700)]
chassis-index: Use OVSDB index mechanism.

It seems like a good idea to use the built-in indexing instead of doing it
by hand.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agoovsdb-idl: Redesign use of indexes.
Ben Pfaff [Fri, 8 Jun 2018 04:07:34 +0000 (21:07 -0700)]
ovsdb-idl: Redesign use of indexes.

The design of the compound index feature in the C OVSDB IDL was unusual.
Indexes were generally referenced only by name rather than by pointer, and
could be obtained only from the top-level ovsdb_idl object.  To iterate or
otherwise search an index required explicitly creating a special
ovsdb_idl_cursor object, which at least seemed somewhat heavy-weight given
that it required a string lookup in a table of indexes.

This commit redesigns the compound index interface.  It discards the use of
names for indexes, instead having clients pass in a pointer to the index
object itself.  It simplifies how indexes are created, gets rid of the need
for explicit cursor objects, and updates all of the users to the new
interface.

The underlying reason for this commit is to make it easier in
ovn-controller to keep track of the dependencies for a given function, by
making the indexes explicit arguments to any function that needs to use
them.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agoMakefile: Add build-time check for files with initial tabs.
Ben Pfaff [Mon, 4 Jun 2018 21:16:40 +0000 (14:16 -0700)]
Makefile: Add build-time check for files with initial tabs.

This should make it harder to reintroduce inappropriate indentation.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agotreewide: Convert leading tabs to spaces.
Ben Pfaff [Sat, 26 May 2018 00:11:07 +0000 (17:11 -0700)]
treewide: Convert leading tabs to spaces.

It's always been OVS coding style to use spaces rather than tabs for
indentation, but some tabs have snuck in over time.  This commit converts
them to spaces.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofproto-dpif: Remove tabs from output.
Ben Pfaff [Sat, 26 May 2018 00:03:05 +0000 (17:03 -0700)]
ofproto-dpif: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofproto-dpif-upcall: Remove tabs from output.
Ben Pfaff [Sat, 26 May 2018 00:02:22 +0000 (17:02 -0700)]
ofproto-dpif-upcall: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofproto-dpif-trace: Remove tabs from output.
Ben Pfaff [Sat, 26 May 2018 00:01:48 +0000 (17:01 -0700)]
ofproto-dpif-trace: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agobond: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:59:40 +0000 (16:59 -0700)]
bond: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agostopwatch: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:58:25 +0000 (16:58 -0700)]
stopwatch: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agorstp, stp: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:57:59 +0000 (16:57 -0700)]
rstp, stp: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-lldp: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:57:13 +0000 (16:57 -0700)]
ovs-lldp: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agolacp: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:56:18 +0000 (16:56 -0700)]
lacp: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agodpctl: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:55:18 +0000 (16:55 -0700)]
dpctl: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agocfm: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:50:54 +0000 (16:50 -0700)]
cfm: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agobfd: Remove leading tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:50:29 +0000 (16:50 -0700)]
bfd: Remove leading tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovn-sandbox: Fix link.
Ben Pfaff [Sun, 3 Jun 2018 20:40:26 +0000 (13:40 -0700)]
ovn-sandbox: Fix link.

I couldn't figure out a way to fix this without making it inline.  Weird.

Reported-by: Qiuyu Xiao <qxiao@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofp-bundle: Minor style fixes for header.
Ben Pfaff [Thu, 17 May 2018 15:22:45 +0000 (08:22 -0700)]
ofp-bundle: Minor style fixes for header.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovn-controller: Pass around pointers to individual tables.
Ben Pfaff [Thu, 7 Jun 2018 21:22:33 +0000 (14:22 -0700)]
ovn-controller: Pass around pointers to individual tables.

We're working to make ovn-controller compute more incrementally, to reduce
CPU usage.  To make it easier to keep track of dependencies, it makes sense
to pass around pointers to fine-grained resources instead of an entire
database at a time.  This commit introduces a way to pass individual tables
around and starts using that feature in ovn-controller.

CC: Han Zhou <zhouhan@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agoovn-controller: Style fixes.
Ben Pfaff [Tue, 5 Jun 2018 18:04:39 +0000 (11:04 -0700)]
ovn-controller: Style fixes.

The OVS coding style says that input parameters should come first,
followed by output parameters.  This changes a few functions in
ovn-controller to fit this style.  It also marks a number of input
parameters 'const', for clarity.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agodatapath-windows: Add support for handling DEI bit of VLAN header
Anand Kumar [Tue, 15 May 2018 23:38:00 +0000 (16:38 -0700)]
datapath-windows: Add support for handling DEI bit of VLAN header

The Drop eligible indicator(DEI) is 1 bit wide and it is part of
Tag control information (TCI) in VLAN header, which indicates that
the frame can be dropped during congestion.

Signed-off-by: Anand Kumar <kumaranand@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
6 years agoovs-thread: Fix thread id for threads not started with ovs_thread_create()
Eelco Chaudron [Mon, 4 Jun 2018 08:07:36 +0000 (10:07 +0200)]
ovs-thread: Fix thread id for threads not started with ovs_thread_create()

When ping-pong'in a live VM migration between two machines running
OVS-DPDK every now and then the ping misses would increase
dramatically. For example:
Acked-by: Ilya Maximets <i.maximets@samsung.com>
===========Stream Rate: 3Mpps===========
No Stream_Rate Downtime Totaltime Ping_Loss Moongen_Loss
 0       3Mpps      128     13974       115      7168374
 1       3Mpps      145     13620        17      1169770
 2       3Mpps      140     14499       116      7141175
 3       3Mpps      142     13358        16      1150606
 4       3Mpps      136     14004        16      1124020
 5       3Mpps      139     15494       214     13170452
 6       3Mpps      136     15610       217     13282413
 7       3Mpps      146     13194        17      1167512
 8       3Mpps      148     12871        16      1162655
 9       3Mpps      137     15615       214     13170656

I identified this issue being introduced in OVS commit,
f3e7ec254738 ("Update relevant artifacts to add support for DPDK 17.05.1.")
and more specific due to DPDK commit,
af1475918124 ("vhost: introduce API to start a specific driver").

The combined changes no longer have OVS start the vhost socket polling
thread at startup, but DPDK will do it on its own when the first vhost
client is started.

Figuring out the reason why this happens kept me puzzled for quite some time...
What happens is that the callbacks called from the vhost thread are
calling ovsrcu_synchronize() as part of destroy_device(). This will
end-up calling seq_wait__().

By default, all created threads outside of OVS will get thread id 0,
which is equal to the main ovs thread. So for example in the
seq_wait__() function above if the main thread is waiting already we
won't add ourselves as a waiter.

The fix below assigns OVSTHREAD_ID_UNSET to none OVS created threads,
which will get updated to a valid ID on the first call to
ovsthread_id_self().

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Fixes: f3e7ec254738 ("Update relevant artifacts to add support for DPDK
                      17.05.1.")
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agonetdev-dpdk: Handle ENOTSUP for rte_eth_dev_set_mtu.
Ian Stokes [Thu, 3 May 2018 09:06:10 +0000 (10:06 +0100)]
netdev-dpdk: Handle ENOTSUP for rte_eth_dev_set_mtu.

The function rte_eth_dev_set_mtu is not supported for all DPDK drivers.
Currently if it is not supported we return an error in
dpdk_eth_dev_queue_setup. There are two issues with this.

(i) A device can still function even if rte_eth_dev_set_mtu is not
supported albeit with the default max rx packet length.

(ii) When ENOTSUP is returned it will not be caught in port_reconfigure()
at the dpif-netdev layer. Port_reconfigure() checks if a netdev_reconfigure()
function is supported for a given netdev and ignores EOPNOTSUPP errors as it
assumes errors of this value mean there is no reconfiguration function.
In this case the reconfiguration function is supported for netdev dpdk but
a function called as part of the reconfigure (rte_eth_dev_set_mtu) may
not be supported.

As this is a corner case, this commit warns a user when
rte_eth_dev_set_mtu is not supported and informs them of the default
max rx packet length that will be used instead.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Co-author: Michal Weglicki <michalx.weglicki@intel.com>
Tested-By: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Cian Ferriter <cian.ferriter@intel.com>
Tested-by: Cian Ferriter <cian.ferriter@intel.com>
6 years agonetdev-dpdk: Enable HW_CRC_STRIP for virtual functions.
Michal Weglicki [Thu, 3 May 2018 08:00:34 +0000 (09:00 +0100)]
netdev-dpdk: Enable HW_CRC_STRIP for virtual functions.

Virtual functions such as igb_vf and i40e_vf require HW_CRC_STRIP to be
explicitly enabled before configuration, otherwise device configuration
will fail.

This commit achieves this by adding NETDEV_RX_HW_CRC_STRIP to
dpdk_hw_ol_features. When a dpdk device is added, the driver for the
device is examined, if the device is a virtual function enable
HW_CRC_STRIP.

Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Co-Authored: Ian Stokes <ian.stokes@intel.com>
Acked-by: Cian Ferriter <cian.ferriter@intel.com>
Tested-by: Cian Ferriter <cian.ferriter@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agoOVS-DPDK: Change "dpdk-socket-mem" default value.
Marcin Rybka [Wed, 9 May 2018 10:14:25 +0000 (11:14 +0100)]
OVS-DPDK: Change "dpdk-socket-mem" default value.

When "dpdk-socket-mem" and "dpdk-alloc-mem" are not specified,
"dpdk-socket-mem" will be set to allocate 1024MB on each NUMA node.
This change will prevent OVS from failing when NIC is attached on
NUMA node 1 and higher. Patch contains documentation update.

Signed-off-by: Marcin Rybka <marcinx.rybka@intel.com>
Co-authored-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Billy O'Mahony <billy.o.mahony@intel.com>
Tested-by: Hariprasad Govindharajan <hariprasad.govindharajan@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agolib: fix typo in fragment handling error message
Louis Peens [Tue, 29 May 2018 18:51:15 +0000 (20:51 +0200)]
lib: fix typo in fragment handling error message

The error message states that "not_first" is a valid selection
for the ip_frag field, but looking at the structure that is defined
this should say "not_later".

Signed-off-by: Louis Peens <louis.peens@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
6 years agodatapath: Check if gre kernel module is loaded
Greg Rose [Wed, 6 Jun 2018 22:23:28 +0000 (15:23 -0700)]
datapath: Check if gre kernel module is loaded

Before attempting to add a gre tunnel to OVS via the vport gre
kernel interface make sure that the openvswitch kernel module has
been able to grab the gre protocol entry point.  If OVS does not
own the gre protocol then report address family not supported.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodhparams: Add pregenerated .c file to the repository.
Eneas U de Queiroz [Tue, 5 Jun 2018 22:25:42 +0000 (15:25 -0700)]
dhparams: Add pregenerated .c file to the repository.

The version of dhparams.c generated by any given version of OpenSSL or
LibreSSL might work only with that version of the library.  This can be
inconvenient for cross-compiling if the "openssl" program on the build
machine has a different version from the library on the host where OVS will
run, since it could generate code that won't compile.

This commit fixes the problem by generating dhparams.c that works on the
currently important versions of OpenSSL and LibreSSL.

Submitted-at: https://github.com/openvswitch/ovs/pull/235
Signed-off-by: Eneas U de Queiroz <cote2004-github@yahoo.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agorhel: remove ovs-sim man page from temporary directory (also for RHEL)
Ansis Atteka [Wed, 6 Jun 2018 02:48:26 +0000 (19:48 -0700)]
rhel: remove ovs-sim man page from temporary directory (also for RHEL)

Fix following compilation error when building rpm packages
with rhel/openvswitch.spec file.

error: Installed (but unpackaged) file(s) found:
   /usr/share/man/man1/ovs-sim.1.gz

Signed-off-by: Ansis Atteka <aatteka@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
6 years agorhel: remove ovs-sim man page from temporary directory
Lorenzo Bianconi [Tue, 5 Jun 2018 12:42:23 +0000 (14:42 +0200)]
rhel: remove ovs-sim man page from temporary directory

Fix following compilation error running 'make rpm-fedora'

error: Installed (but unpackaged) file(s) found:
   /usr/share/man/man1/ovs-sim.1.gz

RPM build errors:
    Installed (but unpackaged) file(s) found:
   /usr/share/man/man1/ovs-sim.1.gz
make: *** [Makefile:7049: rpm-fedora] Error 1

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
6 years agoovndb-servers.ocf: add LB support for managing ovndb cluster:
aginwala [Sat, 2 Jun 2018 16:11:56 +0000 (09:11 -0700)]
ovndb-servers.ocf: add LB support for managing ovndb cluster:

using pacemaker so that controllers can be placed in different fault domains.
More background about the discussions can be found on:
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-May/046770.html

Signed-off-by: aginwala <aginwala@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Tested-by: Numan Siddique <nusiddiq@redhat.com>
6 years agopython: Update docstring in ovs.db.idl.Idl class.
Toms Atteka [Mon, 4 Jun 2018 18:33:32 +0000 (11:33 -0700)]
python: Update docstring in ovs.db.idl.Idl class.

Adjusted docstring and variable names according to previous code changes;
Fixed grammar "a attribute" > "an attribute".

Fixes: bf42f674 (idl: Convert python daemons to utilize SchemaHelper)
Signed-off-by: Toms Atteka <cpp.code.lv@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agorstp: Eliminate BPDU padding and uninitialized bytes.
Ben Pfaff [Mon, 4 Jun 2018 20:42:10 +0000 (13:42 -0700)]
rstp: Eliminate BPDU padding and uninitialized bytes.

When the RSTP implementation sent BPDUs, it failed to initialize some of
their bytes.  None of the code initialized an array of 7 padding bytes, and
some of it also failed to initialize the version1_length field.  In
addition, the padding bytes confused some implementations that did not
correctly ignore extra bytes.

This commit fixes both problems, by removing the padding bytes and
initializing every byte in outgoing messages.

Reported-by: David van Moolenbroek <dvmoolenbroek@aimvalley.nl>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-June/046864.html
Tested-by: David van Moolenbroek <dvmoolenbroek@aimvalley.nl>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoRemoved calls to AP deprecated in openssl 1.1
Eneas U de Queiroz [Tue, 5 Jun 2018 13:36:51 +0000 (10:36 -0300)]
Removed calls to AP deprecated in openssl 1.1

In openssl 1.1, there is no need to initialize the library.  It is
automatically done when first used.  This allows to compile openvswitch
with openssl 1.1.0 with deprecated API disabled.

Signed-off-by: Eneas U de Queiroz <cote2004-github@yahoo.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodatapath: Do not fail to load on gre protocol conflict
Greg Rose [Mon, 4 Jun 2018 20:14:38 +0000 (13:14 -0700)]
datapath: Do not fail to load on gre protocol conflict

The ERSPAN feature depends on the gre kernel module so on systems where
the ERSPAN feature isn't supported the openvswitch kernel module would
attempt to grab the ipv4 GRE protocol entry point and would fail to load
if it could not.

This patch modifies openvswitch to not fail to load when the gre kernel
module is loaded and instead it will print a warning message to the
kernel system log indicating that the ERSPAN feature may not be
available.

We need this patch because users are experiencing failures due to the
conflicts and high priority bugs are resulting.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoRevert "utilities/ovs-ctl: Force removal of ip_gre/gre"
Greg Rose [Mon, 4 Jun 2018 20:14:37 +0000 (13:14 -0700)]
Revert "utilities/ovs-ctl: Force removal of ip_gre/gre"

This reverts commit 2bdd1f3d96a86bea6bdb8788f23ec7dd99b289e3.

This is the wrong direction for the solution to the ip_gre/gre kernel
module conflicts, as reported by Jiri Benc <jbenc@redhat.com> and others in
https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/347803.html and
elsewhere in the same thread

Rather than attempting to force the removal of the ip_gre/gre kernel
modules, which often fails because they're in use, we will add a patch that
does not cause the openvswitch kernel module to fail to load when the
ip_gre/gre protocol entry points are already claimed.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoRevert "dpif: Ensure ERSPAN GRE support"
Greg Rose [Mon, 4 Jun 2018 20:14:36 +0000 (13:14 -0700)]
Revert "dpif: Ensure ERSPAN GRE support"

This reverts commit 8929c55287abae37efeac1e8876e6b3c2ccad0b9.

This is the wrong direction for the solution to the ip_gre/gre kernel
module conflicts, as reported by Jiri Benc <jbenc@redhat.com> and others in
https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/347803.html and
elsewhere in the same thread

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: Fix compile warning
Greg Rose [Mon, 4 Jun 2018 20:33:30 +0000 (13:33 -0700)]
compat: Fix compile warning

Fix compile warning about redefined symbol

Fixes: 10f242363d ("compat: Add skb_checksum_simple_complete()")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoFix typo in database commands documentation.
Mark Michelson [Mon, 4 Jun 2018 14:36:31 +0000 (10:36 -0400)]
Fix typo in database commands documentation.

s/remov/remove/

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agocompat: Add skb_checksum_simple_complete()
Greg Rose [Fri, 1 Jun 2018 20:07:43 +0000 (13:07 -0700)]
compat: Add skb_checksum_simple_complete()

A recent patch to gre.c added a call to skb_checksum_simple_complete()
which is not present in kernels before 3.16.  Fix up the compatability
layer to allow compile on older kernels that do not have it.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoutilities/ovs-ctl: Force removal of ip_gre/gre
Greg Rose [Thu, 31 May 2018 21:20:45 +0000 (14:20 -0700)]
utilities/ovs-ctl: Force removal of ip_gre/gre

On Linux kernels older than 4.16 the user cannot take advantage of
OVS ERSPAN features if the older ip_gre and gre kernel modules are
loaded.  In addition, the openvswitch kernel module will fail to
load because it cannot grab the IPPROTO_GRE inet protocol handler
since the gre kernel module has already taken it.

Update the force_reload_kmod() script function to force removal
of the ip_gre and gre built-in kernel modules so that the openvswitch
kernel module can load and provide support for ERSPAN.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodpif: Ensure ERSPAN GRE support
Greg Rose [Thu, 31 May 2018 22:50:31 +0000 (15:50 -0700)]
dpif: Ensure ERSPAN GRE support

When verifying the built-in gre kernel module check for ERSPAN support.

Reported-by: Guru Shetty <guru@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agocompat: Fixups for newer kernels
Greg Rose [Thu, 31 May 2018 21:10:10 +0000 (14:10 -0700)]
compat: Fixups for newer kernels

A recent patch series added support for ERSPAN but left some problems
remaining for kernel releases from 4.10 to 4.14.  This patch
addresses those problems.

Of note is that the old cisco gre compat layer code is gone for good.

Also, several compat defines in acinclude.m4 were looking for keys
in .c source files - this does not work on distros without source
code.  A more reliable key was already defined so we use that instead.

We have pared support for the Linux kernel releases in .travis.yml
to reflect that 4.15 is no longer in the LTS list.  With this patch
the Out of Tree OVS datapath kernel modules can build on kernels
up to 4.14.47.  Support for kernels up to 4.16.x will be added
later.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodatapath: ip6_gre: fix tunnel metadata device sharing.
William Tu [Tue, 29 May 2018 12:55:05 +0000 (05:55 -0700)]
datapath: ip6_gre: fix tunnel metadata device sharing.

commit b80d0b93b991e551a32157e0d9d38fc5bc9348a7
Author: William Tu <u9012063@gmail.com>
Date:   Fri May 18 19:22:28 2018 -0700

    net: ip6_gre: fix tunnel metadata device sharing.

    Currently ip6gre and ip6erspan share single metadata mode device,
    using 'collect_md_tun'.  Thus, when doing:
      ip link add dev ip6gre11 type ip6gretap external
      ip link add dev ip6erspan12 type ip6erspan external
      RTNETLINK answers: File exists
    simply fails due to the 2nd tries to create the same collect_md_tun.

    The patch fixes it by adding a separate collect md tunnel device
    for the ip6erspan, 'collect_md_tun_erspan'.  As a result, a couple
    of places need to refactor/split up in order to distinguish ip6gre
    and ip6erspan.

    First, move the collect_md check at ip6gre_tunnel_{unlink,link} and
    create separate function {ip6gre,ip6ersapn}_tunnel_{link_md,unlink_md}.
    Then before link/unlink, make sure the link_md/unlink_md is called.
    Finally, a separate ndo_uninit is created for ip6erspan.  Tested it
    using the samples/bpf/test_tunnel_bpf.sh.

Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Fix ip6erspan hlen calculation
William Tu [Tue, 29 May 2018 12:55:04 +0000 (05:55 -0700)]
datapath: ip6_gre: Fix ip6erspan hlen calculation

commit 2d665034f239412927b1e71329f20f001c92da09
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:51 2018 +0200

    net: ip6_gre: Fix ip6erspan hlen calculation

    Even though ip6erspan_tap_init() sets up hlen and tun_hlen according to
    what ERSPAN needs, it goes ahead to call ip6gre_tnl_link_config() which
    overwrites these settings with GRE-specific ones.

    Similarly for changelink callbacks, which are handled by
    ip6gre_changelink() calls ip6gre_tnl_change() calls
    ip6gre_tnl_link_config() as well.

    The difference ends up being 12 vs. 20 bytes, and this is generally not
    a problem, because a 12-byte request likely ends up allocating more and
    the extra 8 bytes are thus available. However correct it is not.

    So replace the newlink and changelink callbacks with an ERSPAN-specific
    ones, reusing the newly-introduced _common() functions.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_changelink()
William Tu [Tue, 29 May 2018 12:55:03 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_changelink()

commit c8632fc30bb03aa0c3bd7bcce85355a10feb8149
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:45 2018 +0200

    net: ip6_gre: Split up ip6gre_changelink()

    Extract from ip6gre_changelink() a reusable function
    ip6gre_changelink_common(). This will allow introduction of
    ERSPAN-specific _changelink() function with not a lot of code
    duplication.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_newlink()
William Tu [Tue, 29 May 2018 12:55:02 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_newlink()

commit 7fa38a7c852ec99e3a7fc375eb2c21c50c2e46b8
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:39 2018 +0200

    net: ip6_gre: Split up ip6gre_newlink()

    Extract from ip6gre_newlink() a reusable function
    ip6gre_newlink_common(). The ip6gre_tnl_link_config() call needs to be
    made customizable for ERSPAN, thus reorder it with calls to
    ip6_tnl_change_mtu() and dev_hold(), and extract the whole tail to the
    caller, ip6gre_newlink(). Thus enable an ERSPAN-specific _newlink()
    function without a lot of duplicity.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_tnl_change()
William Tu [Tue, 29 May 2018 12:55:01 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_tnl_change()

commit a6465350ef495f5cbd76a3e505d25a01d648477e
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:33 2018 +0200

    net: ip6_gre: Split up ip6gre_tnl_change()

    Split a reusable function ip6gre_tnl_copy_tnl_parm() from
    ip6gre_tnl_change(). This will allow ERSPAN-specific code to
    reuse the common parts while customizing the behavior for ERSPAN.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_tnl_link_config()
William Tu [Tue, 29 May 2018 12:55:00 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_tnl_link_config()

commit a483373ead61e6079bc8ebe27e2dfdb2e3c1559f
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:27 2018 +0200

    net: ip6_gre: Split up ip6gre_tnl_link_config()

    The function ip6gre_tnl_link_config() is used for setting up
    configuration of both ip6gretap and ip6erspan tunnels. Split the
    function into the common part and the route-lookup part. The latter then
    takes the calculated header length as an argument. This split will allow
    the patches down the line to sneak in a custom header length computation
    for the ERSPAN tunnel.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()
William Tu [Tue, 29 May 2018 12:54:59 +0000 (05:54 -0700)]
datapath: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()

commit 5691484df961aff897d824bcc26cd1a2aa036b5b
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:15 2018 +0200

    net: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()

    dev->needed_headroom is not primed until ip6_tnl_xmit(), so it starts
    out zero. Thus the call to skb_cow_head() fails to actually make sure
    there's enough headroom to push the ERSPAN headers to. That can lead to
    the panic cited below. (Reproducer below that).

    Fix by requesting either needed_headroom if already primed, or just the
    bare minimum needed for the header otherwise.

    [  190.703567] kernel BUG at net/core/skbuff.c:104!
    [  190.708384] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
    [  190.714007] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_t
emp_thermal mlx_platform nfsd e1000e leds_mlxcpld
    [  190.728975] CPU: 1 PID: 959 Comm: kworker/1:2 Not tainted 4.17.0-rc4-net_master-custom-139 #10
    [  190.737647] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
    [  190.747006] Workqueue: ipv6_addrconf addrconf_dad_work
    [  190.752222] RIP: 0010:skb_panic+0xc3/0x100
    [  190.756358] RSP: 0018:ffff8801d54072f0 EFLAGS: 00010282
    [  190.761629] RAX: 0000000000000085 RBX: ffff8801c1a8ecc0 RCX: 0000000000000000
    [  190.768830] RDX: 0000000000000085 RSI: dffffc0000000000 RDI: ffffed003aa80e54
    [  190.776025] RBP: ffff8801bd1ec5a0 R08: ffffed003aabce19 R09: ffffed003aabce19
    [  190.783226] R10: 0000000000000001 R11: ffffed003aabce18 R12: ffff8801bf695dbe
    [  190.790418] R13: 0000000000000084 R14: 00000000000006c0 R15: ffff8801bf695dc8
    [  190.797621] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
    [  190.805786] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  190.811582] CR2: 000055fa929aced0 CR3: 0000000003228004 CR4: 00000000001606e0
    [  190.818790] Call Trace:
    [  190.821264]  <IRQ>
    [  190.823314]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.828940]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.834562]  skb_push+0x78/0x90
    [  190.837749]  ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.843219]  ? ip6gre_tunnel_ioctl+0xd90/0xd90 [ip6_gre]
    [  190.848577]  ? debug_check_no_locks_freed+0x210/0x210
    [  190.853679]  ? debug_check_no_locks_freed+0x210/0x210
    [  190.858783]  ? print_irqtrace_events+0x120/0x120
    [  190.863451]  ? sched_clock_cpu+0x18/0x210
    [  190.867496]  ? cyc2ns_read_end+0x10/0x10
    [  190.871474]  ? skb_network_protocol+0x76/0x200
    [  190.875977]  dev_hard_start_xmit+0x137/0x770
    [  190.880317]  ? do_raw_spin_trylock+0x6d/0xa0
    [  190.884624]  sch_direct_xmit+0x2ef/0x5d0
    [  190.888589]  ? pfifo_fast_dequeue+0x3fa/0x670
    [  190.892994]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
    [  190.898455]  ? __lock_is_held+0xa0/0x160
    [  190.902422]  __qdisc_run+0x39e/0xfc0
    [  190.906041]  ? _raw_spin_unlock+0x29/0x40
    [  190.910090]  ? pfifo_fast_enqueue+0x24b/0x3e0
    [  190.914501]  ? sch_direct_xmit+0x5d0/0x5d0
    [  190.918658]  ? pfifo_fast_dequeue+0x670/0x670
    [  190.923047]  ? __dev_queue_xmit+0x172/0x1770
    [  190.927365]  ? preempt_count_sub+0xf/0xd0
    [  190.931421]  __dev_queue_xmit+0x410/0x1770
    [  190.935553]  ? ___slab_alloc+0x605/0x930
    [  190.939524]  ? print_irqtrace_events+0x120/0x120
    [  190.944186]  ? memcpy+0x34/0x50
    [  190.947364]  ? netdev_pick_tx+0x1c0/0x1c0
    [  190.951428]  ? __skb_clone+0x2fd/0x3d0
    [  190.955218]  ? __copy_skb_header+0x270/0x270
    [  190.959537]  ? rcu_read_lock_sched_held+0x93/0xa0
    [  190.964282]  ? kmem_cache_alloc+0x344/0x4d0
    [  190.968520]  ? cyc2ns_read_end+0x10/0x10
    [  190.972495]  ? skb_clone+0x123/0x230
    [  190.976112]  ? skb_split+0x820/0x820
    [  190.979747]  ? tcf_mirred+0x554/0x930 [act_mirred]
    [  190.984582]  tcf_mirred+0x554/0x930 [act_mirred]
    [  190.989252]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
    [  190.996109]  ? __lock_acquire+0x706/0x26e0
    [  191.000239]  ? sched_clock_cpu+0x18/0x210
    [  191.004294]  tcf_action_exec+0xcf/0x2a0
    [  191.008179]  tcf_classify+0xfa/0x340
    [  191.011794]  __netif_receive_skb_core+0x8e1/0x1c60
    [  191.016630]  ? debug_check_no_locks_freed+0x210/0x210
    [  191.021732]  ? nf_ingress+0x500/0x500
    [  191.025458]  ? process_backlog+0x347/0x4b0
    [  191.029619]  ? print_irqtrace_events+0x120/0x120
    [  191.034302]  ? lock_acquire+0xd8/0x320
    [  191.038089]  ? process_backlog+0x1b6/0x4b0
    [  191.042246]  ? process_backlog+0xc2/0x4b0
    [  191.046303]  process_backlog+0xc2/0x4b0
    [  191.050189]  net_rx_action+0x5cc/0x980
    [  191.053991]  ? napi_complete_done+0x2c0/0x2c0
    [  191.058386]  ? mark_lock+0x13d/0xb40
    [  191.062001]  ? clockevents_program_event+0x6b/0x1d0
    [  191.066922]  ? print_irqtrace_events+0x120/0x120
    [  191.071593]  ? __lock_is_held+0xa0/0x160
    [  191.075566]  __do_softirq+0x1d4/0x9d2
    [  191.079282]  ? ip6_finish_output2+0x524/0x1460
    [  191.083771]  do_softirq_own_stack+0x2a/0x40
    [  191.087994]  </IRQ>
    [  191.090130]  do_softirq.part.13+0x38/0x40
    [  191.094178]  __local_bh_enable_ip+0x135/0x190
    [  191.098591]  ip6_finish_output2+0x54d/0x1460
    [  191.102916]  ? ip6_forward_finish+0x2f0/0x2f0
    [  191.107314]  ? ip6_mtu+0x3c/0x2c0
    [  191.110674]  ? ip6_finish_output+0x2f8/0x650
    [  191.114992]  ? ip6_output+0x12a/0x500
    [  191.118696]  ip6_output+0x12a/0x500
    [  191.122223]  ? ip6_route_dev_notify+0x5b0/0x5b0
    [  191.126807]  ? ip6_finish_output+0x650/0x650
    [  191.131120]  ? ip6_fragment+0x1a60/0x1a60
    [  191.135182]  ? icmp6_dst_alloc+0x26e/0x470
    [  191.139317]  mld_sendpack+0x672/0x830
    [  191.143021]  ? igmp6_mcf_seq_next+0x2f0/0x2f0
    [  191.147429]  ? __local_bh_enable_ip+0x77/0x190
    [  191.151913]  ipv6_mc_dad_complete+0x47/0x90
    [  191.156144]  addrconf_dad_completed+0x561/0x720
    [  191.160731]  ? addrconf_rs_timer+0x3a0/0x3a0
    [  191.165036]  ? mark_held_locks+0xc9/0x140
    [  191.169095]  ? __local_bh_enable_ip+0x77/0x190
    [  191.173570]  ? addrconf_dad_work+0x50d/0xa20
    [  191.177886]  ? addrconf_dad_work+0x529/0xa20
    [  191.182194]  addrconf_dad_work+0x529/0xa20
    [  191.186342]  ? addrconf_dad_completed+0x720/0x720
    [  191.191088]  ? __lock_is_held+0xa0/0x160
    [  191.195059]  ? process_one_work+0x45d/0xe20
    [  191.199302]  ? process_one_work+0x51e/0xe20
    [  191.203531]  ? rcu_read_lock_sched_held+0x93/0xa0
    [  191.208279]  process_one_work+0x51e/0xe20
    [  191.212340]  ? pwq_dec_nr_in_flight+0x200/0x200
    [  191.216912]  ? get_lock_stats+0x4b/0xf0
    [  191.220788]  ? preempt_count_sub+0xf/0xd0
    [  191.224844]  ? worker_thread+0x219/0x860
    [  191.228823]  ? do_raw_spin_trylock+0x6d/0xa0
    [  191.233142]  worker_thread+0xeb/0x860
    [  191.236848]  ? process_one_work+0xe20/0xe20
    [  191.241095]  kthread+0x206/0x300
    [  191.244352]  ? process_one_work+0xe20/0xe20
    [  191.248587]  ? kthread_stop+0x570/0x570
    [  191.252459]  ret_from_fork+0x3a/0x50
    [  191.256082] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
    [  191.275327] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d54072f0
    [  191.281024] ---[ end trace 7ea51094e099e006 ]---
    [  191.285724] Kernel panic - not syncing: Fatal exception in interrupt
    [  191.292168] Kernel Offset: disabled
    [  191.295697] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

    Reproducer:

        ip link add h1 type veth peer name swp1
        ip link add h3 type veth peer name swp3

        ip link set dev h1 up
        ip address add 192.0.2.1/28 dev h1

        ip link add dev vh3 type vrf table 20
        ip link set dev h3 master vh3
        ip link set dev vh3 up
        ip link set dev h3 up

        ip link set dev swp3 up
        ip address add dev swp3 2001:db8:2::1/64

        ip link set dev swp1 up
        tc qdisc add dev swp1 clsact

        ip link add name gt6 type ip6erspan \
                local 2001:db8:2::1 remote 2001:db8:2::2 oseq okey 123
        ip link set dev gt6 up

        sleep 1

        tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
                action mirred egress mirror dev gt6
        ping -I h1 192.0.2.2

Fixes: e41c7c68ea77 ("ip6erspan: make sure enough headroom at xmit.")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Request headroom in __gre6_xmit()
William Tu [Tue, 29 May 2018 12:54:58 +0000 (05:54 -0700)]
datapath: ip6_gre: Request headroom in __gre6_xmit()

Upstream commit:
commit 01b8d064d58b4c1f0eff47f8fe8a8508cb3b3840
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:10 2018 +0200

net: ip6_gre: Request headroom in __gre6_xmit()

__gre6_xmit() pushes GRE headers before handing over to ip6_tnl_xmit()
for generic IP-in-IP processing. However it doesn't make sure that there
is enough headroom to push the header to. That can lead to the panic
cited below. (Reproducer below that).

Fix by requesting either needed_headroom if already primed, or just the
bare minimum needed for the header otherwise.

[  158.576725] kernel BUG at net/core/skbuff.c:104!
[  158.581510] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[  158.587174] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_t
emp_thermal mlx_platform nfsd e1000e leds_mlxcpld
[  158.602268] CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 4.17.0-rc4-net_master-custom-139 #10
[  158.610938] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
[  158.620426] RIP: 0010:skb_panic+0xc3/0x100
[  158.624586] RSP: 0018:ffff8801d3f27110 EFLAGS: 00010286
[  158.629882] RAX: 0000000000000082 RBX: ffff8801c02cc040 RCX: 0000000000000000
[  158.637127] RDX: 0000000000000082 RSI: dffffc0000000000 RDI: ffffed003a7e4e18
[  158.644366] RBP: ffff8801bfec8020 R08: ffffed003aabce19 R09: ffffed003aabce19
[  158.651574] R10: 000000000000000b R11: ffffed003aabce18 R12: ffff8801c364de66
[  158.658786] R13: 000000000000002c R14: 00000000000000c0 R15: ffff8801c364de68
[  158.666007] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
[  158.674212] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  158.680036] CR2: 00007f4b3702dcd0 CR3: 0000000003228002 CR4: 00000000001606e0
[  158.687228] Call Trace:
[  158.689752]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.694475]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.699141]  skb_push+0x78/0x90
[  158.702344]  __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.706872]  ip6gre_tunnel_xmit+0x3bc/0x610 [ip6_gre]
[  158.711992]  ? __gre6_xmit+0xd80/0xd80 [ip6_gre]
[  158.716668]  ? debug_check_no_locks_freed+0x210/0x210
[  158.721761]  ? print_irqtrace_events+0x120/0x120
[  158.726461]  ? sched_clock_cpu+0x18/0x210
[  158.730572]  ? sched_clock_cpu+0x18/0x210
[  158.734692]  ? cyc2ns_read_end+0x10/0x10
[  158.738705]  ? skb_network_protocol+0x76/0x200
[  158.743216]  ? netif_skb_features+0x1b2/0x550
[  158.747648]  dev_hard_start_xmit+0x137/0x770
[  158.752010]  sch_direct_xmit+0x2ef/0x5d0
[  158.755992]  ? pfifo_fast_dequeue+0x3fa/0x670
[  158.760460]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
[  158.765975]  ? __lock_is_held+0xa0/0x160
[  158.770002]  __qdisc_run+0x39e/0xfc0
[  158.773673]  ? _raw_spin_unlock+0x29/0x40
[  158.777781]  ? pfifo_fast_enqueue+0x24b/0x3e0
[  158.782191]  ? sch_direct_xmit+0x5d0/0x5d0
[  158.786372]  ? pfifo_fast_dequeue+0x670/0x670
[  158.790818]  ? __dev_queue_xmit+0x172/0x1770
[  158.795195]  ? preempt_count_sub+0xf/0xd0
[  158.799313]  __dev_queue_xmit+0x410/0x1770
[  158.803512]  ? ___slab_alloc+0x605/0x930
[  158.807525]  ? ___slab_alloc+0x605/0x930
[  158.811540]  ? memcpy+0x34/0x50
[  158.814768]  ? netdev_pick_tx+0x1c0/0x1c0
[  158.818895]  ? __skb_clone+0x2fd/0x3d0
[  158.822712]  ? __copy_skb_header+0x270/0x270
[  158.827079]  ? rcu_read_lock_sched_held+0x93/0xa0
[  158.831903]  ? kmem_cache_alloc+0x344/0x4d0
[  158.836199]  ? skb_clone+0x123/0x230
[  158.839869]  ? skb_split+0x820/0x820
[  158.843521]  ? tcf_mirred+0x554/0x930 [act_mirred]
[  158.848407]  tcf_mirred+0x554/0x930 [act_mirred]
[  158.853104]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
[  158.860005]  ? __lock_acquire+0x706/0x26e0
[  158.864162]  ? mark_lock+0x13d/0xb40
[  158.867832]  tcf_action_exec+0xcf/0x2a0
[  158.871736]  tcf_classify+0xfa/0x340
[  158.875402]  __netif_receive_skb_core+0x8e1/0x1c60
[  158.880334]  ? nf_ingress+0x500/0x500
[  158.884059]  ? process_backlog+0x347/0x4b0
[  158.888241]  ? lock_acquire+0xd8/0x320
[  158.892050]  ? process_backlog+0x1b6/0x4b0
[  158.896228]  ? process_backlog+0xc2/0x4b0
[  158.900291]  process_backlog+0xc2/0x4b0
[  158.904210]  net_rx_action+0x5cc/0x980
[  158.908047]  ? napi_complete_done+0x2c0/0x2c0
[  158.912525]  ? rcu_read_unlock+0x80/0x80
[  158.916534]  ? __lock_is_held+0x34/0x160
[  158.920541]  __do_softirq+0x1d4/0x9d2
[  158.924308]  ? trace_event_raw_event_irq_handler_exit+0x140/0x140
[  158.930515]  run_ksoftirqd+0x1d/0x40
[  158.934152]  smpboot_thread_fn+0x32b/0x690
[  158.938299]  ? sort_range+0x20/0x20
[  158.941842]  ? preempt_count_sub+0xf/0xd0
[  158.945940]  ? schedule+0x5b/0x140
[  158.949412]  kthread+0x206/0x300
[  158.952689]  ? sort_range+0x20/0x20
[  158.956249]  ? kthread_stop+0x570/0x570
[  158.960164]  ret_from_fork+0x3a/0x50
[  158.963823] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
[  158.983235] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d3f27110
[  158.988935] ---[ end trace 5af56ee845aa6cc8 ]---
[  158.993641] Kernel panic - not syncing: Fatal exception in interrupt
[  159.000176] Kernel Offset: disabled
[  159.003767] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Reproducer:

ip link add h1 type veth peer name swp1
ip link add h3 type veth peer name swp3

ip link set dev h1 up
ip address add 192.0.2.1/28 dev h1

ip link add dev vh3 type vrf table 20
ip link set dev h3 master vh3
ip link set dev vh3 up
ip link set dev h3 up

ip link set dev swp3 up
ip address add dev swp3 2001:db8:2::1/64

ip link set dev swp1 up
tc qdisc add dev swp1 clsact

ip link add name gt6 type ip6gretap \
local 2001:db8:2::1 remote 2001:db8:2::2
ip link set dev gt6 up

sleep 1

tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
action mirred egress mirror dev gt6
ping -I h1 192.0.2.2

Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agoofproto-dpif: Use dp_hash as default selection method
Jan Scheurich [Thu, 24 May 2018 15:28:01 +0000 (17:28 +0200)]
ofproto-dpif: Use dp_hash as default selection method

The dp_hash selection method for select groups overcomes the scalability
problems of the current default selection method which, due to L2-L4
hashing during xlation and un-wildcarding of the hashed fields,
basically requires an upcall to the slow path to load-balance every
L4 connection. The consequence are an explosion of datapath flows
(megaflows degenerate to miniflows) and a limitation of connection
setup rate OVS can handle.

This commit changes the default selection method to dp_hash, provided the
bucket configuration is such that the dp_hash method can accurately
represent the bucket weights with up to 64 hash values. Otherwise we
stick to original default hash method.

We use the new dp_hash algorithm OVS_HASH_L4_SYMMETRIC to maintain the
symmetry property of the old default hash method.

A controller can explicitly request the old default hash selection method
by specifying selection method "hash" with an empty list of fields in the
Group properties of the OpenFlow 1.5 Group Mod message.

Update the documentation about selection method in the ovs-ovctl man page.

Revise and complete the ofproto-dpif unit tests cases for select groups.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoofproto-dpif: Improve dp_hash selection method for select groups
Jan Scheurich [Thu, 24 May 2018 15:28:00 +0000 (17:28 +0200)]
ofproto-dpif: Improve dp_hash selection method for select groups

The current implementation of the "dp_hash" selection method suffers
from two deficiences: 1. The hash mask and hence the number of dp_hash
values is just large enough to cover the number of group buckets, but
does not consider the case that buckets have different weights. 2. The
xlate-time selection of best bucket from the masked dp_hash value often
results in bucket load distributions that are quite different from the
bucket weights because the number of available masked dp_hash values
is too small (2-6 bits compared to 32 bits of a full hash in the default
hash selection method).

This commit provides a more accurate implementation of the dp_hash
select group by applying the well known Webster method for distributing
a small number of "seats" fairly over the weighted "parties"
(see https://en.wikipedia.org/wiki/Webster/Sainte-Lagu%C3%AB_method).
The dp_hash mask is autmatically chosen large enough to provide good
enough accuracy even with widely differing weights.

This distribution happens at group modification time and the resulting
table is stored with the group-dpif struct. At xlation time, we use the
masked dp_hash values as index to look up the assigned bucket.

If the bucket should not be live, we do a circular search over the
mapping table until we find the first live bucket. As the buckets in
the table are by construction in pseudo-random order with a frequency
according to their weight, this method maintains correct distribution
even if one or more buckets are non-live.

Xlation is further simplified by storing some derived select group state
at group construction in struct group-dpif in a form better suited for
xlation purposes.

Adapted the unit test case for dp_hash select group accordingly.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agouserspace datapath: Add OVS_HASH_L4_SYMMETRIC dp_hash algorithm
Jan Scheurich [Thu, 24 May 2018 15:27:59 +0000 (17:27 +0200)]
userspace datapath: Add OVS_HASH_L4_SYMMETRIC dp_hash algorithm

This commit implements a new dp_hash algorithm OVS_HASH_L4_SYMMETRIC in
the netdev datapath. It will be used as default hash algorithm for the
dp_hash-based select groups in a subsequent commit to maintain
compatibility with the symmetry property of the current default hash
selection method.

A new dpif_backer_support field 'max_hash_alg' is introduced to reflect
the highest hash algorithm a datapath supports in the dp_hash action.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agovlog: exit with error if explicitly specified logfile cannot be opened
Dan Williams [Fri, 25 May 2018 17:49:59 +0000 (12:49 -0500)]
vlog: exit with error if explicitly specified logfile cannot be opened

It seems like if the user wanted a specific logfile but that request
cannot be fulfilled, OVS/OVN shouldn't just continue as if nothing
really happened (besides logging a warning).

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agovconn: Remove obsolete comment.
Ben Pfaff [Wed, 23 May 2018 23:39:56 +0000 (16:39 -0700)]
vconn: Remove obsolete comment.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agotests: Avoid printing Python exception for hosts without IPv6 support.
Ben Pfaff [Wed, 23 May 2018 22:15:36 +0000 (15:15 -0700)]
tests: Avoid printing Python exception for hosts without IPv6 support.

The tests probe whether the host has IPv6 support and, if it doesn't, skip
the tests that require IPv6.  However, until now, when the host lacks
support, this caused a Python exception to be printed, like this:

Traceback (most recent call last):
  File "<string>", line 3, in <module>
  File "/usr/lib64/python2.7/socket.py", line 187, in __init__
    _sock = _realsocket(family, type, proto)
socket.error: [Errno 97] Address family not supported by protocol

This exception is expected and harmless, but it reasonably surprised some
users.  This commit fixes the problem.

Reported-by: Paul Greenberg
Reported-at: https://github.com/openvswitch/ovs-issues/issues/146#issuecomment-390081887
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Support backup and clustered databases for ovn.
Ben Pfaff [Thu, 17 May 2018 21:20:15 +0000 (14:20 -0700)]
ovs-sim: Support backup and clustered databases for ovn.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-vsctl, ovn-nbctl, ovn-sbctl, vtep-ctl: Parse options before logging.
Ben Pfaff [Thu, 17 May 2018 21:04:25 +0000 (14:04 -0700)]
ovs-vsctl, ovn-nbctl, ovn-sbctl, vtep-ctl: Parse options before logging.

These utilities logged the command very early, before parsing the options
or the command.  This meant that logging options (like --log-file or
-vsyslog:off) weren't considered for the purpose of logging the command.
This fixes the problem.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim, ovs-sandbox: Turn off logging to syslog.
Ben Pfaff [Thu, 17 May 2018 20:53:44 +0000 (13:53 -0700)]
ovs-sim, ovs-sandbox: Turn off logging to syslog.

There's no value in having these testing tools log to syslog.  It just
pollutes the system log.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Install RST manpages into simulation environment too.
Ben Pfaff [Thu, 17 May 2018 19:41:23 +0000 (12:41 -0700)]
ovs-sim: Install RST manpages into simulation environment too.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sandbox: Add option to support multiple ovn-controllers.
Ben Pfaff [Fri, 25 May 2018 21:24:18 +0000 (14:24 -0700)]
ovs-sandbox: Add option to support multiple ovn-controllers.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Convert documentation to RST format.
Ben Pfaff [Thu, 17 May 2018 19:31:45 +0000 (12:31 -0700)]
ovs-sim: Convert documentation to RST format.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovsdb: Improve timing in cluster torture test.
Ben Pfaff [Thu, 17 May 2018 18:06:01 +0000 (11:06 -0700)]
ovsdb: Improve timing in cluster torture test.

Until now the timing in the cluster torture test has been pretty
inaccurate because it just worked by calling "sleep 1" in a loop that
did other things.  The longer those other things took, the more
inaccurate it got.

This commit changes to using a separate process for timing.  It still won't
be all that accurate but at least the timing loop doesn't try to do
anything else.

(I'm not sure how to actually get accurate timing in shell.)

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovsdb: Improve torture test for clusters.
Ben Pfaff [Thu, 17 May 2018 17:41:51 +0000 (10:41 -0700)]
ovsdb: Improve torture test for clusters.

This test is supposed to be parameterized, but one of the loops didn't
honor the parameterization and just had hardcoded values.  Also, the
output comparison didn't work properly for more than 100 client sets
(n1 > 100), so this adds some explicit sorting to the mix.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovn: Update TODO.
Ben Pfaff [Thu, 24 May 2018 18:00:35 +0000 (11:00 -0700)]
ovn: Update TODO.

We've actually made a lot of improvements.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agocompati/ip_gre: remove duplicate vport definition.
William Tu [Fri, 25 May 2018 13:28:49 +0000 (06:28 -0700)]
compati/ip_gre: remove duplicate vport definition.

Clean up the duplicate definition of OVS_VPORT_TYPE_ERSPAN
since it is defined in openvswitch.h.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoodp-util: refector erspan option parsing.
William Tu [Fri, 25 May 2018 13:28:48 +0000 (06:28 -0700)]
odp-util: refector erspan option parsing.

Instead of memcpy to a local stack, parse the erspan
metadata in memory.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoerspan: update NEWS and FAQ.
William Tu [Fri, 25 May 2018 16:47:20 +0000 (09:47 -0700)]
erspan: update NEWS and FAQ.

Update Documentation/faq/configuration.rst about ERSPAN
and Update NEWS.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoEmbrace anonymous unions.
Ben Pfaff [Thu, 24 May 2018 17:32:59 +0000 (10:32 -0700)]
Embrace anonymous unions.

Several OVS structs contain embedded named unions, like this:

struct {
    ...
    union {
        ...
    } u;
};

C11 standardized a feature that many compilers already implemented
anyway, where an embedded union may be unnamed, like this:

struct {
    ...
    union {
        ...
    };
};

This is more convenient because it allows the programmer to omit "u."
in many places.  OVS already used this feature in several places.  This
commit embraces it in several others.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Tested-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
6 years agoovs-fields: Improve formatting of NSH section.
Ben Pfaff [Fri, 18 May 2018 17:16:41 +0000 (10:16 -0700)]
ovs-fields: Improve formatting of NSH section.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-fields: Correct ideas about which OXM classes are official.
Ben Pfaff [Fri, 18 May 2018 17:16:40 +0000 (10:16 -0700)]
ovs-fields: Correct ideas about which OXM classes are official.

The purpose of including an OpenFlow version in the notes in meta-flow.h
and ovs-fields.7 is to explain what version of OpenFlow standardized a
given field.  NXOXM_* are not standardized so they should not have an
OpenFlow version.  This commit corrects it.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoerspan: add NXOXM_ET_ERSPAN_ field tests.
William Tu [Fri, 25 May 2018 15:03:18 +0000 (08:03 -0700)]
erspan: add NXOXM_ET_ERSPAN_ field tests.

ERSPAN is the first real-world use cases of Experimenter OXM,
which introduces 4 new NXOXM_ET_ fields (ver, idx, dir, hwid).
The patch adds test cases for these fields.

At the same time, delete the special case for NXOXM_ET_DP_HASH,
because it was only in there for testing anyway.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoovn pacemaker: Fix promotion issue when the master node is reset
Numan Siddique [Thu, 17 May 2018 10:04:09 +0000 (15:34 +0530)]
ovn pacemaker: Fix promotion issue when the master node is reset

When a node 'A' in the pacemaker cluster running OVN db servers in
master is brought down ungracefully ('echo b > /proc/sysrq_trigger'
for example), pacemaker is not able to promote any other node to
master in the cluster. When pacemaker selects a node B for instance to
promote, it moves the IPAddr2 resource (i.e the master ip) to node
'B'. As soon the node is configured with the IP address, when the
issue is seen, the OVN db servers which were running as standy
earlier, transitions to active. Ideally this should not have happened.
The ovsdb-servers are expected to remain in standby until there are
promoted. (This needs separate investigation). When the pacemaker
calls the OVN OCF script's promote action, the ovsdb_server_promot
function returns almost immediately without recording the present
master. And later in the notify action it demotes back the OVN db
servers since the last known master doesn't match with node 'B's
hostname. This results in pacemaker promoting/demoting in a loop.

This patch fixes the issue by not returning immediately when promote
action is called if the OVN db servers are running as active. Now it
would continue with the ovsdb_server_promot function and records the
new master by setting proper master score ($CRM_MASTER -N $host_name
-v ${master_score})

This issue is not seen when a node is brought down gracefully as
pacemaker before promoting a node, calls stop, start and then promote
actions. Not sure why pacemaker doesn't call stop, start and promote
actions when a node is reset ungracefully.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1579025
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
6 years agodpdk: reflect status and version in the database
Aaron Conole [Thu, 3 May 2018 19:08:01 +0000 (15:08 -0400)]
dpdk: reflect status and version in the database

The normal way of retrieving the running DPDK status involves parsing
log files and issuing various incantations of ovs-vsctl and ovs-appctl
commands to determine whether the rte_eal_init successfully started.

This commit adds two new records to reflect the dpdk version, and
the dpdk initialization status.

To support this, the other_config:dpdk-init configuration block supports
the 'true' and 'try' keywords now, instead of just 'true'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agodpdk: allow init to fail
Aaron Conole [Thu, 3 May 2018 19:08:00 +0000 (15:08 -0400)]
dpdk: allow init to fail

It's possible for dpdk initialization to fail either due to an internal
error or an invalid configuration.  When that happens, it's rather
impolite to immediately abort without any details.

With this change, a failed dpdk initialization attempt will continue to
trigger a SIGABRT.  However, the failure details will be logged, and a
user or administrator may have more information to correct the issue.
A restart of OvS would still be required to re-attempt initialization.

The refactor to propagate the init error will be used in an upcoming
commit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agodpif-netdev: Free packets on TUNNEL_PUSH if should_steal.
Ilya Maximets [Thu, 24 May 2018 09:51:21 +0000 (12:51 +0300)]
dpif-netdev: Free packets on TUNNEL_PUSH if should_steal.

Unconditional return may cause packet leak in case of
'should_steal == true'.

Additionally, removed redundant checking for depth level.

CC: Sugesh Chandran <sugesh.chandran@intel.com>
Fixes: 7c12dfc527a5 ("tunneling: Avoid datapath-recirc by
                      combining recirc actions at xlate.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokess <ian.stokes@intel.com>
6 years agonetdev-dpdk: fix check for "net_nfp" driver
Timothy Redaelli [Thu, 17 May 2018 16:45:01 +0000 (18:45 +0200)]
netdev-dpdk: fix check for "net_nfp" driver

Currently the check of "net_nfp" driver while enabling scatter compares
only the first 6 bytes, but "net_nfp" is 7 bytes long.

This change fixes the check by comparing the first 7 bytes.

CC: Pablo Cascón <pablo.cascon@netronome.com>
CC: Simon Horman <simon.horman@netronome.com>
Fixes: 65a87968f4cf ("netdev-dpdk: don't enable scatter for jumbo RX support for nfp")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Pablo Cascón <pablo.cascon@netronome.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agonetdev-dpdk: Don't use PMD driver if not configured successfully
Eelco Chaudron [Wed, 16 May 2018 14:15:34 +0000 (16:15 +0200)]
netdev-dpdk: Don't use PMD driver if not configured successfully

When initialization of the DPDK PMD driver fails
(dpdk_eth_dev_init()), the reconfigure_datapath() function will remove
the port from dp_netdev, and the port is not used.

Now when bridge_reconfigure() is called again, no changes to the
previous failing netdev configuration are detected and therefore the
ports gets added to dp_netdev and used uninitialized. This is causing
exceptions...

The fix has two parts to it. First in netdev-dpdk.c we remember if the
DPDK port was started or not, and when calling
netdev_dpdk_reconfigure() we also try re-initialization if the port
was not already active. The second part of the change is in
dpif-netdev.c where it makes sure netdev_reconfigure() is called if
the port needs reconfiguration, as netdev_is_reconf_required() is only
true until netdev_reconfigure() is called (even if it fails).

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agonetdev-dpdk: Remove use of rte_mempool_ops_get_count.
Kevin Traynor [Wed, 23 May 2018 13:41:30 +0000 (14:41 +0100)]
netdev-dpdk: Remove use of rte_mempool_ops_get_count.

rte_mempool_ops_get_count is not exported by DPDK so it means it
cannot be used by OVS when using DPDK as a shared library.

Remove rte_mempool_ops_get_count but still use rte_mempool_full
and document it's behavior.

Fixes: 91fccdad72a2 ("netdev-dpdk: Free mempool only when no in-use mbufs.")
Reported-by: Timothy Redaelli <tredaelli@redhat.com>
Reported-by: Markos Chandras <mchandras@suse.de>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agoExtend tests for conjunctive match support in OVN
Numan Siddique [Thu, 24 May 2018 15:45:53 +0000 (17:45 +0200)]
Extend tests for conjunctive match support in OVN

Check the application of conjunctive matching to logical flow match
expressions. In particular cover the case where conjunctive matching is
applied to ACL match expressions that refer to Address Sets.

Mark Michelson who tested a similar patch [1] has found a significant
improvement in ACL processing and reduction of OF flows from an order of
1 million to few thousands. [2]

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
[1] - https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344523.html
[2] - https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344311.html

Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoFactor prerequisites out of AND/OR trees with unique symbol
Jakub Sitnicki [Thu, 24 May 2018 15:45:52 +0000 (17:45 +0200)]
Factor prerequisites out of AND/OR trees with unique symbol

Appending prerequisites to sub-expressions of OR that are all over one
symbol prevents the expression-to-matches converter from applying
conjunctive matching. This happens during the annotation phase.

input:      s1 == { c1, c2 } && s2 == { c3, c4 }
expanded:   (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)
annotated:  ((p1 && s1 == c1) || (p1 && s1 == c2)) &&
            ((p2 && s2 == c3) || (p2 && s2 == c4))
normalized: (p1 && p2 && s1 == c1 && s2 == c3) ||
            (p1 && p2 && s1 == c1 && s2 == c4) ||
            (p1 && p2 && s1 == c2 && s2 == c3) ||
            (p1 && p2 && s1 == c2 && s2 == c4)

Where s1,s2 - symbols, c1..c4 - constants, p1,p2 - prerequisites.

Since sub-expressions of OR trees that are over one symbol all have the
same prerequisites, we can factor them out leaving the OR tree in tact,
and enabling the converter to apply conjunctive matching to
AND(OR(clause)) trees.

Going back to our example this change gives us:

input:      s1 == { c1, c2 } && s2 == { c3, c4 }
expanded:   (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)
annotated:  (s1 == c1 || s1 == c2) && p1 && (s2 == c3 || s2 == c4) && p2
normalized: p1 && p2 && (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)

We also factor out the prerequisites out of pure AND or mixed AND/OR
trees to keep the common code path, but in this case the only thing we
gain is a shorter expression as prerequisites for each symbol appear
only once.

Documentation comments have been contributed by Ben Pfaff.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agonetdev-native-tnl: Fix alignment for erspan index.
Darrell Ball [Thu, 24 May 2018 02:13:56 +0000 (19:13 -0700)]
netdev-native-tnl: Fix alignment for erspan index.

Flagged by clang.

CC: William Tu <u9012063@gmail.com>
Fixes: 068794b43f0e ("erspan: Add flow-based erspan options")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>