]> git.proxmox.com Git - mirror_ovs.git/log
mirror_ovs.git
6 years agoofproto-dpif: Remove tabs from output.
Ben Pfaff [Sat, 26 May 2018 00:03:05 +0000 (17:03 -0700)]
ofproto-dpif: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofproto-dpif-upcall: Remove tabs from output.
Ben Pfaff [Sat, 26 May 2018 00:02:22 +0000 (17:02 -0700)]
ofproto-dpif-upcall: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofproto-dpif-trace: Remove tabs from output.
Ben Pfaff [Sat, 26 May 2018 00:01:48 +0000 (17:01 -0700)]
ofproto-dpif-trace: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agobond: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:59:40 +0000 (16:59 -0700)]
bond: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agostopwatch: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:58:25 +0000 (16:58 -0700)]
stopwatch: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agorstp, stp: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:57:59 +0000 (16:57 -0700)]
rstp, stp: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-lldp: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:57:13 +0000 (16:57 -0700)]
ovs-lldp: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agolacp: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:56:18 +0000 (16:56 -0700)]
lacp: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agodpctl: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:55:18 +0000 (16:55 -0700)]
dpctl: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agocfm: Remove tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:50:54 +0000 (16:50 -0700)]
cfm: Remove tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agobfd: Remove leading tabs from output.
Ben Pfaff [Fri, 25 May 2018 23:50:29 +0000 (16:50 -0700)]
bfd: Remove leading tabs from output.

OVS uses spaces for indentation in source code and it makes sense for it to
also use spaces for indentation in output.  Spaces also consume less
horizontal space in output, which often makes it easier to read.  This
commit transitions one part of output from tabs to spaces and updates
appropriate parts of the tests to match.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovn-sandbox: Fix link.
Ben Pfaff [Sun, 3 Jun 2018 20:40:26 +0000 (13:40 -0700)]
ovn-sandbox: Fix link.

I couldn't figure out a way to fix this without making it inline.  Weird.

Reported-by: Qiuyu Xiao <qxiao@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofp-bundle: Minor style fixes for header.
Ben Pfaff [Thu, 17 May 2018 15:22:45 +0000 (08:22 -0700)]
ofp-bundle: Minor style fixes for header.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovn-controller: Pass around pointers to individual tables.
Ben Pfaff [Thu, 7 Jun 2018 21:22:33 +0000 (14:22 -0700)]
ovn-controller: Pass around pointers to individual tables.

We're working to make ovn-controller compute more incrementally, to reduce
CPU usage.  To make it easier to keep track of dependencies, it makes sense
to pass around pointers to fine-grained resources instead of an entire
database at a time.  This commit introduces a way to pass individual tables
around and starts using that feature in ovn-controller.

CC: Han Zhou <zhouhan@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agoovn-controller: Style fixes.
Ben Pfaff [Tue, 5 Jun 2018 18:04:39 +0000 (11:04 -0700)]
ovn-controller: Style fixes.

The OVS coding style says that input parameters should come first,
followed by output parameters.  This changes a few functions in
ovn-controller to fit this style.  It also marks a number of input
parameters 'const', for clarity.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agodatapath-windows: Add support for handling DEI bit of VLAN header
Anand Kumar [Tue, 15 May 2018 23:38:00 +0000 (16:38 -0700)]
datapath-windows: Add support for handling DEI bit of VLAN header

The Drop eligible indicator(DEI) is 1 bit wide and it is part of
Tag control information (TCI) in VLAN header, which indicates that
the frame can be dropped during congestion.

Signed-off-by: Anand Kumar <kumaranand@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
6 years agolib: fix typo in fragment handling error message
Louis Peens [Tue, 29 May 2018 18:51:15 +0000 (20:51 +0200)]
lib: fix typo in fragment handling error message

The error message states that "not_first" is a valid selection
for the ip_frag field, but looking at the structure that is defined
this should say "not_later".

Signed-off-by: Louis Peens <louis.peens@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
6 years agodatapath: Check if gre kernel module is loaded
Greg Rose [Wed, 6 Jun 2018 22:23:28 +0000 (15:23 -0700)]
datapath: Check if gre kernel module is loaded

Before attempting to add a gre tunnel to OVS via the vport gre
kernel interface make sure that the openvswitch kernel module has
been able to grab the gre protocol entry point.  If OVS does not
own the gre protocol then report address family not supported.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodhparams: Add pregenerated .c file to the repository.
Eneas U de Queiroz [Tue, 5 Jun 2018 22:25:42 +0000 (15:25 -0700)]
dhparams: Add pregenerated .c file to the repository.

The version of dhparams.c generated by any given version of OpenSSL or
LibreSSL might work only with that version of the library.  This can be
inconvenient for cross-compiling if the "openssl" program on the build
machine has a different version from the library on the host where OVS will
run, since it could generate code that won't compile.

This commit fixes the problem by generating dhparams.c that works on the
currently important versions of OpenSSL and LibreSSL.

Submitted-at: https://github.com/openvswitch/ovs/pull/235
Signed-off-by: Eneas U de Queiroz <cote2004-github@yahoo.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agorhel: remove ovs-sim man page from temporary directory (also for RHEL)
Ansis Atteka [Wed, 6 Jun 2018 02:48:26 +0000 (19:48 -0700)]
rhel: remove ovs-sim man page from temporary directory (also for RHEL)

Fix following compilation error when building rpm packages
with rhel/openvswitch.spec file.

error: Installed (but unpackaged) file(s) found:
   /usr/share/man/man1/ovs-sim.1.gz

Signed-off-by: Ansis Atteka <aatteka@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
6 years agorhel: remove ovs-sim man page from temporary directory
Lorenzo Bianconi [Tue, 5 Jun 2018 12:42:23 +0000 (14:42 +0200)]
rhel: remove ovs-sim man page from temporary directory

Fix following compilation error running 'make rpm-fedora'

error: Installed (but unpackaged) file(s) found:
   /usr/share/man/man1/ovs-sim.1.gz

RPM build errors:
    Installed (but unpackaged) file(s) found:
   /usr/share/man/man1/ovs-sim.1.gz
make: *** [Makefile:7049: rpm-fedora] Error 1

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
6 years agoovndb-servers.ocf: add LB support for managing ovndb cluster:
aginwala [Sat, 2 Jun 2018 16:11:56 +0000 (09:11 -0700)]
ovndb-servers.ocf: add LB support for managing ovndb cluster:

using pacemaker so that controllers can be placed in different fault domains.
More background about the discussions can be found on:
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-May/046770.html

Signed-off-by: aginwala <aginwala@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Tested-by: Numan Siddique <nusiddiq@redhat.com>
6 years agopython: Update docstring in ovs.db.idl.Idl class.
Toms Atteka [Mon, 4 Jun 2018 18:33:32 +0000 (11:33 -0700)]
python: Update docstring in ovs.db.idl.Idl class.

Adjusted docstring and variable names according to previous code changes;
Fixed grammar "a attribute" > "an attribute".

Fixes: bf42f674 (idl: Convert python daemons to utilize SchemaHelper)
Signed-off-by: Toms Atteka <cpp.code.lv@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agorstp: Eliminate BPDU padding and uninitialized bytes.
Ben Pfaff [Mon, 4 Jun 2018 20:42:10 +0000 (13:42 -0700)]
rstp: Eliminate BPDU padding and uninitialized bytes.

When the RSTP implementation sent BPDUs, it failed to initialize some of
their bytes.  None of the code initialized an array of 7 padding bytes, and
some of it also failed to initialize the version1_length field.  In
addition, the padding bytes confused some implementations that did not
correctly ignore extra bytes.

This commit fixes both problems, by removing the padding bytes and
initializing every byte in outgoing messages.

Reported-by: David van Moolenbroek <dvmoolenbroek@aimvalley.nl>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-June/046864.html
Tested-by: David van Moolenbroek <dvmoolenbroek@aimvalley.nl>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoRemoved calls to AP deprecated in openssl 1.1
Eneas U de Queiroz [Tue, 5 Jun 2018 13:36:51 +0000 (10:36 -0300)]
Removed calls to AP deprecated in openssl 1.1

In openssl 1.1, there is no need to initialize the library.  It is
automatically done when first used.  This allows to compile openvswitch
with openssl 1.1.0 with deprecated API disabled.

Signed-off-by: Eneas U de Queiroz <cote2004-github@yahoo.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodatapath: Do not fail to load on gre protocol conflict
Greg Rose [Mon, 4 Jun 2018 20:14:38 +0000 (13:14 -0700)]
datapath: Do not fail to load on gre protocol conflict

The ERSPAN feature depends on the gre kernel module so on systems where
the ERSPAN feature isn't supported the openvswitch kernel module would
attempt to grab the ipv4 GRE protocol entry point and would fail to load
if it could not.

This patch modifies openvswitch to not fail to load when the gre kernel
module is loaded and instead it will print a warning message to the
kernel system log indicating that the ERSPAN feature may not be
available.

We need this patch because users are experiencing failures due to the
conflicts and high priority bugs are resulting.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoRevert "utilities/ovs-ctl: Force removal of ip_gre/gre"
Greg Rose [Mon, 4 Jun 2018 20:14:37 +0000 (13:14 -0700)]
Revert "utilities/ovs-ctl: Force removal of ip_gre/gre"

This reverts commit 2bdd1f3d96a86bea6bdb8788f23ec7dd99b289e3.

This is the wrong direction for the solution to the ip_gre/gre kernel
module conflicts, as reported by Jiri Benc <jbenc@redhat.com> and others in
https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/347803.html and
elsewhere in the same thread

Rather than attempting to force the removal of the ip_gre/gre kernel
modules, which often fails because they're in use, we will add a patch that
does not cause the openvswitch kernel module to fail to load when the
ip_gre/gre protocol entry points are already claimed.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoRevert "dpif: Ensure ERSPAN GRE support"
Greg Rose [Mon, 4 Jun 2018 20:14:36 +0000 (13:14 -0700)]
Revert "dpif: Ensure ERSPAN GRE support"

This reverts commit 8929c55287abae37efeac1e8876e6b3c2ccad0b9.

This is the wrong direction for the solution to the ip_gre/gre kernel
module conflicts, as reported by Jiri Benc <jbenc@redhat.com> and others in
https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/347803.html and
elsewhere in the same thread

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: Fix compile warning
Greg Rose [Mon, 4 Jun 2018 20:33:30 +0000 (13:33 -0700)]
compat: Fix compile warning

Fix compile warning about redefined symbol

Fixes: 10f242363d ("compat: Add skb_checksum_simple_complete()")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoFix typo in database commands documentation.
Mark Michelson [Mon, 4 Jun 2018 14:36:31 +0000 (10:36 -0400)]
Fix typo in database commands documentation.

s/remov/remove/

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agocompat: Add skb_checksum_simple_complete()
Greg Rose [Fri, 1 Jun 2018 20:07:43 +0000 (13:07 -0700)]
compat: Add skb_checksum_simple_complete()

A recent patch to gre.c added a call to skb_checksum_simple_complete()
which is not present in kernels before 3.16.  Fix up the compatability
layer to allow compile on older kernels that do not have it.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoutilities/ovs-ctl: Force removal of ip_gre/gre
Greg Rose [Thu, 31 May 2018 21:20:45 +0000 (14:20 -0700)]
utilities/ovs-ctl: Force removal of ip_gre/gre

On Linux kernels older than 4.16 the user cannot take advantage of
OVS ERSPAN features if the older ip_gre and gre kernel modules are
loaded.  In addition, the openvswitch kernel module will fail to
load because it cannot grab the IPPROTO_GRE inet protocol handler
since the gre kernel module has already taken it.

Update the force_reload_kmod() script function to force removal
of the ip_gre and gre built-in kernel modules so that the openvswitch
kernel module can load and provide support for ERSPAN.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodpif: Ensure ERSPAN GRE support
Greg Rose [Thu, 31 May 2018 22:50:31 +0000 (15:50 -0700)]
dpif: Ensure ERSPAN GRE support

When verifying the built-in gre kernel module check for ERSPAN support.

Reported-by: Guru Shetty <guru@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agocompat: Fixups for newer kernels
Greg Rose [Thu, 31 May 2018 21:10:10 +0000 (14:10 -0700)]
compat: Fixups for newer kernels

A recent patch series added support for ERSPAN but left some problems
remaining for kernel releases from 4.10 to 4.14.  This patch
addresses those problems.

Of note is that the old cisco gre compat layer code is gone for good.

Also, several compat defines in acinclude.m4 were looking for keys
in .c source files - this does not work on distros without source
code.  A more reliable key was already defined so we use that instead.

We have pared support for the Linux kernel releases in .travis.yml
to reflect that 4.15 is no longer in the LTS list.  With this patch
the Out of Tree OVS datapath kernel modules can build on kernels
up to 4.14.47.  Support for kernels up to 4.16.x will be added
later.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodatapath: ip6_gre: fix tunnel metadata device sharing.
William Tu [Tue, 29 May 2018 12:55:05 +0000 (05:55 -0700)]
datapath: ip6_gre: fix tunnel metadata device sharing.

commit b80d0b93b991e551a32157e0d9d38fc5bc9348a7
Author: William Tu <u9012063@gmail.com>
Date:   Fri May 18 19:22:28 2018 -0700

    net: ip6_gre: fix tunnel metadata device sharing.

    Currently ip6gre and ip6erspan share single metadata mode device,
    using 'collect_md_tun'.  Thus, when doing:
      ip link add dev ip6gre11 type ip6gretap external
      ip link add dev ip6erspan12 type ip6erspan external
      RTNETLINK answers: File exists
    simply fails due to the 2nd tries to create the same collect_md_tun.

    The patch fixes it by adding a separate collect md tunnel device
    for the ip6erspan, 'collect_md_tun_erspan'.  As a result, a couple
    of places need to refactor/split up in order to distinguish ip6gre
    and ip6erspan.

    First, move the collect_md check at ip6gre_tunnel_{unlink,link} and
    create separate function {ip6gre,ip6ersapn}_tunnel_{link_md,unlink_md}.
    Then before link/unlink, make sure the link_md/unlink_md is called.
    Finally, a separate ndo_uninit is created for ip6erspan.  Tested it
    using the samples/bpf/test_tunnel_bpf.sh.

Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Fix ip6erspan hlen calculation
William Tu [Tue, 29 May 2018 12:55:04 +0000 (05:55 -0700)]
datapath: ip6_gre: Fix ip6erspan hlen calculation

commit 2d665034f239412927b1e71329f20f001c92da09
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:51 2018 +0200

    net: ip6_gre: Fix ip6erspan hlen calculation

    Even though ip6erspan_tap_init() sets up hlen and tun_hlen according to
    what ERSPAN needs, it goes ahead to call ip6gre_tnl_link_config() which
    overwrites these settings with GRE-specific ones.

    Similarly for changelink callbacks, which are handled by
    ip6gre_changelink() calls ip6gre_tnl_change() calls
    ip6gre_tnl_link_config() as well.

    The difference ends up being 12 vs. 20 bytes, and this is generally not
    a problem, because a 12-byte request likely ends up allocating more and
    the extra 8 bytes are thus available. However correct it is not.

    So replace the newlink and changelink callbacks with an ERSPAN-specific
    ones, reusing the newly-introduced _common() functions.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_changelink()
William Tu [Tue, 29 May 2018 12:55:03 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_changelink()

commit c8632fc30bb03aa0c3bd7bcce85355a10feb8149
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:45 2018 +0200

    net: ip6_gre: Split up ip6gre_changelink()

    Extract from ip6gre_changelink() a reusable function
    ip6gre_changelink_common(). This will allow introduction of
    ERSPAN-specific _changelink() function with not a lot of code
    duplication.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_newlink()
William Tu [Tue, 29 May 2018 12:55:02 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_newlink()

commit 7fa38a7c852ec99e3a7fc375eb2c21c50c2e46b8
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:39 2018 +0200

    net: ip6_gre: Split up ip6gre_newlink()

    Extract from ip6gre_newlink() a reusable function
    ip6gre_newlink_common(). The ip6gre_tnl_link_config() call needs to be
    made customizable for ERSPAN, thus reorder it with calls to
    ip6_tnl_change_mtu() and dev_hold(), and extract the whole tail to the
    caller, ip6gre_newlink(). Thus enable an ERSPAN-specific _newlink()
    function without a lot of duplicity.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_tnl_change()
William Tu [Tue, 29 May 2018 12:55:01 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_tnl_change()

commit a6465350ef495f5cbd76a3e505d25a01d648477e
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:33 2018 +0200

    net: ip6_gre: Split up ip6gre_tnl_change()

    Split a reusable function ip6gre_tnl_copy_tnl_parm() from
    ip6gre_tnl_change(). This will allow ERSPAN-specific code to
    reuse the common parts while customizing the behavior for ERSPAN.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_tnl_link_config()
William Tu [Tue, 29 May 2018 12:55:00 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_tnl_link_config()

commit a483373ead61e6079bc8ebe27e2dfdb2e3c1559f
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:27 2018 +0200

    net: ip6_gre: Split up ip6gre_tnl_link_config()

    The function ip6gre_tnl_link_config() is used for setting up
    configuration of both ip6gretap and ip6erspan tunnels. Split the
    function into the common part and the route-lookup part. The latter then
    takes the calculated header length as an argument. This split will allow
    the patches down the line to sneak in a custom header length computation
    for the ERSPAN tunnel.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()
William Tu [Tue, 29 May 2018 12:54:59 +0000 (05:54 -0700)]
datapath: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()

commit 5691484df961aff897d824bcc26cd1a2aa036b5b
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:15 2018 +0200

    net: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()

    dev->needed_headroom is not primed until ip6_tnl_xmit(), so it starts
    out zero. Thus the call to skb_cow_head() fails to actually make sure
    there's enough headroom to push the ERSPAN headers to. That can lead to
    the panic cited below. (Reproducer below that).

    Fix by requesting either needed_headroom if already primed, or just the
    bare minimum needed for the header otherwise.

    [  190.703567] kernel BUG at net/core/skbuff.c:104!
    [  190.708384] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
    [  190.714007] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_t
emp_thermal mlx_platform nfsd e1000e leds_mlxcpld
    [  190.728975] CPU: 1 PID: 959 Comm: kworker/1:2 Not tainted 4.17.0-rc4-net_master-custom-139 #10
    [  190.737647] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
    [  190.747006] Workqueue: ipv6_addrconf addrconf_dad_work
    [  190.752222] RIP: 0010:skb_panic+0xc3/0x100
    [  190.756358] RSP: 0018:ffff8801d54072f0 EFLAGS: 00010282
    [  190.761629] RAX: 0000000000000085 RBX: ffff8801c1a8ecc0 RCX: 0000000000000000
    [  190.768830] RDX: 0000000000000085 RSI: dffffc0000000000 RDI: ffffed003aa80e54
    [  190.776025] RBP: ffff8801bd1ec5a0 R08: ffffed003aabce19 R09: ffffed003aabce19
    [  190.783226] R10: 0000000000000001 R11: ffffed003aabce18 R12: ffff8801bf695dbe
    [  190.790418] R13: 0000000000000084 R14: 00000000000006c0 R15: ffff8801bf695dc8
    [  190.797621] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
    [  190.805786] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  190.811582] CR2: 000055fa929aced0 CR3: 0000000003228004 CR4: 00000000001606e0
    [  190.818790] Call Trace:
    [  190.821264]  <IRQ>
    [  190.823314]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.828940]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.834562]  skb_push+0x78/0x90
    [  190.837749]  ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.843219]  ? ip6gre_tunnel_ioctl+0xd90/0xd90 [ip6_gre]
    [  190.848577]  ? debug_check_no_locks_freed+0x210/0x210
    [  190.853679]  ? debug_check_no_locks_freed+0x210/0x210
    [  190.858783]  ? print_irqtrace_events+0x120/0x120
    [  190.863451]  ? sched_clock_cpu+0x18/0x210
    [  190.867496]  ? cyc2ns_read_end+0x10/0x10
    [  190.871474]  ? skb_network_protocol+0x76/0x200
    [  190.875977]  dev_hard_start_xmit+0x137/0x770
    [  190.880317]  ? do_raw_spin_trylock+0x6d/0xa0
    [  190.884624]  sch_direct_xmit+0x2ef/0x5d0
    [  190.888589]  ? pfifo_fast_dequeue+0x3fa/0x670
    [  190.892994]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
    [  190.898455]  ? __lock_is_held+0xa0/0x160
    [  190.902422]  __qdisc_run+0x39e/0xfc0
    [  190.906041]  ? _raw_spin_unlock+0x29/0x40
    [  190.910090]  ? pfifo_fast_enqueue+0x24b/0x3e0
    [  190.914501]  ? sch_direct_xmit+0x5d0/0x5d0
    [  190.918658]  ? pfifo_fast_dequeue+0x670/0x670
    [  190.923047]  ? __dev_queue_xmit+0x172/0x1770
    [  190.927365]  ? preempt_count_sub+0xf/0xd0
    [  190.931421]  __dev_queue_xmit+0x410/0x1770
    [  190.935553]  ? ___slab_alloc+0x605/0x930
    [  190.939524]  ? print_irqtrace_events+0x120/0x120
    [  190.944186]  ? memcpy+0x34/0x50
    [  190.947364]  ? netdev_pick_tx+0x1c0/0x1c0
    [  190.951428]  ? __skb_clone+0x2fd/0x3d0
    [  190.955218]  ? __copy_skb_header+0x270/0x270
    [  190.959537]  ? rcu_read_lock_sched_held+0x93/0xa0
    [  190.964282]  ? kmem_cache_alloc+0x344/0x4d0
    [  190.968520]  ? cyc2ns_read_end+0x10/0x10
    [  190.972495]  ? skb_clone+0x123/0x230
    [  190.976112]  ? skb_split+0x820/0x820
    [  190.979747]  ? tcf_mirred+0x554/0x930 [act_mirred]
    [  190.984582]  tcf_mirred+0x554/0x930 [act_mirred]
    [  190.989252]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
    [  190.996109]  ? __lock_acquire+0x706/0x26e0
    [  191.000239]  ? sched_clock_cpu+0x18/0x210
    [  191.004294]  tcf_action_exec+0xcf/0x2a0
    [  191.008179]  tcf_classify+0xfa/0x340
    [  191.011794]  __netif_receive_skb_core+0x8e1/0x1c60
    [  191.016630]  ? debug_check_no_locks_freed+0x210/0x210
    [  191.021732]  ? nf_ingress+0x500/0x500
    [  191.025458]  ? process_backlog+0x347/0x4b0
    [  191.029619]  ? print_irqtrace_events+0x120/0x120
    [  191.034302]  ? lock_acquire+0xd8/0x320
    [  191.038089]  ? process_backlog+0x1b6/0x4b0
    [  191.042246]  ? process_backlog+0xc2/0x4b0
    [  191.046303]  process_backlog+0xc2/0x4b0
    [  191.050189]  net_rx_action+0x5cc/0x980
    [  191.053991]  ? napi_complete_done+0x2c0/0x2c0
    [  191.058386]  ? mark_lock+0x13d/0xb40
    [  191.062001]  ? clockevents_program_event+0x6b/0x1d0
    [  191.066922]  ? print_irqtrace_events+0x120/0x120
    [  191.071593]  ? __lock_is_held+0xa0/0x160
    [  191.075566]  __do_softirq+0x1d4/0x9d2
    [  191.079282]  ? ip6_finish_output2+0x524/0x1460
    [  191.083771]  do_softirq_own_stack+0x2a/0x40
    [  191.087994]  </IRQ>
    [  191.090130]  do_softirq.part.13+0x38/0x40
    [  191.094178]  __local_bh_enable_ip+0x135/0x190
    [  191.098591]  ip6_finish_output2+0x54d/0x1460
    [  191.102916]  ? ip6_forward_finish+0x2f0/0x2f0
    [  191.107314]  ? ip6_mtu+0x3c/0x2c0
    [  191.110674]  ? ip6_finish_output+0x2f8/0x650
    [  191.114992]  ? ip6_output+0x12a/0x500
    [  191.118696]  ip6_output+0x12a/0x500
    [  191.122223]  ? ip6_route_dev_notify+0x5b0/0x5b0
    [  191.126807]  ? ip6_finish_output+0x650/0x650
    [  191.131120]  ? ip6_fragment+0x1a60/0x1a60
    [  191.135182]  ? icmp6_dst_alloc+0x26e/0x470
    [  191.139317]  mld_sendpack+0x672/0x830
    [  191.143021]  ? igmp6_mcf_seq_next+0x2f0/0x2f0
    [  191.147429]  ? __local_bh_enable_ip+0x77/0x190
    [  191.151913]  ipv6_mc_dad_complete+0x47/0x90
    [  191.156144]  addrconf_dad_completed+0x561/0x720
    [  191.160731]  ? addrconf_rs_timer+0x3a0/0x3a0
    [  191.165036]  ? mark_held_locks+0xc9/0x140
    [  191.169095]  ? __local_bh_enable_ip+0x77/0x190
    [  191.173570]  ? addrconf_dad_work+0x50d/0xa20
    [  191.177886]  ? addrconf_dad_work+0x529/0xa20
    [  191.182194]  addrconf_dad_work+0x529/0xa20
    [  191.186342]  ? addrconf_dad_completed+0x720/0x720
    [  191.191088]  ? __lock_is_held+0xa0/0x160
    [  191.195059]  ? process_one_work+0x45d/0xe20
    [  191.199302]  ? process_one_work+0x51e/0xe20
    [  191.203531]  ? rcu_read_lock_sched_held+0x93/0xa0
    [  191.208279]  process_one_work+0x51e/0xe20
    [  191.212340]  ? pwq_dec_nr_in_flight+0x200/0x200
    [  191.216912]  ? get_lock_stats+0x4b/0xf0
    [  191.220788]  ? preempt_count_sub+0xf/0xd0
    [  191.224844]  ? worker_thread+0x219/0x860
    [  191.228823]  ? do_raw_spin_trylock+0x6d/0xa0
    [  191.233142]  worker_thread+0xeb/0x860
    [  191.236848]  ? process_one_work+0xe20/0xe20
    [  191.241095]  kthread+0x206/0x300
    [  191.244352]  ? process_one_work+0xe20/0xe20
    [  191.248587]  ? kthread_stop+0x570/0x570
    [  191.252459]  ret_from_fork+0x3a/0x50
    [  191.256082] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
    [  191.275327] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d54072f0
    [  191.281024] ---[ end trace 7ea51094e099e006 ]---
    [  191.285724] Kernel panic - not syncing: Fatal exception in interrupt
    [  191.292168] Kernel Offset: disabled
    [  191.295697] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

    Reproducer:

        ip link add h1 type veth peer name swp1
        ip link add h3 type veth peer name swp3

        ip link set dev h1 up
        ip address add 192.0.2.1/28 dev h1

        ip link add dev vh3 type vrf table 20
        ip link set dev h3 master vh3
        ip link set dev vh3 up
        ip link set dev h3 up

        ip link set dev swp3 up
        ip address add dev swp3 2001:db8:2::1/64

        ip link set dev swp1 up
        tc qdisc add dev swp1 clsact

        ip link add name gt6 type ip6erspan \
                local 2001:db8:2::1 remote 2001:db8:2::2 oseq okey 123
        ip link set dev gt6 up

        sleep 1

        tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
                action mirred egress mirror dev gt6
        ping -I h1 192.0.2.2

Fixes: e41c7c68ea77 ("ip6erspan: make sure enough headroom at xmit.")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Request headroom in __gre6_xmit()
William Tu [Tue, 29 May 2018 12:54:58 +0000 (05:54 -0700)]
datapath: ip6_gre: Request headroom in __gre6_xmit()

Upstream commit:
commit 01b8d064d58b4c1f0eff47f8fe8a8508cb3b3840
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:10 2018 +0200

net: ip6_gre: Request headroom in __gre6_xmit()

__gre6_xmit() pushes GRE headers before handing over to ip6_tnl_xmit()
for generic IP-in-IP processing. However it doesn't make sure that there
is enough headroom to push the header to. That can lead to the panic
cited below. (Reproducer below that).

Fix by requesting either needed_headroom if already primed, or just the
bare minimum needed for the header otherwise.

[  158.576725] kernel BUG at net/core/skbuff.c:104!
[  158.581510] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[  158.587174] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_t
emp_thermal mlx_platform nfsd e1000e leds_mlxcpld
[  158.602268] CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 4.17.0-rc4-net_master-custom-139 #10
[  158.610938] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
[  158.620426] RIP: 0010:skb_panic+0xc3/0x100
[  158.624586] RSP: 0018:ffff8801d3f27110 EFLAGS: 00010286
[  158.629882] RAX: 0000000000000082 RBX: ffff8801c02cc040 RCX: 0000000000000000
[  158.637127] RDX: 0000000000000082 RSI: dffffc0000000000 RDI: ffffed003a7e4e18
[  158.644366] RBP: ffff8801bfec8020 R08: ffffed003aabce19 R09: ffffed003aabce19
[  158.651574] R10: 000000000000000b R11: ffffed003aabce18 R12: ffff8801c364de66
[  158.658786] R13: 000000000000002c R14: 00000000000000c0 R15: ffff8801c364de68
[  158.666007] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
[  158.674212] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  158.680036] CR2: 00007f4b3702dcd0 CR3: 0000000003228002 CR4: 00000000001606e0
[  158.687228] Call Trace:
[  158.689752]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.694475]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.699141]  skb_push+0x78/0x90
[  158.702344]  __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.706872]  ip6gre_tunnel_xmit+0x3bc/0x610 [ip6_gre]
[  158.711992]  ? __gre6_xmit+0xd80/0xd80 [ip6_gre]
[  158.716668]  ? debug_check_no_locks_freed+0x210/0x210
[  158.721761]  ? print_irqtrace_events+0x120/0x120
[  158.726461]  ? sched_clock_cpu+0x18/0x210
[  158.730572]  ? sched_clock_cpu+0x18/0x210
[  158.734692]  ? cyc2ns_read_end+0x10/0x10
[  158.738705]  ? skb_network_protocol+0x76/0x200
[  158.743216]  ? netif_skb_features+0x1b2/0x550
[  158.747648]  dev_hard_start_xmit+0x137/0x770
[  158.752010]  sch_direct_xmit+0x2ef/0x5d0
[  158.755992]  ? pfifo_fast_dequeue+0x3fa/0x670
[  158.760460]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
[  158.765975]  ? __lock_is_held+0xa0/0x160
[  158.770002]  __qdisc_run+0x39e/0xfc0
[  158.773673]  ? _raw_spin_unlock+0x29/0x40
[  158.777781]  ? pfifo_fast_enqueue+0x24b/0x3e0
[  158.782191]  ? sch_direct_xmit+0x5d0/0x5d0
[  158.786372]  ? pfifo_fast_dequeue+0x670/0x670
[  158.790818]  ? __dev_queue_xmit+0x172/0x1770
[  158.795195]  ? preempt_count_sub+0xf/0xd0
[  158.799313]  __dev_queue_xmit+0x410/0x1770
[  158.803512]  ? ___slab_alloc+0x605/0x930
[  158.807525]  ? ___slab_alloc+0x605/0x930
[  158.811540]  ? memcpy+0x34/0x50
[  158.814768]  ? netdev_pick_tx+0x1c0/0x1c0
[  158.818895]  ? __skb_clone+0x2fd/0x3d0
[  158.822712]  ? __copy_skb_header+0x270/0x270
[  158.827079]  ? rcu_read_lock_sched_held+0x93/0xa0
[  158.831903]  ? kmem_cache_alloc+0x344/0x4d0
[  158.836199]  ? skb_clone+0x123/0x230
[  158.839869]  ? skb_split+0x820/0x820
[  158.843521]  ? tcf_mirred+0x554/0x930 [act_mirred]
[  158.848407]  tcf_mirred+0x554/0x930 [act_mirred]
[  158.853104]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
[  158.860005]  ? __lock_acquire+0x706/0x26e0
[  158.864162]  ? mark_lock+0x13d/0xb40
[  158.867832]  tcf_action_exec+0xcf/0x2a0
[  158.871736]  tcf_classify+0xfa/0x340
[  158.875402]  __netif_receive_skb_core+0x8e1/0x1c60
[  158.880334]  ? nf_ingress+0x500/0x500
[  158.884059]  ? process_backlog+0x347/0x4b0
[  158.888241]  ? lock_acquire+0xd8/0x320
[  158.892050]  ? process_backlog+0x1b6/0x4b0
[  158.896228]  ? process_backlog+0xc2/0x4b0
[  158.900291]  process_backlog+0xc2/0x4b0
[  158.904210]  net_rx_action+0x5cc/0x980
[  158.908047]  ? napi_complete_done+0x2c0/0x2c0
[  158.912525]  ? rcu_read_unlock+0x80/0x80
[  158.916534]  ? __lock_is_held+0x34/0x160
[  158.920541]  __do_softirq+0x1d4/0x9d2
[  158.924308]  ? trace_event_raw_event_irq_handler_exit+0x140/0x140
[  158.930515]  run_ksoftirqd+0x1d/0x40
[  158.934152]  smpboot_thread_fn+0x32b/0x690
[  158.938299]  ? sort_range+0x20/0x20
[  158.941842]  ? preempt_count_sub+0xf/0xd0
[  158.945940]  ? schedule+0x5b/0x140
[  158.949412]  kthread+0x206/0x300
[  158.952689]  ? sort_range+0x20/0x20
[  158.956249]  ? kthread_stop+0x570/0x570
[  158.960164]  ret_from_fork+0x3a/0x50
[  158.963823] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
[  158.983235] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d3f27110
[  158.988935] ---[ end trace 5af56ee845aa6cc8 ]---
[  158.993641] Kernel panic - not syncing: Fatal exception in interrupt
[  159.000176] Kernel Offset: disabled
[  159.003767] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Reproducer:

ip link add h1 type veth peer name swp1
ip link add h3 type veth peer name swp3

ip link set dev h1 up
ip address add 192.0.2.1/28 dev h1

ip link add dev vh3 type vrf table 20
ip link set dev h3 master vh3
ip link set dev vh3 up
ip link set dev h3 up

ip link set dev swp3 up
ip address add dev swp3 2001:db8:2::1/64

ip link set dev swp1 up
tc qdisc add dev swp1 clsact

ip link add name gt6 type ip6gretap \
local 2001:db8:2::1 remote 2001:db8:2::2
ip link set dev gt6 up

sleep 1

tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
action mirred egress mirror dev gt6
ping -I h1 192.0.2.2

Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agoofproto-dpif: Use dp_hash as default selection method
Jan Scheurich [Thu, 24 May 2018 15:28:01 +0000 (17:28 +0200)]
ofproto-dpif: Use dp_hash as default selection method

The dp_hash selection method for select groups overcomes the scalability
problems of the current default selection method which, due to L2-L4
hashing during xlation and un-wildcarding of the hashed fields,
basically requires an upcall to the slow path to load-balance every
L4 connection. The consequence are an explosion of datapath flows
(megaflows degenerate to miniflows) and a limitation of connection
setup rate OVS can handle.

This commit changes the default selection method to dp_hash, provided the
bucket configuration is such that the dp_hash method can accurately
represent the bucket weights with up to 64 hash values. Otherwise we
stick to original default hash method.

We use the new dp_hash algorithm OVS_HASH_L4_SYMMETRIC to maintain the
symmetry property of the old default hash method.

A controller can explicitly request the old default hash selection method
by specifying selection method "hash" with an empty list of fields in the
Group properties of the OpenFlow 1.5 Group Mod message.

Update the documentation about selection method in the ovs-ovctl man page.

Revise and complete the ofproto-dpif unit tests cases for select groups.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoofproto-dpif: Improve dp_hash selection method for select groups
Jan Scheurich [Thu, 24 May 2018 15:28:00 +0000 (17:28 +0200)]
ofproto-dpif: Improve dp_hash selection method for select groups

The current implementation of the "dp_hash" selection method suffers
from two deficiences: 1. The hash mask and hence the number of dp_hash
values is just large enough to cover the number of group buckets, but
does not consider the case that buckets have different weights. 2. The
xlate-time selection of best bucket from the masked dp_hash value often
results in bucket load distributions that are quite different from the
bucket weights because the number of available masked dp_hash values
is too small (2-6 bits compared to 32 bits of a full hash in the default
hash selection method).

This commit provides a more accurate implementation of the dp_hash
select group by applying the well known Webster method for distributing
a small number of "seats" fairly over the weighted "parties"
(see https://en.wikipedia.org/wiki/Webster/Sainte-Lagu%C3%AB_method).
The dp_hash mask is autmatically chosen large enough to provide good
enough accuracy even with widely differing weights.

This distribution happens at group modification time and the resulting
table is stored with the group-dpif struct. At xlation time, we use the
masked dp_hash values as index to look up the assigned bucket.

If the bucket should not be live, we do a circular search over the
mapping table until we find the first live bucket. As the buckets in
the table are by construction in pseudo-random order with a frequency
according to their weight, this method maintains correct distribution
even if one or more buckets are non-live.

Xlation is further simplified by storing some derived select group state
at group construction in struct group-dpif in a form better suited for
xlation purposes.

Adapted the unit test case for dp_hash select group accordingly.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agouserspace datapath: Add OVS_HASH_L4_SYMMETRIC dp_hash algorithm
Jan Scheurich [Thu, 24 May 2018 15:27:59 +0000 (17:27 +0200)]
userspace datapath: Add OVS_HASH_L4_SYMMETRIC dp_hash algorithm

This commit implements a new dp_hash algorithm OVS_HASH_L4_SYMMETRIC in
the netdev datapath. It will be used as default hash algorithm for the
dp_hash-based select groups in a subsequent commit to maintain
compatibility with the symmetry property of the current default hash
selection method.

A new dpif_backer_support field 'max_hash_alg' is introduced to reflect
the highest hash algorithm a datapath supports in the dp_hash action.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agovlog: exit with error if explicitly specified logfile cannot be opened
Dan Williams [Fri, 25 May 2018 17:49:59 +0000 (12:49 -0500)]
vlog: exit with error if explicitly specified logfile cannot be opened

It seems like if the user wanted a specific logfile but that request
cannot be fulfilled, OVS/OVN shouldn't just continue as if nothing
really happened (besides logging a warning).

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agovconn: Remove obsolete comment.
Ben Pfaff [Wed, 23 May 2018 23:39:56 +0000 (16:39 -0700)]
vconn: Remove obsolete comment.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agotests: Avoid printing Python exception for hosts without IPv6 support.
Ben Pfaff [Wed, 23 May 2018 22:15:36 +0000 (15:15 -0700)]
tests: Avoid printing Python exception for hosts without IPv6 support.

The tests probe whether the host has IPv6 support and, if it doesn't, skip
the tests that require IPv6.  However, until now, when the host lacks
support, this caused a Python exception to be printed, like this:

Traceback (most recent call last):
  File "<string>", line 3, in <module>
  File "/usr/lib64/python2.7/socket.py", line 187, in __init__
    _sock = _realsocket(family, type, proto)
socket.error: [Errno 97] Address family not supported by protocol

This exception is expected and harmless, but it reasonably surprised some
users.  This commit fixes the problem.

Reported-by: Paul Greenberg
Reported-at: https://github.com/openvswitch/ovs-issues/issues/146#issuecomment-390081887
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Support backup and clustered databases for ovn.
Ben Pfaff [Thu, 17 May 2018 21:20:15 +0000 (14:20 -0700)]
ovs-sim: Support backup and clustered databases for ovn.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-vsctl, ovn-nbctl, ovn-sbctl, vtep-ctl: Parse options before logging.
Ben Pfaff [Thu, 17 May 2018 21:04:25 +0000 (14:04 -0700)]
ovs-vsctl, ovn-nbctl, ovn-sbctl, vtep-ctl: Parse options before logging.

These utilities logged the command very early, before parsing the options
or the command.  This meant that logging options (like --log-file or
-vsyslog:off) weren't considered for the purpose of logging the command.
This fixes the problem.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim, ovs-sandbox: Turn off logging to syslog.
Ben Pfaff [Thu, 17 May 2018 20:53:44 +0000 (13:53 -0700)]
ovs-sim, ovs-sandbox: Turn off logging to syslog.

There's no value in having these testing tools log to syslog.  It just
pollutes the system log.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Install RST manpages into simulation environment too.
Ben Pfaff [Thu, 17 May 2018 19:41:23 +0000 (12:41 -0700)]
ovs-sim: Install RST manpages into simulation environment too.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sandbox: Add option to support multiple ovn-controllers.
Ben Pfaff [Fri, 25 May 2018 21:24:18 +0000 (14:24 -0700)]
ovs-sandbox: Add option to support multiple ovn-controllers.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Convert documentation to RST format.
Ben Pfaff [Thu, 17 May 2018 19:31:45 +0000 (12:31 -0700)]
ovs-sim: Convert documentation to RST format.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovsdb: Improve timing in cluster torture test.
Ben Pfaff [Thu, 17 May 2018 18:06:01 +0000 (11:06 -0700)]
ovsdb: Improve timing in cluster torture test.

Until now the timing in the cluster torture test has been pretty
inaccurate because it just worked by calling "sleep 1" in a loop that
did other things.  The longer those other things took, the more
inaccurate it got.

This commit changes to using a separate process for timing.  It still won't
be all that accurate but at least the timing loop doesn't try to do
anything else.

(I'm not sure how to actually get accurate timing in shell.)

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovsdb: Improve torture test for clusters.
Ben Pfaff [Thu, 17 May 2018 17:41:51 +0000 (10:41 -0700)]
ovsdb: Improve torture test for clusters.

This test is supposed to be parameterized, but one of the loops didn't
honor the parameterization and just had hardcoded values.  Also, the
output comparison didn't work properly for more than 100 client sets
(n1 > 100), so this adds some explicit sorting to the mix.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovn: Update TODO.
Ben Pfaff [Thu, 24 May 2018 18:00:35 +0000 (11:00 -0700)]
ovn: Update TODO.

We've actually made a lot of improvements.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agocompati/ip_gre: remove duplicate vport definition.
William Tu [Fri, 25 May 2018 13:28:49 +0000 (06:28 -0700)]
compati/ip_gre: remove duplicate vport definition.

Clean up the duplicate definition of OVS_VPORT_TYPE_ERSPAN
since it is defined in openvswitch.h.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoodp-util: refector erspan option parsing.
William Tu [Fri, 25 May 2018 13:28:48 +0000 (06:28 -0700)]
odp-util: refector erspan option parsing.

Instead of memcpy to a local stack, parse the erspan
metadata in memory.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoerspan: update NEWS and FAQ.
William Tu [Fri, 25 May 2018 16:47:20 +0000 (09:47 -0700)]
erspan: update NEWS and FAQ.

Update Documentation/faq/configuration.rst about ERSPAN
and Update NEWS.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoEmbrace anonymous unions.
Ben Pfaff [Thu, 24 May 2018 17:32:59 +0000 (10:32 -0700)]
Embrace anonymous unions.

Several OVS structs contain embedded named unions, like this:

struct {
    ...
    union {
        ...
    } u;
};

C11 standardized a feature that many compilers already implemented
anyway, where an embedded union may be unnamed, like this:

struct {
    ...
    union {
        ...
    };
};

This is more convenient because it allows the programmer to omit "u."
in many places.  OVS already used this feature in several places.  This
commit embraces it in several others.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Tested-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
6 years agoovs-fields: Improve formatting of NSH section.
Ben Pfaff [Fri, 18 May 2018 17:16:41 +0000 (10:16 -0700)]
ovs-fields: Improve formatting of NSH section.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-fields: Correct ideas about which OXM classes are official.
Ben Pfaff [Fri, 18 May 2018 17:16:40 +0000 (10:16 -0700)]
ovs-fields: Correct ideas about which OXM classes are official.

The purpose of including an OpenFlow version in the notes in meta-flow.h
and ovs-fields.7 is to explain what version of OpenFlow standardized a
given field.  NXOXM_* are not standardized so they should not have an
OpenFlow version.  This commit corrects it.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoerspan: add NXOXM_ET_ERSPAN_ field tests.
William Tu [Fri, 25 May 2018 15:03:18 +0000 (08:03 -0700)]
erspan: add NXOXM_ET_ERSPAN_ field tests.

ERSPAN is the first real-world use cases of Experimenter OXM,
which introduces 4 new NXOXM_ET_ fields (ver, idx, dir, hwid).
The patch adds test cases for these fields.

At the same time, delete the special case for NXOXM_ET_DP_HASH,
because it was only in there for testing anyway.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoovn pacemaker: Fix promotion issue when the master node is reset
Numan Siddique [Thu, 17 May 2018 10:04:09 +0000 (15:34 +0530)]
ovn pacemaker: Fix promotion issue when the master node is reset

When a node 'A' in the pacemaker cluster running OVN db servers in
master is brought down ungracefully ('echo b > /proc/sysrq_trigger'
for example), pacemaker is not able to promote any other node to
master in the cluster. When pacemaker selects a node B for instance to
promote, it moves the IPAddr2 resource (i.e the master ip) to node
'B'. As soon the node is configured with the IP address, when the
issue is seen, the OVN db servers which were running as standy
earlier, transitions to active. Ideally this should not have happened.
The ovsdb-servers are expected to remain in standby until there are
promoted. (This needs separate investigation). When the pacemaker
calls the OVN OCF script's promote action, the ovsdb_server_promot
function returns almost immediately without recording the present
master. And later in the notify action it demotes back the OVN db
servers since the last known master doesn't match with node 'B's
hostname. This results in pacemaker promoting/demoting in a loop.

This patch fixes the issue by not returning immediately when promote
action is called if the OVN db servers are running as active. Now it
would continue with the ovsdb_server_promot function and records the
new master by setting proper master score ($CRM_MASTER -N $host_name
-v ${master_score})

This issue is not seen when a node is brought down gracefully as
pacemaker before promoting a node, calls stop, start and then promote
actions. Not sure why pacemaker doesn't call stop, start and promote
actions when a node is reset ungracefully.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1579025
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
6 years agodpdk: reflect status and version in the database
Aaron Conole [Thu, 3 May 2018 19:08:01 +0000 (15:08 -0400)]
dpdk: reflect status and version in the database

The normal way of retrieving the running DPDK status involves parsing
log files and issuing various incantations of ovs-vsctl and ovs-appctl
commands to determine whether the rte_eal_init successfully started.

This commit adds two new records to reflect the dpdk version, and
the dpdk initialization status.

To support this, the other_config:dpdk-init configuration block supports
the 'true' and 'try' keywords now, instead of just 'true'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agodpdk: allow init to fail
Aaron Conole [Thu, 3 May 2018 19:08:00 +0000 (15:08 -0400)]
dpdk: allow init to fail

It's possible for dpdk initialization to fail either due to an internal
error or an invalid configuration.  When that happens, it's rather
impolite to immediately abort without any details.

With this change, a failed dpdk initialization attempt will continue to
trigger a SIGABRT.  However, the failure details will be logged, and a
user or administrator may have more information to correct the issue.
A restart of OvS would still be required to re-attempt initialization.

The refactor to propagate the init error will be used in an upcoming
commit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agodpif-netdev: Free packets on TUNNEL_PUSH if should_steal.
Ilya Maximets [Thu, 24 May 2018 09:51:21 +0000 (12:51 +0300)]
dpif-netdev: Free packets on TUNNEL_PUSH if should_steal.

Unconditional return may cause packet leak in case of
'should_steal == true'.

Additionally, removed redundant checking for depth level.

CC: Sugesh Chandran <sugesh.chandran@intel.com>
Fixes: 7c12dfc527a5 ("tunneling: Avoid datapath-recirc by
                      combining recirc actions at xlate.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokess <ian.stokes@intel.com>
6 years agonetdev-dpdk: fix check for "net_nfp" driver
Timothy Redaelli [Thu, 17 May 2018 16:45:01 +0000 (18:45 +0200)]
netdev-dpdk: fix check for "net_nfp" driver

Currently the check of "net_nfp" driver while enabling scatter compares
only the first 6 bytes, but "net_nfp" is 7 bytes long.

This change fixes the check by comparing the first 7 bytes.

CC: Pablo Cascón <pablo.cascon@netronome.com>
CC: Simon Horman <simon.horman@netronome.com>
Fixes: 65a87968f4cf ("netdev-dpdk: don't enable scatter for jumbo RX support for nfp")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Pablo Cascón <pablo.cascon@netronome.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agonetdev-dpdk: Don't use PMD driver if not configured successfully
Eelco Chaudron [Wed, 16 May 2018 14:15:34 +0000 (16:15 +0200)]
netdev-dpdk: Don't use PMD driver if not configured successfully

When initialization of the DPDK PMD driver fails
(dpdk_eth_dev_init()), the reconfigure_datapath() function will remove
the port from dp_netdev, and the port is not used.

Now when bridge_reconfigure() is called again, no changes to the
previous failing netdev configuration are detected and therefore the
ports gets added to dp_netdev and used uninitialized. This is causing
exceptions...

The fix has two parts to it. First in netdev-dpdk.c we remember if the
DPDK port was started or not, and when calling
netdev_dpdk_reconfigure() we also try re-initialization if the port
was not already active. The second part of the change is in
dpif-netdev.c where it makes sure netdev_reconfigure() is called if
the port needs reconfiguration, as netdev_is_reconf_required() is only
true until netdev_reconfigure() is called (even if it fails).

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agonetdev-dpdk: Remove use of rte_mempool_ops_get_count.
Kevin Traynor [Wed, 23 May 2018 13:41:30 +0000 (14:41 +0100)]
netdev-dpdk: Remove use of rte_mempool_ops_get_count.

rte_mempool_ops_get_count is not exported by DPDK so it means it
cannot be used by OVS when using DPDK as a shared library.

Remove rte_mempool_ops_get_count but still use rte_mempool_full
and document it's behavior.

Fixes: 91fccdad72a2 ("netdev-dpdk: Free mempool only when no in-use mbufs.")
Reported-by: Timothy Redaelli <tredaelli@redhat.com>
Reported-by: Markos Chandras <mchandras@suse.de>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agoExtend tests for conjunctive match support in OVN
Numan Siddique [Thu, 24 May 2018 15:45:53 +0000 (17:45 +0200)]
Extend tests for conjunctive match support in OVN

Check the application of conjunctive matching to logical flow match
expressions. In particular cover the case where conjunctive matching is
applied to ACL match expressions that refer to Address Sets.

Mark Michelson who tested a similar patch [1] has found a significant
improvement in ACL processing and reduction of OF flows from an order of
1 million to few thousands. [2]

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
[1] - https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344523.html
[2] - https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344311.html

Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoFactor prerequisites out of AND/OR trees with unique symbol
Jakub Sitnicki [Thu, 24 May 2018 15:45:52 +0000 (17:45 +0200)]
Factor prerequisites out of AND/OR trees with unique symbol

Appending prerequisites to sub-expressions of OR that are all over one
symbol prevents the expression-to-matches converter from applying
conjunctive matching. This happens during the annotation phase.

input:      s1 == { c1, c2 } && s2 == { c3, c4 }
expanded:   (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)
annotated:  ((p1 && s1 == c1) || (p1 && s1 == c2)) &&
            ((p2 && s2 == c3) || (p2 && s2 == c4))
normalized: (p1 && p2 && s1 == c1 && s2 == c3) ||
            (p1 && p2 && s1 == c1 && s2 == c4) ||
            (p1 && p2 && s1 == c2 && s2 == c3) ||
            (p1 && p2 && s1 == c2 && s2 == c4)

Where s1,s2 - symbols, c1..c4 - constants, p1,p2 - prerequisites.

Since sub-expressions of OR trees that are over one symbol all have the
same prerequisites, we can factor them out leaving the OR tree in tact,
and enabling the converter to apply conjunctive matching to
AND(OR(clause)) trees.

Going back to our example this change gives us:

input:      s1 == { c1, c2 } && s2 == { c3, c4 }
expanded:   (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)
annotated:  (s1 == c1 || s1 == c2) && p1 && (s2 == c3 || s2 == c4) && p2
normalized: p1 && p2 && (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)

We also factor out the prerequisites out of pure AND or mixed AND/OR
trees to keep the common code path, but in this case the only thing we
gain is a shorter expression as prerequisites for each symbol appear
only once.

Documentation comments have been contributed by Ben Pfaff.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agonetdev-native-tnl: Fix alignment for erspan index.
Darrell Ball [Thu, 24 May 2018 02:13:56 +0000 (19:13 -0700)]
netdev-native-tnl: Fix alignment for erspan index.

Flagged by clang.

CC: William Tu <u9012063@gmail.com>
Fixes: 068794b43f0e ("erspan: Add flow-based erspan options")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoofp-flow: Fix uninitialized data decoding OF1.5 flow stats.
Ben Pfaff [Wed, 23 May 2018 20:51:59 +0000 (13:51 -0700)]
ofp-flow: Fix uninitialized data decoding OF1.5 flow stats.

Reported-by: Paul Greenberg
Reported-at: https://github.com/openvswitch/ovs-issues/issues/149
Fixes: c7b02b800615 ("Add support for OpenFlow 1.5 statistics (OXS).")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Darrell Ball <dlu998@gmail.com>
6 years agorconn: Introduce new invariant to fix assertion failure in corner case.
Ben Pfaff [Wed, 23 May 2018 23:58:31 +0000 (16:58 -0700)]
rconn: Introduce new invariant to fix assertion failure in corner case.

Until now, rconn_get_version() has only reported the OpenFlow version in
use when the rconn is actually connected.  This makes sense, but it has a
harsh consequence.  Consider code like this:

    if (rconn_is_connected(rconn) && rconn_get_version(rconn) >= 0) {
        for (int i = 0; i < 2; i++) {
            struct ofpbuf *b = ofputil_encode_echo_request(
                rconn_get_version(rconn));
            rconn_send(rconn, b, NULL);
        }
    }

Maybe not the smartest code in the world, and probably no one would write
this exact code in any case, but it doesn't look too risky or crazy.

But it is.  The second trip through the loop can assert-fail inside
ofputil_encode_echo_request() because rconn_get_version(rconn) returns -1
instead of a valid OpenFlow version.  That happens if the first call to
rconn_send() encounters an error while sending the message and therefore
destroys the underlying vconn and disconnects so that rconn_get_version()
doesn't have a vconn to query for its version.

In a case like this where all the code to send the messages is close by, we
could just check rconn_get_version() in each loop iteration.  We could even
go through the tree and convince ourselves that individual bits of code are
safe, or be conservative and check rconn_get_version() >= 0 in the iffy
cases.  But this seems to me like an ongoing source of risk and a way to
get things wrong in corner cases.

This commit takes a different approach.  It introduces a new invariant: if
an rconn has ever been connected, then it returns a valid OpenFlow version
from rconn_get_version().  In addition, if an rconn is currently connected,
then the OpenFlow version it returns is the correct one (that may be
obvious, but there were corner cases before where it returned -1 even
though rconn_is_connected() returned true).

With this commit, the code above would work OK.  If the first call to
rconn_send() encounters an error sending the message, then
rconn_get_version() in the second iteration will return the same value as
in the first iteration.  The message passed to rconn_send() will end up
being discarded, but that's much better than either an assertion failure or
having to carefully analyze a lot of our code to deal with one unusual
corner case.

Reported-by: Han Zhou <zhouhan@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agodatapath: compat: Fix ndo_size in RHEL 7.5 backport
Yi-Hung Wei [Thu, 17 May 2018 19:39:51 +0000 (12:39 -0700)]
datapath: compat: Fix ndo_size in RHEL 7.5 backport

If 'ndo_size' is not set in 'struct net_device_ops', RHEL kernel will not
make use of functions in 'struct net_device_ops_extended'.

Fixes: 39ca338374ab ("datapath: compat: Fix build on RHEL 7.5")
Reported-by: Jiri Benc <jbenc@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/347070.html
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
6 years agoovn-controller: Count calls to lflow_run()
Jakub Sitnicki [Fri, 18 May 2018 16:55:35 +0000 (18:55 +0200)]
ovn-controller: Count calls to lflow_run()

lflow_run() is the main logical flows processing routine that we spend
most of the CPU time in when testing at scale.

With the switch to incremental processing approach in the controller,
we will be trying to avoid calling to lflow_run() as much as possible.

A counter lets us confirm that we are doing logical flow processing
only when it's expected, without resorting to profiling under stress.

It can also serve as a hint as to why ovn-controller process is
consuming CPU time.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agorhel: Use openvswitch user/group for the log directory
Timothy Redaelli [Wed, 23 May 2018 13:46:32 +0000 (15:46 +0200)]
rhel: Use openvswitch user/group for the log directory

Commit 94cd8383e297 ("rhel: fix log directory permissions") restored the
old 755 permission on /var/log/openvswitch and this can result in the
exposure of sensitive information.

Since commit f624bf23b62a ("rhel: user/group openvswitch does not exist")
moved the user/group creations in %pre phase it's now possible to change
/var/log/openvswitch user/group to openvswitch:openvswitch and remove
the r/x bits for other again without having the "permission denied"
error when the logs are rotated.

CC: Aaron Conole <aconole@redhat.com>
Fixes: 94cd8383e297 ("rhel: fix log directory permissions")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Markos Chandras <mchandras@suse.de>
6 years agoovn.at: fix occasional failure - ACL reject rule test
Han Zhou [Sat, 19 May 2018 21:21:33 +0000 (14:21 -0700)]
ovn.at: fix occasional failure - ACL reject rule test

The test fails occasionally because it may starts sending packets
before the new ACL related flows are installed on HVs, even if it
ensures lflows exist in SB DB. This patch ensure the HVs are in
sync by ovn-nbctl --wait=hv sync, and removes the check for lflow
readiness in SB.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoodp-execute: Rename 'may_steal' to 'should_steal'.
Darrell Ball [Thu, 17 May 2018 02:24:46 +0000 (19:24 -0700)]
odp-execute: Rename 'may_steal' to 'should_steal'.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoodp-execute: Correct and clarify 'steal' parameter.
Darrell Ball [Thu, 17 May 2018 06:08:48 +0000 (23:08 -0700)]
odp-execute: Correct and clarify 'steal' parameter.

Correct and clarify 'steal'/'may_steal' comments in
odp_execute_actions().

Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agotests: Make test result more predictable.
Darrell Ball [Fri, 18 May 2018 16:52:23 +0000 (09:52 -0700)]
tests: Make test result more predictable.

The test 'ofproto-dpif - in place modification (vlan)' fails often
due to miss handling. Hence, make it more predictable by specifying
that misses should just be dropped.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoerspan: fix invalid erspan version.
William Tu [Thu, 17 May 2018 20:36:47 +0000 (13:36 -0700)]
erspan: fix invalid erspan version.

ERSPAN only support version 1 and 2.  When packets send to an erspan device
which does not have proper version number set, drop the packet.  In real
case, we observe multicast packets sent to the erspan pernet device,
erspan0, which does not have erspan version configured.

Without this patch, we observe warning message from ovs-vswitchd as below,
due to receive an malformed erspan packet:

odp_util|WARN|odp_tun_key_from_attr__ invalid erspan version

Reported-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agogre: Resolve gre receive issues
Greg Rose [Thu, 17 May 2018 17:43:53 +0000 (10:43 -0700)]
gre: Resolve gre receive issues

On newer Linux kernels or on older kernels such as Red Hat that backport
from newer upstream Linux kernel releases the built-in gre kernel module
will interfere with OVS gre code in the receive path.  Fix this up by
placing the gre kernel code within the openvswitch driver so it will
not have to depend on the built-in gre kernel module.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agorhel: Enable ERSPAN features for RHEL 7.x
Greg Rose [Wed, 16 May 2018 20:13:20 +0000 (13:13 -0700)]
rhel: Enable ERSPAN features for RHEL 7.x

Enable ERSPAN on RHEL 7.x

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoerspan: set bso when truncated bit is set.
William Tu [Sun, 13 May 2018 14:09:41 +0000 (07:09 -0700)]
erspan: set bso when truncated bit is set.

Before the patch, the erspan BSO bit (Bad/Short/Oversized) is not
handled.  BSO has 4 possible values:
  00 --> Good frame with no error, or unknown integrity
  11 --> Payload is a Bad Frame with CRC or Alignment Error
  01 --> Payload is a Short Frame
  10 --> Payload is an Oversized Frame

This patch set BSO to 01 when truncate is true.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoerspan: auto detect truncated ipv6 packets.
William Tu [Sun, 13 May 2018 14:03:49 +0000 (07:03 -0700)]
erspan: auto detect truncated ipv6 packets.

Upstream commit:
    commit d5db21a3e6977dcb42cee3d16cd69901fa66510a
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri May 11 05:49:47 2018 -0700

    erspan: auto detect truncated ipv6 packets.

    Currently the truncated bit is set only when 1) the mirrored packet
    is larger than mtu and 2) the ipv4 packet tot_len is larger than
    the actual skb->len.  This patch adds another case for detecting
    whether ipv6 packet is truncated or not, by checking the ipv6 header
    payload_len and the skb->len.

Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoip6erspan: make sure enough headroom at xmit.
William Tu [Fri, 11 May 2018 03:03:31 +0000 (20:03 -0700)]
ip6erspan: make sure enough headroom at xmit.

Upstream commit:
    commit e41c7c68ea771683cae5a7f81c268f38d7912ecb
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Mar 9 07:34:42 2018 -0800

    ip6erspan: make sure enough headroom at xmit.

    The patch adds skb_cow_header() to ensure enough headroom
    at ip6erspan_tunnel_xmit before pushing the erspan header
    to the skb.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip6erspan: improve error handling for erspan version number.
William Tu [Fri, 11 May 2018 02:51:09 +0000 (19:51 -0700)]
ip6erspan: improve error handling for erspan version number.

Upstream commit:
    commit d6aa71197ffcb68850bfebfc3fc160abe41df53b
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Mar 9 07:34:41 2018 -0800

    ip6erspan: improve error handling for erspan version number.

    When users fill in incorrect erspan version number through
    the struct erspan_metadata uapi, current code skips pushing
    the erspan header but continue pushing the gre header, which
    is incorrect.  The patch fixes it by returning error.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip6gre: add erspan v2 to tunnel lookup
William Tu [Fri, 11 May 2018 02:45:44 +0000 (19:45 -0700)]
ip6gre: add erspan v2 to tunnel lookup

Upstream commit:
    commit 3b04caab81649a9e8d5375b919b6653d791951df
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Mar 9 07:34:40 2018 -0800

    ip6gre: add erspan v2 to tunnel lookup

    The patch adds the erspan v2 proto in ip6gre_tunnel_lookup
    so the erspan v2 tunnel can be found correctly.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoerspan: Add flow-based erspan options
Greg Rose [Fri, 18 May 2018 00:46:41 +0000 (17:46 -0700)]
erspan: Add flow-based erspan options

The patch add supports for flow-based erspan options.
The erspan_ver, erspan_idx, erspan_dir, and erspan_hwid can be
set as "flow" so that its value is set by the openflow rule,
instead of statically configured at port creation time.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agolib/dpif-netlink: Fix miscompare of gre ports
Greg Rose [Fri, 4 May 2018 23:48:43 +0000 (16:48 -0700)]
lib/dpif-netlink: Fix miscompare of gre ports

In netdev_to_ovs_vport_type() it checks for netdev types matching
"gre" with a strstr().  This makes it match ip6gre as well and return
OVS_VPORT_TYPE_GRE, which is clearly wrong.

Move the usage of strstr() *after* all the exact matches with strcmp()
to avoid the problem permanently because when I added the ip6gre
type I ran into a very difficult to detect bug.

Cc: Ben Pfaff <blp@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip6gre: Add ip6gre vport type
Greg Rose [Fri, 4 May 2018 17:14:44 +0000 (10:14 -0700)]
ip6gre: Add ip6gre vport type

Add handlers for OVS_VPORT_TYPE_IP6GRE

Cc: Ben Pfaff <blp@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoerspan: auto detect truncated packets.
William Tu [Wed, 2 May 2018 22:06:04 +0000 (15:06 -0700)]
erspan: auto detect truncated packets.

Upstream commit:
    commit 1baf5ebf8954d9bff8fa4e7dd6c416a0cebdb9e2
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Apr 27 14:16:32 2018 -0700

    erspan: auto detect truncated packets.

    Currently the truncated bit is set only when the mirrored packet
    is larger than mtu.  For certain cases, the packet might already
    been truncated before sending to the erspan tunnel.  In this case,
    the patch detect whether the IP header's total length is larger
    than the actual skb->len.  If true, this indicated that the
    mirrored packet is truncated and set the erspan truncate bit.

    I tested the patch using bpf_skb_change_tail helper function to
    shrink the packet size and send to erspan tunnel.

Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoopenvswitch: fix vport packet length check.
William Tu [Wed, 2 May 2018 21:45:39 +0000 (14:45 -0700)]
openvswitch: fix vport packet length check.

Upstream commit:
    commit 46e371f0e78a82186a83cbcb4f4b8850417c7dd5
    Author: William Tu <u9012063@gmail.com>
    Date:   Wed Mar 7 15:38:48 2018 -0800

    openvswitch: fix vport packet length check.

    When sending a packet to a tunnel device, the dev's hard_header_len
    could be larger than the skb->len in function packet_length().
    In the case of ip6gretap/erspan, hard_header_len = LL_MAX_HEADER + t_hlen,
    which is around 180, and an ARP packet sent to this tunnel has
    skb->len = 42.  This causes the 'unsign int length' to become super
    large because it is negative value, causing the later ovs_vport_send
    to drop it due to over-mtu size.  The patch fixes it by setting it to 0.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoerspan: add kernel datapath support
William Tu [Wed, 21 Mar 2018 21:02:25 +0000 (14:02 -0700)]
erspan: add kernel datapath support

pass check, check-kernel (4.16-rc4), check-system-userspace

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agouserspace: add erspan tunnel support.
William Tu [Tue, 15 May 2018 20:10:48 +0000 (16:10 -0400)]
userspace: add erspan tunnel support.

ERSPAN is a tunneling protocol based on GRE tunnel.  The patch
add erspan tunnel support for ovs-vswitchd with userspace datapath.
Configuring erspan tunnel is similar to gre tunnel, but with
additional erspan's parameters.  Matching a flow on erspan's
metadata is also supported, see ovs-fields for more details.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agouserspace: add gre sequence number support.
William Tu [Tue, 15 May 2018 20:10:49 +0000 (16:10 -0400)]
userspace: add gre sequence number support.

The patch adds support for gre sequence number.
Default is disable.  When enable with 'options:seq=true',
the outgoing gre packet will have its sequence number
incremented by one.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agonetdev-native-tnl: refactor the tunnel push header.
William Tu [Fri, 9 Mar 2018 21:02:23 +0000 (13:02 -0800)]
netdev-native-tnl: refactor the tunnel push header.

The patch adds additional 'struct netdev *' to the
native tunnel's push_header() interface.  This is used
for later GRE sequence number support.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>