]> git.proxmox.com Git - ovs.git/log
ovs.git
5 years agoacinclude: Include libverbs and libmlx5 when needed
Eli Britstein [Mon, 11 Feb 2019 11:32:33 +0000 (13:32 +0200)]
acinclude: Include libverbs and libmlx5 when needed

DPDK 18.11 uses libverbs and libmlx5 when MLX5 PMD is enabled.

This commit makes OVS to link to libverbs and libmlx5 when MLX5 PMD is
enabled on DPDK.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoInitialize the right database.
Ted Elhourani [Fri, 15 Feb 2019 00:59:57 +0000 (00:59 +0000)]
Initialize the right database.

Use value of db parameter in order to initialize the correct database.

Signed-off-by: Ted Elhourani <ted.elhourani@nutanix.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: Simplify 'ct_addr'.
Darrell Ball [Thu, 14 Feb 2019 21:15:08 +0000 (13:15 -0800)]
conntrack: Simplify 'ct_addr'.

Remove the struct wrapper and remove the unneeded union members.
There may even be a portability benefit here because of the
type punning.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: Remove redundant call to 'hash_finish()'.
Darrell Ball [Thu, 14 Feb 2019 21:15:07 +0000 (13:15 -0800)]
conntrack: Remove redundant call to 'hash_finish()'.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoUserspace datapath: Add fragmentation handling.
Darrell Ball [Wed, 13 Feb 2019 23:34:21 +0000 (15:34 -0800)]
Userspace datapath: Add fragmentation handling.

Fragmentation handling is added for supporting conntrack.
Both v4 and v6 are supported.

After discussion with several people, I decided to not store
configuration state in the database to be more consistent with
the kernel in future, similarity with other conntrack configuration
which will not be in the database as well and overall simplicity.
Accordingly, fragmentation handling is enabled by default.

This patch enables fragmentation tests for the userspace datapath.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodp-packet: Add 'do_not_steal' packet batch flag.
Darrell Ball [Wed, 13 Feb 2019 23:34:20 +0000 (15:34 -0800)]
dp-packet: Add 'do_not_steal' packet batch flag.

This is needed in a subsequent patch and may otherwise be useful.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodp-packet: Add 'dp_packet_batch_is_full()' api.
Darrell Ball [Wed, 13 Feb 2019 23:34:19 +0000 (15:34 -0800)]
dp-packet: Add 'dp_packet_batch_is_full()' api.

This new api is used in a subsequent patch and may otherwise be useful.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovs-atomic: Add 64 bit apis.
Darrell Ball [Wed, 13 Feb 2019 23:34:18 +0000 (15:34 -0800)]
ovs-atomic: Add 64 bit apis.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: Reword conntrack_execute() description.
Darrell Ball [Wed, 13 Feb 2019 23:34:17 +0000 (15:34 -0800)]
conntrack: Reword conntrack_execute() description.

Use 'must' instead of 'should'.

Suggested-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotests: Add missed local stack checks.
Darrell Ball [Wed, 13 Feb 2019 23:34:16 +0000 (15:34 -0800)]
tests: Add missed local stack checks.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoflow: Enhance parse_ipv6_ext_hdrs.
Darrell Ball [Wed, 13 Feb 2019 23:34:15 +0000 (15:34 -0800)]
flow: Enhance parse_ipv6_ext_hdrs.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodp-packet: Add const qualifiers for checksum apis.
Darrell Ball [Wed, 13 Feb 2019 23:34:14 +0000 (15:34 -0800)]
dp-packet: Add const qualifiers for checksum apis.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agomonitor: Fix crash when monitor condition adds new columns.
Han Zhou [Tue, 12 Feb 2019 02:19:21 +0000 (18:19 -0800)]
monitor: Fix crash when monitor condition adds new columns.

The OVSDB conditional monitor implementation allows many clients
to share same copy of monitored data if the clients are sharing
same tables and columns being monitored, while they can have
different monitor conditions. In monitor conditions they can
have different columns which can be different from the columns
being monitored. So the struct ovsdb_monitor_table maintains the
union of the all the columns being used in any conditions.

The problem of the current implementation is that for each change
set generated, it doesn't maintain any metadata for the number of
columns for the data that has already populated in it. Instead, it
always rely on the n_columns field of the struct ovsdb_monitor_table
to manipulate the data. However, the n_columns in struct
ovsdb_monitor_table can increase (e.g. when a client changes its
condition which involves more columns). So it can result in that
the existing rows in a change set with N columns being later processed
as if it had more than N columns, typically, when the row is freed.
This causes the ovsdb-server crashing (see an example of the
backtrace).

The patch fixes the problem by maintaining n_columns for each
change set, and added a test case which fails without the fix.

(gdb) bt
at lib/ovsdb-data.c:1031
out>, mt=<optimized out>) at ovsdb/monitor.c:320
mt=0x1e7b940) at ovsdb/monitor.c:333
out>, transaction=<optimized out>) at ovsdb/monitor.c:527
initial=<optimized out>, cond_updated=cond_updated@entry=false,
unflushed_=unflushed_@entry=0x20dae70,
    condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156
(m=m@entry=0x20dae40, initial=initial@entry=false) at
ovsdb/jsonrpc-server.c:1655
at ovsdb/jsonrpc-server.c:1729
ovsdb/jsonrpc-server.c:551
ovsdb/jsonrpc-server.c:586
ovsdb/jsonrpc-server.c:401
exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0,
unixctl=0x1e7a560, all_dbs=0x7ffdb947f800,
    jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoacinclude: Also use LIBS from dpkg pkg-config
Christian Ehrhardt [Tue, 12 Feb 2019 06:29:58 +0000 (07:29 +0100)]
acinclude: Also use LIBS from dpkg pkg-config

DPDK 18.11 builds using the more modern meson build system no more
provide the -ldpdk linker script. Instead it is expected to use
pkgconfig for linker options as well.

This change will set DPDK_LIB from pkg-config (if pkg-config was
available) and since that already carries the whole-archive flags
around the PMDs skips the further wrapping in more whole-archive
if that is already part of DPDK_LIB.

To work reliable in all environments this needs pkg-config 0.29.1.
We want to be able to use PKG_CHECK_MODULES_STATIC which
is not yet available in 0.24. Therefore update pkg.m4
to pkg-config 0.29.1.

This should be backport-safe as these macro files are all versioned.
autoconf is smart enough to check the version if you have it locally,
and if the system's is higher, it will use that one instead.

Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agofaq: Update OVS/DPDK version table for OVS 2.11.
Kevin Traynor [Wed, 13 Feb 2019 16:27:00 +0000 (16:27 +0000)]
faq: Update OVS/DPDK version table for OVS 2.11.

Indicate that OVS 2.11 uses DPDK 18.11.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agopython: Fix E117 over-indented.
Ilya Maximets [Tue, 12 Feb 2019 15:34:02 +0000 (18:34 +0300)]
python: Fix E117 over-indented.

New check was added to recent pycodestyle-2.5.0 and flake8
complains while building on Travis:

  ../utilities/bugtool/ovs-bugtool.in:767:17: E117 over-indented
  ../utilities/bugtool/ovs-bugtool.in:771:17: E117 over-indented
  ../utilities/bugtool/ovs-bugtool.in:774:17: E117 over-indented
  ../utilities/bugtool/ovs-bugtool.in:778:17: E117 over-indented
  ../python/ovs/db/error.py:33:17: E117 over-indented
  ../python/ovs/poller.py:118:21: E117 over-indented
  ../python/ovs/reconnect.py:244:17: E117 over-indented

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodebian: Add libelf-dev dependency for dkms
Greg Rose [Tue, 12 Feb 2019 20:37:03 +0000 (12:37 -0800)]
debian: Add libelf-dev dependency for dkms

Newer kernels define CONFIG_UNWINDER_ORC for their kernel configurations
and to build this the kernel compilation requires the libelf-dev
package.  Add the dependency to the dkms build requirements.

VMware-BZ: #2287968
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoacinclude: Check for rte_config.h before checking dependencies.
Ilya Maximets [Tue, 12 Feb 2019 13:19:31 +0000 (16:19 +0300)]
acinclude: Check for rte_config.h before checking dependencies.

Current ./configure script shows misleading errors in case of wrong
DPDK path:

  # ./configure --with-dpdk=/wrong/path
  ...
  checking whether dpdk datapath is enabled... yes
  checking for library containing get_mempolicy... -lnuma
  checking for library containing pcap_dump... -lpcap
  checking for library containing mnl_attr_put... no
  configure: error: unable to find libmnl, install the dependency package

This happens because we're not checking for headers before checking
for dependencies. All the compile attempts fails and script thinks
that we need more dependencies.

With this change script will check for 'rte_config.h' availability
and produce sane error message:

  # ./configure --with-dpdk=/wrong/path
  ...
  checking for rte_config.h... no
  configure: error: unable to find rte_config.h in /wrong/path

'AC_INCLUDES_DEFAULT' passed explicitly to avoid preprocessor test.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agodpif-netdev: Add thread safety annotation to sorted_poll_list.
Ilya Maximets [Mon, 11 Feb 2019 17:35:41 +0000 (20:35 +0300)]
dpif-netdev: Add thread safety annotation to sorted_poll_list.

'sorted_poll_list()' uses the 'pmd->poll_list' that should be
guarded by 'pmd->port_mutex'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoacinclude: Use NUMA_AWARE_HUGEPAGES too for libnuma check.
Ilya Maximets [Thu, 7 Feb 2019 13:00:51 +0000 (16:00 +0300)]
acinclude: Use NUMA_AWARE_HUGEPAGES too for libnuma check.

This fixes build with NUMA_AWARE_HUGEPAGES enabled and VHOST_NUMA
disabled. This should not be a usual case. But it's possible to
configure DPDK this way.

Fixes: 5e925ccc2a6f ("netdev-dpdk: DPDK v17.11 upgrade")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoconntrack: Exclude l2 padding in 'conn_key_extract()'.
Darrell Ball [Tue, 5 Feb 2019 00:23:07 +0000 (16:23 -0800)]
conntrack: Exclude l2 padding in 'conn_key_extract()'.

'conn_key_extract()' in userspace conntrack is including L2
(Ethernet) pad bytes for both L3 and L4 sizes. One problem is
any packet with non-zero L2 padding can incorrectly fail L4
checksum validation.

This patch fixes conn_key_extract() by ignoring L2 pad bytes.

Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.")
CC: Daniele Di Proietto <diproiettod@ovn.org>
Co-authored-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Co-authored-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodp-packet: Add 'dp_packet_l3_size()'.
Darrell Ball [Tue, 5 Feb 2019 00:23:06 +0000 (16:23 -0800)]
dp-packet: Add 'dp_packet_l3_size()'.

The new api will be used in a subsequent patch.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: Remove unnecessary check in process_ftp_ctl_v4
Li RongQing [Mon, 11 Feb 2019 02:52:54 +0000 (10:52 +0800)]
conntrack: Remove unnecessary check in process_ftp_ctl_v4

It has been assured that both first and second int from ftp
command are not bigger than 255, so their combination(first
int << 8 +second int) must not bigger than 65535

Co-authored-by: Wang Li <wangli39@baidu.com>
Signed-off-by: Wang Li <wangli39@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Cc: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agorhel: Add an example to specify custom options
Timothy Redaelli [Mon, 11 Feb 2019 18:55:53 +0000 (19:55 +0100)]
rhel: Add an example to specify custom options

Add an example to specify custom options of ovs-vswitchd and
ovsdb-server.
In the example, the log level for file and console destinations is set to dbg.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovs-ctl: Permit to specify additional options
Timothy Redaelli [Mon, 11 Feb 2019 18:55:52 +0000 (19:55 +0100)]
ovs-ctl: Permit to specify additional options

Currently using ovs-ctl is not possible to specify additional options
for ovs-vswitchd and ovsdb-server (for example to specify a different
loglevel during daemon startup).

This patch adds --ovs-vswitchd-options and --ovsdb-server-options
options to ovs-ctl in order to specify the additional options.

Due to word splitting it may not be possible to specify an option that
includes whitespaces.

Reported-at: https://bugzilla.redhat.com/1664794
Reported-by: Matt Flusche <mflusche@redhat.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovn: change load balancer references to weak in NB schema
Daniel Alvarez [Mon, 11 Feb 2019 16:06:14 +0000 (17:06 +0100)]
ovn: change load balancer references to weak in NB schema

When a load balancer is added to multiple logical switches
and routers it has be to be removed from all of them before
being able to delete due to the current 'strong' reference
in the NB schema.

By changing it to 'weak', users can simply remove the load
balancer without having to remove all the references manually.
In particular, this will make things easier for networking-ovn,
the OpenStack integration project as it'll save some
calculations upon load balancer deletion.

The update path has been successfully from the previous version
of the schema.

Acked-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovs-lib.in: Cleanup old socket and pidfiles in stop_daemon
Timothy Redaelli [Mon, 11 Feb 2019 16:22:53 +0000 (17:22 +0100)]
ovs-lib.in: Cleanup old socket and pidfiles in stop_daemon

Currently if a client crashes (signal 11) the unix socket (.ctl) and the
pidfile may not be deleted when you use ovs-ctl stop or restart.

Moreover since ovs-appctl is used on a closed socket some warnings are
printed.

This commit deletes the pidfile and the unix socket then returns without
running ovs-appctl if the pidfile point to a not-existing pid.

Reported-at: https://bugzilla.redhat.com/1667845
Reported-by: Candido Campos <ccamposr@redhat.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Drop redundant DPDK build check.
Ilya Maximets [Fri, 8 Feb 2019 16:49:00 +0000 (19:49 +0300)]
travis: Drop redundant DPDK build check.

This check covered by 'TESTSUITE=1 DPDK=1'.
No need to run it separately.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Use parallel jobs for DPDK and sparse builds.
Ilya Maximets [Fri, 8 Feb 2019 16:48:59 +0000 (19:48 +0300)]
travis: Use parallel jobs for DPDK and sparse builds.

This allows to save a few minutes.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Enable printing of executed commands.
Ilya Maximets [Fri, 8 Feb 2019 16:48:58 +0000 (19:48 +0300)]
travis: Enable printing of executed commands.

This increases the output by a few lines, but gives important
information regarding commands and their exact arguments.
Very useful for debugging.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Dump config.log on configure failures.
Ilya Maximets [Fri, 8 Feb 2019 16:48:57 +0000 (19:48 +0300)]
travis: Dump config.log on configure failures.

Useful for debugging.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Run testsuite with desired options.
Ilya Maximets [Fri, 8 Feb 2019 16:48:56 +0000 (19:48 +0300)]
travis: Run testsuite with desired options.

'make distcheck' executes it's own './configure' without any options
provided to the script. This means that in current configuration
Travis CI always re-builds and runs testsuite on a defualt binaries.
i.e. we're not checking testsuite with DPDK, not checking testsuite
with '--enable-shared' and not checking it with '-ljemalloc'.
We just 8 times running the testsuite without arguments. Only compiler
changes (gcc or clang) because CC is exported by Travis.

This patch reorders the commands in the build script and provides
'DISTCHECK_CONFIGURE_FLAGS' to force 'make distcheck' using our
desired configuration.

Another issue that addressed here is that we will no longe build
twice in case of TESTSUITE.

For linking inside the distcheck we also need to provide absulute path
to DPDK libraries.

'configure' executed before 'distcheck' to have a Makefile target.
It's executed without arguments because 'configure' inside the
'distcheck' will fail if we'll use sparse-wrapped CC.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoautomake: Clean up cxxtest.cc.
Ilya Maximets [Fri, 8 Feb 2019 16:48:55 +0000 (19:48 +0300)]
automake: Clean up cxxtest.cc.

'distcheck' complains on some configurations:

  ERROR: files left in build directory after distclean:
  ./include/openvswitch/cxxtest.cc

CC: Ben Pfaff <blp@ovn.org>
Fixes: 994bfc298502 ("Automatically verify that OVS header files work OK in C++ also.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Clean up some gcov, tmp and cache files.
Ilya Maximets [Fri, 8 Feb 2019 16:48:54 +0000 (19:48 +0300)]
datapath: Clean up some gcov, tmp and cache files.

'distcheck' complains about these files while building --with-linux.

  ERROR: files left in build directory after distclean:
  ./datapath/linux/.tmp_ip6_gre.gcno
  ./datapath/linux/.tmp_ip_tunnels_core.gcno
  ./datapath/linux/.tmp_genetlink-openvswitch.gcno
  ./datapath/linux/.tmp_stt.gcno
  <..>
  ./datapath/linux/.tmp_versions/vport-gre.mod
  ./datapath/linux/.tmp_versions/vport-geneve.mod
  ./datapath/linux/.tmp_versions/vport-vxlan.mod
  ./datapath/linux/.tmp_versions/vport-lisp.mod
  ./datapath/linux/.tmp_versions/vport-stt.mod
  <..>
  ./datapath/linux/.dev-openvswitch.o.d
  ./datapath/linux/.ip_tunnels_core.o.d
  ./datapath/linux/.vport.o.d
  ./datapath/linux/.udp_tunnel.o.d
  ./datapath/linux/.cache.mk

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Fix building datapath instead of userspace with DPDK_SHARED.
Ilya Maximets [Fri, 8 Feb 2019 16:48:53 +0000 (19:48 +0300)]
travis: Fix building datapath instead of userspace with DPDK_SHARED.

Current script does not check build of OVS with DPDK.
It builds datapath instead.

CC: Ian Stokes <ian.stokes@intel.com>
Fixes: edfe8d263d2e ("travis: Add dpdk shared library build.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agolib/tc: Support optional tunnel id
Adi Nissim [Thu, 17 Jan 2019 15:41:36 +0000 (17:41 +0200)]
lib/tc: Support optional tunnel id

Currently the TC tunnel_key action is always
initialized with the given tunnel id value. However,
some tunneling protocols define the tunnel id as an optional field.

This patch initializes the id field of tunnel_key:set and tunnel_key:unset
only if a value is provided.

In the case that a tunnel key value is not provided by the user
the key flag will not be set.

Signed-off-by: Adi Nissim <adin@mellanox.com>
Acked-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
5 years agoacinclude: Drop DPDK_EXTRA_LIB variable.
Ilya Maximets [Thu, 7 Feb 2019 12:20:20 +0000 (15:20 +0300)]
acinclude: Drop DPDK_EXTRA_LIB variable.

AC_SEARCH_LIBS enables the libraries itself:

  checking for library containing get_mempolicy... -lnuma
  checking for library containing pcap_dump... -lpcap

So, they are available in LIBS. No need to add them twice.

Also, DPDK_EXTRA_LIB doesn't even work, because each check overwrites
the variable instead of appending the new library. It was first time
misused while making libnuma optional and copy-pasted to several places
after that.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoAUTHORS: Add Ophir Munk <ophirmu@mellanox.com>
Ian Stokes [Wed, 6 Feb 2019 12:20:30 +0000 (12:20 +0000)]
AUTHORS: Add Ophir Munk <ophirmu@mellanox.com>

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoAUTHORS: Add Asaf Penso <asafp@mellanox.com>
Ian Stokes [Wed, 6 Feb 2019 12:06:33 +0000 (12:06 +0000)]
AUTHORS: Add Asaf Penso <asafp@mellanox.com>

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agonetdev-dpdk: Memset rte_flow_item on a need basis.
Asaf Penso [Mon, 4 Feb 2019 16:14:41 +0000 (16:14 +0000)]
netdev-dpdk: Memset rte_flow_item on a need basis.

In netdev_dpdk_add_rte_flow_offload function different rte_flow_item are
created as part of the pattern matching.

For most of them, there is a check whether the wildcards are not zero.
In case of zero, nothing is being done with the rte_flow_item.

Befor the wildcard check, and regardless of the result, the
rte_flow_item is being memset.

The patch moves the memset function only if the condition of the
wildcard is true, thus saving cycles of memset if not needed.

Signed-off-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoofproto: Don't always treat passive controllers as "equal".
Ben Pfaff [Tue, 9 Oct 2018 18:15:00 +0000 (11:15 -0700)]
ofproto: Don't always treat passive controllers as "equal".

If a passive controller chooses to configure itself as a slave controller,
I don't know a reason why it should be considered "equal" for some
purposes.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agovswitchd: Allow user to configure controllers as "primary" or "service".
Ben Pfaff [Fri, 26 Oct 2018 22:53:55 +0000 (15:53 -0700)]
vswitchd: Allow user to configure controllers as "primary" or "service".

Normally it makes sense for an active connection to be primary and a
passive connection to be a service connection, but I've run into a corner
case where it is better for a passive connection to be a primary
connection.  This specific case is for use with OFtest, which expects to be
a primary controller.  However, it also wants to reconnect frequently,
which is slow for active connections because of the backoff; by
configuring a passive, primary controller, OFtest can reconnect as
frequently and as quickly as it wants, making the overall test much faster.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoRemove support for OpenFlow 1.6 (draft).
Ben Pfaff [Fri, 18 Jan 2019 00:20:20 +0000 (16:20 -0800)]
Remove support for OpenFlow 1.6 (draft).

ONF abandoned the OpenFlow specification, so that OpenFlow 1.6 will never
be completed.  It did not contain much in the way of useful features, so
remove what support Open vSwitch already had.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
5 years agoconnmgr: Make treatment of active and passive connections more uniform.
Ben Pfaff [Wed, 24 Oct 2018 21:23:38 +0000 (14:23 -0700)]
connmgr: Make treatment of active and passive connections more uniform.

Until now, connmgr has handled active and passive OpenFlow connections in
quite different ways.  Any active connection, whether it was currently
connected or not, was always maintained as an ofconn.  Whenever such a
connection (re)connected, its settings were cleared.  On the other hand,
passive connections had a separate listener which created an ofconn when
a new connection came in, and these ofconns would be deleted when such a
connection was closed.  This approach is inelegant and has occasionally
led to bugs when reconnection didn't clear all of the state that it
should have.

There's another motivation here.  Currently, active connections are
always primary controllers and passive connections are always service
controllers (as documented in ovs-vswitchd.conf.db(5)).  Sometimes it would
be useful to have passive primary controllers (maybe active service
controllers too but I haven't personally run into that use case).  As is,
this is difficult to implement because there is so much different code in
use between active and passive connections.  This commit will make it
easier.

Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Speed up linux kernel downloads.
Ilya Maximets [Tue, 5 Feb 2019 07:16:04 +0000 (10:16 +0300)]
travis: Speed up linux kernel downloads.

CDN links are much faster in average. https://www.kernel.org/
links shows usually less than 10 MB/s, while https://cdn.kernel.org/
could give up to 200 MB/s and usually shows speeds much higher than
10 MB/s. Also, 'xz' archives are 30-50 MB smaller than gzip ones.
It takes a bit more time to unpack them, but it's negligible in
compare with download time.

For exmaple,
  linux-3.16.54.tar.gz - 122064395 (116M)
  linux-3.16.54.tar.xz -  81057528 (77M)

'xz' archive download via CDN link is the default way for kernel
downloading that provided by the kernel.org.

Some exmaples from Travis builds:
Before:

  100%[==========================>] 122,064,395 3.11MB/s   in 36s
  (3.23 MB/s) - 'linux-3.16.54.tar.gz' saved [122064395/122064395]

  100%[==========================>] 157,764,715 7.16MB/s   in 24s
  (6.28 MB/s) - 'linux-4.17.14.tar.gz' saved [157764715/157764715]

After:

  100%[==========================>] 81,057,528  95.0MB/s   in 0.8s
  (95.0 MB/s) - 'linux-3.16.54.tar.xz' saved [81057528/81057528]

  100%[==========================>] 102,195,552  218MB/s   in 0.4s
  (218 MB/s) - 'linux-4.17.14.tar.xz' saved [102195552/102195552]

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoFix OpenFlow v1.3.4 Conf test failures: 430.500, 430.510
psiyengar [Thu, 17 Jan 2019 00:53:52 +0000 (16:53 -0800)]
Fix OpenFlow v1.3.4 Conf test failures: 430.500, 430.510

This commit adds additional verification to nx_pull_header__()
in lib/nx-match.c to distinguish between bad match and bad action
header conditions and return the appropriate error type/code.

Signed-off-by: Prashanth Iyengar <prashanth_iyengar@alliedtelesis.com>
Reviewed-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz>
Reviewed-by: Rahul Gupta <Rahul_Gupta@alliedtelesis.com>
5 years agoskiplist: Drop data comparison in skiplist_delete.
Ilya Maximets [Tue, 29 Jan 2019 13:09:55 +0000 (16:09 +0300)]
skiplist: Drop data comparison in skiplist_delete.

Current version of 'skiplist_delete' uses data comparator to check
if the node that we're removing exists on current level. i.e. our
node 'x' is the next of update[i] on the level i.
But it's enough to just check pointers for that purpose.

Here is the small example of how the data structures looks at
this moment:

        i   a   b   c        x        d   e   f
        0  [ ]>[ ]>[*] ---> [ ] ---> [#]>[ ]>[ ]
        1  [ ]>[*] -------> [ ] -------> [#]>[ ]
        2  [ ]>[*] -------> [ ] -----------> [#]
        3  [ ]>[*] ------------------------> [ ]
        4  [*] ----------------------------> [ ]

                        0  1  2  3  4
           update[] = { c, b, b, b, a }
        x.forward[] = { d, e, f }

        c.forward[0] = x
        b.forward[1] = x
        b.forward[2] = x
        b.forward[3] = f
        a.forward[4] = f

  Target:

        i   a   b   c                 d   e   f
        0  [ ]>[ ]>[*] ------------> [#]>[ ]>[ ]
        1  [ ]>[*] --------------------> [#]>[ ]
        2  [ ]>[*] ------------------------> [#]
        3  [ ]>[*] ------------------------> [ ]
        4  [*] ----------------------------> [ ]

        c.forward[0] = x.forward[0] = d
        b.forward[1] = x.forward[1] = e
        b.forward[2] = x.forward[2] = f
        b.forward[3] = f
        a.forward[4] = f

i.e. we're updating forward pointers while update[i].forward[i] == x.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoskiplist: Remove 'height' from skiplist_node.
Ben Pfaff [Fri, 25 Jan 2019 20:22:06 +0000 (12:22 -0800)]
skiplist: Remove 'height' from skiplist_node.

This member was write-only: it was initialized and never used later on.

Thanks to Esteban Rodriguez Betancourt <estebarb@hpe.com> for the
following additional rationale:

    In this case you are right, the "height" member is not only not
    used, it is in fact not required, and can be safely removed,
    without causing security issues.

    The code can't read past the end of the 'forward' array because
    the skiplist "level" member, that specifies the maximum height of
    the whole skiplist.

    The "level" field is updated in insertions and deletions, so that
    in insertion the root node will point to the newly created item
    (if there isn't a list there yet). At the deletions, if the
    deleted item is the last one at that height then the root is
    modified to point to NULL at that height, and the whole skiplist
    height is decremented.

    For the forward_to case:

    - If a node is found in a list of level/height N, then it has
      height N (that's why it was inserted in that list)

    - forward_to travels throught nodes in the same level, so it is
      safe, as it doesn't go up.

    - If a node has height N, then it belongs to all the lists
      initiated at root->forward[n, n-1 ,n-2, ..., 0]

    - forward_to may go to lower levels, but that is safe, because of
      previous point.

    So, the protection is given by the "level" field in skiplist root
    node, and it is enough to guarantee that the code won't go off
    limits at 'forward' array. But yes, the height field is unused,
    unneeded, and can be removed safely.

CC: Esteban Rodriguez Betancourt <estebarb@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: Fix possibly uninitialized memory.
Darrell Ball [Tue, 5 Feb 2019 00:02:15 +0000 (16:02 -0800)]
conntrack: Fix possibly uninitialized memory.

There are a few cases where struct 'conn_key' padding may be unspecified
according to the C standard.  Practically, it seems implementations don't
have issue, but it is better to be safe. The code paths modified are not
hot ones.  Fix this by doing a memcpy in these cases in lieu of a
structure copy.

Found by inspection.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoflow: fix a possible memory leak in parse_ct_state
Li RongQing [Mon, 28 Jan 2019 05:49:09 +0000 (13:49 +0800)]
flow: fix a possible memory leak in parse_ct_state

state_s should be freed always before exit parse_ct_state

Fixes: b4293a336d8d ("conntrack: Move ct_state parsing to lib/flow.c")
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofproto-dpif-trace: Fix for the segmentation fault in ofproto_trace().
Ashish Varma [Mon, 4 Feb 2019 23:34:34 +0000 (15:34 -0800)]
ofproto-dpif-trace: Fix for the segmentation fault in ofproto_trace().

Added the check for NULL in "next_ct_states" argument passed to the
"ofproto_trace()" function. Under normal scenario, this is non-NULL. A NULL
"next_ct_states" argument is passed from the "upcall_xlate()" function on
encountering XLATE_RECURSION_TOO_DEEP or XLATE_TOO_MANY_RESUBMITS error.

VMware-BZ: #2282287
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Fix IPv6 later frags parsing
Yi-Hung Wei [Thu, 31 Jan 2019 21:16:19 +0000 (13:16 -0800)]
datapath: Fix IPv6 later frags parsing

Upstream commit:
    commit 41e4e2cd75346667b0c531c07dab05cce5b06d15
    Author: Yi-Hung Wei <yihung.wei@gmail.com>
    Date:   Thu Jan 3 09:51:57 2019 -0800

    openvswitch: Fix IPv6 later frags parsing

    The previous commit fa642f08839b
    ("openvswitch: Derive IP protocol number for IPv6 later frags")
    introduces IP protocol number parsing for IPv6 later frags that can mess
    up the network header length calculation logic, i.e. nh_len < 0.
    However, the network header length calculation is mainly for deriving
    the transport layer header in the key extraction process which the later
    fragment does not apply.

    Therefore, this commit skips the network header length calculation to
    fix the issue.

Reported-by: Chris Mi <chrism@mellanox.com>
Reported-by: Greg Rose <gvrose8192@gmail.com>
Fixes: fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 9a4ab6da01f7 ("datapath: Derive IP protocol number for IPv6 later frags")
Cc: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Derive IP protocol number for IPv6 later frags
Yi-Hung Wei [Thu, 31 Jan 2019 21:16:18 +0000 (13:16 -0800)]
datapath: Derive IP protocol number for IPv6 later frags

Upstream commit:
    commit fa642f08839bf2ff35b2f6c6a6c062aee8121ba8
    Author: Yi-Hung Wei <yihung.wei@gmail.com>
    Date:   Tue Sep 4 15:33:41 2018 -0700

    openvswitch: Derive IP protocol number for IPv6 later frags

    Currently, OVS only parses the IP protocol number for the first
    IPv6 fragment, but sets the IP protocol number for the later fragments
    to be NEXTHDF_FRAGMENT.  This patch tries to derive the IP protocol
    number for the IPV6 later frags so that we can match that.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Avoid OOB read when parsing flow nlattrs
Ross Lagerwall [Thu, 31 Jan 2019 21:16:17 +0000 (13:16 -0800)]
datapath: Avoid OOB read when parsing flow nlattrs

Upstream commit:
    commit 04a4af334b971814eedf4e4a413343ad3287d9a9
    Author: Ross Lagerwall <ross.lagerwall@citrix.com>
    Date:   Mon Jan 14 09:16:56 2019 +0000

    openvswitch: Avoid OOB read when parsing flow nlattrs

    For nested and variable attributes, the expected length of an attribute
    is not known and marked by a negative number.  This results in an OOB
    read when the expected length is later used to check if the attribute is
    all zeros. Fix this by using the actual length of the attribute rather
    than the expected length.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Add support for kernel 4.18.x
Yifeng Sun [Tue, 29 Jan 2019 22:18:08 +0000 (14:18 -0800)]
datapath: Add support for kernel 4.18.x

No code changing is necessary to support 4.18.x.

Only one kernel test failed and it is in the process of being fixed.

Updated .travis.yml to include 4.18.x and also use latest 4.17 version.
Updated test files to test 4.18 kernel.

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodpif-netlink: Fix a bug that causes duplicate key error in datapath
Yifeng Sun [Thu, 31 Jan 2019 23:10:00 +0000 (15:10 -0800)]
dpif-netlink: Fix a bug that causes duplicate key error in datapath

Kmod tests 122 and 123 failed and kernel reports a "Duplicate key of
type 6" error. Further debugging reveals that nl_attr_find__() should
start looking for OVS_KEY_ATTR_ETHERTYPE from offset returned by
a previous called nl_msg_start_nested(). This patch fixes it.

Tests 122 and 123 were skipped by kernel 4.15 and older versions.
Kernel 4.16 and later kernels start showing this failure.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotest: Fix failed test "flow resume with geneve tun_metadata"
Yifeng Sun [Fri, 1 Feb 2019 17:56:53 +0000 (09:56 -0800)]
test: Fix failed test "flow resume with geneve tun_metadata"

Test "flow resume with geneve tun_metadata" failed because there is
no controller running to handle the continuation message. A previous
commit deleted the line that starts ovs-ofctl as a controller in
order to avoid a race condition on monitor log. This patch adds
back this line but omits the log file because this test doesn't
depend on the log file.

Fixes: e8833217914f9c071c49 ("system-traffic.at: avoid a race condition on monitor log")
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
CC: David Marchand <david.marchand@redhat.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoSupport for match & set ICMPv6 reserved and options type fields
Vishal Deep Ajmera [Mon, 28 Jan 2019 11:41:06 +0000 (11:41 +0000)]
Support for match & set ICMPv6 reserved and options type fields

Currently OVS supports all ARP protocol fields as OXM match fields to
implement the relevant ARP procedures for IPv4. This includes support
for matching copying and setting ARP fields. In IPv6 ARP has been
replaced by ICMPv6 neighbor discovery (ND) procedures, neighbor
advertisement and neighbor solicitation.

The support for ICMPv6 fields in OVS is not complete for the use cases
equivalent to ARP in IPv4. OVS lacks support for matching, copying and
setting the “ND option type” and “ND reserved” fields. Without these user
cannot implement all ICMPv6 ND procedures for IPv6 support.

This commit adds additional OXM fields to OVS for ICMPv6 “ND option type“
and ICMPv6 “ND reserved” using the OXM extension mechanism. This allows
support for parsing these fields from an ICMPv6 packet header and extending
the OpenFlow protocol with specifications for these new OXM fields for
matching, copying and setting.

Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Co-authored-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com>
Signed-off-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoodp-util: Stop parse odp actions if nlattr is overflow
Yifeng Sun [Fri, 1 Feb 2019 23:56:04 +0000 (15:56 -0800)]
odp-util: Stop parse odp actions if nlattr is overflow

`encap = nl_msg_start_nested(key, OVS_KEY_ATTR_ENCAP)` ensures that
key->size >= (encap + NLA_HDRLEN), so the `if` statement is safe.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11306
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofp-actions: Set an action depth limit to prevent stackoverflow by ofpacts_parse
Yifeng Sun [Sat, 2 Feb 2019 00:44:26 +0000 (16:44 -0800)]
ofp-actions: Set an action depth limit to prevent stackoverflow by ofpacts_parse

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12557
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoAUTHORS: Add Hyong Youb Kim <hyonkim@cisco.com>.
Ben Pfaff [Mon, 4 Feb 2019 20:30:07 +0000 (12:30 -0800)]
AUTHORS: Add Hyong Youb Kim <hyonkim@cisco.com>.

Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovs-tcpdump: Fix an undefined variable
Hyong Youb Kim [Sat, 2 Feb 2019 07:19:40 +0000 (23:19 -0800)]
ovs-tcpdump: Fix an undefined variable

Run ovs-tcpdump without --span, and it throws the following
exception. Define mirror_select_all to avoid the error.

Traceback (most recent call last):
  File "/usr/local/bin/ovs-tcpdump", line 488, in <module>
    main()
  File "/usr/local/bin/ovs-tcpdump", line 454, in main
    mirror_select_all)
UnboundLocalError: local variable 'mirror_select_all' referenced before assignment

Fixes: 0475db71c650 ("ovs-tcpdump: Add --span to mirror all ports on bridge.")
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovn-controller: Fix chassisredirect port flapping when ovs-vswitchd crashes
Numan Siddique [Mon, 4 Feb 2019 16:50:31 +0000 (22:20 +0530)]
ovn-controller: Fix chassisredirect port flapping when ovs-vswitchd crashes

On a chassis when ovs-vswitchd crashes for some reason, the BFD status doesn't
get updated in the ovs db. ovn-controller will be reading the old BFD status
even though ovs-vswitchd is crashed. This results in the chassiredirect port
claim flapping between the master chassis and the chasiss with the next higher
priority if ovs-vswitchd crashes in the master chassis.

All the other chassis notices the BFD status down with the master chassis
and hence the next higher priority claims the port. But according to
the master chassis, the BFD status is fine and it again claims back the
chassisredirect port. And this results in flapping. The issue gets resolved
when ovs-vswitchd comes back but until then it leads to lot of SB DB
transactions and high CPU usage in ovn-controller's.

This patch fixes the issue by checking the OF connection status of the
ovn-controller with ovs-vswitchd and calculates the active bfd tunnels
only if it's connected.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agosystem-dpdk-macros.at: Drop dpdk-socket-mem configuration.
Ilya Maximets [Fri, 1 Feb 2019 14:45:38 +0000 (17:45 +0300)]
system-dpdk-macros.at: Drop dpdk-socket-mem configuration.

There are two reasons:
1. OVS provides same default itself.
2. socket-mem is not necessary with dynamic memory model in DPDK 18.11.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoconntrack: Fix max size for inet_ntop() call.
Darrell Ball [Fri, 1 Feb 2019 07:35:41 +0000 (23:35 -0800)]
conntrack: Fix max size for inet_ntop() call.

The call to inet_ntop() in repl_ftp_v6_addr() is 1 short to handle
the maximum possible V6 address size for v4 mapping case.

Found by inspection.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: fix ftp ipv4 address substitution.
Darrell Ball [Fri, 1 Feb 2019 07:35:40 +0000 (23:35 -0800)]
conntrack: fix ftp ipv4 address substitution.

When replacing the ipv4 address in repl_ftp_v4_addr(), the remaining size
was incorrectly calculated which could lead to the wrong replacement
adjustment.

This goes unnoticed most of the time, unless you choose carefully your
initial and replacement addresses.

Example fail address combination with 10.1.1.200 DNAT'd to 10.1.100.1.

Fix this by doing something similar to V6 and also splicing out common
code for better coverage and maintainability.

A test is updated to exercise different initial and replacement addresses
and another test is added.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Reported-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodpdk: Limit DPDK memory usage.
Ilya Maximets [Tue, 29 Jan 2019 08:11:50 +0000 (11:11 +0300)]
dpdk: Limit DPDK memory usage.

Since 18.05 release, DPDK moved to dynamic memory model in which
hugepages could be allocated on demand. At the same time '--socket-mem'
option was re-defined as a size of pre-allocated memory, i.e. memory
that should be allocated at startup and could not be freed.
So, DPDK with a new memory model could allocate more hugepage memory
than specified in '--socket-mem' or '-m' options.

This change adds new configurable 'other_config:dpdk-socket-limit'
which could be used to limit the ammount of memory DPDK could use.
It uses new DPDK option '--socket-limit'.
Ex.:

  ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-limit="1024,1024"

Also, in order to preserve old behaviour, if '--socket-limit' is not
specified, it will be defaulted to the amount of memory specified by
'--socket-mem' option, i.e. OVS will not be able to allocate more.
This is needed, for example, to disallow OVS to allocate more memory
than reserved for it by Nova in OpenStack installations.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agolib/tc: add set ipv6 traffic class action offload via pedit
Pieter Jansen van Vuuren [Mon, 28 Jan 2019 12:29:10 +0000 (12:29 +0000)]
lib/tc: add set ipv6 traffic class action offload via pedit

Extend ovs-tc translation by allowing non-byte-aligned fields
for set actions. Use new boundary shifts and add set ipv6 traffic
class action offload via pedit.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
5 years agolib/tc: add set ipv4 dscp and ecn action offload via pedit
Pieter Jansen van Vuuren [Mon, 28 Jan 2019 12:29:09 +0000 (12:29 +0000)]
lib/tc: add set ipv4 dscp and ecn action offload via pedit

Add setting of ipv4 dscp and ecn fields in tc offload using pedit.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
5 years agolib/tc: fix 32 bits shift for pedit offset calculation
Pieter Jansen van Vuuren [Mon, 28 Jan 2019 12:29:08 +0000 (12:29 +0000)]
lib/tc: fix 32 bits shift for pedit offset calculation

pedit allows setting entire words with an optional mask and OVS
makes use of such masks to allow setting fields that do not span
entire words. One mask for leading bytes that should not be
updated and another mask for trailing bytes that should not be
updated. The masks are created using bit shifts.

In the case of the mask to omit trailing bytes a right bit shift
is used. Currently the code can produce shifts of 1, 2, 3 or 4
bytes (8, 16, 24 or 32 bits) based on the alignment of the end
of field being set.

However, a shift of 32 bits on a 32bit value is undefined.
As it stands the code relies on the result of UINT32_MAX >> 32
being UINT32_MAX. Or in other words a mask that results in the
pedit action setting all bytes of the word under operation.

This patch adjusts the code to use a shift of 0 for this case,
which gives the same result as the undefined behaviour that was
relied on, and appears logically correct as the desire is for no
trailing bytes (or bits!) to be omitted from the set action.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
5 years agolib/tc: make pedit mask calculations byte order agnostic
Pieter Jansen van Vuuren [Mon, 28 Jan 2019 12:29:07 +0000 (12:29 +0000)]
lib/tc: make pedit mask calculations byte order agnostic

pedit allows setting entire words with an optional mask and OVS
makes use of such masks to allow setting fields that do not span
entire words.

The struct tc_pedit_key structure, which is part of the kernel
ABI, uses host byte order fields to store the mask and value for
a pedit action, however, these fields contain values in network
byte order.

In order to allow static analysis tools to check for endianness
problems this patch adds a local version of struct tc_pedit_key
which uses big endian types and refactors the relevant code as
appropriate.

In the course of making this change it became apparent that the
calculation of masks was occurring using host byte order although
the values are in network byte order. This patch also fixes that
problem by shifting values in host byte order and then converting
them to network byte order. It is believe this fixes a bug on big
endian systems although we are not in a position to test that.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
5 years agodpdk: Use svec instead of re-inventing.
Ilya Maximets [Tue, 29 Jan 2019 08:11:49 +0000 (11:11 +0300)]
dpdk: Use svec instead of re-inventing.

No need to implement dynamic vector to store arguments.
'svec' perfectly covers all the needed functionality.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoraft.c: Remove noisy INFO log
Han Zhou [Tue, 29 Jan 2019 00:31:40 +0000 (16:31 -0800)]
raft.c: Remove noisy INFO log

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodpdk: Use dynamic string for socket-mem construction.
Ilya Maximets [Tue, 22 Jan 2019 13:22:30 +0000 (16:22 +0300)]
dpdk: Use dynamic string for socket-mem construction.

No need to allocate memory and use 'strcat' direcly.
'dynamic-string' could do this for us.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoFix test 'testing ovn -- IP packet buffering' on Windows
Alin Gabriel Serdean [Wed, 23 Jan 2019 10:41:10 +0000 (12:41 +0200)]
Fix test 'testing ovn -- IP packet buffering' on Windows

The test fails on Windows because of:
<--cut-->
ovn-nbctl: sw0: invalid network address: 2001;1\64
ovn-nbctl: sw1: invalid network address: 2002;1\64
<--cut-->

This is due to the fact msys converts '::1' into ';1'.

Use IPv6 long form instead of its short variant.

Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath-windows: Add support for 'OVS_KEY_ATTR_ENCAP' key attribute.
Anand Kumar [Fri, 11 Jan 2019 00:45:24 +0000 (16:45 -0800)]
datapath-windows: Add support for 'OVS_KEY_ATTR_ENCAP' key attribute.

Add a new structure in l2 header to accomodate vlan header,
based of commit "d7efce7beff25052bd9083419200e1a47f0d6066
datapath: 802.1AD Flow handling, actions, vlan parsing, netlink attributes"

Also reset vlan header in flow key, after deleting vlan tag from nbl

With this change a sample vlan flow would look like,
eth(src=0a:ea:8a:24:03:86,dst=0a:cd:fa:4d:15:5c),in_port(3),eth_type(0x8100),
vlan(vid=2239,pcp=0),encap(eth_type(0x0800),ipv4(src=13.12.11.149,dst=13.12.11.107,
proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0))

Signed-off-by: Anand Kumar <kumaranand@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
5 years agoRevert "ofproto-dpif: Let the dpif report when a port is a duplicate."
Flavio Leitner [Thu, 13 Dec 2018 14:24:48 +0000 (12:24 -0200)]
Revert "ofproto-dpif: Let the dpif report when a port is a duplicate."

This reverts commit 7521e0cf9e88a62f2feff4e7253654557f94877e.

This patch introduced a regression in OSP environments using internal
ports in other netns. Their networking configuration is lost when
the service is restarted because the ports are recreated now.

Before the patch it checked using netlink if the port with a specific
"name" was already there. The check is a lookup in all ports attached
to the DP regardless of the port's netns.

After the patch it relies on the kernel to identify that situation.
Unfortunately the only protection there is register_netdevice() which
fails only if the port with that name exists in the current netns.

If the port is in another netns, it will get a new dp_port and because
of that userspace will delete the old port. At this point the original
port is gone from the other netns and there a fresh port in the current
netns.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoRevert "ofproto-dpif: Check for EBUSY as well"
Flavio Leitner [Thu, 13 Dec 2018 14:24:47 +0000 (12:24 -0200)]
Revert "ofproto-dpif: Check for EBUSY as well"

This reverts commit c65259a9b6e5380ac963944b69949ceb71ae623a.

The original commit 7521e0cf9e88 ("ofproto-dpif: Let the dpif report
when a port is a duplicate.") relies on the kernel to check if the
port exists or not. However, the current kernel code doesn't handle
when the port is moved to another network namespace.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoRevert "dpif-netlink: Don't destroy and recreate port if it exists"
Flavio Leitner [Thu, 13 Dec 2018 14:24:46 +0000 (12:24 -0200)]
Revert "dpif-netlink: Don't destroy and recreate port if it exists"

This reverts commit  a38dccb3ee80a1d0b8973191c9e94f045441f8cc.

The original commit 7521e0cf9e88 ("ofproto-dpif: Let the dpif report
when a port is a duplicate.") relies on the kernel to check if the
port exists or not. However, the current kernel code doesn't handle
when the port is moved to another network namespace.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotreewide: Get rid of // comments, even inside comments.
Ben Pfaff [Wed, 23 Jan 2019 20:09:46 +0000 (12:09 -0800)]
treewide: Get rid of // comments, even inside comments.

Just a style fix.

With this patch, the following reports no hits:

git ls-files | grep '\.[ch]$' | grep -vE 'datapath|sflow' \
    | xargs grep -n // | grep -vE "http|s/|'|\""

Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoAUTHORS: Add Li RongQing.
Ben Pfaff [Fri, 25 Jan 2019 21:08:51 +0000 (13:08 -0800)]
AUTHORS: Add Li RongQing.

Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoflow: fix udp checksum
Li RongQing [Fri, 25 Jan 2019 11:08:33 +0000 (19:08 +0800)]
flow: fix udp checksum

As per RFC 768, if the calculated UDP checksum is 0, it should be
instead set as 0xFFFF in the frame. A value of 0 in the checksum
field indicates to the receiver that no checksum was calculated
and hence it should not verify the checksum.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agohash: Implement hash for aarch64 using CRC32c intrinsics.
Yanqin Wei (Arm Technology China) [Fri, 25 Jan 2019 11:28:01 +0000 (11:28 +0000)]
hash: Implement hash for aarch64 using CRC32c intrinsics.

This commit adds lib/hash-aarch64.h to implement hash for aarch64.
It is based on aarch64 built-in CRC32c intrinsics, which accelerates
hash function for datapath performance.

test:
1. "test-hash" case passed in aarch64 platform.
2.  OVS-DPDK datapth performance test was run(NIC to NIC).
    Test bed: aarch64(Centriq 2400) platform.
    Test case: DPCLS forwarding(disable EMC + avg 10 subtable lookups)
    Test result: improve around 10%.

Signed-off-by: Yanqin Wei <yanqin.wei@arm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovs-macros.at: Better hide 'exec -a' checking.
Ilya Maximets [Fri, 25 Jan 2019 13:21:22 +0000 (16:21 +0300)]
ovs-macros.at: Better hide 'exec -a' checking.

There is some issue with parsing of redirection options
on some shells. For example:

  $ (exec -a name true) 2>&1 >/dev/null || echo "failed"
  sh: 10: exec: -a: not found
  failed

  $ (exec -a name true) >/dev/null 2>&1 || echo "failed"
  failed

So, the order of redirections matters for some reason.
Let's replace our current version with simple redirection of stderr.
This version seems to work in most of shells except [t]csh. But it's
really tricky to write portable redirections that works with csh and
this shell will not be used by the testsuite on most of the systems.

With the new version:

  # cat test.sh
  ((exec -a myname true 2>/dev/null) && echo "OK") || echo "fail"

  # sh ./test.sh
  fail
  # bash ./test.sh
  OK
  # tcsh ./test.sh
  -a: Command not found.
  fail

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agostt: Fix return code during xmit.
Aaron Conole [Thu, 24 Jan 2019 18:20:13 +0000 (10:20 -0800)]
stt: Fix return code during xmit.

In the case of an error, return the error code as opposed to
NETDEV_TX_OK.

Caught by compiler warning:

  /home/travis/build/ovsrobot/ovs/datapath/linux/stt.c: In function =E2=80=
=98ovs_stt_xmit=E2=80=99:
  /home/travis/build/ovsrobot/ovs/datapath/linux/stt.c:1005:6: warning: var=
iable =E2=80=98err=E2=80=99 set but not used [-Wunused-but-set-variable]
    int err;
        ^

Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
5 years agorhel: bug fix upgrade path in kmod fedora spec file
Martin Xu [Tue, 22 Jan 2019 23:02:30 +0000 (15:02 -0800)]
rhel: bug fix upgrade path in kmod fedora spec file

This patch removes the "Conflicts" tag and adds "Obsoletes" tag.

With the conflicts tag, when a user attempts to install or upgrade with
the same version as already installed, the conflict kicks in. Otherwise,
such is allowed with --replacepkgs.

Obsoletes is needed for the upgrade path from kmod-openvswitch to
openvswitch-kmod.

Fixes: 22c33c3039 (rhel: support kmod build against mulitple kernel
versions, fedora)

VMware-BZ: #2249788

Signed-off-by: Martin Xu <martinxu9.ovs@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
CC: Yi-Hung Wei <yihung.wei@gmail.com>
CC: Yifeng Sun <pkusunyifeng@gmail.com>
CC: Zak Whittington <zwhitt.vmware@gmail.com>
CC: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: return -EEXIST if inet6_add_protocol fails
Greg Rose [Tue, 22 Jan 2019 23:42:55 +0000 (15:42 -0800)]
datapath: return -EEXIST if inet6_add_protocol fails

Our code to determine whether receive functionality will work with
ip6 gre depends on the return of -EEXIST but inet6_add_protocol()
returns a -1 on failure to grab the pointer via a cmpxchg op.  Just
set the error return to -EEXIST to help out the vport init function.

Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-January/048090.html
Reported-by: Ken Ajiro <ken-ajiro@xr.jp.nec.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agocompat: Fixup ipv6 fragmentation on 4.9.135+ kernels
Greg Rose [Thu, 10 Jan 2019 22:09:51 +0000 (14:09 -0800)]
compat: Fixup ipv6 fragmentation on 4.9.135+ kernels

Upstream commit 648700f76b03 ("inet: frags: use rhashtables...") changed
how ipv6 fragmentation is implemented.  This patch was backported to
the upstream stable 4.9.x kernel starting at 4.9.135.

This patch creates the compatibility layer changes required to both
compile and also operate correctly with ipv6 fragmentation on these
kernels. Check if the inet_frags 'rnd' field is present to key on
whether the upstream patch is present.  Also update Travis to the
latest 4.9 kernel release so that this patch is compile tested.

Passes Travis:
https://travis-ci.org/gvrose8192/ovs-experimental/builds/478033409

Cc: William Tu <u9012063@gmail.com>
Cc: Yi-Hung Wei <yihung.wei@gmail.com>
Cc: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoFix crash due to multiple tnl push action
Anju Thomas [Mon, 7 May 2018 19:04:54 +0000 (00:34 +0530)]
Fix crash due to multiple tnl push action

During slow path packet processing, if the action is to output to a
tunnel port, the slow path processing of the encapsulated packet
continues on the underlay bridge and additional actions (e.g. optional
VLAN encapsulation, bond link selection and finally output to port) are
collected there.

To prepare for a continuation of the processing of the original packet
(e.g. output to other tunnel ports in a flooding scenario), the
“tunnel_push” action and the actions of the underlay bridge are
encapsulated in a clone() action to preserve the original packet.

If the underlay bridge decides to drop the tunnel packet (for example if
both bonded ports are down simultaneously), the clone(tunnel_push))
actions previously generated as part of translation of the output to
tunnel port are discarded and a stand-alone tunnel_push action is added
instead. Thus the tunnel header is pushed on to the original packet.
This is the bug.

Consequences: If packet processing continues with sending to further
tunnel ports, multiple tunnel header pushes will happen on the original
packet as typically the tunnels all traverse the same underlay bond
which is down. The packet may not have enough headroom to accommodate
all the tunnel headers. OVS crashes if it runs out of space while trying
to push the tunnel headers.

Even in case there is enough headroom, the packet will not be freed
since the accumulated action list contains only the tunnel header push
action without any output port action. Thus, we either have a crash or a
packet buffer leak.

Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis/linux-build: enable testing with clang builds
Aaron Conole [Mon, 21 Jan 2019 18:05:26 +0000 (13:05 -0500)]
travis/linux-build: enable testing with clang builds

The CLANG version of the builds have not honored the TESTSUITE variable.
This dates to at least 2015, and the reason for the restriction isn't
clear.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: enable testsuite with dpdk
Aaron Conole [Mon, 21 Jan 2019 18:05:25 +0000 (13:05 -0500)]
travis: enable testsuite with dpdk

The testsuite flag isn't currently being passed for DPDK.  Let's pass it
and when a future DPDK supports running the check-dpdk suite, we can
turn that on then, too.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoseq: Correct example in comment.
Ben Pfaff [Tue, 22 Jan 2019 01:14:36 +0000 (17:14 -0800)]
seq: Correct example in comment.

It was deceptive for the example to imply that a seq can be declared
directly, because the API only allows for creating a new one on the heap.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
5 years agoPrepare for post-2.11.0 (2.11.90).
Justin Pettit [Sun, 20 Jan 2019 21:28:40 +0000 (13:28 -0800)]
Prepare for post-2.11.0 (2.11.90).

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
5 years agoPrepare for 2.11.0.
Justin Pettit [Sun, 20 Jan 2019 21:26:52 +0000 (13:26 -0800)]
Prepare for 2.11.0.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: Fix FTP seq_skew boundary adjustments.
Darrell Ball [Wed, 16 Jan 2019 02:58:17 +0000 (18:58 -0800)]
conntrack: Fix FTP seq_skew boundary adjustments.

At the same time, splice out a function and also rely on the compiler
for overflow/underflow handling.

Found by inspection.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: fix expectations for ftp+DNAT.
David Marchand [Wed, 16 Jan 2019 02:58:16 +0000 (18:58 -0800)]
conntrack: fix expectations for ftp+DNAT.

When configuring the nat part of an expectation, care must be taken to
look at the master nat action and direction to properly reproduce it.

DNAT tests have been added to both active and passive modes, all
ftp/tftp tests titles have been updated to reflect they are dealing with
SNAT.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Co-authored-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: fix tcp seq adjustments when mangling commands.
David Marchand [Wed, 16 Jan 2019 02:58:15 +0000 (18:58 -0800)]
conntrack: fix tcp seq adjustments when mangling commands.

The ftp alg deals with packets in two ways for the command connection:
either they are inspected and can be mangled when nat is enabled
(CT_FTP_CTL_INTEREST) or they just go through without being modified
(CT_FTP_CTL_OTHER).

For CT_FTP_CTL_INTEREST packets, we must both adjust the packet tcp seq
number by the connection current offset, then prepare for the next
packets by setting an accumulated offset in the ct object.  However,
this was not done for multiple CT_FTP_CTL_INTEREST packets for the same
connection.
This is relevant for handling multiple child data connections that also
need natting.

The tests are updated so that some ftp+NAT tests send multiple port
commands or other similar commands for a single control connection.
Wget is not able to do this, so switch to lftp.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Co-authored-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoodp-util: Avoid revalidation error for masked NSH set action.
Ben Pfaff [Sat, 15 Dec 2018 02:16:54 +0000 (18:16 -0800)]
odp-util: Avoid revalidation error for masked NSH set action.

A masked NSH set action has mdtype 0 because the mdtype is not being
changed, but odp_nsh_key_from_attr() rejects this because mdtype 0 does
not match up with the OVS_NSH_KEY_ATTR_MD1 attribute being present.  This
fixes the problem.

The kernel datapath in flow_netlink function nsh_key_put_from_nlattr() has
a similar exception.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoFix bugs in L3 protocol support.
Ben Pfaff [Sat, 15 Dec 2018 02:16:53 +0000 (18:16 -0800)]
Fix bugs in L3 protocol support.

Test 854 "tunnel_push_pop - action" showed problems in revalidation for
L3 protocol support in its L3 GRE test.  L3 packets (that is, packets
without an Ethernet header but only some L3 protocol such as IPv4 or IPv6)
have an Ethernet type that is kept in the dl_type member of the flow, and
the flows that they pass through can cause L3 and L4 fields to be matched.
However, the translation process incorrectly forced the dl_type to be
wildcarded, which caused a contradiction since it's not possible to match
on L3 and L4 fields if the dl_type is not known, and the code in
odp_flow_key_to_flow() and related functions therefore rejected these flows
at revalidation time.

This commit fixes the problem by treating dl_type the same for L2 and L3
flows in translation.  It also makes odp_flow_key_to_flow__() copy the
Ethernet type that comes from a packet_type field into dl_type, which is
the expected behavior.

The actual error that this fixes is only visible after applying an upcoming
commit that improves logging for bad datapath flows.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoselinux: Add missing permissions for ovs-kmod-ctl
Yi-Hung Wei [Mon, 7 Jan 2019 23:48:19 +0000 (15:48 -0800)]
selinux: Add missing permissions for ovs-kmod-ctl

Starting from OVS 2.10, ovs-vswitchd may fail to run after system reboot
since it fails to load ovs kernel module.  It is because the conntrack
zone limit feature introduced in OVS 2.10 now depends on
nf_conntrack_ipv4/6 kernel module, and the SELinux prevents it to load the
two kernel modules.

Example log of the AVC violations:
    type=AVC msg=audit(1546903594.735:29): avc:  denied  { execute_no_trans }
    for  pid=820 comm="modprobe" path="/usr/bin/bash" dev="dm-0" ino=50337111
    scontext=system_u:system_r:openvswitch_load_module_t:s0
    tcontext=system_u:object_r:shell_exec_t:s0 tclass=file

    type=AVC msg=audit(1546903594.791:30): avc:  denied  { module_request } for
    pid=819 comm="modprobe" kmod="nf_conntrack-2"
    scontext=system_u:system_r:openvswitch_load_module_t:s0
    tcontext=system_u:system_r:kernel_t:s0 tclass=system

This patch adds the missing permissions for modprobe command in ovs-kmod-ctl
so that the aforementioned issue is resolved.

VMWare-BZ: #2257534
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>