]> git.proxmox.com Git - mirror_ovs.git/log
mirror_ovs.git
4 years agoOVN: run local logical flows first in S_ROUTER_OUT_SNAT table
Lorenzo Bianconi [Sat, 6 Jul 2019 10:45:00 +0000 (12:45 +0200)]
OVN: run local logical flows first in S_ROUTER_OUT_SNAT table

Run local logical flows first if the gw router port is scheduled
on the local chassis in order to properly manage snat traffic

Tested-by: Eran Kuris <ekuris@redhat.com>
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agorhel: Fixed a bug for checking the correct major version and revision.
Ashish Varma [Mon, 8 Jul 2019 16:51:29 +0000 (09:51 -0700)]
rhel: Fixed a bug for checking the correct major version and revision.

Fixed a bug where checking for major version 3.10 and major revision not
equal to 327 or 693 or 957 should have gone to the default else at the end.
In the current code, the default else condition will not get executed
for kernel with major version 3.10 and major revision not equal
to 327/693/957 resulting in failure to load the kernel module.

Fixes: 402efbe4e176 ("rhel: Add 4.12 kernel support in ovs-kmod-manage.sh")
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn-controller: Update stale chassis entry at init
Dumitru Ceara [Mon, 8 Jul 2019 10:07:26 +0000 (12:07 +0200)]
ovn-controller: Update stale chassis entry at init

The first time ovn-controller initializes the Chassis entry (shortly
after start up) we first look if there is a stale Chassis record in the
OVN_Southbound DB by checking if any of the old Encap entries associated
to the Chassis record match the new tunnel configuration. If found it
means that ovn-controller didn't shutdown gracefully last time it was
run so it didn't cleanup the Chassis table. Potentially in the meantime
the OVS system-id was also changed. We then update the stale entry with
the new configuration and store the last configured chassis-id in memory
to avoid walking the Chassis table every time.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn-controller: Refactor chassis.c to abstract the string parsing
Dumitru Ceara [Mon, 8 Jul 2019 10:07:12 +0000 (12:07 +0200)]
ovn-controller: Refactor chassis.c to abstract the string parsing

Abstract out the chassis config string processing and use library data
structures (e.g., sset).
Rename the get_chassis_id function in ovn-controller.c to
get_ovs_chassis_id to avoid confusion with the newly added
chassis_get_id function from chassis.c which returns the last
successfully configured chassis-id.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn-controller: Fix chassis ovn-sbdb record init
Dumitru Ceara [Mon, 8 Jul 2019 10:07:00 +0000 (12:07 +0200)]
ovn-controller: Fix chassis ovn-sbdb record init

The chassis_run code didn't take into account the scenario when the
system-id was changed in the Open_vSwitch table. Due to this the code
was trying to insert a new Chassis record in the OVN_Southbound DB with
the same Encaps as the previous Chassis record. The transaction used
to insert the new records was aborting due to the ["type", "ip"]
index constraint violation as we were creating new Encap entries with
the same "type" and "ip" as the old ones.

In order to fix this issue the flow is now:
1. the first time ovn-controller initializes the Chassis (shortly after
start up) we store the chassis-id.
2. for subsequent chassis_run calls we use last configured
chassis-id stored at the previous step to lookup the old Chassis record.
3. when ovn-controller shuts down gracefully we lookup the Chassis
record based on the chassis-id stored in memory at steps 1 and 2 above.
This is to avoid failing to cleanup the Chassis record in OVN_Southbound
DB if the OVS system-id changes between the last call to chassis_run and
chassis_cleanup.

Reported-at: https://bugzilla.redhat.com/1708146
Reported-by: Haidong Li <haili@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agonetdev-dpdk: Enable tx-retries-max config.
Kevin Traynor [Tue, 2 Jul 2019 00:32:30 +0000 (01:32 +0100)]
netdev-dpdk: Enable tx-retries-max config.

vhost tx retries can provide some mitigation against
dropped packets due to a temporarily slow guest/limited queue
size for an interface, but on the other hand when a system
is fully loaded those extra cycles retrying could mean
packets are dropped elsewhere.

Up to now max vhost tx retries have been hardcoded, which meant
no tuning and no way to disable for debugging to see if extra
cycles spent retrying resulted in rx drops on some other
interface.

Add an option to change the max retries, with a value of
0 effectively disabling vhost tx retries.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agonetdev-dpdk: Add custom stat for vhost tx retries.
Kevin Traynor [Tue, 2 Jul 2019 00:32:29 +0000 (01:32 +0100)]
netdev-dpdk: Add custom stat for vhost tx retries.

vhost tx retries may occur, and it can be a sign that
the guest is not optimally configured.

Add a custom stat so a user will know if vhost tx retries are
occurring and hence give a hint that guest config should be
examined.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agodoc: Move vhost tx retry info to separate section.
Kevin Traynor [Tue, 2 Jul 2019 00:32:28 +0000 (01:32 +0100)]
doc: Move vhost tx retry info to separate section.

vhost tx retry is applicable to vhost-user and vhost-user-client,
but was in the section that compares them. Also, moved further
down the doc as prefer to have more fundamental info about vhost
nearer the top.

Fixes: 6d6513bfc657 ("doc: Add info on vhost tx retries.")
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agoOVN: do not distribute traffic for local FIP
Lorenzo Bianconi [Thu, 13 Jun 2019 17:47:59 +0000 (19:47 +0200)]
OVN: do not distribute traffic for local FIP

Do not send traffic for local FIP through the overlay tunnels but
manage it in the local hypervisor

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoDocumentation: Clarify connection tracking tutorial
Greg Rose [Wed, 19 Jun 2019 21:56:54 +0000 (14:56 -0700)]
Documentation: Clarify connection tracking tutorial

The current documentation states that "all packets entering OVS for
the first time are "untracked"".  However there is a minor exception
to this in the case where a packet (re)enters the same datapath and
the namespace has not changed.  In that case there is no need to
scrub the packet and in this case the connection may already be
in the "tracked" state.

Reported-by: Quan Tian <qtian@vmware.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agorconn: Increase precision of timers.
Ben Pfaff [Tue, 11 Jun 2019 16:55:16 +0000 (09:55 -0700)]
rconn: Increase precision of timers.

Until now, the rconn timers have been precise only to the nearest second.
This increases them to millisecond precision, which seems cleaner these
days.

Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agorconn: Remove write-only struct members.
Ben Pfaff [Tue, 11 Jun 2019 16:55:15 +0000 (09:55 -0700)]
rconn: Remove write-only struct members.

Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agosat-math: Add functions for saturating arithmetic on "long long int".
Ben Pfaff [Tue, 11 Jun 2019 16:55:14 +0000 (09:55 -0700)]
sat-math: Add functions for saturating arithmetic on "long long int".

The first users will be added in an upcoming commit.

Also add tests.

Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoofproto-dpif-upcall: Remove unused macro MAX_QUEUE_LENGTH.
Yunjian Wang [Wed, 19 Jun 2019 04:22:45 +0000 (12:22 +0800)]
ofproto-dpif-upcall: Remove unused macro MAX_QUEUE_LENGTH.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn-controller: Omit tracking external_ids columns
Numan Siddique [Fri, 28 Jun 2019 10:44:04 +0000 (16:14 +0530)]
ovn-controller: Omit tracking external_ids columns

Running the command "ovn-nbctl set logical_switch_port foo external_ids:foo=bar"
results in the incremetal processing engine to recompute the flows on the
chassis where the logical port 'foo' is claimed.

This patch avoids this unnecessary recomputation by omitting the tracking of
external_ids column of all the Southbound DB tables except DNS, Chassis
and Datapath_Binding tables. ovn-controller is refering to the external_ids
column of these tables.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoOVN: Enable E-W Traffic, Vlan backed DVR
Ankur Sharma [Thu, 20 Jun 2019 01:36:46 +0000 (01:36 +0000)]
OVN: Enable E-W Traffic, Vlan backed DVR

Background:
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html
[2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing

Key difference between an overlay logical switch and
vlan backed logical switch is that for vlan logical switches
packets are not encapsulated.

Hence, if a distributed router port is connected to vlan backed
logical switch, then router port mac as source mac could be
seen from multiple hypervisors. Same <mac,vlan> pairs coming
from multiple ports from a top of the rack switch (TOR) perspective
could be seen as a security threat and it could send alarms, drop
the packets or block the ports etc.

This patch addresses the same by introducing the concept of chassis mac.
A chassis mac is CMS provisioned unique mac per chassis. For any routed packet
(i.e source mac is router port mac) going on the wire on a vlan type
logical switch, we will replace its source mac with chassis mac.

This replacing of source mac with chassis mac will happen in table=65
of the logical switch datapath. A flow is added at priority 150, which
matches the source mac and replaces it with chassis mac if the value
is a router port mac.

Example flow:
cookie=0x0, duration=67765.830s, table=65, n_packets=0, n_bytes=0,
idle_age=65534, hard_age=65534, priority=150,reg15=0x1,metadata=0x4,
dl_src=00:00:01:01:02:03 actions=mod_dl_src:aa:bb:cc:dd:ee:ff,
mod_vlan_vid:1000,output:16

Here, 00:00:01:01:02:03 is router port mac and aa:bb:cc:dd:ee:ff
is chassis mac.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn-controller: Provide the option to configure inactivity probe interval for OpenFlo...
Numan Siddique [Mon, 1 Jul 2019 07:42:08 +0000 (13:12 +0530)]
ovn-controller: Provide the option to configure inactivity probe interval for OpenFlow conn

If the ovn-controller main loop takes more than 5 seconds (if there are lots of logical
flows) before it calls poll_block(), it causes the poll_block to wake up immediately,
since rconn module has to send echo request. With the incremental processing, this is
not an issue as ovn-controller will not recompute again. But for older versions, this
is an issue as it causes flow recomputations and this would result in 100% cpu all the
time.

With this patch, CMS can configure a higher value depending the workload.

The main intention of this patch is to fix this recompuation issue for older versions
(there by requesting backport), it still would be beneficial with the
incremental processing engine.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Tested-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodb-ctl-base: fix memory leak in cmd-get() function
Damijan Skvarc [Fri, 5 Jul 2019 11:38:47 +0000 (13:38 +0200)]
db-ctl-base: fix memory leak in cmd-get() function

Memory leak occured in case specified key was not found in table
record.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn: Send GARP for router port IPs of a router port connected to bridged logical...
Numan Siddique [Mon, 1 Jul 2019 07:43:39 +0000 (13:13 +0530)]
ovn: Send GARP for router port IPs of a router port connected to bridged logical switch

This patch handles sending GARPs for

 - router port IPs of a distributed router port

 - router port IPs of a router port which belongs to gateway router
   (with the option - redirect-chassis set in Logical_Router.options)

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn: Send GARP for the router ports with reside-on-redirect-chassis options set
Numan Siddique [Mon, 1 Jul 2019 07:43:20 +0000 (13:13 +0530)]
ovn: Send GARP for the router ports with reside-on-redirect-chassis options set

With the commit [1], the routing for the provider logical switches
connected to a router is centralized on the master gateway chassis
(if the option - reside-on-redirect-chassis) is set. When the
failover happens and a standby gateway chassis becomes master,
it should send GARPs for the router port macs. Without this, the
physical switch doesn't learn the new location of the router port macs
immediately and this could result in traffic disruption.

This patch addresses this issue so that the ovn-controller which claims the
distributed gatweway router port sends out the GARPs.

ovn-controller sends the GARPs if the Port_Binding.nat_addresses column
is set. This patch makes use of this column, instead of adding a new column
even though the name - nat_addresses seems a bit misnomer. The documentation is
updated to highlight the usage of this column.

This patch doesn't handle sending the GARPs for the gateway router port IPs.
This will be handled in a separate patch.

[1] - 85706c34d53d ("ovn: Avoid tunneling for VLAN packets redirected to a gateway chassis")

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn-northd: Refactor the code which sets nat_addresses
Numan Siddique [Mon, 1 Jul 2019 07:43:11 +0000 (13:13 +0530)]
ovn-northd: Refactor the code which sets nat_addresses

The present code which sets the Port_Binding.nat_addresses
can be simplied. This patch does this. This would help in
upcoming commits to set the nat_addresses column with the
mac and IPs of distributed logical router ports and logical
router ports with 'reside-on-redirect-chassis' set.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agotunnel: Add layer 2 IPv6 GRE encapsulation support.
William Tu [Mon, 1 Jul 2019 19:45:22 +0000 (12:45 -0700)]
tunnel: Add layer 2 IPv6 GRE encapsulation support.

The patch adds ip6gre support. Tunnel type 'ip6gre' with packet_type=
legacy_l2 is a layer 2 GRE tunnel over IPv6, carrying inner ethernet packets
and encap with GRE header with outer IPv6 header.  Encapsulation of layer 3
packet over IPv6 GRE, ip6gre, is not supported yet.  I tested it by running:
  # make check-kernel TESTSUITEFLAGS='-k ip6gre'
under kernel 5.2 and for userspace:
  # make check TESTSUITEFLAGS='-k ip6gre'

Tested-by: Greg Rose <gvrose8192@gmail.com>
Tested-at: https://travis-ci.org/gvrose8192/ovs-experimental/builds/552977116
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agocompat: Clean up tunnel_id_to_key
Greg Rose [Wed, 3 Jul 2019 17:04:55 +0000 (10:04 -0700)]
compat: Clean up tunnel_id_to_key

This function was just a duplicate of tunnel_id_to_key32 - I'm not sure
why it was ever needed but let's dump it now.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agocompat: Clean up gre_calc_hlen
Greg Rose [Wed, 3 Jul 2019 17:04:54 +0000 (10:04 -0700)]
compat: Clean up gre_calc_hlen

It's proliferated throughout three .c files so let's pull them all
together in gre.h where the inline function belongs. This requires
some adjustments to the compat layer so that the various iterations
of gre_calc_hlen and ip_gre_calc_hlen since the 3.10 kernel are
handled correctly.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agocompat: Remove duplicate metadata destination code
Greg Rose [Wed, 3 Jul 2019 17:04:53 +0000 (10:04 -0700)]
compat: Remove duplicate metadata destination code

ip_gre.c and ip6_gre.c both had duplicate code for handling the tunnel
metadata destinations.  Move the duplicate code over into the right
header file, dst_metadata.h.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoossfuzz: Remove duplicate tcp flags parsing in flow extract target
Bhargava Shastry [Fri, 21 Jun 2019 12:50:35 +0000 (14:50 +0200)]
ossfuzz: Remove duplicate tcp flags parsing in flow extract target

During a code audit, the flow extraction fuzzer target was seen to be
 parsing tcp flags from the fuzzer supplied input twice. This is
probably a typo since the second call to `parse_tcp_flags()` is
identical to the first.
Since a call to `parse_tcp_flags()` parses the Ethernet and IP headers
contained in the packet, the second (buggy) call to `parse_tcp_flags()`
creates an expectation that there is a second set of Ethernet and IP
headers beyond the first which is incorrect. This patch fixes this
problem by removing the duplicate code in question.

Signed-off-by: Bhargava Shastry <bshas3@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoossfuzz: Add documentation
Bhargava Shastry [Fri, 21 Jun 2019 14:21:02 +0000 (16:21 +0200)]
ossfuzz: Add documentation

Documents OvS fuzzing effort and performs a rudimentary security
analysis of existing OvS fuzzing harnesses.

Feedback on the documentation and analysis appreciated.

Signed-off-by: Bhargava Shastry <bshas3@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovsdb-idl: Improve comments.
Ben Pfaff [Wed, 26 Jun 2019 21:02:14 +0000 (14:02 -0700)]
ovsdb-idl: Improve comments.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Suggested-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agofaq: Correct supported kernel versions for OVS 2.11.x.
Ben Pfaff [Thu, 27 Jun 2019 13:51:43 +0000 (06:51 -0700)]
faq: Correct supported kernel versions for OVS 2.11.x.

I don't think we're planning to backport 5.0 support to OVS 2.11.x, because
that would be counter to our usual practice.

Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Fixes: 2adada0e3db2 ("datapath: Support kernel version 5.0.x")
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovn-nbctl: fix memory leak
Damijan Skvarc [Wed, 3 Jul 2019 11:50:40 +0000 (13:50 +0200)]
ovn-nbctl: fix memory leak

Patch is mostly intended to prevent valgrind to report memory leak issues
while running unit tests. Otherwise it does not benefit anything since
the application exits immediately after freeing the memory.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agovswitchd: Always cleanup userspace datapath.
Ilya Maximets [Mon, 24 Jun 2019 14:20:17 +0000 (17:20 +0300)]
vswitchd: Always cleanup userspace datapath.

'netdev' datapath is implemented within ovs-vswitchd process and can
not exist without it, so it should be gracefully terminated with a
full cleanup of resources upon ovs-vswitchd exit.

This change forces dpif cleanup for 'netdev' datapath regardless of
passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to
not pass this additional option everytime for userspace datapath
installations and also allowes to not terminate system datapath in
setups where both datapaths runs at the same time.

The main part is that dpif_port_del() will lead to netdev_close()
and subsequent netdev_class->destroy(dev) which will stop HW NICs
and free their resources. For vhost-user interfaces it will invoke
vhost driver unregistering with a properly closed vhost-user
connection. For upcoming AF_XDP netdev this will allow to gracefully
destroy xdp sockets and unload xdp programs from linux interfaces.
Another important thing is that port deletion will also trigger
flushing of flows offloaded to HW NICs.

Exception made for 'internal' ports that could have user ip/route
configuration. These ports will not be removed without '--cleanup'.

This change fixes OVS disappearing from the DPDK point of view
(keeping HW NICs improperly configured, sudden closing of vhost-user
connections) and will help with linux devices clearing with upcoming
AF_XDP netdev support.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agoNEWS: Update regarding dumping HW offloaded flows.
Ilya Maximets [Mon, 1 Jul 2019 10:20:55 +0000 (13:20 +0300)]
NEWS: Update regarding dumping HW offloaded flows.

NEWS update was missed while updating docs for dynamic Flow API.
Since this is a user visible change, it should be mentioned here.

Fixes: d74ca2269e36 ("dpctl: Update docs about dump-flows and HW offloading.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Acked-by: Eli Britstein <elibr@mellanox.com>
4 years agonetdev-offload-tc: Fix requesting match on wildcarded vlan tpid.
Ilya Maximets [Wed, 19 Jun 2019 08:05:38 +0000 (11:05 +0300)]
netdev-offload-tc: Fix requesting match on wildcarded vlan tpid.

'mask' must be checked first before configuring key in flower.

CC: Eli Britstein <elibr@mellanox.com>
Fixes: 0b0a84783cd6 ("netdev-tc-offloads: Support match on priority tags")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
4 years agoovsdb-idl: memory leak while destroying database
Damjan Skvarc [Mon, 1 Jul 2019 10:24:38 +0000 (12:24 +0200)]
ovsdb-idl: memory leak while destroying database

While checking unit tests with valgrind option (make check-valgrind) I have
noticed several memory leaks of the following format:

.....
==20019== 13,883 (296 direct, 13,587 indirect) bytes in 1 blocks are definitely lost in loss record 346 of 346
==20019==    at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==20019==    by 0x530F52: xcalloc (util.c:121)
==20019==    by 0x5037A1: ovsdb_idl_row_create__ (ovsdb-idl.c:3120)
==20019==    by 0x5045A3: ovsdb_idl_row_create (ovsdb-idl.c:3133)
==20019==    by 0x507240: ovsdb_idl_process_update2 (ovsdb-idl.c:2478)
==20019==    by 0x507240: ovsdb_idl_db_parse_update__ (ovsdb-idl.c:2328)
==20019==    by 0x507240: ovsdb_idl_db_parse_update (ovsdb-idl.c:2380)
==20019==    by 0x508128: ovsdb_idl_process_response (ovsdb-idl.c:742)
==20019==    by 0x508128: ovsdb_idl_process_msg (ovsdb-idl.c:831)
==20019==    by 0x508128: ovsdb_idl_run (ovsdb-idl.c:915)
==20019==    by 0x4106D9: bridge_run (bridge.c:2977)
==20019==    by 0x40719C: main (ovs-vswitchd.c:127)
==20019==
==20019== LEAK SUMMARY:
==20019==    definitely lost: 296 bytes in 1 blocks
==20019==    indirectly lost: 13,587 bytes in 10 blocks
==20019==      possibly lost: 0 bytes in 0 blocks
==20019==    still reachable: 43,563 bytes in 440 blocks
==20019==         suppressed: 288 bytes in 1 blocks
....

The problem is that table records maintained by database which is going to
be destroyed with ovsdb_idl_db_destroy() function are not destroyed.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoOVN: add the possibility to specify tunnel dst port
Lorenzo Bianconi [Tue, 25 Jun 2019 10:35:26 +0000 (12:35 +0200)]
OVN: add the possibility to specify tunnel dst port

Introduce dst_port in options column of Encap table in order to add the
capability to configure destination port used for tunnel encapsulation

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodoc: Add info on vhost tx retries.
Kevin Traynor [Thu, 27 Jun 2019 11:12:30 +0000 (12:12 +0100)]
doc: Add info on vhost tx retries.

Add documentation about vhost tx retries and external
configuration that can help reduce/avoid them.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agostream-ssl: Fix crash on NULL private key and valid certificate.
Ilya Maximets [Tue, 25 Jun 2019 14:28:02 +0000 (17:28 +0300)]
stream-ssl: Fix crash on NULL private key and valid certificate.

Running ovsdb-server with empty private-key and non-empty certificate
(or otherwise) causes crash:

 # ovsdb-tool create ./etc/openvswitch/conf.db ./vswitch.ovsschema
 # ovsdb-server --remote=punix:./db.sock \
                --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
                --private-key=db:Open_vSwitch,SSL,private_key \
                --certificate=db:Open_vSwitch,SSL,certificate \
                --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert

 # ovs-vsctl --no-wait init
 # ovs-vsctl --no-wait set-ssl pkey.key cert.cert ca.cert
 # ovs-vsctl --no-wait set SSL . private_key='""'
 # ovs-vsctl --no-wait set SSL . certificate='cert.new'

 ==25513==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000
 ==25513==The signal is caused by a READ memory access.
 ==25513==Hint: address points to the zero page.
    #0 0x7ff7582aa0a9 in __GI___strlen_sse2
    #1 0x7ff759bdde81  (/lib64/libasan.so.5+0xace81)
    #2 0x7ff759479932  (/lib64/libcrypto.so.1.1+0xb3932)
    #3 0x7ff759473c5a in BIO_ctrl (/lib64/libcrypto.so.1.1+0xadc5a)
    #4 0x7ff7598decc1 in SSL_CTX_use_certificate_file (/lib64/libssl.so.1.1+0x40cc1)
    #5 0x4dbaa7 in stream_ssl_set_certificate_file__ lib/stream-ssl.c:1170
    #6 0x4dca2e in stream_ssl_set_key_and_cert lib/stream-ssl.c:1216
    #7 0x4146b2 in reconfigure_ssl ovsdb/ovsdb-server.c:1254
    #8 0x409c83 in main ovsdb/ovsdb-server.c:368
    #9 0x7ff758233812 in __libc_start_main
    #10 0x40f6bd in _start (ovsdb-server+0x40f6bd)

 AddressSanitizer can not provide additional info.
 SUMMARY: AddressSanitizer: SEGV (/lib64/libc.so.6+0x9a0a9) in __GI___strlen_sse2
 ==25513==ABORTING

Another way to reproduce is to use non-initialized DB entry for
private-key and a file for certificate in ovsdb-server cmdline.

The root cause is that stream_ssl_set_key_and_cert() triggers
configuration for both key and cert if any of them is valid, keeping
it possible for one of them to be NULL.

Fixes: 6f1e91b1d7c0 ("stream-ssl: Make changing keys and certificate at runtime reliable.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agonetdev-dpdk: Fix additional vhost tx retry.
Kevin Traynor [Thu, 27 Jun 2019 11:12:29 +0000 (12:12 +0100)]
netdev-dpdk: Fix additional vhost tx retry.

Fix minor issue of one possible additional retry.

Fixes: c6ec9d176dbf ("netdev-dpdk: Fix vHost stats.")
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agonetdev-dpdk: Reset queue number for vhost devices on vm shutdown.
David Marchand [Thu, 27 Jun 2019 09:43:36 +0000 (11:43 +0200)]
netdev-dpdk: Reset queue number for vhost devices on vm shutdown.

Rather than poll all disabled queues and waste some memory for vms that
have been shutdown, we can reconfigure when receiving a destroy
connection notification from the vhost library.

$ while true; do
  ovs-appctl dpif-netdev/pmd-rxq-show |awk '
  /port: / {
    tot++;
    if ($5 == "(enabled)") {
      en++;
    }
  }
  END {
    print "total: " tot ", enabled: " en
  }'
  sleep 1
done

total: 66, enabled: 66
total: 6, enabled: 2

This change requires a fix in the DPDK vhost library, so bump the minimal
required version to 18.11.2.

Co-authored-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agodpdk: Use DPDK 18.11.2 release.
Ian Stokes [Wed, 26 Jun 2019 21:06:05 +0000 (22:06 +0100)]
dpdk: Use DPDK 18.11.2 release.

Modify travis linux build script to use the latest DPDK stable release
18.11.2. Update docs for latest DPDK stable releases.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
4 years agovswitchd: Separate disable system and route.
William Tu [Tue, 25 Jun 2019 21:52:38 +0000 (14:52 -0700)]
vswitchd: Separate disable system and route.

Previously, '--disable-system' disables both system dp and the system
routing table.  The patch makes '--disable-system' only disable system
dp and adds '--disable-system-route' for disabling the route table.
This fixes failures when 'make check-system-userspace' for tunnel cases.

As a consequence, hitting errors due to OVS userspace parses the IGMP packet
but its datapaths do not, so odp_flow_key_to_flow() return ODP_FIT_TOO_LITTLE.
commit c645550bb249 ("odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP.")
Fix it by filtering out the IGMP-related error message.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Co-authored-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovs-atomic-c++.h: Fix for 64 bit atomics.
Gurucharan Shetty [Wed, 12 Jun 2019 10:57:29 +0000 (03:57 -0700)]
ovs-atomic-c++.h: Fix for 64 bit atomics.

Commit e981a45a6cae4 (ovs-atomic: Add 64 bit apis.)
added a few 64 bit apis (e.g: atomic_count_inc64).  For C++,
this invokes std::atomic_fetch_*_explicit() functions in
lib/ovs-atomic-c++.h.

The function overloading for 64 bit function fails without
specifiying something like: std::atomic_fetch_*_explicit<std::uint64_t>().
But it looks tricky to do this with macros.

This patch tries to fix the compilation failures by calling atomic
functions on the variables itself.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agonetdev-dpdk: Avoid reconfiguration on VIRTIO_NET_F_MQ changes.
David Marchand [Thu, 25 Apr 2019 15:22:09 +0000 (17:22 +0200)]
netdev-dpdk: Avoid reconfiguration on VIRTIO_NET_F_MQ changes.

At the moment, a malicious guest might negotiate VIRTIO_NET_F_MQ and
!VIRTIO_NET_F_MQ in a loop which would be seen as qp_num going from 1 to
n and n to 1 continuously, triggering datapath reconfigurations at each
transition.

Limit this by only reconfiguring on increased qp_num.
The previous patch reduced the observed cost of polling disabled queues,
so the only cost is memory.

Co-authored-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agodpif-netdev: Only poll enabled vhost queues.
David Marchand [Thu, 25 Apr 2019 15:22:08 +0000 (17:22 +0200)]
dpif-netdev: Only poll enabled vhost queues.

We currently poll all available queues based on the max queue count
exchanged with the vhost peer and rely on the vhost library in DPDK to
check the vring status beneath.
This can lead to some overhead when we have a lot of unused queues.

To enhance the situation, we can skip the disabled queues.
On rxq notifications, we make use of the netdev's change_seq number so
that the pmd thread main loop can cache the queue state periodically.

$ ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 1:
  isolated : true
  port: dpdk0             queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 2:
  isolated : true
  port: vhost1            queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost3            queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 15:
  isolated : true
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 16:
  isolated : true
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost2            queue-id:  0 (enabled)   pmd usage:  0 %

$ while true; do
  ovs-appctl dpif-netdev/pmd-rxq-show |awk '
  /port: / {
    tot++;
    if ($5 == "(enabled)") {
      en++;
    }
  }
  END {
    print "total: " tot ", enabled: " en
  }'
  sleep 1
done

total: 6, enabled: 2
total: 6, enabled: 2
...

 # Started vm, virtio devices are bound to kernel driver which enables
 # F_MQ + all queue pairs
total: 6, enabled: 2
total: 66, enabled: 66
...

 # Unbound vhost0 and vhost1 from the kernel driver
total: 66, enabled: 66
total: 66, enabled: 34
...

 # Configured kernel bound devices to use only 1 queue pair
total: 66, enabled: 34
total: 66, enabled: 19
total: 66, enabled: 4
...

 # While rebooting the vm
total: 66, enabled: 4
total: 66, enabled: 2
...
total: 66, enabled: 66
...

 # After shutting down the vm
total: 66, enabled: 66
total: 66, enabled: 2

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agocompat: Fix compilation error on CentOS 7.6
Yi-Hung Wei [Tue, 25 Jun 2019 18:09:07 +0000 (11:09 -0700)]
compat: Fix compilation error on CentOS 7.6

This fix the compilation issue on CentOS 7.6 kernel
(3.10.0-957.21.3.el7.x86_64).

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-June/360013.html
Reported-by: Fred Neubauer <fred.neubauer@gmail.com>
Fixes: 6660a9597a49 ("datapath: compat: Introduce static key support")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agorhel: Fix upgrade path
Greg Rose [Tue, 25 Jun 2019 18:45:52 +0000 (11:45 -0700)]
rhel: Fix upgrade path

There is a bug in the upgrade path from the old kmod-openvswitch SysV
based RPM to the new openvswitch-kmod systemd based RPM. Since the
name of the package is changed it is not possible to use the yum
or rpm upgrade options.  This prevents passing in a 1 or 2 to the
%postun scriptlet section of the older RPM and that causes the section
to be treated as an 'erase'.  The old kmod-openvswitch %postun section
proceeds to erase the symlinks in ../weak-updates/openvwswitch that
the installation of the new package had just created.

Fix this by adding a %posttrans tag to the systemd spec file.  This
scriptlet is called after the symlinks have just been erased and
it calls the ovs-kmod-manage.sh script to recreate the symlinks and
run depmod -a again so that the correct kernel modules will be
found and loaded.

VMware-BZ: #236987

Cc: Aaron Conole <aconole@redhat.com>
Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
4 years agoofproto-dpif: Fix continuation with patch port
Yi-Hung Wei [Fri, 21 Jun 2019 17:51:23 +0000 (10:51 -0700)]
ofproto-dpif: Fix continuation with patch port

This patch fixes the ofp_port to odp_port translation issue on patch
port with nxt_resume.  When OVS resumes processing a packet from
nxt_resume, OVS does not translate the ofp in_port to odp in_port
correctly if the packet is originally received from a patch port.
Currently,OVS sets the odp in_port for this resume pakcet as ODPP_NONE
and push the resume packet back to the datapath. Later on, if the packet
goes through a recirc, OVS will generate the following message since it
can not translate odp in_port (ODPP_NONE) back to ofp in_port during upcall,
and push down a datapath rule to drop the packet.

    ofproto_dpif_upcall(handler16)|INFO|received packet on unassociated
        datapath port 4294967295

When OVS revalidates the drop datapath flow with ODPP_NONE in_port, we
will see the following warning.
    ofproto_dpif_upcall(revalidator18)|WARN|Failed to acquire udpif_key
        corresponding to unexpected flow (Invalid argument): ufid:....

This patch resolves this issue by storing the odp in_port in the
continuation messages, and restores the odp in_port before push the
packet back to the datapath.

VMWare-BZ: 2364696
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoOpenFlow: Enable OpenFlow 1.5 by default.
Ben Pfaff [Mon, 24 Apr 2017 18:49:59 +0000 (11:49 -0700)]
OpenFlow: Enable OpenFlow 1.5 by default.

Open vSwitch now supports all OpenFlow 1.5 required features, so enable
it by default.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoofp-actions: Support OF1.5 meter action.
Ben Pfaff [Tue, 30 Apr 2019 16:19:27 +0000 (09:19 -0700)]
ofp-actions: Support OF1.5 meter action.

OpenFlow 1.5 changed "meter" from an instruction to an action.  This commit
supports it properly.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agotravis: Make it possible to build against a dpdk branch.
David Marchand [Wed, 19 Jun 2019 07:26:29 +0000 (09:26 +0200)]
travis: Make it possible to build against a dpdk branch.

Rework the build script so that we can pass branches and tags.

With this, DPDK_VER can be passed as:
- a string starting with refs/ which is understood as a git reference.
  This triggers a git clone on DPDK_GIT (default value points to
  https://dpdk.org/git/dpdk) for a single branch pointing to this
  reference (to save some disk),
- else, any other string which is understood as an official release.
  This triggers a tarball download on dpdk.org.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
4 years agotravis: Do not patch dpdk sources.
David Marchand [Wed, 19 Jun 2019 07:26:28 +0000 (09:26 +0200)]
travis: Do not patch dpdk sources.

Rather than patch the dpdk makefile and a template config file, we can
pass the -fPIC flag via EXTRA_CFLAGS.
This is more reliable than expecting the dpdk file names to be kept
unchanged.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
5 years agoAUTHORS: Add Yanqin Wei and Malvika Gupta.
Ben Pfaff [Thu, 13 Jun 2019 17:52:51 +0000 (10:52 -0700)]
AUTHORS: Add Yanqin Wei and Malvika Gupta.

Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoutil: implement count_1bits with Neon intrinsics or gcc built-in for aarch64.
Yanqin Wei [Thu, 13 Jun 2019 10:38:07 +0000 (18:38 +0800)]
util: implement count_1bits with Neon intrinsics or gcc built-in for aarch64.

Userspace datapath needs to traverse through miniflow values many times. In
this process, 'count_1bits' operation for 'Flowmap' significantly impact
performance. On arm, this function was defined by portable implementation
because gcc for arm does not support popcnt feature.
But in the aarch64, VCNT neon instruction can accelerate "count_1bits".
From Gcc-7, the built-in function is implemented with neon intruction.
In this patch, count_1bits function will be impelmented with gcc built-in
from gcc-7 on, and with neon intrinsics in gcc-6.
Performance test was run in two aarch64 machines. In the NIC2NIC test, one
tuple dpcls lookup case achieves around 4% throughput improvement and
10(average) tuples case achieves around 5% improvement.

Tested-by: Malvika Gupta <malvika.gupta@arm.com>
Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Support kernel version 5.0.x
Yifeng Sun [Wed, 12 Jun 2019 22:35:29 +0000 (15:35 -0700)]
datapath: Support kernel version 5.0.x

This patch updated acinclude.m4 so that OVS can be compiled on
5.0.x kernels.
This patch also updated travis files so that 5.0.x kernel versions
are used during travis test builds.
Besides, NEWS and releases.rst are also updated to reflect this
new support.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agonet: core: dev: Add extack argument to dev_change_flags()
Petr Machata [Wed, 12 Jun 2019 22:35:28 +0000 (15:35 -0700)]
net: core: dev: Add extack argument to dev_change_flags()

Upstream commit:
    commit 567c5e13be5cc74d24f5eb54cf353c2e2277189b
    Author: Petr Machata <petrm@mellanox.com>
    Date:   Thu Dec 6 17:05:42 2018 +0000

    net: core: dev: Add extack argument to dev_change_flags()

    In order to pass extack together with NETDEV_PRE_UP notifications, it's
    necessary to route the extack to __dev_open() from diverse (possibly
    indirect) callers. One prominent API through which the notification is
    invoked is dev_change_flags().

    Therefore extend dev_change_flags() with and extra extack argument and
    update all users. Most of the calls end up just encoding NULL, but
    several sites (VLAN, ipvlan, VRF, rtnetlink) do have extack available.

    Since the function declaration line is changed anyway, name the other
    function arguments to placate checkpatch.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch backports the above upstream patch and also adds fixes
in compat code.

Cc: Petr Machata <petrm@mellanox.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Backport the removal of __tcp_checksum_complete()
Yifeng Sun [Wed, 12 Jun 2019 22:35:27 +0000 (15:35 -0700)]
datapath: Backport the removal of __tcp_checksum_complete()

Upstream commit 6ab6dfa6bb500f5cbb9b7a0f23a1613417ca2d12 ("net: get
rid of __tcp_checksum_complete())" deleted __tcp_checksum_complete()
and caused compilation failure for OVS on newer kernels.

This patch fixes it by using __skb_checksum_complete(), which is
100% the same with __tcp_checksum_complete().

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoOVS: remove use of VLAN_TAG_PRESENT
Michał Mirosław [Wed, 12 Jun 2019 22:35:26 +0000 (15:35 -0700)]
OVS: remove use of VLAN_TAG_PRESENT

Upstream commits:
    (1) commit 9df46aefafa6dee81a27c2a9d8ba360abd8c5fe3
    Author: Michał Mirosław <mirq-linux@rere.qmqm.pl>
    Date:   Thu Nov 8 18:44:50 2018 +0100

    OVS: remove use of VLAN_TAG_PRESENT

    This is a minimal change to allow removing of VLAN_TAG_PRESENT.
    It leaves OVS unable to use CFI bit, as fixing this would need
    a deeper surgery involving userspace interface.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
    (2) commit 6083e28aa02d7c9e6b87f8b944e92793094ae047
    Author: Michał Mirosław <mirq-linux@rere.qmqm.pl>
    Date:   Sat Nov 10 19:55:34 2018 +0100

    OVS: remove VLAN_TAG_PRESENT - fixup

    It turns out I missed one VLAN_TAG_PRESENT in OVS code while rebasing.
    This fixes it.

Fixes: 9df46aefafa6 ("OVS: remove use of VLAN_TAG_PRESENT")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch backports the above upstream patch to OVS and adds
extra checking in kernel module's compat code.

Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Check extack argument of rtnl_create_link()
Yifeng Sun [Wed, 12 Jun 2019 22:35:25 +0000 (15:35 -0700)]
datapath: Check extack argument of rtnl_create_link()

Upstream commit d0522f1cd25edb796548f91e04766fa3cbc3b6df ("net:
Add extack argument to rtnl_create_link") added new argument
to rtnl_create_link(). This introduced compiling errors in
the code of kernel datapath.

This patch fixes this issue.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agonetdev-tc-offloads: Use correct hook qdisc at init tc flow
Raed Salem [Mon, 10 Jun 2019 11:58:40 +0000 (14:58 +0300)]
netdev-tc-offloads: Use correct hook qdisc at init tc flow

A preliminary netdev qdisc cleanup is done during init tc flow.
The cited commit allows for creating of egress hook qdiscs on internal
ports. This breaks the netdev qdisc cleanup as currently only ingress
hook qdiscs type is deleted. As a consequence the check for tc ingress
shared block support fails when the check is done on internal port.

Issue can be reproduced by the following steps:
- start openvswitch service
- create ovs bridge
- restart openvswitch service

Fix by using the correct hook qdisc type at netdev hook qdisc cleanup.

Fixes 608ff46aaf0d ("ovs-tc: offload datapath rules matching on internal ports")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
5 years agoovn-controller: Fix parsing of OVN tunnel IDs
Dumitru Ceara [Wed, 12 Jun 2019 15:59:02 +0000 (17:59 +0200)]
ovn-controller: Fix parsing of OVN tunnel IDs

Encap tunnel-ids are of the form:
<chassis-id><OVN_MVTEP_CHASSISID_DELIM><encap-ip>.
In physical_run we were checking if a tunnel-id corresponds
to the local chassis-id by searching if the chassis-id string
is included in the tunnel-id (strstr). This can break quite
easily, for example, if the local chassis-id is a substring
of a remote chassis-id. In that case we were wrongfully
skipping the tunnel creation.

To fix that new tunnel-id creation and parsing functions are added in
encaps.[ch]. These functions are now used everywhere where applicable.

Acked-by: Venu Iyer <iyervl@ymail.com>
Reported-at: https://bugzilla.redhat.com/1708131
Reported-by: Haidong Li <haili@redhat.com>
Fixes: b520ca7 ("Support for multiple VTEP in OVN")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Don't install kernel for DPDK checks.
Ilya Maximets [Tue, 11 Jun 2019 15:31:21 +0000 (18:31 +0300)]
travis: Don't install kernel for DPDK checks.

We don't need to build DPDK kernel modules to test build with OVS.
And we don't need to build OVS datapath modules for checking
userspace with DPDK.

Removed 'max-inline-insns-single' changes that only was needed for
DPDK kernel modules. Config modifications changed to update
generated build/.config instead of changing sources.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
5 years agoovn-controller: Cleanup memory in binding_evaluate_port_binding_changes
Dumitru Ceara [Tue, 11 Jun 2019 14:55:34 +0000 (16:55 +0200)]
ovn-controller: Cleanup memory in binding_evaluate_port_binding_changes

The 'lport_to_iface' and 'egress_ifaces' hashtables were not cleaned up
when checking if port bindings require a recompute.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048822.html
Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com>
Fixes: 9d0b504abdee ("ovn-controller: runtime_data change handler for SB port-binding")
Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agonetdev-offload: Rename offload providers.
Ilya Maximets [Tue, 7 May 2019 09:24:09 +0000 (12:24 +0300)]
netdev-offload: Rename offload providers.

Flow API providers renamed to be consistent with parent module
'netdev-offload' and look more like each other.

'_rte_' replaced with more convenient '_dpdk_'.

We'll have following structure:

  Common code:
    lib/netdev-offload-provider.h
    lib/netdev-offload.c
    lib/netdev-offload.h

  Providers:
    lib/netdev-offload-tc.c
    lib/netdev-offload-dpdk.c

'netdev-offload-dummy' still resides inside netdev-dummy, but it
makes no much sence to move it out of there.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
5 years agonetdev: Split up netdev offloading to separate module.
Ilya Maximets [Tue, 7 May 2019 09:24:08 +0000 (12:24 +0300)]
netdev: Split up netdev offloading to separate module.

New module 'netdev-offload' created to manage different flow API
implementations. All the generic and provider independent code moved
there from the 'netdev' module.

Flow API providers further encapsulated.

The only function that was changed is 'netdev_any_oor'.
Now it uses offloading related hmap instead of common 'netdev_shash'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
5 years agodpctl: Update docs about dump-flows and HW offloading.
Ilya Maximets [Wed, 15 May 2019 14:32:32 +0000 (17:32 +0300)]
dpctl: Update docs about dump-flows and HW offloading.

Since introduction of dynamic flow API for netdevs, tricky
accesses to uninitialized flow API are no longer possible.
So, ovs-dpctl doesn't support dumping HW offloaded flows now.
Claim this in docs and man pages. Additionally forbidden
'type' argument for 'ovs-dpctl dump-flows'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Roi Dayan <roid@mellanox.com>
5 years agonetdev: Dynamic per-port Flow API.
Ilya Maximets [Tue, 7 May 2019 09:24:07 +0000 (12:24 +0300)]
netdev: Dynamic per-port Flow API.

Current issues with Flow API:

* OVS calls offloading functions regardless of successful
  flow API initialization. (ex. on init_flow_api failure)
* Static initilaization of Flow API for a netdev_class forbids
  having different offloading types for different instances
  of netdev with the same netdev_class. (ex. different vports in
  'system' and 'netdev' datapaths at the same time)

Solution:

* Move Flow API from the netdev_class to netdev instance.
* Make Flow API dynamic, i.e. probe the APIs and choose the
  suitable one.

Side effects:

* Flow API providers localized as possible in their modules.
* Now we have an ability to make runtime checks. For example,
  we could check if particular device supports features we
  need, like if dpdk device supports RSS+MARK action.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Roi Dayan <roid@mellanox.com>
5 years agorhel: let *-ctl handle runtime directory
Jaime Caamaño Ruiz [Mon, 10 Jun 2019 13:55:31 +0000 (15:55 +0200)]
rhel: let *-ctl handle runtime directory

Recent versions of systemd restores RuntimeDirectory ownership to the
unit's User in between execution of *Exec directives (see [1]). Using
ExecStartPre to reset RuntimeDirectory ownership to OVS_USER no longer
works as expected.

The ctl scripts already handle creation of the runtime directory with
correct ownership and permissions so we can basically remove
RuntimeDirectory from systemd unit file. There is still need to handle
ownsership to cover some upgrade scenarios, but success of that will be
optional as the directory itself wont exist at first time run.

[1] https://github.com/systemd/systemd/issues/12713

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agorhel: Fix ovn database dir optional on first run
Jaime Caamaño Ruiz [Mon, 10 Jun 2019 16:58:12 +0000 (18:58 +0200)]
rhel: Fix ovn database dir optional on first run

OVN database directory is createid on first run so make ownership
handling optional.

Fixes: 94e1e8be3187 ("rhel: run ovn with the same user as ovs")
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agorhel: set useropts optional for ovsdb-server
Jaime Caamaño Ruiz [Mon, 10 Jun 2019 16:58:11 +0000 (18:58 +0200)]
rhel: set useropts optional for ovsdb-server

systemd assesses the presssence of all EnvironmentFile before execution
of Exec* directives, thus useropts needs to be optional even though it
will always be created at ExecStartPre.

Fixes: 94e1e8be3187 ("rhel: run ovn with the same user as ovs")
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agorhel: useropts should be owned by package
Jaime Caamaño Ruiz [Mon, 10 Jun 2019 13:55:45 +0000 (15:55 +0200)]
rhel: useropts should be owned by package

So that is properly cleaned up after package is uninstalled.

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agolacp: Don't send or receive PDUs when carrier state of slave is down
Nitin Katiyar [Sun, 9 Jun 2019 14:18:10 +0000 (14:18 +0000)]
lacp: Don't send or receive PDUs when carrier state of slave is down

Fortville NICs (or their drivers) can get into an inconsistent state,
in which the NIC can actually transmit and receive packets even
though they report "PHY down". In such a state, OVS can exchange and
process LACP messages and enable a LACP slave. However, further packet
exchange over the slave fails because OVS sees that the PHY is down.

This commit fixes the problem by making OVS ignore received LACP PDUs
and suppress transmitting LACP PDUs when carrier is down. In addition,
when a LACP PDU is received with carrier down, this commit triggers
rechecking the carrier status (by incrementing the connectivity sequence
number) to ensure that it is updated as quickly as possible.

Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agolacp: Avoid packet drop on LACP bond after link up
Nitin Katiyar [Sun, 9 Jun 2019 14:17:45 +0000 (14:17 +0000)]
lacp: Avoid packet drop on LACP bond after link up

Problem:
========
The OVS state machine that enables and disables bond slaves runs in
the OVS main thread. The OVS code that processes received LACP packets
runs in a different thread. Until now, when the latter processes a LACP
PDU that should enable a slave, the slave was only enabled when the
main thread was able to run the state machine. In some cases this led
to delays of up to 350ms when the main thread was busy or not scheduled,
which led to corresponding delays in which packets were dropped due to
the bond-admissibility check.

Fix:
====
When a LACP PDU is received, evaluate whether LACP slave can be enabled
(slave_may_enable()) and set LACP slave's may_enable from the datapath
thread itself. When may_enable = TRUE, it means L1 state is UP and
LACP-SYNC is done and it is waiting for the main thread to enable the
slave. Relax the check in bond_check_admissibility() to check for both
"enable" and "may_enable" of the LACP slave. This would avoid dropping
of packets until the main thread enables the slave from bundle_run().

Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Test with latest stable kernel releases.
Ilya Maximets [Thu, 16 May 2019 16:39:21 +0000 (19:39 +0300)]
travis: Test with latest stable kernel releases.

Instead of managing kernel minor versions manually we could always test
with the most recent stable release of the desired branch.

With this patch applied Travis will always check with the most recent
kernels, so we'll be notified about changes in upstream kernels that
breaks the build of our kernel module. However, this will also break
Travis checks on patches that doesn't touch the kernel parts until
we fix the module.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
5 years agoAUTHORS: Add Damijan Skvarc and Jaime Caamaño Ruiz.
Ben Pfaff [Mon, 10 Jun 2019 00:28:05 +0000 (17:28 -0700)]
AUTHORS: Add Damijan Skvarc and Jaime Caamaño Ruiz.

Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agorhel: run ovn with the same user as ovs
Jaime Caamaño Ruiz [Wed, 8 May 2019 11:53:48 +0000 (13:53 +0200)]
rhel: run ovn with the same user as ovs

Both ovn and ovs share the same log and run directories which are owned
by the user running ovs so it makes sense that ovn runs under that user
too to diminish security concerns and possible problems with log rotation.

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agorhel: secure openvswitch useropts
Jaime Caamaño Ruiz [Wed, 8 May 2019 11:53:47 +0000 (13:53 +0200)]
rhel: secure openvswitch useropts

The openvswitch useropts file is being stored in a directory where the
openvswitch user has write permissions. The openvswitch user can then
manipulate the file to change the user under which switchd daemon runs.

This patch changes the file to /var/openvswitch.useropts preventing any
manipulation.

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agorhel: start ovn-controller-vtep with ovn-ctl
Jaime Caamaño Ruiz [Wed, 8 May 2019 11:53:46 +0000 (13:53 +0200)]
rhel: start ovn-controller-vtep with ovn-ctl

Use ovn-ctl to start ovn-controller-vtep from the corresponding systemd
unit file.

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovn-controller-vtep: Fix wrong value for ovnsb-db argument
Jaime Caamaño Ruiz [Wed, 8 May 2019 11:53:45 +0000 (13:53 +0200)]
ovn-controller-vtep: Fix wrong value for ovnsb-db argument

Fix help output of ovn-controller-vtep that was suggesting the
openvswitch database instead of the ovn southbound database for the
ovnsb-db argument.

Also fix the corresponding systemd unit that was passing the openvswitch
database instead of the ovn southbound database for the ovnsb-db
argument.

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agonsh: Fix "shadow" warnings while compiling with clang.
Damijan Skvarc [Fri, 10 May 2019 08:57:16 +0000 (10:57 +0200)]
nsh: Fix "shadow" warnings while compiling with clang.

Because of the macro implementation of htonX() and ntohX(), using one in
the argument of the other yields warnings.  This commit avoids the issue by
using a temporary variable.

This does not fix a bug, only suppresses a warning.

Submitted-at: https://github.com/openvswitch/ovs/pull/283
Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotests: Add negative tests for action and instruction parsing.
Ben Pfaff [Tue, 30 Apr 2019 22:30:41 +0000 (15:30 -0700)]
tests: Add negative tests for action and instruction parsing.

This adds a negative test for almost all of the error messages that
parsing an action or instruction can produce.

This commit removes now-redundant tests from multipath.at.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovs-ofctl: New testing command "parse-group".
Ben Pfaff [Tue, 30 Apr 2019 22:07:07 +0000 (15:07 -0700)]
ovs-ofctl: New testing command "parse-group".

This will be used in an upcoming test.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofp-actions: Improve a few error messages.
Ben Pfaff [Tue, 30 Apr 2019 22:01:00 +0000 (15:01 -0700)]
ofp-actions: Improve a few error messages.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofp-actions: Eliminate redundant error messages from ofpacts_parse__().
Ben Pfaff [Tue, 30 Apr 2019 22:06:34 +0000 (15:06 -0700)]
ofp-actions: Eliminate redundant error messages from ofpacts_parse__().

These duplicate messages emitted by ofpacts_verify(), so drop them.

These were previously useful because ofpacts_verify()'s error messages were
not as good as those emitted by ofpacts_parse__(), but that's been fixed
now.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofp-actions: Improve error messages for verification failures in parsing.
Ben Pfaff [Tue, 30 Apr 2019 16:19:16 +0000 (09:19 -0700)]
ofp-actions: Improve error messages for verification failures in parsing.

Verification can fail for a variety of reasons but the code here always
reported "Incorrect instruction ordering".

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofp-actions: Enforce minimum length for packet truncation during parsing.
Ben Pfaff [Tue, 30 Apr 2019 20:26:47 +0000 (13:26 -0700)]
ofp-actions: Enforce minimum length for packet truncation during parsing.

Otherwise, specifying something like output(port=1,max_len=5) would parse
OK and then cause a failure when it was received by the switch.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofp-actions: Make decap action format output match parsed input.
Ben Pfaff [Tue, 30 Apr 2019 20:41:52 +0000 (13:41 -0700)]
ofp-actions: Make decap action format output match parsed input.

The action expects 'type' as a parameter name so it should use 'type' when
it formats actions too.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofp-actions: Make encap action really require OF1.3+.
Ben Pfaff [Tue, 30 Apr 2019 20:25:48 +0000 (13:25 -0700)]
ofp-actions: Make encap action really require OF1.3+.

This action is only supported in OpenFlow 1.3 and later, but the parser
from text allowed it in earlier versions, which could cause confusion,
e.g.:

$ ovs-ofctl parse-flow 'actions=encap(ethernet())'
usable protocols: any
chosen protocol: OpenFlow10-table_id
2019-04-30T20:19:59Z|00001|ofp_actions|WARN|unknown OpenFlow10 action for vendor 0x2320 and type 46
2019-04-30T20:19:59Z|00002|ofp_actions|WARN|bad action at offset 0 (OFPBAC_BAD_VENDOR_TYPE):
00000000  ff ff 00 10 00 00 23 20-00 2e 00 00 00 00 00 00
OFPT_FLOW_MOD (xid=0x1): ***decode error: OFPBAC_BAD_VENDOR_TYPE***
00000000  01 0e 00 58 00 00 00 01-00 38 20 ff 00 00 00 00 |...X.....8 .....|
00000010  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 |................|
00000020  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 |................|
00000030  00 00 00 00 00 00 00 00-00 00 00 00 00 00 80 00 |................|
00000040  ff ff ff ff ff ff 00 00-ff ff 00 10 00 00 23 20 |..............# |
00000050  00 2e 00 00 00 00 00 00-                        |........        |

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofproto-dpif-xlate: Report DHCP output actions in trace.
Ben Pfaff [Fri, 7 Jun 2019 19:09:19 +0000 (12:09 -0700)]
ofproto-dpif-xlate: Report DHCP output actions in trace.

Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovn: grab pinctrl_mutex before running pinctrl_handle_buffered_packets
Lorenzo Bianconi [Wed, 15 May 2019 13:33:08 +0000 (15:33 +0200)]
ovn: grab pinctrl_mutex before running pinctrl_handle_buffered_packets

pinctrl_handle_buffered_packets can insert new elements in
buffered_packets_map hasmap and it runs concurrently with pinctrl_run
starting from commit 3594ffab6b4b. Fix possible races grabbing
pinctrl_mutex before running pinctrl_handle_buffered_packets

Fixes: 3594ffab6b4b ("ovn-controller: Add a new thread in pinctrl module to handle packet-ins.")
Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agodatapath: Fix build errors for 4.9.172+ kernels
Yifeng Sun [Fri, 24 May 2019 18:24:25 +0000 (11:24 -0700)]
datapath: Fix build errors for 4.9.172+ kernels

4.9.172+ kernel backported upstream patch 70b095c843266
("ipv6: remove dependency of nf_defrag_ipv6 on ipv6 module")
and this caused compilation errors of OVS kernel module.

This patch fixes it by checking and using new functions
introduced by the upstream patch.

Travis tests passed at
https://travis-ci.org/yifsun/ovs-travis/builds/536527230
with latest Linux kernel version.

In addition, this patch doesn't introduce failed tests for latest kernels
of Ubuntu (bionic, trusty, xenial), fedora, centos 73, rhel (74, 75, 76).

Reported-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoofproto-dpif-xlate: Fix match criteria for in-band control rule
Anju Thomas [Wed, 29 May 2019 09:54:04 +0000 (09:54 +0000)]
ofproto-dpif-xlate: Fix match criteria for in-band control rule

As part of in-band control, OVS is expected to send DHCP server replies
to the LOCAL port as well. In this case, OVS implicitly adds an
additional action to output to the bridge’s LOCAL port after the ofproto
translation for the packet is completed in the ofproto layer but before
sending the actions to datapath for installation.

However, the match criteria is unchanged and as a result all packets
(not just DHCP server replies) are also sent to the LOCAL port.

The fix is to add the IP protocol type (UDP), the UDP source and
destination ports to the match criteria so that a specific datapath flow
that matches only DHCP server replies is installed. As a result, only
DHCP server reply packets will be sent to the LOCAL port.

Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoconntrack: ignore port for ICMP/ICMPv6 NAT.
solomon [Wed, 5 Jun 2019 22:35:34 +0000 (15:35 -0700)]
conntrack: ignore port for ICMP/ICMPv6 NAT.

ICMP/ICMPv6 fails, if the src/dst port is set in a common NAT rule.
For example:
actions=ct(nat(dst=172.16.1.100:5000),commit,table=40)

Fixes: 4cd0481c9e8b ("conntrack: Fix wasted work for ICMP NAT.")
CC: Darrell Ball <dlu998@gmail.com>
Signed-off-by: solomon <liwei.solomon@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Co-authored-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoflow: Don't include ports of first fragments in hash
Van Bemmel, Jeroen (Nokia - US) [Wed, 5 Jun 2019 22:36:33 +0000 (22:36 +0000)]
flow: Don't include ports of first fragments in hash

For a series of IP fragments, only the first packet includes the transport
header (TCP/UDP/SCTP) and the src/dst ports. By including these port
numbers in the hash, it may happen that a first fragment hashes to a
different value than subsequent packets, causing different packets from
the same flow to follow different paths. This in turn may result in
out-of-order delivery or failed reassembly. This patch excludes port
numbers from the hash calculation in case of IP fragmentation.

Signed-off-by: Jeroen van Bemmel <jeroen.van_bemmel@nokia.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoAUTHORS: Add Maciej Józefczyk.
Ben Pfaff [Fri, 7 Jun 2019 17:19:02 +0000 (10:19 -0700)]
AUTHORS: Add Maciej Józefczyk.

Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agoovn: Add support for DHCP option 15 - domain name
Maciej Józefczyk [Fri, 7 Jun 2019 12:14:28 +0000 (14:14 +0200)]
ovn: Add support for DHCP option 15 - domain name

For Openstack Internal DNS functionality we need
to provide support for domain_name option.
DHCP option 15 was previously used only in parser
tests and according to RFC it should be renamed to
domain_name [1].

This patch modifies its name in the tests from
'domain' to 'domain_name' and adds its support
to the code.

[1] https://tools.ietf.org/html/rfc2132#section-3.17

Signed-off-by: Maciej Józefczyk <mjozefcz@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
5 years agotravis: Use enable-Werror knob.
Ilya Maximets [Tue, 28 May 2019 13:39:46 +0000 (16:39 +0300)]
travis: Use enable-Werror knob.

Unlike manual injecting of "-Werror" to CFLAGS,  '--enable-Werror'
enables failure on "sparse" and flake8 warnings. At least we wasn't
notified about flake8 warnings previously.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
5 years agosparse: Re-allow sparse builds with dpdk.
Ilya Maximets [Fri, 26 Apr 2019 08:08:49 +0000 (11:08 +0300)]
sparse: Re-allow sparse builds with dpdk.

Few structures from rte_flow.h updated to the version from DPDK 18.11
to fix incorrect structure definitions.

rte_lcore.h and rte_vect.h "sparse" headers removed because not needed
and only produce type-mismatch issues.

Enabled -Werror for sparse builds with DPDK to prevent regressions.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
5 years agoacinclude: Add vector defines to sparse.
Ilya Maximets [Tue, 28 May 2019 12:34:14 +0000 (15:34 +0300)]
acinclude: Add vector defines to sparse.

By adding compiler default flags for vector instructions to
cgcc we'll be able to check the same sources that we're building.
Also, this will allow to avoid re-defining these flags and
types specifically for "sparse" includes.

"sparse" headers "bmi2intrin.h" and "emmintrin.h" dropped as
not needed anymore.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
5 years agodpif-netdev: Forbid vport offloading attempts.
Ilya Maximets [Mon, 13 May 2019 12:39:11 +0000 (15:39 +0300)]
dpif-netdev: Forbid vport offloading attempts.

'netdev_flow_put()' for vports could eventually succeed for
userspace datapath in case there is a kernel datapath with
similar vport at the same time. The root cause is that vports
like 'vxlan' uses same 'vxlan_sys_<port>' system interfaces
for flow offloading and there is no way to distinguish system
and userspace vports using only 'netdev' structure.

Let's forbid vport offloading from userspace datapath to avoid
installing userspace flows to unrelated system devices.

Future dynamic flow API management will allow to enable vport
offloading back using more flexible checks.

Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Reported-by: Ophir Munk <ophirmu@mellanox.com>
Acked-By: Roni Bar Yanai <roniba@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
5 years agodpif-netdev: Fix flow mark leak on port lookup failure.
Ilya Maximets [Mon, 13 May 2019 11:37:26 +0000 (14:37 +0300)]
dpif-netdev: Fix flow mark leak on port lookup failure.

Flow mark should be properly freed in all error cases.

Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Acked-By: Roni Bar Yanai <roniba@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>